Retrieval evaluation measures, Management Information Sys

Retrieval Evaluation Measures 

Objective retrieval evaluation measures have been used from the beginning to assess the retrieval system performance. Many different parameters can in principle be used to measure retrieval performance. Of these measures, the best known, and the most widely used, are: recall (R) and precision (P), reflecting the proportion of relevant items retrieved in answer to a search request, and the proportion of retrieved items that are relevant, respectively. The main assumption behind the use of measures such as recall and precision is that the average user is interested in retrieving large amounts of relevant materials (producing a high recall performance) while at the same time rejecting a large proportion of the extraneous items (producing high precision). These assumptions may not always be satisfied. Nevertheless, recall-precision measurements have formed the basis for the evaluation of the better known studies of both operational and laboratory-type retrieval systems. 

It may be mentioned that recall-precision measurements have not proved universally acceptable. Objections of a theoretical and practical nature have been raised. The most serious questions relate to the fact that recall, in particular, is apparently incompatible .with the utility theoretic approach to retrieval, which forms the basis of a good deal of existing information retrieval theory. Under the utility theoretical perspective, retrieval effectiveness is measured by determining the utility to the users of the documents retrieved in answer to a user query. The problem is that a document may be relevant to a query while nevertheless proving useless for a variety of reasons. The system utility might be optimised in some cases by bringing a single relevant document to the user's attention, at which point the recall might be very low. Hence, recall and relevance are quite different notions. 

The most important factor in determining how well a system is working is the relevance of the items retrieved from the system in response to a query from the user. Relevancy of an item, however, is not a binary evaluation, but a continuous function between the items being exactly what is being looked for and is being totally unrelated. To discuss relevance, it is necessary to define the context under which the concept is used. From a human judgement stand point, relevancy can be considered: 

  1. Subjective, i.e., depends upon a specific user's judgement. 
  2. Situational, i.e., relates to a user's requirements. 
  3. Cognitive, i,e., depends on human perception and behaviour. 
  4. Temporal, i.e., changes over time. 
  5. Measurable, i.e., observable at points in time, 

The subjective nature of relevance judgements has been documented by Saracevic and was shown in TREC experiments. (Saracevic, 1995). In a dynamic environment each user has his own understanding.of the requirement and the threshold on what is acceptable. Based upon his cognitive model of the information and the problem, the user judges a particular item to be relevant or not. Some users consider the information they already know to he non-relevant to their information need. Also, judgement of relevance can vary over time. Thus, relevance judgement is measurable at a point in time constrained by the particular users and their thresholds on acceptability of information. 

Another method of specifying relevance is from information, system and situational views. Here again, the information view is subjective in nature and pertains to human judgement of the conceptual relatedness between an item and the search. It involves the user's personal 'judgement of the relevancy of the item to the user's information need. When information professionals assist the user, it is assumed that they can reasonably predict whether certain information will satisfy the user's needs. Ingwersen (1992) categorises the informations view into four categories of 'aboutness': 

  1. 'Author Aboutness'- determined by the author's language as matched by the system in natural language retrieval. 
  2. 'Index Aboutness'- determined by the indexer's transformation of the author's natural language into a controlled vocabulary. 'Request Aboutness'- determined by the user's or intermediary's processing of a search statement into a query. 
  3. 'User Aboutness'- determined by the indexer's attempt to represent the document according to presupposition about what the user will want to know. 

In this context, the system view relates to a match between query terms and terms within an item. It can be objectively observed, tested without relying on human judgement. On the other hand, the situation view pertains to relationship between information and the user's information problem situation. It assumes that only users can make valid judgements regarding the suitability of information to solve their information need. Lancaster and Warner refer to information and situation views as relevance and pertinence respectively. Pertinence can be defined as those items that satisfy the user's information need at the time of retrieval. Pertinence depends on each situation. 

It may be mentioned here that evaluation of IR Systems is, essential to understand the source of weakness in existing systems and to improve their effectiveness. The standard measures of Precision, Recall and Relevance have been used for the last 25 years as the major measures of algorithmic effectiveness. 

The major problem associated with evaluation of IR Systems is the subjective nature of the information. There is no deterministic methodology for understanding what is relevant to a user's search. Users have trouble in translating their mental perception of information being sought into the written language of a search statement. When fe. is are needed users are able to provide a specific relevance judgement on an item. But when general information is needed relevancy goes from a classification process to a continuous function. They are not able to easily distinguish relevant items from non-relevant ones. 

The Text Retrieval Evaluation Conferences (TRCSs) provide yearly forums where developers of algorithms can share their techniques with their peers and contribute theories of evaluation, which may lead to the design and development of effective and efficient ER systems in future.  

Posted Date: 10/24/2012 3:28:44 AM | Location : United States







Related Discussions:- Retrieval evaluation measures, Assignment Help, Ask Question on Retrieval evaluation measures, Get Answer, Expert's Help, Retrieval evaluation measures Discussions

Write discussion on Retrieval evaluation measures
Your posts are moderated
Related Questions
The Online Search   The first stage; in conducting a search is to develop a clear specification of the information required by the end-user. The search may he conducted by the

Ask que Describe a database, a database management system, and the relational database modelstion #Minimum 100 words accepted#


Structure of an information System An information system may be viewed as indicated in the Figure 1. Information systems accept (input), store (in files or database), and displ

what is the best way to write about cloud computing from the IS point of view?


problems and solution implementing mis


Ch 17 problem 17.11 What difference does it make to the Var calculated in Example 17.2 if the exponentially weighted moving average model is used to assign weights to scenarios as

Question 1: (a) Explain the Marketing Information System (MIS) with an illustrated diagram. (b) Show the importance of the MIS to marketing decision makers. Question 2: