Deprecated: Assigning the return value of new by reference is deprecated in /var/www/thorendal.dk/public_html/bbclone/lib/io.php on line 151
Strict Standards: Only variables should be assigned by reference in /var/www/thorendal.dk/public_html/bbclone/lib/io.php on line 154
Deprecated: Function eregi() is deprecated in /var/www/thorendal.dk/public_html/bbclone/lib/new_connect.php on line 88
June 22, 2005
Binary relevance meassuring in today’s information management systems
Deprecated: preg_replace() [function.preg-replace]: The /e modifier is deprecated, use preg_replace_callback instead in /var/www/thorendal.dk/public_html/wp-includes/functions-formatting.php on line 75
Within the LIS sector (Library and Information Science) there excits two concepts that everyone knows: Precistion and recall. They are interesting to have a closer look at because they are very traditional meassurement within the domain and at least earlier on very used and respected perameters.
The Cranfield II Studies
These parameters were developed through some expiriments called the Cranfield II studies back in the early 50′ies in the UK. In these studies there was a distinction between two kinds of relevance: ‘Stated relevance’ and ‘User relevance’. It is obvious that the difference is that the former is predefined and the latter is based on the users assesement - and therefor subjective - of relevance (Ellis 1990, p. 15-16).
In the Cranfield II Studies outer paramenters such as context, user level e.g., have not been taken into account. This has of course been strongly criticized by all theory within this domain (Harter, 1996, p. 45).
The relevance in the Cranfield II studies looks at relevance as either a document is relevant and it is retrieved or it is not relevant and therefor it is not retrieved (Bruhns, 1998, p.17). Pretty basic.
The question is if they are adequate parameters for the meassurement of relevance in intellegent information retrieval systems nowadays? And are these perameters called precision and recall?
There is an ongoing discussion on how many aspects relevance as concept has. As the Cranfield II Studies operates with a binary concept of relevance it can be stated that it is way too narrow. On the other hand if the judgement relevant is rather broad it can cover all the grey areas of what is really relevant to the user or the search.
The most normalt thing nowadays is though to have more than two levels for the concept of relevance.
Precesion is defined as:
The ratio between the number of relevant and retrieved documents and the number og retrieved documents.
Recall is defines as:
The ration between the number of relevant and retrieved documents and the total number of relevant documents in the system - retrived or not. This meassure provides that the total number of relevant documents in the system for the specific search is known.
The somewhat conclusion must be then, that if a system can be trained sufficiently and is intelligent enough the precision and recall meassures should be accepetable for meassurering binary relevance - but then a new question emerges: Is anyone interested in a relative narrow or very broard definition of ‘relevant’?
References:
Bruhns, 1998,
Ellis, P. 1990,
Harter, 1996,
Deprecated: preg_replace() [function.preg-replace]: The /e modifier is deprecated, use preg_replace_callback instead in /var/www/thorendal.dk/public_html/wp-includes/functions-formatting.php on line 75
Come on - there gotta b more interestinng things in the world than this (yawn)
Deprecated: preg_replace() [function.preg-replace]: The /e modifier is deprecated, use preg_replace_callback instead in /var/www/thorendal.dk/public_html/wp-includes/functions-formatting.php on line 75
Well well I happen to think it is interesting, but I do understand your lack of excitement though!