Information Retrieval and Extraction
Spring 2011

Homework Webpage


Homework: Evaluation Measures

The the query-document relevance information (AssessmentTrainSet.txt) for a set of queries (16 queries) and a collection of 2,265 documents is provided. An IR model is then tested on this query set and save the corresponding ranking results in a file (ResultsTrainSet.txt) . Please evaluate the overall model performance using the following two measures.

1. Interpolated Recall-Precision Curve: 
    (for each query)

          (overall performance)

2. (Non-interpolated) Mean Average Precision:


, where "non-interpolated average precision" is "average precision at seen relevant documents" introduced in the textbook.

Example 1: Interpolated Recall-Precision Curve

Example 2: (Non-interpolated) Mean Average Precision


Homework: Classic Retrieval Models

A  set of text queries (16 queries) and a collection of text documents ( 2,265 documents) is provided, in which each word is represented as a number except that the number "-1" is a delimiter.

Try to implement an information retrieval system based on the classic retrieval models. The query-document   relevance information is in "AssessmentTrainSet.txt".  You should evaluated you system with the two measures described above.