Calculate the appropriate weight for each query term, Programming Languages

1-Create ir3.py based on ir2.py

2-Repeatedly prompt the user for a query (if they enter "q", then quit)

3-Find the terms in the query, and calculate the appropriate weight for each query term

• (hint:) : weight for query = log2 (total number of doc / number of times the word appear in all the Doc).

• weight for query =((log( float( len( documents) ) / docfreq [ term ] ))/log(2))

• the Output for the query ""quick brown vex zebras""should be :

Doc name

Term

Weights

Q

Quick

0.58

Q

Brown

1.58

Q

Vex

0.58

Q

Zebras

1.58

4-Calculate the similarity for each query/document pair

(hint:) : the similarity= Q * D1 / |Q||D1| for example :

2361_Calculate the appropriate weight for each query term.png

5-List the documents in order of decreasing similarity to the query, along with their similarity value

• Your results for "quick brown vex zebras" should be:

D1.txt 0.42, D3.txt 0.33, D2.txt 0.08

7-Make sure that querying "quick brown vex zebras" a 2nd time gives the same result

8-What is the result for the query "quick brown vex lion"?

Genral Hint :

• For user Input :
while True:
querystring = raw_input( '\nEnter query (q to quit): ' )
if querystring == 'q':
print '\nGoodbye!\n'
break
...do more stuff...

• To sort a dictionary in descending order by value from operator import itemgetter
items = results.items()
items.sort( key = itemgetter(1), reverse=True )
for (document, ranking) in items:
print document, "%.2f" % ranking

Posted Date: 2/15/2013 12:12:47 AM | Location : United States







Related Discussions:- Calculate the appropriate weight for each query term, Assignment Help, Ask Question on Calculate the appropriate weight for each query term, Get Answer, Expert's Help, Calculate the appropriate weight for each query term Discussions

Write discussion on Calculate the appropriate weight for each query term
Your posts are moderated
Related Questions
Implement the Prim's algorithm with array data structure as described in slide 12 of the file 04mst.ppt. Your program should have a runtime complexity of O(n2) and should be as eff


Write a program that takes names of 5 students in 2D Character Array and their GPAs in 1D array. Arrange the names in alphabetical order and display on screen along with GPA.

We will be assuming here that our roots are of the form, in this case, r 1,2 = l + mi If we take the first root we'll find the following solution. x l + m i It i

Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4

Create a logical expression that corresponds to the following statement: If you like talking about computers or playing video games at LAN parties, and you want to meet others w

A large offshore accommodation barge is to be converted into a floating luxury hotel.  It will be connected to a single point mooring buoy (SPM) in a beautiful inland loch where th

Windows Presentation Foundation Designed by Microsoft Technologies, the Microsoft Technologies windows Display Groundwork (or WPF) is a computer-software graphic subsystem for maki

program the following exercises using JAVA and JENA API: SPARQL endpoint to be queried: QUERY:">http://services.data.gov.uk/education/sparql QUERY: What are the school’s names th

Write a Prolog predicate remove_nth(N,L1,L2) that is true if list L2 is just list L1 with its Nth element removed. If L1 does not have an Nth element then the predicate should fail