Calculate the appropriate weight for each query term, Programming Languages

Assignment Help:

1-Create ir3.py based on ir2.py

2-Repeatedly prompt the user for a query (if they enter "q", then quit)

3-Find the terms in the query, and calculate the appropriate weight for each query term

• (hint:) : weight for query = log2 (total number of doc / number of times the word appear in all the Doc).

• weight for query =((log( float( len( documents) ) / docfreq [ term ] ))/log(2))

• the Output for the query ""quick brown vex zebras""should be :

Doc name

Term

Weights

Q

Quick

0.58

Q

Brown

1.58

Q

Vex

0.58

Q

Zebras

1.58

4-Calculate the similarity for each query/document pair

(hint:) : the similarity= Q * D1 / |Q||D1| for example :

2361_Calculate the appropriate weight for each query term.png

5-List the documents in order of decreasing similarity to the query, along with their similarity value

• Your results for "quick brown vex zebras" should be:

D1.txt 0.42, D3.txt 0.33, D2.txt 0.08

7-Make sure that querying "quick brown vex zebras" a 2nd time gives the same result

8-What is the result for the query "quick brown vex lion"?

Genral Hint :

• For user Input :
while True:
querystring = raw_input( '\nEnter query (q to quit): ' )
if querystring == 'q':
print '\nGoodbye!\n'
break
...do more stuff...

• To sort a dictionary in descending order by value from operator import itemgetter
items = results.items()
items.sort( key = itemgetter(1), reverse=True )
for (document, ranking) in items:
print document, "%.2f" % ranking


Related Discussions:- Calculate the appropriate weight for each query term

Robot factory game, A deterministic finite automaton (DFA) is an abstract m...

A deterministic finite automaton (DFA) is an abstract machine that reads input from a serial (nonreversible) stream and changes between a finite number of  states according to the

Write a prolog predicate has duplicates, Write a Prolog predicate has_dupli...

Write a Prolog predicate has_duplicates(L) that is true if list L contains duplicated elements (that is at least 2 copies of an element). For instance: ?- has_duplicates([a,e,b,

Power of mobile applications, BACKGROUND: This assignment illustrates t...

BACKGROUND: This assignment illustrates the power of mobile applications. OBJECTIVES: 1. Mobile applications DESCRIBED TASK: This is a single part assignment.

Program for searching by indexing text files, Write a program that can faci...

Write a program that can facilitate searching by indexing text files according to words. In this task, you are given a large text file, sample.txt, which you will need to index the

Windows communication foundation, Windows Communication Foundation The Micr...

Windows Communication Foundation The Microsoft windows Interaction or communication foundation (or WCF), formerly known as "Indigo", is an application selection program (API) in th

Programming and modelling in uml, Introduction Currently, Omega has 178...

Introduction Currently, Omega has 178 stores UK wide. Most cities and large towns within the UK are catered for by Omega. Omega began in 1960's selling various makes of telep

MATLAB, Who can help with MATLAB?

Who can help with MATLAB?

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd