Calculate the appropriate weight for each query term, Programming Languages

Assignment Help:

1-Create ir3.py based on ir2.py

2-Repeatedly prompt the user for a query (if they enter "q", then quit)

3-Find the terms in the query, and calculate the appropriate weight for each query term

• (hint:) : weight for query = log2 (total number of doc / number of times the word appear in all the Doc).

• weight for query =((log( float( len( documents) ) / docfreq [ term ] ))/log(2))

• the Output for the query ""quick brown vex zebras""should be :

Doc name

Term

Weights

Q

Quick

0.58

Q

Brown

1.58

Q

Vex

0.58

Q

Zebras

1.58

4-Calculate the similarity for each query/document pair

(hint:) : the similarity= Q * D1 / |Q||D1| for example :

2361_Calculate the appropriate weight for each query term.png

5-List the documents in order of decreasing similarity to the query, along with their similarity value

• Your results for "quick brown vex zebras" should be:

D1.txt 0.42, D3.txt 0.33, D2.txt 0.08

7-Make sure that querying "quick brown vex zebras" a 2nd time gives the same result

8-What is the result for the query "quick brown vex lion"?

Genral Hint :

• For user Input :
while True:
querystring = raw_input( '\nEnter query (q to quit): ' )
if querystring == 'q':
print '\nGoodbye!\n'
break
...do more stuff...

• To sort a dictionary in descending order by value from operator import itemgetter
items = results.items()
items.sort( key = itemgetter(1), reverse=True )
for (document, ranking) in items:
print document, "%.2f" % ranking


Related Discussions:- Calculate the appropriate weight for each query term

How do you find the complexity of an algorithm, How do you get the complexi...

How do you get the complexity of an algorithm? What is the relation b/w the time & space complexities of an algorithm? Justify your answer with an example.

Develop a program on behavior of hospital personnel, A psychologist is inte...

A psychologist is interested in learning about the voting behavior of college students.  (4 points for each part.)Design a study which would yield data on this topic. a. state y

String cost, A string S is said to be "Super ASCII", if it contains the cha...

A string S is said to be "Super ASCII", if it contains the character frequency equal to their ascii values. String will contain only lower case alphabets (''a''-''z'') and the asci

Looping, You are required to develop a program that calculates the charges ...

You are required to develop a program that calculates the charges for DVD rentals, where current release cost RM3.50 and all others cost RM2.50. If a customer rents several DVDs, e

Unix, 1. Write a shell script to locate executable files. This script takes...

1. Write a shell script to locate executable files. This script takes a list of file names from the command line and determines which would be executed had these names been given a

Capstone project, I need help programming an arduino uno to scan an ean-8 s...

I need help programming an arduino uno to scan an ean-8 student barcode and display their name and id on computer. This is a capstone project.

Find out starting address of stack segment, For the following Code answer t...

For the following Code answer the following questions-: .STACK 100H .DATA COUNT DB 10 TOTAL DW 4126H .CODE MAIN PROC MOV BX, 3F20H MOV AL, BL MOV BL, COUNT MOV

System flowchart, creating system flowchart for website

creating system flowchart for website

Api in c#, i what to know how setcapture() api work in c#

i what to know how setcapture() api work in c#

Vbs, Add macros to MS word 2010 according to A PARTICULAR FORMAT

Add macros to MS word 2010 according to A PARTICULAR FORMAT

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd