Write program to print names of files with similarity

Assignment Help Programming Languages
Reference no: EM1388359

The basic task is to measure similarity between any two files in our collection. To do this, we will require a appropriate universe of words. This will consist of all words in collection that are (a) more than four letters long, (b) don't occur more than 20 times overall, and (c) do not happen in more than 7 files in collection. Now we constructor a vector (in mathematical sense) corresponding to each file. Vector will have as many coordinates as words in universe -- so there is one coordinate for each word in universe. If word occurs in file, corresponding coordinate is 1, otherwise it is 0.

Let us give example: assume universe consists of five words: apple, grapes, banana, doctor, program. Assume file1 contains: apple, banana, program. Then the vector for file1 is (1,0,1,0,1).

We require to normalize each of vectors so that it has unit length. So each coordinate in above vector gets divided by square root of 3.

Similarity of two files is defined to be scalar product of corresponding two vectors. Scalar product of two vectors is obtained by multiplying corresponding components and adding. For instance, scalar product of (2,1,3) and (0,5,6) is 2 * 0 + 1 * 5 + 3 * 6.

Your task is to write down the program which prints names of two files with highest similarity among files in collection, and names of two files with lowest similarity.

Reference no: EM1388359

Questions Cloud

Market following a weibull distribution : A Manager needs to decide between two machines to put into market following a Weibull distribution. Machine X test unit cost $3000 with beta=3 and theta=500 Machine Y test unit cost $2000 with beta=3 and theta=400
Issues of health care legal liability : As a new member of the Institutional Policy Review Team, you're seeking information about institutional, professional, and personal ethical standards and dilemmas with respect to privacy of medical information, professional and personal ethical st..
Determine the equation of the line : You are estimating the cost ($K) of optical sensors based on the power output of the sensor. Using the preliminary calculations from a data set of 8 sensors, determine the equation of the line. (Round your intermediate calculations to 3 decimal pl..
A business organization intends to develop a new e-commerce : A business organization intends to develop a new e-commerce Web site to enable its customers to make online purchases of computers in a quicker and more efficient manner
Write program to print names of files with similarity : Write down the program which prints names of two files with highest similarity among files in collection, and names of two files with lowest similarity.
Compare an experimental medication : A clinical trial is organized to compare an experimental medication designed to lower blood pressure to a placebo. Before starting the trial, a pilot study is conducted involving ten participants.
Null and alternative hypothesis : what statement should be made about the null and alternative hypothesis based on sample data and significance level?
Productivity is measured by the ratio of outputs : Productivity is measured by the ratio of outputs to inputs. Some organizations use a partial measure of productivity to measure actual operations, such as a restaurant using number of customer meals per labor hour.
Measurement process-improvement process : Organization selected for the project is a Pharmaceuticals company. I want help in finding information for section six (Measurement process) and seven (Improvement process). If you could provide me some ideas and push me in right direction, I woul..

Reviews

Write a Review

Programming Languages Questions & Answers

  Program to display only unique values which user entered

Program to display only the unique values which the user entered. Give for the "worst case" in which all 20 numbers are different. use smallest possible array to solve this problem.

  Recursive method to read in string of characters

Trying to write a recursive method that reads in a string of characters and checks to see if the first character is either a 'D' or and 'E', then is followed by a string of one or more 'F's.

  Write program to caculate value of user-s stock

Write a program which caculates value of user's holding of a particular stock. Program asks for number of shares held, whole ¬dollar portion of price for one share, also the fraction portion.

  Write shell script to read from keyboard-display on monitor

Write shell script called poject.21 to read from keyboard and display on monitor first name, last name and age if the age is less than 50.

  Create class has constructor to reduce function of fraction

Create a class RationalNumber (fractions) with the following functionality: Has a constructor that prevents a 0 denominator in a fraction and calls the reduce function to simplify the fraction

  Write program to accept data for each student in school

Write the program which accepts data for each student in school- student ID, classroom number, and score on achievement test. Create program which lists total points scored for each of the 30 classrooms.

  Develop two single dimension arrays-floating-point numbers

Develop two single dimension arrays which contain 10 floating-point numbers in each array. Develop third single dimension array to hold sum.

  Write program to calculate student-s quiz average

Write a program that will calculate a student's quiz average. The program should prompt the user for the number of quizzes and then ask the user for each quiz grade.

  Write a haskell program to calculates a balanced partition

Write a program in Haskell which calculates a balanced partition of N items where each item has a value between 0 and K such that the difference b/w the sum of the values of first partition,

  Design program which models worms behavior

Design a program that models the worms behavior in the following scenario: A worm is moving toward an apple. Each time it moves, the worm cuts the distance between itself and the apple.

  Specific changes made for different countries-sites directed

Why were these specific changes made for different countries at whom the sites were directed? Is there anything else you would consider changing.

  Program to display words in a list box sorted by number

Write a program to display the words in a list box sorted by the number of different vowels(a,e,i,o,u) in the word.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd