Reference no: EM132460659
CITS1401 Computational Thinking with Python Assignment, School of Computer Science and Software Engineering - The University of Western Australia, Australia
Problem Solving and Programming Project - Finding Synonyms by Association
You should construct a Python 3 program containing your solution to the following problem and submit your program electronically using cssubmit. No other method of submission is allowed, and only the Python 3 program file, please.
Tasks -
The only function definition that you must have is: def main(corpus_file_name, commonwords_file_name = None)
The names of the parameters are not relevant, but the function with that name and the arguments prepresenting those file names must be present so my testing program will be able to interact properly with your program. (The file of common words is optional.) If main() definition is not present, it will not be possible to test your program.
Otherwise, how you structure your program is up to you. The tasks your program will have to undertake are:
If the commonwords file is present, open the file and read the words into an appropriate data structure.
Open the corpus file and read in the lines of text.
As the file is being read in, break it up into sentences.
- As each sentence is completed, break the sentence up into words.
- For each unique word (other than common words) in a sentence, update the counts of the profile entries for that word with respect to all the others. For example, after reading in the first sentence in the sample, the profile associated with mississippi will contain references to "worth" and "reading", but not "is", "well" or "about" (as these are in the commonwords). Each term will now have a count of 1. Similarly, "mississippi" will have a count of 1 in the profile of "worth", and so on
I recommend you write a function that prints out the data structure you have created to represent the profiles. I won't be testing that you have such a function, but I find it good practice as you can make sure that things are the way you expect.
Process query word sets until just a blank line is read in.
- Read in words, one per line until a blank line is read in. The first word is the target word.
- For each word, extract the corresponding profile and compare the second and subsequent profiles to the first. Generate a score for each comparison and record the score against the corresponding word.
- Sort the word-score list and print the words and the scores in order of descending score.
- Print the computed synonym.
A script from a sample session can be found at: typescript. The sample session only contains a single set of target word plus query words; your program should be able to handle multiple sets (with a blank line between sets).
Attachment:- Computational Thinking with Python Assignment File.rar