Write program that will analyse and visualise mined feelings

Assignment Help Python Programming
Reference no: EM132489877

COMP 5070 Statistical Programming for Data Science

Aim: 1. Code all requested components

Aim: 2. Your written analysis of the output produced by your code

Aim: 3. Aim for optimised code in terms of computational overhead.

It is not always possible to avoid loops, however you should aim to avoid loops where possible (e.g. use NumPy vectorisation as much as possible).

Aim: 4. Use proper coding style

Code clarity is an important part of your submission. Thus you should choose meaningful variable names and adopt the use of comments - you don't need to comment every single line, as this will affect readability - however you should aim to comment at least each section of code.

Aim 5. Have the code run successfully when I try to run it

If you have special files not pre-supplied with the assigment, you should provide these as a final part of your submission (ask how if you're unsure). Also, **do not hardcode** your computer path directory into your program - **I should be able to open your .ipynb file and run the code successfully without editing your code.

Aim 6. Documentation of any code limitations including, but not limited to, the requested functionalities

For this assignment you are asked to write a program that will analyse and visualise mined feelings from the We Feel Fine data sets based on a default search and then user-driven searches. Note: You do NOT have to search for the phrases \"I feel\" and \"I am feeling\" as We Feel Fine have already done this work for you. We are going to analyse what they found.

There are five components in this job: user prompt, data loading, data analysis, plotting, report.

"1. Prompt the user for the country which will be mined. If the user chooses to not provide this information, then assume a default search of the United States. Try to make your communication with a user as friendly as possible, that is, the least restrictive to how user should enter countries. E.g. no difference for small/large caps, accept some common abbriviations, like US or USA for United States, or UK for United Kingdom.

If an illegal value is entered (e.g. 'new transavia' for country), you can ask a, "gain or try to fix it - google for the Levenshtein distance. Then ask user to confirm your fix or change it to the right one. If your program fails to fix the illegal value for country name, then do not include it in the data loading routine.

You may wish to use a [text list of all countries in the world](world_countries.txt) to help define valid countries. Note that the We Feel Fine data set does not necessarily cover all of the countries in this list.

Please don't be overwhelmed with complexity of this part, start with basic prompt and then gradually increase functionality. Suggested features are desirable but not compulsory.

2. Allow the user a maximum of 5 countries to be successfully mined, although they are also allowed to enter less than 5 countries. Load corresponding data files from the folder **countries**. Successful mining occurs when the feelings for each country have been recorded and returned to your program.

3. For each feeling in the [full list of over 5000 feelings and their frequencies]

determine the number of times each feeling appears in the mined text, for each country. For any counts that are larger than 0, you will need to retain the third column of information which is the hexadecimal equivalent of the colour of the prescribed feeling.

4. For each country, produce a plot of ellipses where each ellipse represents a feeling and have size proportional to the frequency of its occurrence and is coloured based on the full list of feelings referenced above. Ellipse position can be random. The code for this component is provided and explained below, however you will need to make a number of adjustments to it.

"5. Run the base query of data file **World.txt** to determine the first 1500 feelings mined by We Feel Fine from anywhere in the world. We will compare these mined feelings with the chosen countries. There is a substantial hint below explaining how you need to do this.<br>

When your code is ready you have to choose any five countries and provide the following output:

"1. The constructed path to load a data file for the country selected by the user or yourself.

"2. The most popular feeling across the 5 countries you have chosen to explore plus the base query. If there is no feelings mined, then report this fact.

"3. A plot for each country of the ellipses generated by each country's feelings, as well as plot of the results of the base query from Step 5.

"4. Assuming darker colours and blues correspond to negative feelings and lighter, happier colours correspond to positive feelings, write a short description summarising the nature of each country as being generally optimistic or pessimistic. This description is to be written by yourself (not your program, unless you want to be REALLY fancy!) and at most two paragraphs will be sufficient. For the purpose of this assignment, one paragraph is 6-8 lines long.

"You have to execute an analysis of the data and provide answers for the following research questions which are of great interest for many people in the retail industry:

1. Study the distribution of basket sizes measured by the number of items in a basket. What basket size is the most popular?

2. Study the relation between number of items in the basket and dollar value of the basket. Considering different \"popularity\" of different basket sizes from question 1, how much money does store get from each size of the basket? What kind of customers are more important - light (small baskets) of heavy (large baskets)?

3. What day of the week is the busiest for the supermarket in terms of a number of shopping trips (one basket = one shopping trip)? What day is the most profitable? For the last question please consider a total revenue or total sales as a proxy for the profit.

"For each question you have to provide appropriate graph (or multiple graphs) and brief discussion to present your finding, answer the research question and explain your graph.

Attachment:- Assignment 2020.rar

Reference no: EM132489877

Questions Cloud

What would be their average tax rate : Question - Leonardo, who is married but files separately, earns $160,000 of taxable income. What would be their average tax rate
Prepare the appropriate journal entries : Prepare the appropriate journal entries that should be recorded as a result of each of these contingencies. If no journal entry is indicated, state why
Prepare the appropriate journal entries : The operating lease is for the final 12 years of the building's estimated 20-year remaining useful life. Prepare the appropriate journal entries for CED Inc
What is the tax base related to transaction at the end : What is the tax base related to this transaction at the end of the current year? At year-end, $230,000 was received from a customer for goods
Write program that will analyse and visualise mined feelings : write a program that will analyse and visualise mined feelings from the We Feel Fine data sets based on a default search and then user-driven searches
Compute berclairs earnings per share for the year ended : Compute Berclair's earnings per share for the year ended December 31, 2018. (Enter your answers in millions (i.e., 10,000,000 should be entered as 10).)
Determine what the accounts receivable balance is : Marigold had credit sales of $35100 and collected accounts receivable of $28080. At December 31, 2022, What the Accounts Receivable balance is
Determine what is the cost per unit : What is the cost per unit? Rusty Inc. manufactures watches. The following information was obtained from the company's production
Calculate the term debt and capital lease coverage : Calculate two liquidity ratios and one solvency ratio for Fanny Farms for 2009 and 2010. Has liquidity increased or decreased from 2009 to 2004?

Reviews

Write a Review

Python Programming Questions & Answers

  Write a python program to implement the diff command

Without using the system() function to call any bash commands, write a python program that will implement a simple version of the diff command.

  Write a program for checking a circle

Write a program for checking a circle program must either print "is a circle: YES" or "is a circle: NO", appropriately.

  Prepare a python program

Prepare a Python program which evaluates how many stuck numbers there are in a range of integers. The range will be input as two command-line arguments.

  Python atm program to enter account number

Write a simple Python ATM program. Ask user to enter their account number, and print their initail balance. (Just make one up). Ask them if they wish to make deposit or withdrawal.

  Python function to calculate two roots

Write a Python function main() to calculate two roots. You must input a,b and c from keyboard, and then print two roots. Suppose the discriminant D= b2-4ac is positive.

  Design program that asks user to enter amount in python

IN Python Design a program that asks the user to enter the amount that he or she has budget in a month. A loop should then prompt the user to enter his or her expenses for the month.

  Write python program which imports three dictionaries

Write a Python program called hours.py which imports three dictionaries, and uses the data in them to calculate how many hours each person has spent in the lab.

  Write python program to create factors of numbers

Write down a python program which takes two numbers and creates the factors of both numbers and displays the greatest common factor.

  Email spam filter

Analyze the emails and predict whether the mail is a spam or not a spam - Create a training file and copy the text of several mails and spams in to it And create a test set identical to the training set but with different examples.

  Improve the readability and structural design of the code

Improve the readability and structural design of the code by improving the function names, variables, and loops, as well as whitespace. Move functions close to related functions or blocks of code related to your organised code.

  Create a simple and responsive gui

Please use primarily PHP or Python to solve the exercise and create a simple and responsive GUI, using HTML, CSS and JavaScript.Do not use a database.

  The program is to print the time

The program is to print the time in seconds that the iterative version takes, the time in seconds that the recursive version takes, and the difference between the times.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd