COMP20008 Elements of Data Assignment

Assignment Help Other Subject
Reference no: EM132494303

COMP20008 Elements of Data Assignment help and solution, Processing Project - assessment writing service - University of Melbourne

Learning outcome 1: To gain practical experience in written communication skills for data science projects.
Learning outcome 2: To practice a selection of processing and exploratory analysis techniques through visu- alisation discussed in lectures and workshops.
Learning outcome 3: To practice crawling and scraping data from the Internet.
Learning outcome 4: To practice using widely used Python library for data processing and gain experience using library functions which may be unfamiliar and which require consultation of ad- ditional documentation from resources on the Web.

Your Parts

You are to perform a small data science project including some data processing and analysis using Python. Your responses to Parts 1-5 must be contained in a single .py file. Specifically, you have the following Parts:

Part 1

Produce a csv file containing the URL and headline of each the articles your crawler has found found. The CSV file should have two column headings url and headline and be called Part1.csv.

Note: You might want to start with a smaller website to test your crawling implementation with this site

Part 2
For each article found in Part 1,

a) extract the name of the first player mentioned in the article. You can find a list of player names as part of the tennis.json file provided. We will assume the article is written about that player (and only that player).

b) extract the first complete match score identified in the article. You will need to use regular expressions to accomplish this. We will assume this score relates to the first named player in the article.

Produce a csv file containing the URL, headline, first player mentioned and first complete match score of each the articles your crawler has found. The csv file should have four column headings url, headline, player and score and be called Part2.csv.

Note: Some articles may not contain a player name and/or a match score. These articles can be discarded.

Part 3

For each article used in Part 2, identify the absolute value of the game difference. E.g. a 6-2 6-2 score has a game difference of 8, while a 6-4 4-6 6-4 score has a game difference of 2. The value is referred to as the game difference

Produce a csv file containing the player name and average game difference for each player that at least one article has been written about. The csv file should have two column head- ings player and avg game difference and be called Part3.csv.

Part 4

Generate a suitable plot showing five players that articles are most frequently written about and the number of times an article is written about that player.
Save this plot as a png file called Part4.png

Part 5

Generate a suitable plot showing the average game difference for each player that at least one article has been written about and their win percentage. You can find a player's win percentage in the tennis.json file.
Save this plot as a png file called Part5.png

Part 6

Write a 3-4 page report to communicate the process and activities undertaken in the project, the analysis, and some limitations. Specifically, the report should contain the following infor- mation:
• A description of the crawling method and a brief summary the output for Part 1.

• A description of how you scraped data from each page, including any regular expressions used for Part 2 and a brief summary of the output.
• An analysis of the information shown in the two plots produced for Parts 4 & 5, in- cluding a brief summary of the data used. The plots are to be shown (included) along with your analysis.
• A discussion of the appropriateness of associating the first named player in the article with the first match score.
• At least one suggested method for how you could figure out from the contents of the ar- ticle whether the first named player won or lost the match being reported on.
• A discussion of what other information could be extracted from the articles to bet- ter understand player performance and a brief suggestion for how this could be done.

Attachment:- Elements of Data.rar

Reference no: EM132494303

Questions Cloud

What is intuit total net revenue growth : How would these growth rates affect your projection of Intuit's 2017 income statement? What is Intuit's total net revenue growth during 2016?
What is the current value of operations : If the company's weighted average cost of capital is 11 percent, what is the current value of operations, to the nearest million? (Hint: Please consider FCF0
Explain the geico total rewards program : Evaluate the effectiveness of the communication of Geico's total rewards program based upon the Website's descriptions of the benefits. Recommend two (2) areas.
What is the correct amount of his self- employment tax : During the tax year 2018, he had a net profit of $ 150,000. What is the correct amount of his self- employment tax
COMP20008 Elements of Data Assignment : COMP20008 Elements of Data Assignment help and solution, Processing Project - assessment writing service - University of Melbourne
What is the highest expected portfolio return : What is the highest expected portfolio return Brain can earn on his complete portfolio?
What amount of interest cost should harbor capitalize : Harbor incurred interest of $20,000 on specific construction debt, and $60,000 on other borrowings. What amount of interest cost should Harbor capitalize
Why are competency frameworks important : Public health is important work and the people who carry out that work contribute substantially to the health status and quality of life of the individuals.
Compute forecast the company cost of sales : The company anticipates that sales will increase by 2% in 2018 but that the gross profit margin will be the same as 2017. Forecast the company's 2018

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd