Draw a bar chart to compare the artists of the songs

Assignment Help Other Subject
Reference no: EM133144584

BUS5DWR Data Wrangling and R - La Trobe University

Overview

Assignment Requirements

Part 1
The given data files Movie.csv, Rating.csv and Continent.csv record the information about the IMDB movie ratings.

Write R code in an Rmd file to answer the following questions. Each question should be presented in one code chunk:

Load the dataset from the given files into three data frames called Movie, Rating, and Continent. Rename columns to remove space if they exist. (Hint: use str_replace_all to do this automatically for all columns). Remove the column Writer in the Movie dataframe. Display the summary of each dataframe.

How many movies produced by 'Universal Pictures' have the actor 'Arnold Schwarzenegger'?

Display the five most-reviewed movies that belong to both Action and Drama. Display only the Title and the number of reviews.

Display movie rating information including Title, average rating and two new columns (1) 'TotalVote' showing the total votes from both males and females and (2) 'Popular' showing 'Male' for movies with the MalesTotalVotes greater than FemalesTotalVotes and 'Female' otherwise. (Hint: see Workshop 9 exercise). Show only TEN movies with the highest average rating.

Display the number of Comedy movies and their average rating from each continent.

Analyse the distribution of the average rating of all the movies after the year 2000. (Hint: draw a boxplot and histogram and write a short paragraph (less than 100 words) to describe your insight).

Part 2
The given Spotify.xlsx file records the summary of Australia's top 200 daily-streamed songs (or tracks) in the first three months of 2017 and 2018. The Data worksheet records the total streams and the highest position of each song in each month. You will see that the data is far from being ready for analysis and needs to be 'wrangled'. The given Artist.csv file records the artists who perform the songs. You are required to write R code to perform the following steps.

Load the data from the Spotify worksheet into a dataframe named Spotify. Replace the space in the column name with an underscore ("_"). Show the structure of Spotify.

You can see that most column names contain the month information, which should be placed as row values. Let:

• Use pivot_longer to transform the dataframe into four columns, namely Artist_ID, Track_Name, Month, and Value.
• Drop all rows having NA in Value.
• Split the Month column into Month and Year
• Display the number of columns and rows.

You can see that the data in column Value contains both the total stream and highest position of the song in the corresponding month. Note that the smaller value of the position, the higher the position.

• Split the Value column into two columns with appropriate names.
• For each month-year, show the total streams and the number of songs appearing in the daily top 200.

Find all tracks that appeared in all six months with each monthly stream more than 100,000. Display their name, total stream and highest position. Export the result into a CSV file.

Load the data from the Artist.csv file into a new dataframe. Rename the columns to remove spaces. How many artists do not have songs listed in the Spotify dataframe?

Draw a bar chart to compare the artists of the songs/tracks returned in Q2.4 based on their total stream. Order the bar from the highest to the lowest total stream. Write a small paragraph describing your insight got from this chart.

Attachment:- Data Wrangling and R Assignment.rar

Reference no: EM133144584

Questions Cloud

How is data shared with the public : How is data shared with the public (students, parents, community members) of Colorado Springs District 22? Is the current dissemination of data is effective?
How would you define ethical research : How would you define ethical research? What criteria does a research study need to meet in order to be considered ethical?
Prepare an incremental analysis for the special order : In September, Caldwell Company receives a special order for 25,000 machines at $120 each. Prepare an incremental analysis for the special order
Mergers and acquisitions in foreign markets : Mergers and acquisitions in foreign markets has increased over the past decades.
Draw a bar chart to compare the artists of the songs : Analyse the distribution of the average rating of all the movies after the year 2000 and Draw a bar chart to compare the artists of the songs
Temporary insurance agreement : You have just completed an insurance application with your new client. You have reached the point where you are reviewing the Temporary Insurance Agreement (TIA
Introduces tableau for data visualization : This course introduces Tableau for data visualization. You as the subject matter expert (SME) are asked to support or offer an alternative solution to Tableau.
What is the approximate amount of life insurance : Assuming an annual investment return of 4.5% and an average annual rate of inflation of 2.5%, what is the approximate amount of life insurance Xue needs
Initially considered only as means of securing market access : Initially considered only as means of securing market access, alliances today are an integral part of global strategies in all parts of the value chain.

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd