Reference no: EM133556219
Programming for Data Science
1 Project Description
In this assignment there are 4 parts. For each part you should:
• Write the appropriate R code.
• Include comments within the code to explain the algorithm.
• Test the code to ensure its correctness.
• Format and structure the code to maximise its readability.
A report must be submitted containing a cover page, the solutions to each of the four parts.
Project Tasks
"Shmelchemist's Alchemist's Supplies" manufactures and sells components that are used by alchemists in their quest to transmute lead into gold. Shmelchemist's have been operating for the past 47 years and have had many satisfied customers. Recently, they have had a higher than usual number of returned products from customers claiming that their parts are faulty. Shmelchemist's have hired you to investigate the problem.
Shmelchemist's Alchemist's Supplies have provided you with their sales data from 2021, found in the file sales.csv. Each row of the file shows a single customer purchase, the columns are the variables:
• date: the date of the customer purchase.
• partId: the ID number of part that was sold.
• unitsSold: the number of parts purchased by the customer.
• returned: the number of the purchased parts that were later returned due to being faulty.
You have also been provided with the parts description file parts.json. This file contains
• id: the part ID number.
• name: the name of the part.
• ingredients: the elements used to create the part.
They want you to complete the following tasks.
Poor Advice
Shmelchemist's have a suspicion that a previous employee who worked from the start of January to the end of May was providing poor advice to customers, leading to them purchase the wrong parts for their task. This can be identified by examining the distribution of parts sold with and without the employee. To accomplish this, They want you to provide a two row table, where the first row shows the number of purchases of each part ID for the time the employee worked and the second row shows the same, for time they did not work.
Top Ingredients
There is also the thought that the ingredients used this year are poorer quality when compared to previous years and so Shmelchemist's want to know if any ingredient is used more then other ingredients. To observe the prevalence of each ingredient, Shmelchemist's want you to provide a bar chart showing each ingredient and the number of parts they are ingredients of.
Change in Part Quality
Comments from customers suggested that the previous employee was tampering with the part ingredients. Shmelchemist's want you to provide two plots, one showing the set of elements and the proportion of failed parts that they were involved in for the period that the employee worked, and a second plot for the period that they did not work.
Effect on Atomic Mass
An investigation into the previous employee's tampering indicated that the tampering process only effected ingredients with a low atomic number. Shmelchemist's want you to provide two plots each showing the atomic number of the element vs the proportion of failed parts containing the element, the first for the time the previous employee worked and the second for the time they did not work.
Write a PDF report containing your code and all required analysis and results. The report is being marked using the marking criteria, so make sure that each piece of analysis covers all of the criteria.