Reference no: EM132512493
MATH2349 Data Wrangling - RMIT University
Purpose
The purpose of this final assignment is to put to work the tools and knowledge that you gain throughout this course. This provides you with multiple benefits.
• It will provide you with more experience using data preprocessing tools on real life data sets.
• It helps you to self-direct your learning and interests to find unique and creative ways to wrangle your data.
• It starts to build your data analytics portfolio. Portfolios (or e-portfolios) are a great way to show potential employers what you are capable of.
Overview
This assignment requires you to find some open data, and use your knowledge, skills gained during the course to preprocess the data. You will create a report using R Markdown to explain the steps taken by you in order to perform the data preprocessing tasks. You will also publish this report online (in RPubs) which will give you the opportunity to build your data analytics portfolio. This is a great way of showing potential employers what you are capable of. You will be awarded (with marks) the clearer you demonstrate your skills.
Learning outcome 1. Accurately, logically and ethically combine data from multiple sources to make suitable for statistical analysis and draw valid interpretations.
Learning outcome 2. Articulate how data meets the best practice standards (e.g. tidy data principles).
Learning outcome 3. Select, perform and justify data validation processes for raw datasets.
Learning outcome 4. Use leading open source software (e.g. R) for reproducible, automated data processing.
1. Create the report using R Markdown
The Assignment 2 report must be completed using the R Markdown template provided here:
R Markdown Template - Assignment 2
In the report, all R chunks and outputs need to be visible. Failure to do so will result in a loss of marks.
2. Publish your Report to RPubs
Publish your report to RPubs (seehere) and submit your report's RPubs link to the google form given below.
This online version of the report will be used for marking. Failure to submit your link will delay your feedback and risk late penalties.
Attachment:- Data Wrangling.rar