Reference no: EM132011876
Introduction to Data Science Assessment Task
Assignment Task - You are a member of the team, and need to perform data analysis on countries in the region of East Asia & Pacific.
The team has not set any specific goal for the analysis. Therefore, you have the freedom to explore the data, and dig out anything you feel interesting or significant.
You have been requested to prepare a data analysis report about your work and explain your findings. The potential audiences include other researchers, business representatives, and government agencies. They may have limited ICT or mathematical knowledge.
To prepare the report, please follow the following outline:
1. Introduction
Provide an introduction to the problem. Include background material as appropriate: who cares about this problem, what impact it has, where does the data come from.
2. Data Setup
Describe how to load the data, and the libraries needed. Provide an overview of the data about its dimensions and structures.
3. Exploratory Data Analysis
Perform 3 one-variable analysis. Plot at least one graph for each variable. Explain why the selected graph is appropriate.
Perform 2 two-variable analysis. Plot at least one graph for each variable. Explain why the selected graph is appropriate.
The analysis can be performed on all years and all countries, or on a subset of your interest.
4. Advanced Analysis
4.1 Clustering
Briefly explain the concept of clustering and k-means.
Try to do a clustering analysis to group countries according to some selected attributes.
4.2 Linear Regression
Briefly explain the concept of linear regression.
Try to do 2 linear regression analysis. Plot the learned models.
The analysis can be performed on all years and all countries, or on a subset of your interest.
5. Conclusion
6. Reflections
In this part, discuss any difficulties you had performing the analysis and how you solved those difficulties. Reflect on how the analysis process went for you, what you learnt, and what you might do differently next time.
Report Format - Your report should be no less than 1,200 words and it would be best to be no longer than 2,000 words long. All comments and graph titles are counted.
Assignment Guidelines - This assignment will take a number of weeks to complete and will require a good understanding of data science and management for successful completion. It is imperative that students take heed of the following points in relation to doing this assignment:
1. Ensure that you clearly understand the requirements for the assignment - what has to be done and what are the deliverables.
2. If you do not understand any of the assignment requirements - Please ASK the course coordinator or your tutor.
3. Each time you work on any aspect of the assignment reread the assignment requirements to ensure that what is required is clearly understood.
Attachment:- Assignment Files.rar