SIT742 Modern Data Science Assignment

Assignment Help Other Subject
Reference no: EM132487084

SIT742 Modern Data Science - Deakin University

ASSESSMENT TASK ONE

DATA EXPLORATION: DATA SCIENTISTS SURVEY

Background
In 2017, Kaggle (a data science community and competition platform) conducted a survey on a large range of users registered as the data scientist in their platform. The survey data are broadly covered the skill set of the data scientists, the demographic of the data scientists, the feedback of the platform and many other information.

Task Description
We provide one Jupyter notebook 2020SIT742Task1.ipynb at GitHub-SIT742, together with three data files at the data subfolder:
MCQResponses.csv The csv file contains participants' answers to multiple choice ques- tions. Each column contains the answers of one respondent to a specific question.
ConversionRates.csv Currency conversion rates to USD.
JobPostings.csv Data scientists job advertising in US with job descriptions, from JobPikr.

You are required to develop a data exploration report by completing the provided Jupyter notebook to finish some required analysis, with the exploration data analytics skills as well as visualization skills. Details requirements can be found in the provided notebook, and you need follow the notebook requirements to complete the coding and include the results into the report SIT742T1Report.pdf.

Data Exploration
For a data scientist, after obtaining the dataset, the first most crucial task is to obtain a good understanding of the data he or she is dealing with. This includes: examining the data attributes (or equivalently, data fields), seeing what they look like, what is the data type for each field, and from this information, determining suitable numerical/visual descriptions.
In this part of this assessment task, you need to complete the provided notebook coding parts and finish the required analysis in the attributes such as ‘education', ‘salary' and related demographic information (70%).

Text analysis
For the job advertisement data JobPostings.csv, you are required to write Python code to remove the stop-words, and to extract the high frequency words used in job advertisements.
advertisement information (30%). After that, you can do one self-defined text analysis task to get insight into those

ASSESSMENT TASK TWO DATA ANALYTICS: FIFA 2019

Background
Recently, Kaggle (a data science community and competition platform) released one data set FIFA19, which consists of 18K+ FIFA 19 player with around 90 attributes extracted from FIFA database. Here, we redistribute this data set for this assessment task:
2020T2Data.csv The file contains detailed information about each FIFA 19 player.

Task Description
We provide one Jupyter notebook 2020SIT742Task2.ipynb together with 2020T2Data.csv at the data subfolder.

You are required to analyse this dataset using Jupyter notebook with Spark packages including spark.sql and pyspark.ml.

FIFA19 Data Analytics
following 3 kinds of analysis : To systematically investigate this dataset, your Jupyter notebook should complete the
Part 1 - Exploratory Data Analysis data visualization and understanding.
Part 2 - Clustering Analysis Identify the inherent clusters among players, and for each cluster, identify its profile.
Part 3 - Classification Analysis Build classifiers to predict the ‘position_group' of the player. You are also required to evaluate the performance of at least 3 models using cross-validation.

Project Report

write a report SIT742T2Report.pdf with 1000 1500 words, which should include the Based on your implementation as required in Jupyter notebook, you are required to following information:

(1) The required report ‘Section 1' to ‘Section 3' (results and analysis) as specified in the notebook.

such as any rising star? any omni player? etc. (10%) (2) In the report's ‘Section 4', discuss any findings you can reveal from this data set,
(3) In the report's ‘Section 5', reflect the project group activities, such as the task during this project. (10%) distribution and contributions from each group members, and what you have learnt.

Attachment:- Modern Data Science.rar

Attachment:- Data.rar

Reference no: EM132487084

Questions Cloud

Describe the week content and resources to a person : The objective of this Assignment is to provide you with a private place to think on the page; "thinking on the page" is a phrase used to describe writing as a.
What does universal design mean : What does universal design mean? Find at least 2 online definitions and put those definitions into your own words and write your definition here.
What the artifact you selected says about you : What the artifact you selected says about you. Did you share a video of the music or the lyrics? Did you show a photograph or a painting?
Explain why the project failed : Project management is essential for the operations in various industries such as information technology, hospitality, engineering, and others.
SIT742 Modern Data Science Assignment : SIT742 Modern Data Science Assignment help and solution, Deakin University - assessment writing service - develop a data exploration report by completing
Discuss who thinks about birth control : Who thinks about birth control?Discuss Who knows when you're down to 1 roll of paper towels or toilet paper? Who does meal planning? Grocery shopping, cooking?
What potential ethical risks does entry-level manager pose : What potential ethical risks does an entry-level manager pose when asked to take a leadership position leading a new role as part of an international expansion.
Discuss information system solutions that applied to issue : Based on your reading of the case study "AbbVie Builds a Global Systems Infrastructure" on pages 586 of the textbook, discuss the problems that the company was.
Describe the various styles of leadership behaviors : The objective of this paper is for you to understand and be able to apply various styles of leadership behaviors into your own leadership style and approach.

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd