ICT 583 Data Science Applications Assignment

Assignment Help Programming Languages
Reference no: EM133017759

ICT 583 Data Science Applications - Murdoch University

Assignment: Data Science Project

Assignment overview:
The healthcare industry has been one of the most prominent beneficiaries of the emergence of data science. Successful applications such as AI-assisted diagnosis and prognosis, Computerized drug discovery, and virtual assistant, etc can greatly improve the patient care and save public money. Your final assignment is to apply your data science knowledge on two healthcare datasets, one is the mammographic masses dataset, the other one is the global burden of disease dataset. The goal of this project is to follow the data science analysis pipeline to answer interesting questions of your own choosing, acquire the data, perform data manipulations, design your visualizations, build your predictive modelling using machine learning techniques and present the results in a report format.

Classification -- Mammographic Mass Dataset
Step 1: Get your dataset: You will use one health care dataset called Mammographic Mass Data Set

Step 2: You will raise two interesting questions on the dataset and prepare to answer them in your following analysis via data manipulation, visualization or predictive modeling, etc.
Step 3: Data manipulation and cleaning: Observe your dataset and pre-process the data if necessary and justify.
Step 4: Exploratory data analysis: perform initial investigations on data using summary statistic and visualizations.
Step 5: You will select two classification methods and apply them to the dataset for predictive modeling. The performances of different models should be evaluated.
Step 6: Analyze the results
Step 7: Document all your findings
Clustering -- GBD Dataset
Step 1.Get your dataset: You will use one health care dataset about Global Burden of Disease Study (GBD) Data Set from LMS.
NOTE: IHME GBD data 2017_F_csv is the GDB data of females in 2017; IHME GBD data 2017_M_csv is the GDB data of males in 2017. YOU ONLY NEED TO SELECT ANY ONE OF THEM FOR THE FOLLOWING ANALYSIS.
Step 2: You will raise two interesting questions on the dataset and prepare to answer them in your following analysis via data manipulation, visualization or clustering modeling, etc.
Step 2. Data manipulation and cleaning: Observe your dataset and pre-process the data if necessary and justify.
Step 3. Exploratory data analysis: perform initial investigations on data using summary statistic and visualizations.
Step 4. You will select two clustering methods to identify the groups of countries from the dataset. The performances of different models should be evaluated.
Step 5. Analyze the results
Step 6. Document all your findings

What you need to submit:
R file
An essential part of your project is your R coding. Your R file should record the steps in developing your solutions and obtaining the final data analysis results. Make sure your code matches the findings you put in the report. For example, if there are three separate plots in the report, your code should produce exactly the same three separate plots.
Report
You also need to submit an in-depth report including two parts - classification and clustering. The following components and discussions might be considered in each part:
Overview of the project: Provide an overview of the project, the goals, and the motivation for it. Consider that this will be read by people who first see your project.
Dataset: Describe the background of the dataset and provide the summary statistic. Interesting questions: What questions are you trying to answer? Do any questions evolve throughout the project? Are there any new questions you consider in the course of your analysis? ...
Data manipulation and cleaning: Are there any data pre-processing steps performed, and why? Are there any questions that can be answered via data manipulation? ...
Exploratory data analysis: What visualizations did you use to look at your data in different ways? Are there any detected outliers? ...
Predictive modelling: What are the various machine learning methods you considered? Justify the decisions you made. What are the main ideas of the selected methods? How do you build the models? Are there any concerns when designing your model? ...
Final analysis: What did you learn about the data? Which method statistically outperformed the rest? Have you found the answers to the raised questions? How can you justify your answers? ... Engagingly present your results using text, visualizations.
Conclusion: Are there any limitations of your study? What is your future work?

Attachment:- Data Science Applications.rar

Reference no: EM133017759

Questions Cloud

Historical development of database management systems : Discuss the historical development of database management systems and logical data models, starting from the file-based system of the past to today.
What is the expected return on m simon inc : The beta of M Simon Inc., stock is 1.8, whereas the risk-free rate of return is 0.09. What is the expected return on M Simon Inc
Cloud computing or services : Describe ONE benefit and ONE risk associated with moving to cloud computing or services. Next, explain how you would mitigate that risk.
Calculate the interest paid on the loan : Danielle obtained a business loan of $310,000 at 4.92% compounded semi-annually. Calculate the interest paid on the loan
ICT 583 Data Science Applications Assignment : ICT 583 Data Science Applications Assignment Help and Solution, Murdoch University - Assessment Writing Service
Compute the depreciation for the year : The salvage value was determined to be 5000. The truck was used 12000 miles in the current year. Compute the depreciation for the year
Methodology for smart traffic controller : Open CV Methodology for Smart Traffic Controller Using Otsu's and Haar - Cascade Algorithm
Computing earnings per common share : October 1, 2021 Shares issued in a 100% stock dividend 4370000. What the number of shares to be used in computing earnings per common share for 2021
Social networking sites are continuously gathering data : Social networking sites are continuously gathering data about users on their sites and selling them to other businesses such as advertisers.

Reviews

Write a Review

Programming Languages Questions & Answers

  Is it reasonable to create a chawk by deriving from cbird

Is it reasonable to create a CHawk by deriving from CBird? How about a COstrich ? Justify your answers. Derive an avian hierarchy that can cope with both of these birds

  Briefly explain these main programming paradigm

COMP348 PRINCIPLES OF PROGRAMMING LANGUAGES ASSIGNMENT. Briefly explain these main programming paradigm (Logic, Functional, Object Oriented, Procedural, Imperative and Aspect-oriented), and for each of these paradigms name at least one language tha..

  Create directories and build the necessary classes

BCDE 101 Introduction to Programming-Ara Institute of Canterbury-New Zealand- Create directories and build the necessary classes.

  Create memo to grace to outline purposes of organization

Create a memo to Grace in which you outline purposes of organization and costs and benefits of becoming a member.

  Write a program to automate the scoring of dives

The state diving commission wants to computerize the scoring at its diving competitions.

  Write a program for the gcd and lcm

Write a program that shows that the product of two positive integers is equal to the product of their GCD (Greatest Common Divisor) and LCM.

  Define - implement and test a complex class

EEO 224: Object-Oriented Programming for Electrical and Computer Engineers - Classes, Reference Variables and the String Class

  Write down a program which asks the user for an angle

write a program that asks the user for an angle entered in radians. the program should then display the sine cosine and

  Define i-o and identify some of the elements needed in c++

Input and Output are necessary functions with any code. Define I/O and identify some of the elements needed in C++ that would use input or create output.

  Describe logic why it is not enough to show reduction

If we can only show: if x belongs to A, then y does not belongs to B;explain the logic why it is not enough to show A reduction B.IN other words why the theory needs to prove"if and only if"?

  Write a program to calculate the volume flow rate

Write a program to calculate the volume flow rate in cubic feet per second of water flowing through a pipe of diameter d in inches and a velocity of v feet per second.

  Design function to accept two integer values

Design function named max that accepts two integer values as arguments and returns the value that is the greater of the two. for example if 7 and 12 are passed as arguments to the function the function should return 12.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd