Briefly explain the concept of clustering and k-means

Assignment Help Other Subject
Reference no: EM132308866

Assignment Task

A research team planned to study Australian road transport crash fatalities from 2010 to 2018 (inclusive). As a team member, you were given the dataset about Australian Road Death Fatalities, and were requested to analyse the data and prepare a report about your work and findings.

The dataset can be downloaded from Blackboard or the above website. The dataset contains basic demographic and crash details of Australian road crashes between 1989 and 2019. As the team does not have any specific goal for the analysis, you have the freedom to explore the data, and dig out anything you feel interesting or significant. However, you are to limit your research and analysis to the years 2010 to 2018.

The potential audiences include other researchers, business representatives, and government agencies. They may have limited ICT or mathematical knowledge.

To prepare the report, please include the following sections:

1. Introduction
Provide an introduction to the problem. Include background material as appropriate: who cares about this problem, what impact it has, where does the data come from, what are the dimensions and structures of the data.

2. Data Setup
Describe how to load the data, and how the pre-processing is performed.

The original dataset is not ready for analysis and it is different from the data forms that we are familiar with in previous practices. This means we need to do some pre-processing, either for the whole dataset, or for a subset of the dataset required for each sub task described later.

Once you have some ideas of exploratory or advanced analysis, you need to adjust the form of dataset. This can be achieved either by manipulating records in R by transposition or subsetting, or with other tools (e.g. notepad or excel) before reading them into R. Please explain your solution in this section.

3. Exploratory Data Analysis

One-variable analysis

One-variable analysis studies one variable (one row or one column) each time. For example, we can select a particular Australian state or year to get a column of numbers and the histogram can be used.

Perform 2 one-variable analyses. Plot one graph for each variable. Explain the finding for each graph.

Two-variable analysis

Two-variable analysis studies the relation between two variables. For example, we can select "Diseases of the nervous system" and "Year", then a time series (scatter) plot can be drawn. Or, we can select "2015" and "Causes".

Perform 2 two-variable analysis. Plot one graph for each variable. Explain the finding for each graph.

4. Advanced Analysis
Clustering
Briefly explain the concept of clustering and k-means.
Perform 1 clustering analysis to group years according to a selected cause.

Linear Regression
Briefly explain the concept of linear regression.
Perform 2 linear regression analysis. Plot the learned models.

5. Conclusion

6. Reflections

In this part, discuss any difficulties you had performing the analysis and how you solved those difficulties. Reflect on how the analysis process went for you, what you learnt, and what you might do differently next time.

For the data analysis (Section 3 & 4), you need to provide both R code, the explanation to the code, and the result. Please represent each R code snippet in a box with some comments. For example:

# Draw a boxplot on the attribute "Income" boxplot(MyData$income)

Report Format

Your report should be no less than 1,200 words and it would be best to be no longer than 2,000 words long. Text in R code snippets are not counted.

The report MUST be formatted using the following guidelines:

• Title Page - Must not contain headers, footers, or page numbering. Include your name as the report's author.

• Header - Report title

• Footer - your name and the page number

• Paragraph text - 12 point Calibri single line spacing

• Headings - Arial in an appropriate type size

• Margins - 2.5cm on all margins

• Page numbering

• Executive summary to the last page of Table of Figures to use roman numerals (i, ii, iii, iv)

• Introduction and onwards to use conventional numerals (1, 2, 3, 4) starting on page 1 from the introduction.

• The report is to be created as a single Microsoft Word document (version 2007 or later).

Attachment:- Introduction to Data Science.rar

Verified Expert

This task provides a clear working example of simple linear regression analysis. the simple linear regression was used to predict the dependent variable using one independent variable.If there is a strong relationship between these two variables, then, we can conclude that the independent variable strongly influences the dependent variable

Reference no: EM132308866

Questions Cloud

Describe how the new tool or intervention may be integrated : A descriptive and reflective discussion of how the new tool or intervention may be integrated into practice that is supported by sound research.
What you need for the next phase of your career : A professional Portfolio is a collection of documents that provides evidence of your education, skill sets, accomplishments, goals, competencies, professional.
Describe the impact given had on nurse engagement : Discuss how an individual can use effective communication techniques to overcome workplace challenges, encourage collaboration across groups, and promote.
Public health genomics implementation to save lives : Summary of video "Public Health Genomics Implementation to Save Lives - From National Vision to State Success. Network between hospitals.
Briefly explain the concept of clustering and k-means : ICT110 - Introduction to Data Science - University of the Sunshine Coast, Queensland - Describe how to load the data, and how the pre-processing is performed
Identify the current role of the informatics nurse : Explain what is meant by connected health. Provide three examples of connected health in today's healthcare environment. Explain the benefits and drawbacks of.
Write a research report on a local healthcare business : Write a 4- to 6-page report research on a local Healthcare business and learn about its operations in the lieu of a Gemba Walk
Explain why a quality improvement initiative is needed : Explain why a quality improvement initiative is needed in this area and the expected outcome. Discuss how the results of previous research demonstrate support.
Evaluate whether your project made a difference in practice : In order to evaluate an evidence-based practice project, it is important to be able to determine the effectiveness of your change. Discuss one way you will be.

Reviews

len2308866

5/20/2019 3:05:58 AM

This assignment will take several weeks to complete and will require a good understanding of data science theories and practices for successful completion. It is imperative that students take heed of the following points in relation to doing this assignment: 1. Ensure that you clearly understand the requirements for the assignment – what must be done and what are the deliverables. 2. If you do not understand any of the assignment requirements – Please ASK the course coordinator or your tutor. 3. Each time you work on any aspect of the assignment reread the assignment requirements to ensure that what is required is clearly understood.

len2308866

5/20/2019 3:05:48 AM

2 references for the explanation of Clustering and 2 for linear regression are required. These references should follow the Harvard method of referencing. Note that ALL references should be from journal articles, conference papers, technical papers or a recognized expert in the field. DO NOT use Wikipedia as a reference. The use of unqualified references will result in the deduction of marks.

len2308866

5/20/2019 3:05:41 AM

Your report should be no less than 1,200 words and it would be best to be no longer than 2,000 words long. Text in R code snippets are not counted. The report MUST be formatted using the following guidelines: • Title Page – Must not contain headers, footers, or page numbering. Include your name as the report’s author. • Header – Report title • Footer – your name and the page number • Paragraph text – 12 point Calibri single line spacing • Headings – Arial in an appropriate type size • Margins – 2.5cm on all margins • Page numbering • Executive summary to the last page of Table of Figures to use roman numerals (i, ii, iii, iv)

len2308866

5/20/2019 3:05:35 AM

Requests for an extension to an assignment MUST be made to the course coordinator prior to the date of submission and requests made on the day of submission or after the submission date will only be considered in exceptional circumstances. Assignment submission extensions will only be made using the official University guidelines.

len2308866

5/20/2019 3:05:19 AM

Submit your assignment to Blackboard Task 2. Please follow the submission instructions in Blackboard. The assignment will be marked out of a total of 100 marks and forms 30% of the total assessment for the course. ALL assignments will be checked for plagiarism by SafeAssign system provided by Blackboard automatically. Refer to your Course Outline or the Course Web Site for a copy of the “Student Misconduct, Plagiarism and Collusion” guidelines. Late submission will be penalised according to the policy in the course outline. Please note Saturday and Sunday are included in the count of days late.

Write a Review

Other Subject Questions & Answers

  Discuss any uncp experiences or courses that helped you

Discuss at least one of the employer contacts that you made at the career fair. Discuss any UNCP experiences or courses that helped you to prepare for the career fair. What would you do differently to prepare for a career fair in the future?

  What are some of the causes and effects of these crimes

The prevalence of hate crimes has increased. You have been asked to provide your insight into this and recommend ways to decrease the occurrence of these types of crimes. What are some of the causes and effects of these crimes? What actions can be ..

  Prepare resume based on best practice

Prepare resume based on best practice, professionally summary, experience, education & references.

  What is the primary function of the fed

What is the Primary function of the FED? What is the primary difference between monetary and fiscal policy? What is Crony Capitalism?

  Define beccarias position on crime and criminality

Identify the differences between the classical perspective, the biological/psychological perspective, and the process perspective of criminology.

  Write a short description of relevance of achievement

while achievement testing is most commonly used in school settings it can also be beneficial in clinical settings and

  Ethical decision making- case study paper

Many counselors are competent in their knowledge and understanding of ethical standards and relevant laws

  Actual production cost-determine cause of variances

At the start of the year, Frigicor estimated that the company would produce 480 refrigeration units during the year (40 per month).Annual fixed overhead costs were estimated to be $600,000 ($50,000 per month), and estimated variable overhead costs we..

  Define legal considerations surrounding compensation offered

Describe the legal considerations surrounding the compensation offered, and a statement describing legally and ethically acceptable strategies for mitigation.

  Define how does this change your differential and why

moderate exertion but this is only a minor complaint to her. How does this change your differential and why

  Discuss the components that comprise a valid evidence

Identify, analyze, and discuss the components that comprise a valid evidence-based research study. Have they ever rejected research findings? Why or why not

  Memo making a recommendation for improvement

For this assignment, prepare a 350- to 500-word memo to your supervisor requesting that your company make some sort of improvement in one of its processes

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd