ICT707 Data Science Practice Assignment Problem

Assignment Help Other Subject
Reference no: EM132380088

ICT707 Data Science Practice Assessment Task - Big Data Assignment, University of the Sunshine Coast, Australia

Goal: To demonstrate a comprehensive view of big data analysis in terms of definitions and concepts, techniques, and producing big-data solutions to business problems.

Product: Artefact-Technical and Scientific, and Written Piece.

Format: A computer program that uses big-data analysis techniques to solve a business problem, plus a report (1000 words) describing and justifying the design of that program.

This task is being used for measuring assurance of learning towards Association to Advance Collegiate Schools of Business (AACSB) accreditation. The following Program Learning Objectives will be assessed:

1: Problem Solving

Demonstrate critical and creative thinking to identify and solve complex business problems and arrive at innovative solutions.

Further details of this assessment will be given on Blackboard. This is an individual assessment.

Criteria:

  • Presentation and organisation of report.
  • Demonstrate critical analysis of the given problem and apply creative thinking and approaches to solve the problem.
  • Application of relevant programming concepts.
  • Accuracy of the program output.
  • Adherence to the recommended programming styles.

Assignment Task -

This assignment consists of two deliverables, being:

  • One code implementation - The code file in Jupyter Notebook format and the relevant data set files should be contained
  • A report.

Part I - PySpark source code

Important Note: For code reproduction, your code must be self-contained. That is, it should not require other libraries besides PySpark environment we have used in the workshops. The data files are packaged properly with your code file.

In this component, we need to utilise Python 3 and PySpark to complete the following data analysis tasks:

1. Exploratory data analysis

2. Recommendation engine

3. Classification

4. Clustering

You need to choose a dataset from Kaggle to complete these tasks. Remember to include the data set file in you source code submission.

Note: In your notebook, please use Heading 1 Markdown cell to separate each sub task.

Task 1.1: Exploratory data analysis

This subtask requires you to explore your dataset by

  • telling its number of rows and columns,
  • doing the data cleaning (missing values or duplicated records) if necessary
  • selecting 3 columns, and drawing 1 plot (e.g. bar chart, histogram, boxplot, etc.) for each to summarise it

Task 1.2: Recommendation engine

This subtask requires you to implement a recommender system on Collaborative filtering with Alternative Least Squares Algorithm. You need to include

  • Model training and predictions
  • Model evaluation using MSE

Task 1.3: Classification

This subtask requires you to implement a classification system with Logistic regression with LogisticRegressionWithLBFGS class. You need to include

  • Logistic Regression model training
  • Model evaluation

Task 1.4: Clustering

This subtask requires you to implement a clustering system with K-means. You need to include

  • Model training
  • Model evaluation

Part II - Report

You are required to write a report to explain your design and implementation of the machine learning parts in your code, including the following topics:

  • Introduction/summary/explanation to the ML algorithm/concepts.
  • The learning settings, such as how to prepare training/testing set, what are the key parameters and how you set them up.
  • Comments/evaluation for the models learnt.

Your report should use the following template:

Table of Contents

1.0 Introduction

Explain the data set you've chosen, including its source URL. Demonstrate your exploratory data analysis in this section.

2.0 Machine learning implementation

2.1 Collaborative filtering

2.3 Logistic regression

2.4 K-Means

3.0 Conclusion

References

Your report should be about 1000 words, but no more than 1500 words. The report is to include (at least 5) appropriate references and these references should follow the Harvard method of referencing. Note that ALL references should be from journal articles, conference papers, technical papers or a recognized expert in the field.

Please follow the conventions detailed in: Summers, J. & Smith, B., 2014, Communication Skills Handbook, 4th Ed, Wiley, Australia.

Reference no: EM132380088

Questions Cloud

Calculate the amount of taxes charles may include : Calculate the amount of taxes Charles may include in his itemized deductions for the year under the following circumstances.
Discuss the purpose of the security development life cycle : Discuss the purpose of the security development life cycle and how it is used for testing security systems.
Explain situational leadership to your vice president : The Vice President of operations recently promoted you and moved you to a new branch office where the morale among employees is low and performance is poor.
What can you do to improve performance : What can you do to improve performance? And oh-by-the-way, this staff member is a good friend. Careful here, because everyone else is watching
ICT707 Data Science Practice Assignment Problem : ICT707 Data Science Practice Assessment Task - Big Data Assignment Help and Solution, University of the Sunshine Coast, Australia
Identify whether the research is qualitative or quantitative : Summarize the problems that the research articles are addressing. Identify whether the research is qualitative or quantitative or a mixed methodology.
Implementation of new information governance program : Describe how the business culture can have an impact on a company's implementation of a new Information Governance program.
How you could test wheatley concepts in conjunction : Find a current events article in a credible newspaper or magazine and use it as a means to explain how you could test Wheatley's concepts in conjunction.
Think traditional security methods will still be valid : Do you think traditional security methods will still be valid? Why or Why not? Explain your answer. Will hacking and breaches become more predominant?

Reviews

len2380088

10/1/2019 9:48:14 PM

This assignment consists of two deliverables, being: One code implementation - The code file in Jupyter Notebook format and the relevant data set files should be contained within a folder named: Task-Your Name-Student Number, the folder is then to be zipped and uploaded to blackboard. A report - The report must be uploaded as a separate file. Report Format - Your report should be about 1000 words, but no more than 1500 words.

len2380088

10/1/2019 9:48:08 PM

The report MUST be formatted using the following guidelines: Title Page - Must not contain headers, footers, or page numbering. Include your name as the report's author. Header - Report title, Footer - your name and the page number, Paragraph text - 12 point Calibri single line spacing, Headings - Arial in an appropriate type size, Margins - 2.5cm on all margins, Page numbering. Introduction and onwards to use conventional numerals (1, 2, 3, 4) starting at page 1 from the introduction.

len2380088

10/1/2019 9:47:57 PM

The report is to be created as a single Microsoft Word document (version 2007 or later). No other format is acceptable and doing so will result in the deduction of marks. The report is to include (at least 5) appropriate references and these references should follow the Harvard method of referencing. Note that ALL references should be from journal articles, conference papers, technical papers or a recognized expert in the field. DO NOT use Wikipedia as a reference. The use of unqualified references will result in the deduction of marks.

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd