Implement a recommender system on collaborative filtering

Assignment Help Computer Engineering
Reference no: EM132993099

ICT707 Data Science Practice - University of the Sunshine Coast

Part I - PySpark source code

Important Note:
- For code reproduction, your code must be self-contained. That is, it should not require other libraries besides PySpark environment we have used in the semester. The data files are packaged properly with your code file.

- The data sets used in the lecture slides should not be used as the data set of the assignment. This will result in 0 mark for the coding component.

In this component, we need to utilise Python 3 and PySpark to complete the following data analysis tasks:
1. Exploratory data analysis
2. Recommendation engine
3. Classification

You need to choose a dataset from Kaggle to complete these tasks.

Task I.1: Exploratory data analysis

This subtask requires you to explore your dataset by
• telling its number of rows and columns,
• doing the data cleaning (missing values or duplicated records) if necessary
• selecting 3 columns, and drawing 1 plot (e.g. bar chart, histogram, boxplot, etc.) for each to summarise it

Task I.2: Recommendation engine

This subtask requires you to implement a recommender system on Collaborative filtering with Alternative Least Squares Algorithm. You need to include
• Model training and predictions
• Model evaluation using MSE

Task I.3: Classification

This subtask requires you to implement a classification system with Logistic regression. You need to include
• Logistic Regression model training
• Model evaluation

Part II -Report

You are required to write a report with the following content:
• Provide a high-level survey on the advances of data science in the past 2 years.
• Compare the features of Spark version 2.4 that we used this semester and the new version 3.0.
• Explain your design and implementation of the machine learning parts in your code, including the following topics:
o Background of your selected data set
o For each task, which learning algorithm is used and what are its key parameters and how you set them up
o For each task, provide comments/evaluation for the model learnt

Your report should use the following template:

Table of Contents

1.0 Advancement of Data Science (500 words)

2.0 Comparison of Spark 2.4 and 3.0 (250 words)

3.0 Machine Learning Implementation (250 words)
3.1 Data set
3.2 Collaborative filtering
Features of the model, key parameters and configuration Evaluation
3.3 Logistic regression
Features of the model, key parameters and configuration Evaluation

References

Attachment:- Data Science Practice.rar

Reference no: EM132993099

Questions Cloud

Explain which costs are used for long-term pricing decisions : Which costs are used for long-term pricing decisions. Explain which costs are used for short-term pricing decisions. Why is different than long-term pricing
What is your effective annual interest rate : What is your effective annual interest rate (an opportunity cost) on the revolving credit arrangement if your fi rm does not use it during the year
Identify some problem areas in the cost of capital analysis : Identify some problem areas in the cost of capital analysis. Do these problems invalidate the cost of capital procedures
Explain the purpose of financial reporting procedures : Explain the purpose of a profit and loss statement and give two of its key features. Explain the purpose of financial reporting procedures
Implement a recommender system on collaborative filtering : Implement a recommender system on Collaborative filtering with Alternative Least Squares Algorithm - implement a classification system with Logistic regression
What should be response of the union to such demands : What should be response of the Union to such demands/ Can you suggest some alternative and fruitful solution to this demand?
Why has the opioid crisis become so severe : Why has the opioid crisis become so severe? What can it teach us about the US healthcare system?
Relationship between leadership and emotional intelligence : 1. Explain the author's description of the relationship between leadership and emotional intelligence (EI).
Comprising all the student profiles and academic records : Assume we have a data set comprising all the student profiles and academic records of all the students registered in Australian universities since 1850 (when Au

Reviews

Write a Review

Computer Engineering Questions & Answers

  Mathematics in computing

Binary search tree, and postorder and preorder traversal Determine the shortest path in Graph

  Ict governance

ICT is defined as the term of Information and communication technologies, it is diverse set of technical tools and resources used by the government agencies to communicate and produce, circulate, store, and manage all information.

  Implementation of memory management

Assignment covers the following eight topics and explore the implementation of memory management, processes and threads.

  Realize business and organizational data storage

Realize business and organizational data storage and fast access times are much more important than they have ever been. Compare and contrast magnetic tapes, magnetic disks, optical discs

  What is the protocol overhead

What are the advantages of using a compiled language over an interpreted one? Under what circumstances would you select to use an interpreted language?

  Implementation of memory management

Paper describes about memory management. How memory is used in executing programs and its critical support for applications.

  Define open and closed loop control systems

Define open and closed loop cotrol systems.Explain difference between time varying and time invariant control system wth suitable example.

  Prepare a proposal to deploy windows server

Prepare a proposal to deploy Windows Server onto an existing network based on the provided scenario.

  Security policy document project

Analyze security requirements and develop a security policy

  Write a procedure that produces independent stack objects

Write a procedure (make-stack) that produces independent stack objects, using a message-passing style, e.g.

  Define a suitable functional unit

Define a suitable functional unit for a comparative study between two different types of paint.

  Calculate yield to maturity and bond prices

Calculate yield to maturity (YTM) and bond prices

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd