Implement practical machine learning

Assignment Help Python Programming
Reference no: EM133126075

MLN 601 Machine Learning - Torrens University Australia

Assessment - Classification

Learning Outcome 1: Apply learning algorithms to perform machine learning tasks.
Learning Outcome 2: Implement practical machine learning: data pre-processing, analysis, model selection, and interpret the results.
Learning Outcome 3: Communicate clearly and effectively using the technical language of machine learning to a range of stakeholders

Task Summary

Assessment 1 considered a wine data set as a regression task. This brief revisits the data set as a classification task. In this Assessment, you will use a decision tree Machine Learning (ML) algorithm to analyse data and draw conclusions. To help you create and document this ML model and the results, you will follow the end-to-end CRoss-Industry Standard Process for Data Mining (CRISP-DM) (Chapman et al., 2000) methodology. Further, to guide you through the analysis, the development of your report and model, and the writing of your report and 7-10 minute presentation, a template for your Jupyter Notebook has been provided with comments. Your presentation should touch on the key steps of the template, including the lessons you learned and your experiences.

Please refer to the Task Instructions (below) for further details on how to complete this task.

Context
In addition to giving you an opportunity to complete a ML exercise, this Assessment also gives you an opportunity to practice hyperparameter tuning using the really useful scikit-learn library. In your future workplaces, you will often be expected to perform similar exercises using suitable data sets with different machine learners and tune the hyperparameters. Model building requires you to revise parameters and tune them for the next model run.

For this Assessment, the practice data set is available from the UCI ML repository, which contains nearly 500 real-world data sets. Your focus will be on the wine quality data set. This data set provides wine quality data across 11 traits, including acidity, residual sugar and alcohol concentration. Importantly, this Assessment requires you to develop a model to predict wine quality on a score between 1 to 10.

You will revisit this data set to complete a classification task. To achieve this, you will setup a categorical variable with two categories. Thus, you will be required to allocate levels for your wine quality (the dependent variable) to assign either a ‘low' quality (1) (below the value of 6) or a ‘high' quality (0) (below the selected value of 6). You will use this binary classification to help generate a prediction model for high or low quality wine using decision tree algorithms.

Follow the steps of the CRISP-DM model using the template CRISP_DM _Template_(assessment_2_ classification).ipynb to document and develop your ML model. At the modelling stage, you should practice tuning the hyperparameters for the decision tree to ascertain the effects on the model and determine the optimal performance using the AUC-ROC curve.

Task Instructions
You will use your Jupyter Notebook on the Microsoft Azure ML platform or Google Colab and Python
3.6 as the language for all three assessments.
Ultimately, the Notebook will contain both your ML code, data and report documentation.
Your Assessment will be evaluated based on the major stages of the CRISP-DM process as set out in the Notebook template with prompts. The process comprises:
1. Business Understanding;
2. Data Understanding;
3. Data Preparation;
4. Modelling;
5. Evaluation; and
6. Deployment.
The six multi-step stages of the CRISP-DM must be undertaken to complete this Assessment. Note: For ease of working and to complete this Assessment, you should document what you are doing in your Notebook as you progress through the activities (e.g., the steps undertaken and the rationale for the selection of the code). The template will prompt you on how to work through the end-to-end ML process.

Stage 1: Business Understanding
1. This section serves as an introduction. You should write a clear and concise narrative expressing what you are trying to achieve. Think in terms of ML; for example, the prediction algorithm, the data set selected, what you are seeking from the data set and how you intend to understand the value of your prediction capability.
2. Assess the current situation. See 1.1 of the CRISP-DM template (1.1).

Stage 2: Data Understanding
1. Acquire the relevant wine quality data set from the UCI repository for your prediction model (https://archive.ics.uci.edu/ml/datasets/wine+quality). Explicitly specify the data source by providing a specific link and the name of the data set (e.g., red wine, white wine or both) and the method of acquisition (e.g., direct from the URL or a download of the .csv file). The steps taken need to be clearly stated. (2.1).
2. Read this data set into your Notebook. (2.1).
3. Describe the data set inclusive of variables, units and levels. (2.2).
4. Verify the data quality by analysing the data set for structure and missing data. (2.3).
5. Conduct an initial data exploration using data visualisation, reporting and querying of the data. (2.4).
6. Use the pairplot function in seaborn to determine the relationship, if any, between the variables. Include the output or the visualisation of the pairplot function in your Notebook and comment on it. (2.4.2).

Stage 3: Data Preparation
1. Select the data that you will use for the analysis. (3.1).
2. Clean the data you have selected to improve the quality of the data. (3.2).

Stage 4: Modelling
1. For this Assessment, you are only required to consider one classification modelling technique (e.g., a decision tree).
2. Import the decision tree model in your code. (4.1).
3. Record any modelling assumptions. (4.2).
4. Run your model over the data set. (4.3).
5. Record the parameter settings, your rationale for your choice of values and the actual model generated. (4.3).
6. Revise any parameter settings for subsequent model runs. Document all the revisions until the best model is reached. (4.4).
7. Assess the model or models according to the performance measurement set to meet your evaluation criteria. The AUC-ROC curve is useful for the performance measurement of classification.
8. Revise any parameter settings for subsequent model runs. Document all the revisions until the best model is reached. (4.4).

Stage 5: Evaluation
1. Assess the ML results. Ensure you include a statement as to whether the model meets the initial objective.

Stage 6: Deployment

1. For this Assessment, you are not required to deploy your model. For this stage, simply include any lessons that you learned and that you wish to share in relation to the things that went right and wrong, the areas in which you did well and in which you could improve. You can also detail any of your other experiences in completing this Assessment.

Stage 7: Presentation
1. Once complete, you should setup the Jupyter Notebook for screen recording. You are required to make a screen recording of your Jupyter Notebook and a webcam video of yourself narrating for 7-10 minutes. You should specify your name and any other student details at the beginning. Work your way through the Notebook as you discuss the key aspects of the CRISP-DM steps, the lessons you learned and any other experiences.

2. A wide variety of tools are available to record videos of a webcam and screen simultaneously (i.e., picture in picture). In this case, your video will show you discussing your Notebook on your screen. Use the large screen for your Notebook. Available tools include the inbuilt recorder for Windows 10, Quicktime on Apple, fluvid.com, panopto.com or Zoom. Owing to the size of the video file, you will be submitting the URL for the file. Practice the presentation beforehand to ensure clarity and conciseness.

Attachment:- Machine Learning.rar

Reference no: EM133126075

Questions Cloud

Difference between micro- and macro-prudential regulation : 1. What is the difference between micro- and macro-prudential regulation?
Determine the economic order quantity : East Valve Distributors distributes industrial valves and control devices. The Eastern control device has an annual demand of 9,375 units and sells for $100 per
How much is the carrying amount of investment on december : The investee reported net income of P8,000,000 for 2020 and paid dividend of P5,000,000 on December 31, 2020. How much is the carrying amount of investment
Explain two determinants of demand : 1. State and explain two determinants of demand for air travel for leisure purposes.
Implement practical machine learning : Implement practical machine learning: data pre-processing, analysis, model selection, and interpret the results and Communicate clearly and effectively
Compute for the output tax on every payments made : The zonal value of the residential lot was 2,800,000. Compute for the output tax on every payments made by the taxpayer
Discuss based on statutory and common law : In what circumstances can a taxpayer challenge an assessment outside the ordinary appeal process. Discuss based on statutory and common law
Economic growth and development : What are the possible ways of reducing the poverty-stricken in the Philippines concerning to economic growth and development?
What is joseph debt-to-income ratio : Joseph earns $10,000 a month working at a law firm. His monthly living expenses like food and gasoline amount to $3,500. He pays $2,700 for his mortgage.

Reviews

len3126075

4/15/2022 10:45:32 PM

similar to Assessment 1 instead of Regression Tree Here we have to do the Decision Tree Classification Task using 6 stages of Crisp-DM.

Write a Review

Python Programming Questions & Answers

  Write a python program to implement the diff command

Without using the system() function to call any bash commands, write a python program that will implement a simple version of the diff command.

  Write a program for checking a circle

Write a program for checking a circle program must either print "is a circle: YES" or "is a circle: NO", appropriately.

  Prepare a python program

Prepare a Python program which evaluates how many stuck numbers there are in a range of integers. The range will be input as two command-line arguments.

  Python atm program to enter account number

Write a simple Python ATM program. Ask user to enter their account number, and print their initail balance. (Just make one up). Ask them if they wish to make deposit or withdrawal.

  Python function to calculate two roots

Write a Python function main() to calculate two roots. You must input a,b and c from keyboard, and then print two roots. Suppose the discriminant D= b2-4ac is positive.

  Design program that asks user to enter amount in python

IN Python Design a program that asks the user to enter the amount that he or she has budget in a month. A loop should then prompt the user to enter his or her expenses for the month.

  Write python program which imports three dictionaries

Write a Python program called hours.py which imports three dictionaries, and uses the data in them to calculate how many hours each person has spent in the lab.

  Write python program to create factors of numbers

Write down a python program which takes two numbers and creates the factors of both numbers and displays the greatest common factor.

  Email spam filter

Analyze the emails and predict whether the mail is a spam or not a spam - Create a training file and copy the text of several mails and spams in to it And create a test set identical to the training set but with different examples.

  Improve the readability and structural design of the code

Improve the readability and structural design of the code by improving the function names, variables, and loops, as well as whitespace. Move functions close to related functions or blocks of code related to your organised code.

  Create a simple and responsive gui

Please use primarily PHP or Python to solve the exercise and create a simple and responsive GUI, using HTML, CSS and JavaScript.Do not use a database.

  The program is to print the time

The program is to print the time in seconds that the iterative version takes, the time in seconds that the recursive version takes, and the difference between the times.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd