Explore the data to gain insights

Assignment Help Python Programming
Reference no: EM132943296

Applied Machine Learning Report

1 Report Overview
Given the "Life Expectancy Data.csv" dataset, build a model to predict a country's expectancy-who"Life expectancy" using some of the following features from the "Life Expectancy Data.csv" dataset1:
• Year;
• Status;
• Adult Mortality;
• infant deaths;
• Alcohol;
• percentage expenditure;
• Hepatitis B;
• Measles;
• BMI;
• under-five deaths;
• Polio;
• Total expenditure;
• Diphtheria;
• HIV/AIDS;
• GDP;
• Population;
• thinness 1-19 years;
• thinness 5-9 years;
• Income composition of resources;
• Schooling.

Ideally, your report should contain the following contents corresponding to the machine learning project checklist we discussed during Week 2's lecture.

2.1 Frame the Problem
At this initial step, you may first consider what type of machine learning solution would the problem take, e.g.:
• supervised or unsupervised learning;
• batch or mini-batch/online learning;
• instance-based or model based,

2.2 Get the Data
Preferably, the data can be loaded automatically from a fixed folder within your local machine 2, e.g., see the download script from Slide No. 134 of Week 2 lecture. It is also a good idea to convert the dataset into a panda frame format.
Examine the general dataset structure and perhaps consider missing (null) values within the columns (attributes) of some instances (you may also profile you data set using the info() method of panda data frame objects). Recall also that it is at this step where you should create your test set.

2.3 Explore the Data to Gain Insights
Visualise the data to look for possible correlations 3. You may also want to experiment with different attribute combinations.

2.4 Prepare the Data for Machine Learning Algorithms
At this step, you may consider:
Data cleansing: null/missing values cannot be handled by some machine learning algorithms.
Handling non-numerical data: convert text/categorical data into numerical.
Custom transformers: creating your own custom transformers, e.g., see the code in Slide No.
322 in Week 2's lecture that introduces combined attributes as new features.
Feature scaling: some machine learning algorithms (e.g., SVMs) are sensitive to unscaled fea- tures, perhaps you may consider scaling the features for these algorithms.
Transformation pipelines: ideally, automate the whole data transformation and training pro- cesses, e.g., see Slide No. 348 of Week 4's lecture.

2.5 Select and Train a Model
2.5.1 Consider several models and evaluate using cross-validation
For this step, you may further consider training several models, e.g.:
• Linear/logistic/softmax regression;
• Polynomial regression;
• SVM regression;
• Decision trees/random forests;
• Ensemble learning;
• Artificial neural networks,
etc. Each model can be further evaluated using cross-validation, e.g., see Slides No. 378-415. Preferably, you should also discuss why you have not considered some of the models above in your machine learning solution. Also, consider the computation cost of training and generating the predictions from your models.

2.5.2 Fine-tuning the model
You may further consider fine-tuning your model using:
• Grid/Randomized search;
• Performance measures, e.g.: accuracy, precision, f1 scores, mean square error, etc;
• Ensemble methods;
• Evaluating on the test set.

3 Machine Learning Solution Format
Your machine learning solution should be coded under Python and where the machine learning algorithm classes are from the scikit-learn library. You should submit a zipped folder containing both your Report document and your Python codes. Your report should contain enough empirical evaluations and arguments to show that your machine learning model is indeed fit- enough.

Attachment:- Applied Machine Learning Report.rar

Reference no: EM132943296

Questions Cloud

Conduct research to gather data on career salaries : Conduct research to gather data on career salaries, and you will practice analyzing that data using descriptive statistics
By how much will misvalue the firm : By how much will you misvalue the firm if its beta is actually 0.6? (Round your answer to the nearest cent. Enter your answer as positive value.)
What should the purchaser record as the acquisition cost : What should the purchaser record as the acquisition cost of the new truck? A company purchases a new delivery truck, paying $45,000 to the vendor.
Which the company should for pp-e asset : The accumulated depreciation account had a balance of $150,000 after the current year's depreciation of $37,500 had been recorded. The company should
Explore the data to gain insights : Explore the Data to Gain Insights - Visualise the data to look for possible correlations 3. You may also want to experiment with different attribute combination
How much annual depreciation expense should be recognized : How much annual depreciation expense should be recognized for 2016, using straight-line depreciation? On Jan. 1, 2014, a company placed into service a machine.
Which newly developed products are likely to follow a : Nestlé sells over 2,000 food and consumer brands, including Lean Cuisine frozen food and Gerber baby food. Their newly developed products are likely to follow a
Which maps should be based on : Home Depot has decided to use OS and AR perceptual maps to analyze their marketplace. Which of These maps should be based on
Which of characteristics is unique to morphological matrix : A new mobile applications firm is hoping to tap into an expert outside source of ready-made new product concepts. Who should they turn to?

Reviews

len2943296

7/16/2021 11:23:06 PM

It is mentioned in pdf also but more than one model and to also evaluate each one and to also have a transformation pipeline for the cleaning and processing of the data.importantly I need this today.

Write a Review

Python Programming Questions & Answers

  Write a python program to implement the diff command

Without using the system() function to call any bash commands, write a python program that will implement a simple version of the diff command.

  Write a program for checking a circle

Write a program for checking a circle program must either print "is a circle: YES" or "is a circle: NO", appropriately.

  Prepare a python program

Prepare a Python program which evaluates how many stuck numbers there are in a range of integers. The range will be input as two command-line arguments.

  Python atm program to enter account number

Write a simple Python ATM program. Ask user to enter their account number, and print their initail balance. (Just make one up). Ask them if they wish to make deposit or withdrawal.

  Python function to calculate two roots

Write a Python function main() to calculate two roots. You must input a,b and c from keyboard, and then print two roots. Suppose the discriminant D= b2-4ac is positive.

  Design program that asks user to enter amount in python

IN Python Design a program that asks the user to enter the amount that he or she has budget in a month. A loop should then prompt the user to enter his or her expenses for the month.

  Write python program which imports three dictionaries

Write a Python program called hours.py which imports three dictionaries, and uses the data in them to calculate how many hours each person has spent in the lab.

  Write python program to create factors of numbers

Write down a python program which takes two numbers and creates the factors of both numbers and displays the greatest common factor.

  Email spam filter

Analyze the emails and predict whether the mail is a spam or not a spam - Create a training file and copy the text of several mails and spams in to it And create a test set identical to the training set but with different examples.

  Improve the readability and structural design of the code

Improve the readability and structural design of the code by improving the function names, variables, and loops, as well as whitespace. Move functions close to related functions or blocks of code related to your organised code.

  Create a simple and responsive gui

Please use primarily PHP or Python to solve the exercise and create a simple and responsive GUI, using HTML, CSS and JavaScript.Do not use a database.

  The program is to print the time

The program is to print the time in seconds that the iterative version takes, the time in seconds that the recursive version takes, and the difference between the times.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd