Explore an interesting and relevant machine learning dataset

Assignment Help Other Engineering
Reference no: EM132206822

Machine Learning Assignment -

Machine learning is an active area of research with a high level of impact on real-world problems.

The objective of this assignment is to allow you to explore an interesting and relevant machine learning dataset using Scikit-Learn. More specifically you will be required to perform pre-processing, build and evaluate machine learning models and write a report on the results.

You will also be required to pick a specific area to research. This research should be integrated into your methodology and evaluation (more detail on this below).

Dataset - Your initial task will be to select an appropriate dataset. You should select either a regression or classification dataset. Ideally you should pick a dataset where machine learning algorithms have already been applied (although this is not essential). Clearly when selecting a dataset you should identify the column that will act as the classification or regression target for the model.

  • Avoid using time series, text classification, image or audio data.
  • Avoid datasets where you have to spend time merging a number of disparate datasets.
  • I recommend that you limit the size of your dataset to 25MB in size. Just to give you an example, a 14MB file took 13 seconds to run 10 fold cross validation, a 20MB file took about 20 second to run 10 fold cross validation, a 25MB file took 25 seconds. These tests were carried out on an I5 and using a DecisionTreeModel. Clearly these times will vary significantly depending on the model and the characteristics of the data. You should keep in mind that you will need to do hyper-parameter optimization, which will take much longer. Please note this is just a recommendation and if you are really interested in adopting a bigger dataset please let me know.

Project Overview -

The project requires you to build machine learning models for your chosen dataset. You will need to perform pre-processing on your data. Follow the pre-processing steps outlined in the Scikit Learn lecture notes. You will need to build and comprehensively evaluate a range of machine learning models. The most promising models should then undergo hyper-parameter optimization.

You are also required to pick a specific topic to research and then incorporate the result of this research into your models and evaluate the impact. For example, if your dataset is imbalanced your research could focus on the techniques that are commonly used to address imbalance. You would then proceed to incorporate some of these into your evaluation and assess the impact on your results.

You should also compose a research report detailing the work you have undertaken and the overall findings. You will find a template for the research paper in the assignment folder. This template adheres to the Springer paper specification. The paper you submit should contain the following sections:

(i) Abstract

(ii) Introduction

(iii) Research

(iv) Methodology

(v) Evaluation

(vi) Conclusions and Future Work

Attachment:- Assignment Files.rar

Reference no: EM132206822

Questions Cloud

Summarize the concepts of legitimacy and sovereignty : If legitimacy stems from "the mass feeling of a government's rightful place to rule" and in a democracy that stems from a fair election process.
Write your mini-security policy : This is the first case study for the course, we looked into and discussed Edward Snowden. We will now look at another case that has happened to the general.
What triggers the ert going into action : Great progress has been made regarding the creation of the Disaster Recovery and Business Continuity (DR/BC) Plan. This assignment will address implementation.
What are the things about a prospective employer : Make a list of things that are important to you about a prospective employer. For instance, Simply, what are the things about a prospective employer.
Explore an interesting and relevant machine learning dataset : Machine Learning Assignment- The objective of this assignment is to allow you to explore an interesting and relevant machine learning dataset using Scikit-Learn
What scholars have said about the painting : Start out by not believing any labels or claims about your work. Think of it as just an object placed before you and you are to figure out what it might be.
Discuss support for and opposition to the legislative change : POLS 2212: Does your case suggest a need for greater federal oversight (or regulation) of state voting laws?
How might this effect the overall functioning : How important is it that a legislature is as demographically diverse as it's constituents?
Creating components of the pr strategy : Over the past several weeks, we have been creating components of our PR strategy. In this assignment, you will create your final presentation using Microsoft.



1/3/2019 2:49:08 AM

Please note you should upload all deliverable files (the dataset, the python file and your report) into a single .zip file for submission. Note: Before fully deciding on your proposal it is very important that you discuss the idea with me in order to validate its objectives and scope. I recommend that you do not exceed 8 pages for the research paper. I understand that some of you may have difficulty adhering to this limit. Please note that this is a recommended guideline, it is not a requirement and you will not be penalized if you do exceed that page limit. More detail on each of these sections are provide above.


1/3/2019 2:49:02 AM

Distribution of Marks - This project will account for 50% of your overall module grade. The marks will be broken down as follows: Report - Abstract and Introduction [10%] Report - Research [15%] Report – Methodology [10%] Report – Evaluation and Conclusions [35%] Project Code [30%] Each of the above components in described in more detail above.

Write a Review

Other Engineering Questions & Answers

  Characterization technology for nanomaterials

Calculate the reciprocal lattice of the body-centred cubic and Show that the reciprocal of the face-centred cubic (fcc) structure is itself a bcc structure.

  Calculate the gasoline savings

How much gasoline do vehicles with the following fuel efficiencies consume in one year? Calculate the gasoline savings, in gallons per year, created by the following two options. Show all your work, and draw boxes around your answers.

  Design and modelling of adsorption chromatography

Design and modelling of adsorption chromatography based on isotherm data

  Application of mechatronics engineering

Write an essay on Application of Mechatronics Engineering

  Growth chracteristics of the organism

To examine the relationship between fermenter design and operating conditions, oxygen transfer capability and microbial growth.

  Block diagram, system performance and responses

Questions based on Block Diagram, System Performance and Responses.

  Explain the difference in a technical performance measure

good understanding of Mil-Std-499 and Mil-Std-499A

  Electrode impedances

How did this procedure affect the signal observed from the electrode and the electrode impedances?

  Write a report on environmental companies

Write a report on environmental companies

  Scanning electron microscopy

Prepare a schematic diagram below of the major parts of the SEM

  Design a pumping and piping system

creating the pumping and piping system to supply cool water to the condenser

  A repulsive potential energy should be a positive one

Using the data provided on the webvista site in the file marked vdw.txt, try to develop a mathematical equation for the vdW potential we discussed in class, U(x), that best fits the data

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd