Implement a ML solution for a classification problem

Assignment Help Other Subject
Reference no: EM133129425 , Length: word count:3500

COMP1804 Applied Machine Learning

Learning Outcome 1: Rationalise appropriate scenarios for Machine Learning applications and evaluate the choice of machine learning methods for given application requirements.

Learning Outcome 2: Demonstrate competency in using appropriate libraries/toolkits to solve given real- world Machine Learning problems and develop and evaluate suitable application.

Learning Outcome 3: Understand and apply the relevant input data preparation and processing required for the Machine Learning models used, and quantitatively evaluate and qualitatively interpret the learning outcome.

Learning Outcome 4: Recognise and critically address the ethical, legal, social and professional issues that can arise when applying Machine Learning technologies.

Sub-task 1: Text Classification/regression - peer reviews.

This task is to implement a ML solution for text classification/regression (long texts). It uses a dataset of ML paper peer reviews from the International Conference of Learning Representation (in the years between 2017 and 2020) [1,2].
Specifically, you will use as input a text document concatenating: the title of the paper, the abstract of the paper, the review comments, the final acceptance/rejection comment. Such input should be used to predict the following attributes:
• Acceptance status (‘Accept' or ‘Reject')
• Review score (Integer number between 1 and 10).
Note that for the latter attribute you can choose whether to use multiclass classification or regression. You can choose whether to predict both features simultaneously or separately.

Additionally, the dataset is provided with a further attribute, the reviewer confidence score (an integer number between 1 and 5), which is optional to use. If you want to explore the data further, a separate dataset with the text field split into the original fields "review comments", "paper title", "paper abstract" and "final acceptance/rejection comment" can be provided upon request.

Sub-task 2: Image classification - skin lesions.

This task is to implement a ML solution for a classification problem from images. Specifically, you are provided with images of skin lesions [3] and your task is to correctly predict the following attributes:
• Whether a skin lesion is benign or malign (1 for ‘is_benign', 0 for ‘is_malign')
• The fine-grained diagnosis for the skin lesion (7 possible categories).
You can choose whether to predict both features simultaneously or separately. Additionally, the dataset is provided with a further attribute, the location of the skin lesion (for example, "scalp"), which is optional to use. If you want to explore the data further, a separate dataset with more attributes can be provided upon requests. The dataset has been adapted to the requirements of this module; the original dataset was released under the terms of the CC BY-NC 4.0 licence by Tschandl et al. [3].

Sub-task 3: Image classification - advertisements.

This task is to implement a ML solution for a classification problem from images. Specifically, you are provided with images of advertisements [4] and your task is to correctly predict the topic of each advertisement.
• Images are of different sizes and there are 39 possible topic categories.
• You may choose to group together some of the categories (keeping no less than 12 categories). You should thoroughly discuss (and will be evaluated on) the reasons behind and the implications of grouping together different categories.

Sub-task 4: Text classification - amazon reviews.

This task is to implement a ML solution for a multi-task classification problem from text data (mostly short texts). Specifically, you are provided with Amazon reviews [5] (the text is the review title and the review main body joined together) and your task is to predict the following attributes:
• The number of stars associated with the review (on a scale of 1 to 5).
• Whether a product is from the category "Video Games" ("video_games") or "Musical Instrument" ("musical_instrument").
Note that for the first attribute you can choose whether to use multiclass classification or regression. You can choose whether to predict both features simultaneously or separately.

Additionally, the dataset is provided with a further attribute: whether the review is verified or not (either True or False), which is optional to use. If you want to explore the data further, a separate dataset with the text field split into the original fields "review title", "review main body" can be provided upon request.

Tasks:

1. Practical Assignment (complete code that is executed without errors). The source code must be well documented and error free (i.e. no debugging necessary to run). For each dataset, the assignment includes:
o Exploratory Data Analysis (e.g. label distributions per attribute and per set).
o Data cleaning.
o Data Splitting (in training and test sets, but see below) and Data Pre-processing (where appropriate: normalization/standardization, data augmentation, over/under-sampling, text processing).
o 2 ML Methodologies (a basic one & an additional one): appropriate ML methods should be used that have coherent implementations and sound pipelines, without any errors; (if the basic ML method is a Neural Network, the additional one can be another Neural Network).
o Systematic experimentation: you should choose one parameter/attribute to change for each ML methodology (the attribute/parameter can be the same or different across the two methodologies) and show how it affects the results using clear and well formatted figures and tables. Bonus points are given for experimenting using a validation dataset.
o Evaluation of the 2 methods using at least 2 metrics and showing 3-10 examples from the test dataset.

2. Written Report:
• Document in IEEE conference format. Use template available on Moodle or on Overleaf (make a copy of the Overleaf template).
• Should include references (citing other work) where appropriate (when images, data, code, or any other resources have been used from other sources)
• Document structure:
o Abstract: Briefly summarise what the report contains. That is: the task you are solving and why it is important; the outline of the ML methods you implemented and the systematic experimentation performed; the summary of your results and your conclusions. The abstract should be between 100 and 200 words.
o Introduction and related work: This section should talk about the following:
• The problem to be solved, why and to whom it matters, why it is challenging.
• Existing work related to your chosen task (it can be about the exact same task or a similar one).
• A brief overview of the dataset and the data pre-processing steps implemented.
• Your chosen ML implementations and a brief overview about why they are appropriate.
• What your systematic experiment is.
o Ethical discussion: Identify and discuss some of the social, ethical and legal implications of your chosen task, from data collection and processing to the ML prediction. The discussion should take into account communities and people that may be affected by the ML system.
o Dataset preparation: Describe exploratory data analysis, data cleaning, splitting and pre-processing and the reasons behind your design choices.

o ML methods: Describe and explain the 2 methods used and the reasons behind your design choices.
o Experiments and evaluation. Describe the systematic experimentation implemented for each ML method. Based on the experiments, evaluate, present, analyse and explain method performance and metrics used (why are the metrics appropriate?).
o Discussion and future work: Reflections on a) what worked well and what worked less well; b) reasons behind the performance obtained; c) how your work could be extended in the future and what addition can be made to it.
o Conclusions: A brief summary of the work done and what the main highlights were.
o References: All existing works and resources (code/images/etc) you used or talked about in your report must be cited properly.

Attachment:- Applied Machine Learning.rar

Reference no: EM133129425

Questions Cloud

Improve profits over the high-cost strategy : Given the preferences, would bundling improve profits over the high-cost strategy? Support your conclusion by showing if (by how much) profits differ under eac
List one characteristic of demand : When might it be beneficial for a company to use the FIFO method? When is the weighted-average more practical?
Capital market line and security market line : Explain the difference between capital market line (CML) and security market line (SML)
Qualitative descriptive research design : You have just been assigned as a project lead to a research team that is tasked with framing a potential qualitative descriptive study.
Implement a ML solution for a classification problem : Recognise and critically address the ethical, legal, social and professional issues that can arise when applying Machine Learning technologies
Proposed dissertation research study topic : A doctoral learner has decided to do a qualitative descriptive study for his/her proposed dissertation research study topic because it is believed to be the bes
What percentage of their social security benefits is taxable : Social Security $18,000 They did not have any adjustments to income. What percentage of their Social Security benefits is taxable
Calculate the sample proportion of insomnia sufferers : Calculate the sample proportion of insomnia sufferers who did not improve after the therapy. Round to three decimal places.
Monetary policy decision statement : Using the static Aggregate Demand - Aggregate Supply model discussed in the unit, illustrate and explain how there could be inflationary pressure in the economy

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd