Reference no: EM132870401
CMP9137M Advanced Machine Learning - University of Lincoln
Assessment Task
Learning Outcome 1: Critically appraise a range of machine learning techniques, identifying their strengths and weaknesses, and electing appropriate methods to serve particular roles.
Learning Outcome 2: Analyse the "state of the art" in machine learning, including an understanding of current applications.
Learning Outcome 3: Use machine learning software to solve complex real-world problems in an application domain of interest.
TASK 1:
You are required to use Machine Learning to tackle the problem of "Site specific weed management (SSWM)". SSWM refers to the spatially variable application of a weed control strategy rather than spraying herbicides in the whole field, which enables to minimise herbicide usage and thereby potentially reduces the adverse effects on environment and ecosystem.
However, one of the important and challenging components of SSWM is to develop a weed recognition system. This task consists in creating a weed-crop classifier to predict a plant species from an image for SSWM implementation.
The data used in this task is from the following Kaggle competition
The names of plant species are listed as follows:
Black-grass, Charlock, Cleavers, Common Chickweed, Common wheat, Fat Hen, Loose Silky- bent, Maize, Scentless Mayweed, Shepherds purse, Small-flowered Cranesbill, Sugar beet
You will use an adjusted dataset from the original competition. The delivery team will make three datasets available to you via Blackboard (training, validation, and test). The class labels are contained in the file names. Note that the test data should not be involved during training, it should only be used for testing your trained classifiers, and it should be used for comparison against the predictions of your models.
You are expected to explore a range of machine learning classifiers, inspired by the various models and categories explored within the module and beyond (i.e. from reading and literature). At least one of the deep learning classifiers discussed in the lectures and/or workshops should be included as a baseline. In addition, at least one of your proposed classifiers should attempt to go beyond the module in terms of architectural, approach, and/or algorithmic details.
You will then investigate their performance, compare and critique them to justify your recommended classifier(s). This should include metrics such as TP/FP rates, Precision-Recall, F-measure, and any other relevant metrics. In this assignment you are free to train any classifier, to do any pre-processing of the data, and to implement your own algorithm(s) instead of only using libraries. You are also encouraged to make your own implementations and to use libraries to train your models; for example, you can use any deep learning frameworks (e.g.
Keras, Tensorflow, and Pytorch) to train a neural network. You will need to clearly mention your resources, acknowledge appropriately, and compare between classifiers and their results in your report.
TASK 2:
You are required to use Machine Learning to tackle the problem of "Chatbot Learning". Your goal in this task is to train Sequence2Sequence models that receive text-based inputs (words in English) from a partner conversant, and that output text-based responses (also in English). In this context, the former is called encoder and the latter decoder.
You are required to use your knowledge acquired in the module regarding Sequence-to- Sequence models, and knowledge acquired from additional recommended readings. This will be useful for investigating the performance of those models, and for comparing and criticising them so you can recommend your best sequence-to-sequence model. You are expected to evaluate your models using the following metrics:
Similarly to Task 1, (a) at least one of the Sequence2Sequence learners discussed in the lectures and/or workshops should be included as a baseline; and (b), at least one of your Sequence2Sequence learners should attempt to go beyond the module with regard to architectural, approach, and/or algorithmic details.
Three datasets of different size will be extracted by the delivery team and will be made available to you via Blackboard for training, validation and testing. While you can use the "training" and "validation" sets to report training results and to justify your choices of architectures and hyperparameters, you are required to use the "test" set only for testing your encoder-decoders.
In this assignment, you are free to train any sequence-to-sequence model (in any programming language, though Python will be encouraged), to do any data pre-processing, and to implement your own solutions as much as possible. You are free to make use of libraries, but not to use fully available solutions. So please mention your resources used, acknowledge appropriately, and compare between models and results in your report.
Attachment:- Advanced Machine Learning.rar