Reference no: EM132305801
TASMANIA ABALONE RING PREDICTION
Description
In this assignment we will apply machine learning techniques learned in the lectures and tutorials to analyse the population biology of abalone in Tasmania. In particular, we will predict the age of abalone from physical measurements. By observation, the age of abalone can be estimated by the number of rings which can be seen by a microscope. In this assignment you will predict the number of rings given other attributes.
Name
|
Data Type
|
Meas.
|
Description
|
----
|
---------
|
-----
|
-----------
|
sex
|
nominal
|
|
M, F, and I (infant)
|
length
|
continuous
|
mm
|
Longest shell measurement
|
diameter
|
continuous
|
mm
|
perpendicular to length
|
height
|
continuous
|
mm
|
with meat in shell
|
whole weight
|
continuous
|
grams
|
whole abalone
|
shucked weight
|
continuous
|
grams
|
weight of meat
|
viscera weight
|
continuous
|
grams
|
gut weight (after bleeding)
|
shell weight
|
continuous
|
grams
|
after being dried
|
edible
|
boolean.
|
|
True, and False
|
rings
|
integer
|
|
+1.5 gives the age in years
|
Task 1: Data Collection
Identify irrelevant information from the data and remove it to clean the data. Hint: Use Weka or Excel.
Task 2: Data Pre-processing
There are some missing values for height attribute and rings. Decide the way you handle this issue and explain why.
Hint: Use Weka or Excel.
Task 3: Data Transformation
We need to create a new attribute called volume from other attributes as: volume = length * diameter * height.
Normalise the data into [0-1] range.
Hint: Use excel or write program (if you know how to do it).
Task 4: Data Mining & Pattern Evaluation
Prepare your data from the to have:
- A training set of the first 2500 samples
- A validation set of the next 633 samples
- A test set of the last 1044 samples
Run 15 machine learning algorithms and report their accuracy on the validation set to a table.
Select the algorithm that give the highest accuracy in validation set and run the algorithm using the training set and test set. Report this result in test set.
There will be a table of top test result. Email me the results of test set at any time (screenshot of your results in Weka) to put your name to the rank.
Explain how the best algorithms work (in the report) Tips: How to improve performance?
- Handle the missing data issue effectively, use data normalisation, ury different techniques learned from lectures and select the ones that give top accuracy in validation
Task 5: Write a report
Write a report using the following template.
MAJOR ASSIGNMENT
Synopsis of the task and its context
This is an individual assignment making up 20% of the overall unit assessment. The assessment criteria for this task are:
1) Apply machine learning pipeline to solve a real-world problem (Biology of Tasmania Abalone).
a) Identify relevant data
b) Process and clean data
c) Transform data (making new attribute and normalise data)
d) Apply machine learning techniques to predict abalones' rings.
2) Writing a scientific report (1.5-2 pages A4, double column)
a) Understand the impact of this work.
b) Analysis of the results.
c) Identify the best technique for this problem and understand how it works.
Unit learning outcomes
On successful completion of this unit...
1. understand the local and global impact of AI on individuals, organizations, and society
2. adapt and apply techniques for acquiring, representing, and reasoning with data, information, and knowledge
3. select and effectively apply techniques to develop simple AI solutions
4. analyze a problem, apply knowledge of AI principles, and use ICT technical skills to develop potential solutions
5. evaluate strengths and weaknesses of potential AI solutions
Attachment:- ARTIFICIAL INTELLIGENCE.rar