Reference no: EM133180761
Data Mining Algorithms & Techniques
Objectives
1. This assignment involves selecting a topic and a relevant dataset; defining the aims and objectives of mining; designing and implementing right mining techniques and reporting the results.
2. To successfully apply a set of data mining skills imparted in this module to a previously unseen datasets to achieve knowledge discovery.
3. Conduct an extensive and comprehensive literature review related to the selected problem.
1. Conference style paper (an 8-page report in IEEE conference format) (pdf format).
2. A comprehensive report file that covers all aspects of the work (pdf format)
3. A single zip containing the followings:
For each part, a set of supporting files including but not limited to the following, which should be clearly referenced from your documentation. You only need to submit the files relevant the techniques you have explored.
• The original dataset (if the dataset is large, then a link should be enough)
• If weka is used then the following files are needed:
o dataset.arff
o trainigSet.arff
o testingSet.arff
o ...
• If python is used, then a notebook file that runs in jupyter notebook is needed.
Your Task: Classification/Association/Clustering/Time Series
Choosing Your Dataset
• Your dataset should concern a real-world problem that lends itself to easy understanding.
• It should not have been used by another group during the semester.
Deliverables
1. By the end of this part, you are expected to produce an IEEE conference style paper (max 8 pages) that covers all aspects of data mining as discussed in the module. You must identify a testable, answerable, non-trivial research question and then formulate a methodology to answer that question, using one of the data mining frameworks (KDD or CRISP-DM).
You are expected to do an extensive literature review on the selected problem, your techniques and methods should be informed by the literature review which should be relevant to your topic, from a reputable source and recently published.
The suggested paper structure:
i. Abstract
ii. Introduction
iii. Related Work
iv. Methodology
v. Evaluation and results
vi. Conclusions and Future Work
vii. References
Within your paper, you should be able to cover the following points:
i. Description of your dataset
ii. Preprocessing and EDA
iii. Training, testing and validation sets
iv. Classifier(s) used / Association / Clustering
v. Optimisation.
2. Detailed report with clear snapshots (max 10 pages in pdf) and supporting files that show the detailed steps for producing the results and the conclusions presented in the paper (this will insure the reproducibility of the work).
3. Video presentation
i. 10 minutes max
Attachment:- Data Mining Algorithms.rar