Reference no: EM132881600
Unsupervised Learning is a machine learning technique in which the users do not need to supervise the model. Instead, it allows the model to work on its own to discover patterns and information that was previously undetected. It mainly deals with the unlabelled data.Unsupervised learning problems can be further grouped into clustering and association problems (for examples: Audience/customer segmentation,Pattern recognition, medical diagnosis,.....).
In this context you are requested to carry out a project with the following 7 phases:
Phase 1: Business Understanding
Provide the nature of your project: Determine the scope of the business problem and objectives. Describe what your project is about include whether you will be performing data mining tasks, or modifying some other system to incorporate data mining features, etc. It is critical that your problem is well-defined.
Phase 2: Data understanding and preparation
Explore and collect data that will help solve the stated business problem. Prepare the data for furthermodelingprocedures. Include the origin of the data set, an overview of the data set organization, attributes of the data, and challenges of the data set you've selected.
Phase 3:Data Mining Task
Provide the specific tasks you will perform on the data set. Include specific questions you will investigate, and the goals for the tasks. This should be independent of the specific techniques you will use to achieve your goals.
Phase 4: Methods and Models (unsupervised data mining techniques)
In order to achieve the goals, you set in the data mining task section & find valuable and hidden knowledge from data you need to apply two unsupervised data mining techniques(Clustering and Association) based on the type of dataset you are dealing with and your objectives (for example you may apply, K-Means and Apriori Algorithms).
You may use data mining packages (e.g. WEKA). Or implement the data mining algorithms yourself, in any programming language (English). Make clear in your report what existing software you are using.
Phase 5: Compare models and assess results
Compare the efficiency of the techniques used in phase 4, draw conclusions from the data models and assess their validity. Translate the results into a business decision and mention how they help the organization to improve decision-making processes and gain competitive advantage.
Phase 6: Presentation and Visualization
Use data visualization tools and techniques to present and interpret the data mining results in such a way to show meaningful patterns in the data.
Phase 7: The Project Presentation
Each student should present his/her work in front of their classmates.
The assessment of the project presentation is based on the following criteria:
- Accuracy of the presentation;
- Presentation skills;
- Quality of discussion of each student in the group.
The deliverables of the project are:
A project report maximum 15 pages in in MS-word format that includes most of the below, plus other material if needed:
- data description
- problem definition
- data preprocessing
- data mining algorithms used and why
- evaluation, graphs of experiments, result tables
- screenshots if the program has an interesting user interface
- discussion on what was hard to achieve, limitations
- observations, conclusions
Attachment:- Project - Business.rar