How data mining should be considered

Assignment Help Other Subject
Reference no: EM133487140

Data Mining and Analysis

Learning Outcome 1: Describe and explain the concepts of data mining and business analytics.

Learning Outcome 2. Critically review and appreciate the role of data mining in business analytics.

Learning Outcome 3. Critically explain how and why data mining and business analytics can be used to create competitive advantage for businesses and enterprises.

Learning Outcome 4. Critically analyse when, why, and how data mining should be considered a possible problem-solving strategy from a business perspective.

Learning Outcome 5. Gain sufficient working knowledge of SAS Enterprise Miner and SAS Enterprise Guide for performing data exploration, modelling, model comparison, and reporting with real-world case studies.

Overview

The objective of this individual assignment is to evaluate your understanding of the basic theory, concepts, and various methods and algorithms in data mining, and assess your skills of applying appropriate Python packages, such as NumPy, Pandas, Matplotlib, and Scikit-learn, etc., to carry out a data mining project.

This dataset contains all the road accidents occurring in London boroughs which have been reported to the police over a certain period of time. Your role in this project is two- fold: acting as a business client and as a data analyst. As a business client, you are expected to raise meaningful business concerns/problems in relation to the data given. And as a data analyst, you are required to follow a proper data mining methodology and apply various techniques covered in lectures to analyse your data to address the business concerns and problems having been raised.

You must contact the module leader to know which borough's data in the dataset you need to analyse.

Tasks You are required to undertake the following tasks:

1. Problem Identification

Read the data description file (metadata) to learn the basic characteristics of the dataset including the certain business context associated with the data, the total number of attributes (dimensions, variables), the data type of each attribute, the value range/mode, skewness, and kurtosis of each attribute, the total number of instances, and simple data exploration with essential plotting, etc.

Identify a set of meaningful business problems of interest with regard to the data for analysis.

Identify what data mining tasks need to be performed in order to address the business problems raised.

2. Data Preparation

Determine which variables to be used in which analysis. Also refer to 1.2. and 1.3. Task 1.

Get your data for analysis. Choose appropriate methods for data pre-processing, including detecting and dealing with incorrect data types, irrelevant variables, missing values, outliers, imbalanced classes, and duplicates, changing data type, and conducting proper dimensionality reduction, feature extraction, data transformation, data partition, and normalisation, etc. where appropriate. Also refer to1.1. Task 1.

3. Model Construction
With the pre-processed dataset undertake the data mining tasks you have identified in 1.2. You are required to apply two different algorithms for both predictive and descriptive modelling. For descriptive modelling, you may choose to use the k- means clustering and various EDA (Exploratory Data Analysis) methods, e. g., histograms, bar charts, and Person's correlation coefficient, etc. For predictive modelling, for example, you may use decision trees and artificial neural networks, or decision trees and k-nearest-neighbour, etc.

In order to build the most appropriate and accurate models and identify meaningful hidden patterns, different settings for the relevant model parameters should be considered for each of the selected algorithms and methods.

4. Model Interpretation and Evaluation

Interpret the descriptive models created, such as clusters created using k-means algorithms, correlation among variables, and various relevant plots created.

Compare the performances of different predictive models in terms of accuracy, error rate, generalisation capability (over-fitting), simplicity and cost, etc., where appropriate.

Discuss the meaningfulness and usefulness of the models built and the patterns revealed, and how the models and the patterns can be used to address the original business concerns. This includes both descriptive and predictive models.

5. A summary of the main findings of the project.

Harvard Referencing

Reference no: EM133487140

Questions Cloud

Describe one biological and one psychological : Describe one biological, one psychological, and one social factor that influences sexuality during adolescence.
What is the difference between statistic and parameter : What is the difference between a statistic and a parameter? What are the three measures of central tendency?
Describe some of the food-related concerns : Describe some of the food-related concerns you have seen with families or food related concerns you experienced growing up?
Explain the concept of intrinsic motivation. : Explain the concept of intrinsic motivation. Are you motivated by intrinsic or extrinsic motivation?Is one form of motivation more effective than the other? Why
How data mining should be considered : Critically analyse when, why, and how data mining should be considered a possible problem-solving strategy from a business perspective
Impact of facebook on emotional health : How to give a brief description of the topic "The Impact of Facebook on Emotional Health among 17 years old female adolescents as a Research Study Design Projec
Key differences between learning and development : What, according to Piaget, are key differences between learning and development? what role do other people play in fostering cognitive development in children
Compare and contrast social cognition and social influence : Describe differences in attributional biases, including the fundamental attribution error, actor-observer discrepancy, and self-serving bias.
Absence of organizational politics : Can an organization be totally free of political behavior? What would be the positives and negatives of the absence of organizational politics?

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd