Describe the steps involved in data mining

Assignment Help Management Information Sys
Reference no: EM132145547

Problem I (Answer each piece in 75-150 words with reference but do not quote)

What is data mining? In your answer, address the following:

- Is it another fad?

- Out of the three pre-requisite data science skills (database management, statistics, and machine learning) which one(s) are most important to master?

- Explain how the evolution of database technology led to data mining.

- Describe the steps involved in data mining when viewed as a process of knowledge discovery.

Problem II

Robust data loading poses a challenge in database systems because the input data are often dirty. In many cases, an input record may have several missing values and some records could be contaminated (i.e., with some data values out of range or of a different data type than expected).

Work out a step-by-step data cleaning and loading procedure so that the erroneous data will be marked and contaminated data will not be mistakenly inserted into the database during data loading.

Problem III --(Answer 75-100 words with reference but do not quote)

Outline the major steps of decision tree classification.

Problem IV --(Answer each piece in 75-100 words with reference but do not quote)

a. Compare the advantages and disadvantages of eager classification (e.g., Decision tree, Bayesian, neural network) versus lazy classification (e.g., k-nearest neighbor, case based reasoning).

b. Create a hypothetical example for one of the classifiers discussed in part a.

Problem V -(Answer each piece in 75-100 words with reference but do not quote)

Association rule mining often generates a large number of rules. Name at least one effective method that can be used to reduce the number of rules generated while still preserving most of the interesting rules.

Problem VI

You are a consultant working for the company "Data Mining R Us." Your client is a major luxury automobile manufacturer, Lexcedes. They have come up with a brand-new model called the "Chimera" and they want to target the car for young, filthy rich individuals.

Besides having their own company databases, Lexcedes purchased a large collection of databases containing historic information about people, their attributes, and what they buy. They want to use data mining to help sell their new model.

Describe in detail a comprehensive step-by-step data mining procedure you would follow if you were given this task. Make sure that your answer reflects the situation stated above (in other words, do not give a generic answer). State your assumptions.

Reference no: EM132145547

Questions Cloud

Perfect competition better characterizes markets in general : If there is monopoly power in agriculture, do you think monopoly or perfect competition better characterizes markets in general?
Do you agree with the given problem : Some people argue that ethics codes are "just for show" and really do little to deter unethical behavior by employees. Do you agree?
Give an example of a binary relation : Give an example of a binary relation which is not transitive, and then give an example of a binary relation which is reflexive and transitive but not connected.
How does mancur olson explain differences : How does Mancur Olson explain differences in economic performance of nations by the concept of public goods?
Describe the steps involved in data mining : Out of the three pre-requisite data science skills (database management, statistics, and machine learning) which one(s) are most important to master?
Total expenditure on each input is identical : Suppose a firm is employing all its inputs so that the MRP per dollar spent on each sentence is the same. this suggest that:
Link changes in unemployment : Link changes in unemployment, inflation, wages, and GDP to one another and how they impacted each other during periods of economic decline (recessions)
How neoclassical economists derive the law of demand : Outline how neoclassical economists derive the law of demand and then criticize neo- classical consumer/demand theory from a heterodox perspective in light.
Recession with high unemployment and low output : The economy is in a recession with high unemployment and low output (i.e. the output currently is lower than the natural level of output)

Reviews

Write a Review

Management Information Sys Questions & Answers

  Communication pieces of communication plan

On your first day as an Information Systems Security director, you met with the Chief Information Officer. During the meeting, he revealed to you his deep.

  What can an organization do to protect itself

What can an organization do to protect itself against accidental losses due to semantic security problems?

  Analyse the various approaches for mitigating security risk

Specific issues that you need to address in the forum discussions are provided within the first 3 topics. Be able to critically analyse the various approaches for mitigating security risk, including when to use insurance to transfer IT risk

  Analyze types of organizational and computer architectures

Analyze the types of organizational and computer architectures for integrating systems. Compare and contrast the types and role of distributed software architecture. Use technology and information resources to research issues in enterprise architectu..

  Why hci is so important to the overall design process

Discuss your understanding of the human-computer interface (HCI) and its effect on the world today. Discuss the following:Discuss why HCI is so important to the overall design process. Discuss 1 poor HCI experience that you have had recently.

  An enterprise network against malware or viruses

Describe the attack, and provide one (1) example that illustrates the primary manner in which such an attack could damage a company.

  Explain the planning process for software projects

List the activities that must be performed to complete the requirements definition of an IT project based on the provided articles.

  Describe advantages and disadvantages of using virtual teams

Describe five advantages and five disadvantages of using virtual teams for the organizations described in the scenario.

  Describe methodology behind constructing breakdown structure

Write a one to two page summary document in which you: Define a work breakdown structure and describe the methodology behind constructing one.

  Social media application edmodo

About social media application Edmodo, What are its major features? How are people using these applications

  Determine the wacc given the above assumptions

Determine the WACC given the above assumptions. Indicate how these might be useful to determine the feasibility of the capital project. Recommend which is more appropriate to apply to project evaluation.

  Are networks of pcs making mainframe computers obsolete

Major trends in software - Are networks of PCs and servers making mainframe computers obsolete? Describe

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd