Reference no: EM132234122
Question: As you have read the chapter and watched the lecture on chapter 2 from your textbox, your assignment this week is using the following guidelines:
1. Types of Data - "There are many types of data sets, and as the field of data mining develops and matures, a greater variety of data sets become available for analysis" (p. 34). The book lists the most common types (hint p. 34).
Assignment: Name and describe one (1) type of data set. You may provide a graphic of the data set to explain your answer.
2. Data Quality - "Data is often far from perfect. While most data mining techniques can tolerate some level of imperfection in the data, a focus on understanding and improving data quality typically improves the quality of the result analysis" (p. 23).
Assignment: Provide two (2) examples of measurement or data collection issues that may hinder data quality. Provide the term, how it hinders data quality, and what strategy is used to prevent or deal with the issue.
3. Data Preprocessing - "Often, the raw data must be preprocessed in order to make it suitable for analysis" (p. 23)
Assignment: Explain what this means "Often, the raw data must be preprocessed in order to make it suitable for analysis"
4. Measures of Similarity and Dissimilarity - "One approach to data analysis is to find the relationships among the data objects and then perform the remaining analysis using these relationships rather than the data objects themselves" (p. 24)
Assignment: Define similarity and dissimilarly. Describe the difference between the two terms.