Data mining using unsupervised and supervised learning

Assignment Help Database Management System
Reference no: EM13779816

Objectives: Data Mining using Unsupervised and Supervised Learning Approaches

Assume that a local company has collected a data set from their ecommerce website and ask you to analyze it. However, the company didn't provide much of background information about the data itself, e.g., the nature of attributes for the data set. However, based on the discussion with the people who collected the data and your observation on the data set, you felt that the first or second column, X1 or X2 may be decision column.

The basic strategy you will use is first to determine the decision column (or class attribute) using K-means clustering algorithm (unsupervised learning approach) to verify if the result of clustering is consistent with either attribute X1, X2, or both X1 and X2. Once the decision column(s) is determined, you build a model (or concepts) using supervised learning approach hoping that you will be able to offer an advice to the company for their business. To successfully complete the data analysis using this strategy, perform the following tasks:

(a) Use K-means algorithm (unsupervised learning) to cluster the data set and to verify the class field(s).

(b) Using the class field(s) determined in step (a), perform a supervised learning using any of those learning algorithms discussed in class such as Version Space, Decision Tree, and Neural Network, and build a model.

To perform above tasks, you are allowed to use either an existing system or program you implemented. However, in order to receive the maximum bonus points your program should work properly and must be powerful enough for effective data analysis. Otherwise, only a partial bonus point may be given. Therefore, it is more important to complete the above tasks (a) and (b) than implementing your own program.

Write a brief report that summarizes your data analysis activities and results including (1) your name(s) and contact email addresses; the percentage contribution to this assignment if the assignment was completed by a team. If a team cannot reach a consensus on the individual contribution, include the individual's claimed percent contribution with a brief description on specific tasks performed, (2) the language used for K-means algorithm implementation or the source of the software used, parameter settings such as K specifying how you determined the best K, clustering results, verified class field(s), and other relevant information to the task, (3) the name of the supervised learning algorithm used, the source of the implementation or software, parameter settings if any, the result of learning including the learned model and other relevant information, (4) the results of your data analysis, useful advice to the company's business, etc., and (5) other relevant discussion about your experience and data analysis results.

Reference no: EM13779816

Questions Cloud

Awareness of oppression and arousing sympathy of supporters : By creating awareness of oppression and arousing sympathy of supporters, the arts can be a form of protest. Identify and describe an example of how either black slaves or white abolitionists used the arts as a form of protest against slavery. Be s..
Intellectual disability, autism, and multiple disabilities : Identify areas of curriculum necessary for students with mild to moderate disabilities and explain why they are needed.
Research design and data collection : Identify the variables in this study. What are some extraneous variables that might impact your research? How would you control for extraneous variables?
Merits of the liquidators arguments : The merits of the liquidator's arguments, in British company law, that Mr Lay cannot recover his loan from the company and that he should instead be made to contribute to the company's debt on the ground that there is no difference between him and..
Data mining using unsupervised and supervised learning : Data Mining using Unsupervised and Supervised Learning Approaches, Use K-means algorithm (unsupervised learning) to cluster the data set and to verify the class field(s).
Write a paper about competence based education : Write a paper about Competence Based Education.
Internal and external stakeholders : Identify the company's goals and identify the following, specifically:
Find the optimal solution using the simplex method : Find the optimal solution using the simplex method based on the equation z= 2A+3B subject to the following constraints 2.1A+1B less than and equal to 6
Evidence-based psychological interventions : According to the text, the imbalance in the diversity of clinical psychologists

Reviews

Write a Review

Database Management System Questions & Answers

  Explaining views for protecting access by unauthorized users

Why do you believe that views by themselves are insufficient for protecting access by unauthorized users?

  Create a database for a home-budgeting application

The first part is to create a database and some tables which will be appropriate for a home-budgeting application. That portion of the assignment should be completed from the MySQL console command line.

  Review the subsequent list of data management difficulties

how data warehousing, online transactional databases, and data mining can solve or reduce these difficulties. Be specific.

  Design relation schemas for the entire database

Design relation schemas for the entire database.

  Kinds of joins

It is not uncommon to have to access the data which reside in different tables, especially when formulating a report.

  A prestigious university has recently implemented a

a prestigious university has recently implemented a consolidation strategy that will require it to centralize their

  Write about the history of databases

Write a 1-paragraph summary of what you plan to discuss in your paper. Your instructor will read your proposal and either approve it or suggest that you select another topic. If your topic is not approved, you will have 2 calendar days to resubmit..

  Write a program to keep track of a cd or dvd collection.

write a program to keep track of a CD or DVD collection. This can only work exclusively with either CDs or DVDs since some of the data is different. The data will be stored ina file. The data from the file will be stored ina text file as records. Eac..

  Construct a relational schema for the er-diagram

Construct a relational schema for the ER-diagram. Make sure that you correctly translate Specialization and Many-to-Many relationships. Please follow carefully the following guidelines when you ?nish this question.

  Build the prototype as a distributed system

You should document a UML design for the proposed system. This should include a description of: architecture, requirements and functionality, detailed system design.

  Describe the different operations of relational algebra

Describe relationships with the example. Also illustrate degree of relationship for that example. Describe the different operations of relational algebra with suitable example each.

  Explain why is hashing all database inputs not considered

question 1 what are the similarities between an md5 hash and a fingerprint?question 2 how would you encrypt a web

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd