How can we evaluate the k-means model on data

Assignment Help Other Engineering
Reference no: EM131760326

Problem 1: Explain why and when one would want to use k-means clustering, furthermore, give an explanation of how the algorithm works given a set of data points

Problem 2: Explain one method of picking the `k' in k-means clustering

Problem 3: What is the problem with picking such `k' centroids randomly? Can you devise a better method to pick k that could resolve this problem?

Problem 4: How can we evaluate the k-means model on data ?

Problem 5: Explain the similarities and differences between K-means and Linear Regression. When would you use linear regression instead of k-means?

Problem 6: Consider the following problem context:
We want to model the relationship between the number of students that complain about fees to the department head, with the time spent by the head to deal with such student complaints. We know that there are 4 classes in the department, `information science 101' (11 students), `programming 101' (5 students) and `statistics 202' (3 students) and 'distributed systems 402 (12 students)'. There are no common students between these classes.

- Suggest what the the outcome and input variables could be and whether the latter should be understood as categorical or numerical

- Write a mathematical expression for the regression line for this problem (see online help about how to write mathematical statements in latex)

- What would be the input variables for the above problem context?

- State how the answer from (b) and (c) would then be used to complete the model.

- Discuss briefly your strategy for validating the above model

Problem 7: Explain the difference between observed outcome, line fitting error, estimated/predicted values, and the residuals.

Problem 8: After designing a linear regression model for two variables, you discover the following residual distribution ref fig1. What does this mean?

Give an example of a plot that would correspond to this residual graph.

Classification and Validation

Problem 9: Explain the use case for logistic regression, and state at least one similarity and one difference between logistic regression and linear regression

Problem 10: Explain the function of the ROC curve for logistic regression
Your answer should mention null, alternative hypothesis, true positives and false positives and classifier thresholds.

Verified Expert

The assignment required to study and research about Roc curve, k-means clustering and answer the problems given in the assignment. The assignment further required to give examples along with the explanation and their relations.

Reference no: EM131760326

Questions Cloud

Compare the consistency of sales : You are employed as a statistician for a company that makes household products, which are sold by part-time salespeople who work during their spare time.
Condition of creative and new : The condition of creative and new, are always arrived at with a thought process first. It must be seen in the mind's eye before it can be expressed.
Determine the total allocated overhead cost : Determine the total allocated overhead cost for January, March, and August. (Do not round intermediate calculations. Round your answers to the nearest.
Describe competency models : Describe competency models, case-based decision making, and systems thinking.
How can we evaluate the k-means model on data : Explain why and when one would want to use k-means clustering, furthermore, give an explanation of how the algorithm works given a set of data points
Website troubleshooting-search engine optimization : You own a consultant firm that offers the following services: Website troubleshooting, Search Engine Optimization (SEO),
Discuss the concept of reasonable assurance : Discuss the concept of reasonable assurance and the degree of confidence that financial statement users should have in the financial statements
Predetermined overhead rate based on direct labor hours : Maureen Corporation estimated its overhead costs would be $22,800 per month except for January when it pays the $182,880 annual insurance premium.
Discussion of business ethics into the public domain : The turmoil in the world’s financial system and near collapse of the banking system has surfaced a discussion of business ethics into the public domain.

Reviews

inf1760326

4/25/2018 5:00:24 AM

Thanks so much, you did a fantastic job writing this custom assignment for me, I can't tell you how much I appreciate you making the changes. This is proof to me that this web site is truly one-of-a kind and really professional. I will definitely be using your services again

len1760326

12/11/2017 5:19:37 AM

please find attachment .... pdf for picture ... and document for questions But please I need best answer for every questions ... please take care about picture residual.pdf I put it with attachments please every question put ans under ... don't give me general please .... use pdf for answers and change your word of tutor ... but please take care some question Problem 8 for residual.pdf Used answers in assignments 2 CPIS 604 to change by with your own word please

Write a Review

Other Engineering Questions & Answers

  Observation and measurement of the components

Construct the circuit shown in Figure 1 below with the parameters shown in the circuit.

  Design a system to improve the safety

Consider the hazards of performing maintenance on machinery and design a system to improve the safety when doing the maintenance function.

  Question 1a pool of newly qualified doctors are not

question 1a pool of newly qualified doctors are not satisfied with their existing revenue and decide to set up an

  Draw a correctly labelled scatter diagram

What are the minimum, maximum and expected values for the total - What is the probability of matching or exceeding the values in (i), (ii) and (iii) if only five dice are used?

  Define the queueing system

Define the queueing system by describing the customers, arrival rate, expected interarrival time, server(s), service rate, and expected service time. Use Kendall's notation to define the queueing system.

  How can d-bart improve its performance appraisal process

How can D-Bart improve its performance appraisal process? How should D-Bart make reduction decisions when performance appraisal documents are inaccurate?

  How you would use the tool in the investigation

How you would use the tool in the investigation. How the tool helps the investigation and the evidence you expect it to provide. Why you believe the evidence the tool provides is critical to the investigation.

  Design digital clock that displays time

CET 1110C: Digital Logic Term Project: BCD Clock. Using your expert Digital Design Skills, design a 24 hour digital Clock that displays the time in Binary-Coded-Decimal

  The properties of an ideal op-amp related issues

Discuss how the frequency of the input signal to an op-amp affects the voltage gain.

  Different signal-flow graphs can represent the same system

Which form facilitates the calculation of the variable gains during controller design?

  Find the range of the tachometer constant

The block diagram of a motor-control system with tachometer feedback - Find lhe range of the tachometer constant K, so that the system is asymptotically stable

  Comparison of companies to adopt cloud computing

Comparison of companies to adopt cloud computing

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd