Reference no: EM131760326
Problem 1: Explain why and when one would want to use k-means clustering, furthermore, give an explanation of how the algorithm works given a set of data points
Problem 2: Explain one method of picking the `k' in k-means clustering
Problem 3: What is the problem with picking such `k' centroids randomly? Can you devise a better method to pick k that could resolve this problem?
Problem 4: How can we evaluate the k-means model on data ?
Problem 5: Explain the similarities and differences between K-means and Linear Regression. When would you use linear regression instead of k-means?
Problem 6: Consider the following problem context:
We want to model the relationship between the number of students that complain about fees to the department head, with the time spent by the head to deal with such student complaints. We know that there are 4 classes in the department, `information science 101' (11 students), `programming 101' (5 students) and `statistics 202' (3 students) and 'distributed systems 402 (12 students)'. There are no common students between these classes.
- Suggest what the the outcome and input variables could be and whether the latter should be understood as categorical or numerical
- Write a mathematical expression for the regression line for this problem (see online help about how to write mathematical statements in latex)
- What would be the input variables for the above problem context?
- State how the answer from (b) and (c) would then be used to complete the model.
- Discuss briefly your strategy for validating the above model
Problem 7: Explain the difference between observed outcome, line fitting error, estimated/predicted values, and the residuals.
Problem 8: After designing a linear regression model for two variables, you discover the following residual distribution ref fig1. What does this mean?
Give an example of a plot that would correspond to this residual graph.
Classification and Validation
Problem 9: Explain the use case for logistic regression, and state at least one similarity and one difference between logistic regression and linear regression
Problem 10: Explain the function of the ROC curve for logistic regression
Your answer should mention null, alternative hypothesis, true positives and false positives and classifier thresholds.
Compare the consistency of sales
: You are employed as a statistician for a company that makes household products, which are sold by part-time salespeople who work during their spare time.
|
Condition of creative and new
: The condition of creative and new, are always arrived at with a thought process first. It must be seen in the mind's eye before it can be expressed.
|
Determine the total allocated overhead cost
: Determine the total allocated overhead cost for January, March, and August. (Do not round intermediate calculations. Round your answers to the nearest.
|
Describe competency models
: Describe competency models, case-based decision making, and systems thinking.
|
How can we evaluate the k-means model on data
: Explain why and when one would want to use k-means clustering, furthermore, give an explanation of how the algorithm works given a set of data points
|
Website troubleshooting-search engine optimization
: You own a consultant firm that offers the following services: Website troubleshooting, Search Engine Optimization (SEO),
|
Discuss the concept of reasonable assurance
: Discuss the concept of reasonable assurance and the degree of confidence that financial statement users should have in the financial statements
|
Predetermined overhead rate based on direct labor hours
: Maureen Corporation estimated its overhead costs would be $22,800 per month except for January when it pays the $182,880 annual insurance premium.
|
Discussion of business ethics into the public domain
: The turmoil in the world’s financial system and near collapse of the banking system has surfaced a discussion of business ethics into the public domain.
|