Create one dataset for all three with categorical predictor

Assignment Help Computer Engineering
Reference no: EM131926022

Problem

Calculating Distance with Categorical Predictors. This exercise with a tiny dataset illustrates the calculation of Euclidean distance, and the creation of binary dummies. The online education company Statistics.com segments its customers and prospects into three main categories: IT professionals (IT), statisticians (Stat), and other (Other). It also tracks, for each customer, the number of years since first contact (years). Consider the following customers; information about whether they have taken a course or not (the outcome to be predicted) is included:

Customer 1: Stat, 1 year, did not take course

Customer 2: Other, 1.1 year, took course

a. Consider now the following new prospect:

Prospect 1: IT, 1 year

Using the above information on the two customers and one prospect, create one dataset for all three with the categorical predictor variable transformed into 2 binaries, and a similar dataset with the categorical predictor variable transformed into 3 binaries.

b. For each derived dataset, calculate the Euclidean distance between the prospect and each of the other two customers. (Note: while it is typical to normalize data for kNN, this is not an iron-clad rule and you may proceed here without normalization.)

c. Using k-NN with k = 1, classify the prospect as taking or not taking a course using each of the two derived datasets. Does it make a difference whether you use 2 or 3 dummies?

Reference no: EM131926022

Questions Cloud

Revolve around the idea of self-recongnization : The concepts of both articles revolve around the idea of self-recongnization in manta rays, each article attempts to convey the cognitive powers of these animal
How would the given customer be classified : Specify the success class as 1 (loan acceptance), and use the default cutoff value of 0.5. How would this customer be classified?
What is the role of individual in helping to solve problem : Are we more likely to solve the problems if we embrace a sense of common purpose and public spirit focused on the collective good?
Trade-offs in different class of agency conflicts and costs : Jensen and Meckling (1976) also provide potentially important insights into the choice of Capital Structure. They discuss Agency Conflicts and the Costs.
Create one dataset for all three with categorical predictor : Using the above information on two customers and one prospect, create one dataset for all three with categorical predictor variable transformed into 2 binaries.
Construct a balance sheet for fred business : Construct a balance sheet for Fred's business at the end of its first month. (Hint: Fred's business has only current assets, current liabilities and an equity.
What is the time value of the put : A 3-month put has a strike price of $47.50 and an option premium of $1.40. The underlying stock is selling for $46.70 per share.
What would happen if the data were not normalized : What would happen if the data were not normalized? se k-means clustering with the number of clusters that you found above. Does the same picture emerge?
How do you think human communication will evolve : How do you think human communication will evolve in the future? Do you predict an improvement in the ways we communicate with each other?

Reviews

Write a Review

Computer Engineering Questions & Answers

  Mathematics in computing

Binary search tree, and postorder and preorder traversal Determine the shortest path in Graph

  Ict governance

ICT is defined as the term of Information and communication technologies, it is diverse set of technical tools and resources used by the government agencies to communicate and produce, circulate, store, and manage all information.

  Implementation of memory management

Assignment covers the following eight topics and explore the implementation of memory management, processes and threads.

  Realize business and organizational data storage

Realize business and organizational data storage and fast access times are much more important than they have ever been. Compare and contrast magnetic tapes, magnetic disks, optical discs

  What is the protocol overhead

What are the advantages of using a compiled language over an interpreted one? Under what circumstances would you select to use an interpreted language?

  Implementation of memory management

Paper describes about memory management. How memory is used in executing programs and its critical support for applications.

  Define open and closed loop control systems

Define open and closed loop cotrol systems.Explain difference between time varying and time invariant control system wth suitable example.

  Prepare a proposal to deploy windows server

Prepare a proposal to deploy Windows Server onto an existing network based on the provided scenario.

  Security policy document project

Analyze security requirements and develop a security policy

  Write a procedure that produces independent stack objects

Write a procedure (make-stack) that produces independent stack objects, using a message-passing style, e.g.

  Define a suitable functional unit

Define a suitable functional unit for a comparative study between two different types of paint.

  Calculate yield to maturity and bond prices

Calculate yield to maturity (YTM) and bond prices

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd