Build a predictive model to classify shots as missed or made

Assignment Help Applied Statistics
Reference no: EM132271207

Assignment - KOBE BRYANT SHOT SELECTION

OVERVIEW: Kobe Bryant marked his retirement from basketball by scoring 60 points in his final game as a member of the Los Angeles Laker team on Wednesday, April 12, 2016. Starting to play professional basketball at the age of 17, Kobe earned the sport's highest accolades throughout his long career. Using 20 years of data on Kobe's shots made and shots missed, can you predict which shots will be successful?

DATA: The original data set contains the location and circumstances of every shot attempted by Bryant during his 20-year career. Your task is to predict whether the basket went in (shot_made_flag = 1) or missed (shot_made_flag = 0). The data for estimation is in Kobe.xlsx.

For this exercise, 5000 of the shot_made_flags have been removed from the original data set and are shown as missing values in the project2Pred.xlsx file. These are the test set shots for which you must submit a classification. You are provided a sample classification file, project2Pred.xlsx with the shot_ids needed for your predicted classification. Provide you predicted classifications in this file and submit both your paper and the prediction file. I have the actual values of the shot_made_flag for these missing shot_ids and will evaluate the classifications. Your goal is to provide the best predictions possible.

Each group is on the honor system to not use any information outside of the dataset to predict each of the missing shot flags.

DATA CONTINUED

The field names are given below (Data descriptions are available in Kaggle):

action_type

combined_shot_type

game_event_id

game_id

lat - court location identifier (latitude)

loc_x - court location identifier (x/y axis)

loc_y- court location identifier (x / y axis)

lon - court location identifier (longitude)

minutes_remaining - (in period)

period

playoffs

season 

seconds_remaining

attendance

avgnoisedb - avg noise in arena (decibels)

shot_distance

shot_made_flag (this is what you are predicting)

shot_type

shot_zone_area

shot_zone_basic

shot_zone_range

team_id

team_name

game_date

matchup

opponent

shot_id

arena_temp (oF)

DELIVERABLE: Submit a paper with an 8 page limit with a separate Appendix up to 5 pages. Code should be in a second appendix and can be as long as necessary. A separate file with predicted classifications also should be submitted.

PAPER REQUIREMENTS -

Introduction

Data Description

Exploratory Data Analysis

  • Address the need for any potential transformations.
  • Address and identify outliers.
  • Address and identify any multicollinearity.

Build models to provide arguments and evidence for or against the propositions below:

  • The odds of Kobe making a shot decrease with respect to the distance he is from the hoop. If there is evidence of this, quantify this relationship. (CIs, plots, etc.).
  • The probability of Kobe making a shot decreases linearly with respect to the distance he is from the hoop. If there is evidence of this, quantify this relationship. (CIs, plots, etc.).
  • The relationship between the distance Kobe is from the basket and the odds of him making the shot is different if they are in the playoffs. Quantify your findings with.

Build a predictive model to classify shots as missed or made. You should produce at least 1 of each type of model:

  • A logistic regression model.
  • A Linear Discriminant Analysis (LDA) model.

Evaluation: Compare each competing models with the AUC, Mis-Classification Rate, Sensitivity, Specificity and objective / loss function. The log loss function of the model should be used to assess the model fit:

-1/N i=1N[yilog pi + (1 - yi)log(1 - pi)].

Where N is the total number classifications, yi is the shot_made_flag and pi is the probability from the model of each outcome (shot made or shot missed.)

Note - Need A SAS programming assignment done. All relevant info in the zip files.

Attachment:- Kobe-data file.rar

Attachment:- Assignment Files.rar

Reference no: EM132271207

Questions Cloud

Department manager at an upscale store : You agree with your boss that some customers might find it offensive and that it should somehow be covered up. You need to talk to Alex."
Discuss gender differences in communication : Discuss communication styles and which one is the most effective? Discuss gender differences in communication.
Identify the main products-services : Identify the main products/services. Analyze the marketing and marketing strategy of the firm. Discuss the products, product mix, and product strategies.
How are leadership and management similar : What are some examples of ethical challenges that leaders and managers face in today's global business environment?
Build a predictive model to classify shots as missed or made : Assignment - KOBE BRYANT SHOT SELECTION. Task is to predict whether the basket went in (shot_made_flag = 1) or missed (shot_made_flag = 0)
How would you characterize ubers business model : How would you characterize Uber's business model and strategy? What are the key elements of its customer value proposition? Its profit formula?
Close-knit work arrangement deal with issues that arise : How could a Gen Y employee and an older more experienced employee that are paired together in such a close-knit work arrangement deal with issues that arise?
Did they provide enough evidence to adequately establish : What evidence was presented? Was it adequate to establish a causal link? Did the evidence presented come from credible and reliable sources?
How compensation plans influence success of an organization : How compensation plans can influence the success of an organization. How influences outside an organization can affect its compensation plan.

Reviews

Write a Review

Applied Statistics Questions & Answers

  What key words were included in the final search syntax

Using the Nanninga, et al 2018 article provided on CANVAS to answers about the elements - What key words were included in the final search syntax

  Construct a tally chart and frequency distribution table

Business Math 195 Business Statistic Assignment - Constructing a Frequency Distribution. Construct a tally chart and frequency distribution table

  What is the approximate minimum percentage of data

What is the approximate minimum percentage of data in any frequency distribution that lies within 3 standard deviations of its mean?

  What is the chance

What is the chance (rounded to the nearest hundredth)

  What is the null hypothesis for your question

Create a research question using the General Social Survey dataset that can be answered using categorical analysis.

  A salon sells its cologne wholesale

A salon sells its cologne wholesale for $8.75 per bottle. The variable cost of producing ,X hundred bottle is -3x2+511X-325 dollars

  Prevalence of hyperglycemia

A recent study reported that the prevalence of hyperlipidemia (defines as total cholesterol over 200) is 30 % in children 2 to 6 years old of age. If 12 children are analyzed. a) What is the probability that at least 3 are hyperlipidemia? b) What is ..

  Statistics from cornell''s northeast regional climate center

Statistics from Cornell's Northeast Regional Climate Center indicate that Ithaca, NY, gets an average of 35.4" of rain each year with a standard deviation of 4.2". Assume the data is normally distributed.

  Is it critical that the correct method be used

Consider the sample data given below to be two independent samples. Use a 0.05 significance level to test the claim that m1 . m2. Compare the results from parts (a) and (b). Is it critical that the correct method be used? Why or why not?

  Find the sample percentage of children

Find the sample percentage of children who caught the virus in each group. Is the sample percentage lower for the vaccine group, as investigators hoped

  Question 1 the manager of a cosmetics company was

question 1 the manager of a cosmetics company was interested in new zealanders personal hygiene. a survey was conducted

  What is the probability that a directly exposed

According to the Annual death rate from malignant neoplasms table, what is the probability that a directly exposed hibakusha (Person-years 1968-1972) living within 2 kilometers of Hiroshima would die: All hibakusha and Non-hibakusha

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd