Statistical analysis of the football team record

Assignment Help Applied Statistics
Reference no: EM132255617

Statistical Data Management Assignment -

Background: Suppose we were going to do a statistical analysis of the University of Illinois football team's record and how it impacts their AP ranking and Bowl game results. To do so, we would need a data set that's fully validated and cleaned.

Dataset: You will work with a data set containing information from the 127 seasons of University of Illinois football. The raw data set illinifb18.dat contains data from 1892 to 2018.

Field

Description

Notes

1

Obs

Observation number

2

Season

 

3

Conf

Conference

4

W

Wins

5

L

Losses

6

T

Ties

7

Pct

Win percentage

8

SRS

Simple Rating System: A rating that takes into account average point differential and strength of schedule. Average of all teams in a season is 0.

9

SOS

Strength of Schedule: Average of all teams in a season is 0.

10

AP_pre

Rank in pre-season AP poll. Possible values are 1-25 and missing if unranked.

11

AP_high

Highest rank of the team in the AP poll during that season. Possible values are 1-25 and missing if unranked.

12

AP_post

Rank in final AP poll at the end of the season. Possible values are 1-25 and missing if unranked.

13

ConfTitle

Did Illinois win its Conference Title: Y or N

14

Coach

Head coach (or coaches)

15

Record

Team record

16

Bowl

Name of post-season bowl game played in, or missing

17

BowlResult

Result of Bowl game: W or L

Goal: Adequately prepare this dataset for statistical analysis. Here is a list of items you may need to consider.

  • Reading raw data files.
  • Creating formats and labels.
  • Deriving new variables via calculations or recoding.
  • Subsetting data.
  • Checking data for errors.
  • Validating and cleaning data.

To help you along, here are some data details.

  • Each record is a unique season, so each value of Season must be unique.
  • The values of W, L, T, and Pct should coincide correctly. Winning percentage is equal to the number of wins (W) divided by the total number of games (W+L+T).
  • No one is expected to know the proper spelling of each Head Coach's name. If there are any typos in a coach's name, each unique spelling would appear in a frequency report.
  • In some years, Illinois switched their head coach in the middle of the season. If more than one coach is listed for a season, clean the Coach variable to note which coach had the most wins that season.
  • In some years, Illinois had more than one person simultaneously filling the role of head coach. If no coach is listed, it means that more than one person shared the duties of coach. Learn who they were by searching the internet and clean the Coach variable to contain the name of the coach whose last name comes first in alphabetical order.
  • Clean the Record variable to match the W, L, and T entries.

Midterm Report: A summary report that includes the following -

1. Title of the project.

2. Your name.

3. Methods section:

  • Description of the original data file including what type of input style it uses.
  • Description of the guidelines used to validate the data.
  • Description of the issues needed to be cleaned and how it will was done (though not needing to explain the programming code specifically).
  • Description of additional data preparation that you performed.
  • Description of variables to be analyzed including attributes such as name and type.

You do not have to list all the variables in the original sourced file, but do mention the ones you bring to SAS for the creation of the SAS data set.

4. Results section:

  • Tables and visualizations pertaining to validation and cleaning.
  • Write-up of the results. Point out notable information from the charts and tables.

5. To verify that the cleaning was thoroughly completed, also answer these questions in your Results section:

a. Identify which Head Coach or Coaches had the most wins in his career with the University of Illinois football team.

b. Identify which Season(s) saw the football team with their highest ranking for the university across all seasons. Note that #1 is the highest ranking possible.

c. Identify the number of times that Illinois won its conference title.

d. Identify which decade had the most wins in a decade. For example, 1892-1899 will be the decade known as the 1890s; 1900-1909 is the 1900s; ... 2010-2019 is the 2010s.

6. Write in complete sentences and pay attention to grammar, spelling, readability and presentation. If you include a table or chart, make sure you say something about it. If you're not discussing a result, then it doesn't belong in your report.

In terms of length, it probably shouldn't take more than 2-3 pages to explain your work on this dataset. That does not include the space occupied by tables and other output. If you have a point to make, get to it. If you find yourself writing things simply for the sake of padding the word-count, you're writing the wrong things.

You must complete the exercises and turn in the SAS program file and Report just like with HW. Submissions must be uploaded to our Compass 2g site on the Midterm page.

Attachment:- Assignment Files.rar

Reference no: EM132255617

Questions Cloud

Define promotion-advertising and publicity : Define promotion, advertising and publicity. Explain why a business should focus on its brand.
Prepare a projected balance sheet representing the end : Prepare a projected balance sheet representing the end of the first calendar year of operations and defining assets and liabilities.
Covariances are needed to optimize portfolio : In total how many estimates of expected returns, variances, and covariances are needed to optimize this portfolio?
How will the organizations get payback from implementing : how will the organizations get payback from implementing the ERP changes?
Statistical analysis of the football team record : STAT 440 Statistical Data Management Assignment, University of Illinois, USA. Statistical analysis of the University of Illinois football team's record
Dividends to shareholders versus repurchase shares : It can be advantageous for a company to pay dividends to shareholders to show the company's profits and reduce the net income of the company.
How much is the firm total equity : Siskiyou, Inc. has total current assets of $1,200,000; total current liabilities of $500,000; long-term assets of $800,000; and long-term debt of $600,000.
Completing initial draft of the signature assignment : Completing an initial draft of the Signature Assignment requires that students identify which of their proposed solutions is most deserving of adoption.
Covariance between the stock and bond funds : Consider the following table: Scenario Probability StockRate of ReturnRate Bond Fund Rate of Return

Reviews

len2255617

3/13/2019 3:23:56 AM

Instructions - Please name it 'illinifb18.dat'. Suppose we were going to do a statistical analysis of the University of Illinois football team’s record and how it impacts their AP ranking and Bowl game results. To do so, we would need a data set that’s fully validated and cleaned. You must complete the exercises and turn in the SAS program file and Report just like with HW. Submissions must be uploaded to our Compass 2g site on the Midterm page. No email, hardcopy, or late submissions will be accepted.

len2255617

3/13/2019 3:23:49 AM

Submitting your work to Compass 2g - You are to submit two (and only two) files for your homework submission. Your SAS program file which should be saved as midterm_YourNetID.sas. All program statements and code should be included in one program file. Your Report file which should be saved as midterm_YourNetID.pdf. First, use ODS to send your results to a Rich Text Format (RTF) file. Include all relevant output to address the exercises. Do not include output for every execution of your SAS program. After you have included comments and your own responses to the open-ended questions, save and print as a PDF file.

len2255617

3/13/2019 3:23:42 AM

You have an unlimited number of submissions, but only the last one containing both of these two files will be viewed and graded. Homework submissions must always come as a pair of files, as described above. To submit, click on the title of the assignment (e.g., Midterm) in Compass, attach the two files one at a time (no .zip files), then click Submit. Do not click Save Draft.

len2255617

3/13/2019 3:23:36 AM

Midterm Grading: The grading rubric for the final project is summarized by five criteria, each worth 10 points. Methodology of data preparation, Validation and cleaning, Correctness of interpretation of results and output, SAS file/programming (10pts, similar to HW expectation), Report organization and presentation (10pts, similar to HW expectation). Maximum total points: 50.

Write a Review

Applied Statistics Questions & Answers

  Explain why ground-wave propagation is more effective

a) Explain why ground-wave propagation is more effective over sea water than dessert terrain. b) Why do stations in the AM broadcast band always use vertically polarized antennas

  Provide a managerial report - gulf real estate properties

Each condominium is classified as Gulf View if it is located directly on the Gulf of Mexico or No Gulf View if it is located on the bay or a golf course, near but not on the Gulf.

  Underpaid given the current market

Your babysitter claims that she is underpaid given the current market. Her hourly wage is $12 per hour. You do some research and discover that the average wage in your area is $14 per hour with a standard deviation of 1.9.

  Established a record in a major midwestern city

The post office has established a record in a major Midwestern city for delivering 90% of its local mail the next working day. If you mail the eight local letters, what is the probability that all of them will be delivered the next day? Of the eight,..

  What p-value tell you in statistical significance testing

What does the p-value tell you in statistical significance testing? What is the difference between practical and statistical significance?

  The null hypothesis would be rejected

The null hypothesis woud be rejected.

  Social and economic impact on australia

BUS 707 - Applied Business Research - Discusses the problem or question the researchproject seeks to answer and how the research contributes to clarification

  Calculate the shannon h-index for diversity

Calculate the Shannon H-index for diversity of each habitat type, and rank the habitats by diversity - How are wet and dry primeval forests similar? How do they differ? Consider species richness, evenness, diversity, and composition in your answer.

  Questions from spss or stat-101

Stat-101,  questions from SPSS or Stat-101,  Consider the way the table is arranged. If the hypothesis is correct, should we find a positive sign on Somers' dyx or a negative Signon Somers' dyx? Explain how you know.

  How is your hypothesis measurable and testable

After reading the feedback on your proposed quantitative research question in M1 Assignment 2 (Formulating a Research Question), provide your revised quantitative research question. Next, develop a hypothesis for the research question (Include a n..

  The following frequency distribution contains information

The following frequency distribution contains information about urban residents' self-evaluation of the likelihood that they will be a victim of crime in the next year.

  Cigarettes cause the pulse rate to increase

Describe the error in the conclusion. Given: There is a linear correlation between the number of cigarettes smoked and the pulse rate. As the number of cigarettes increases the pulse rate increases. Conclusion: Cigarettes cause the pulse rate to incr..

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd