Create linear regression model

Assignment Help Other Subject
Reference no: EM132854527

Project Overview

This project is the first part of a two-part series. In the first part, you will blend and format data and deal with outliers.

For the second part, you will use your cleaned up dataset to create another linear regression model. The difference this time is that you will have to choose which variable(s) are the most important for the model using new techniques learned in the Selecting Predictor Variables section.

Scenario
Pawdacity is a leading pet store chain in Wyoming with 13 stores throughout the state. This year, Pawdacity would like to expand and open a 14th store. Your manager has asked you to perform an analysis to recommend the city for Pawdacity's newest store, based on predicted yearly sales.

How Do I Complete this Project?

This project uses skills learned throughout the "Data Preparation" lessons. To complete this project:
• Go through the course.
• Apply the skills learned in the course to solve the business problem given in the project details.
• Use our guidelines and rubric to help build your project.
• When you're ready, submit it to us for review using the submission template found in the supporting materials section.
Skills Required
In order to complete this project, you must be able to:
• Understand different data types. Review Lesson 1 Understanding Data
• Deal with a variety of data issues. Review Lesson 2 Data Issues
• Format data appropriately. Review Lesson 3 Data Formatting
• Blend data together using joins and unions. Review Lesson 4 Data Blending

The Business Problem
Pawdacity is a leading pet store chain in Wyoming with 13 stores throughout the state. This year, Pawdacity would like to expand and open a 14th store. Your manager has asked you to perform an analysis to recommend the city for Pawdacity's newest store, based on predicted yearly sales.
Your first step in predicting yearly sales is to first format and blend together data from different datasets and deal with outliers.
Your manager has given you the following information to work with:
1. The monthly sales data for all of the Pawdacity stores for the year 2010.
2. NAICS data on the most current sales of all competitor stores where total sales is equal to 12 months of sales.
3. A partially parsed data file that can be used for population numbers.
4. Demographic data (Households with individuals under 18, Land Area, Population Density, and Total Families) for each city and county in the state of Wyoming. For people who are unfamiliar with the US city system, a state contains counties and counties contains one or more cities.

Steps to Success

Step 1: Business and Data Understanding
Your project should include a description of the key business decisions that need to be made.

Step 2: Building the Training Set
To properly build the model, and select predictor variables, create a dataset with the following columns:
City
2010 Census Population
Total Pawdacity Sales
Households with Under 18
Land Area
Population Density
Total Families
This dataset will be your training set to help you build a regression model in order to predict sales in the Practice Project in the next lesson. Every row should have sales data because we're trying to predict sales.

Step 3: Dealing with Outliers
Once you have created the dataset, look for outliers and figure out how deal with your outliers. Use the IQR method to determine if there are outlier cities for each of the variables and then justify which city that has at least one outlier value should be removed.
IQR Steps
To calculate the upper fence and the lower fence, here are the exact steps:
1 . Calculate 1st quartile Q1 and 3rd quartile Q3 of the dataset. You can use the Excel function QUARTILE.INC or QUARTILE.EXC
2 . Calculate the Interquartile Range: IQR = Q3 - Q1
3 . Add 1.5 IQR to Q3 to get the upper fence: Upper Fence = Q3 + 1.5 IQR
4 . Subtract 1.5 IQR to Q1 to get the lower fence: Lower Fence = Q1 - 1.5 IQR
5 . Values above the Upper Fence and values below the Lower Fence are outliers

Attachment:- Linear regression model.rar

Reference no: EM132854527

Questions Cloud

How many bowls are needed to break even : Determine the formula that is used to compute how many bowls are needed to break even then compute the number of bowls needed
Find the upper outlier boundary : Consider the following Stem and Leaf Plot 0 5 1 0 5 7 7
How does studying of twins relate back to centenarian study : How does the studying of twins relate back to centenarian studies; what are the findings. A general overview of the documentary as well as your thoughts on it
What is the probability that they will run out of chicken : a.) An airline offers customers a choice between a chicken and a pasta dish on international flights. From experience, they estimate that 75% of the customers
Create linear regression model : Create linear regression model. The difference this time is that you will have to choose which variable - important for the model using new techniques learned
Obtain clinical support services : A small hospital in a well-managed healthcare system can consider three ways to obtain clinical support services: "stand alone"
Find the cumulative distribution function f : Let X,Y be independent random variables with Bin(2, g) and Unif[0, 2] distributions respectively. Let Z = X + Y.
Calculate the amount of each note to be included : Calculate the amount of each note to be included in current and noncurrent liabilities on Mel's building centre December 31 2021 balance sheet
Physician negligence and risk management : Mr. And Mrs. Watros came to the Memorial Hospital for the delivery of their first child. These problems were later reported to a physician.

Reviews

len2854527

4/11/2021 11:30:04 PM

need to follow the instructions & guidelines attached + respect the words number mentioned for each Section, please focus on the highlighted parts as well

Write a Review

Other Subject Questions & Answers

  Why would em or hc prefer for one form of collaboration

Describe and justify what you consider to be the most appropriate strategy for collaboration in the light of the facts provided during the module.

  Determine and find the company wacc

$5 million and sell at a price of 110% of face value. The yield to maturity on the bonds is 9%, and the firm's tax rate is 21%. Find the company's WACC.

  What is the importance of catiline

What is the importance of Catiline, Cleopatra, Mark Antony and Cicero in Roman History

  Describe process for the development of nursing standards

Outline process for the development of nursing standards of practice for your state, including discussion of the entities involved in developing the standards.

  Explain the important elements needed for a successful pay

Global Ed, the school from your Unit 6 Discussion has informed your Acme Inc. supervisor that their staff training program you assisted with has been.

  Read jeff jacobys bring back flogging

Read Jeff Jacoby's Bring Back Flogging (P196-198), and to analyze it. The essay should follow theWRITE/SUBMIT ANALYSIS. ( P188-192)

  Failure of experimenter to control extraneous variable

Failure of the experimenter to control some extraneous variable may result in a(n). A researcher wants to find out whether gross motor coordination is better in the morning or in the afternoon. To measure coordination, participants balance a book on ..

  Discuss employment for an employer and employee

What are the comparative advantages and disadvantages of contract employment versus at-will employment for an employer and employee

  Article on online newspaper or magazine

Review the Editorial or Opinion pages(s) of an online newspaper or magazine and select an article that interests you. Copy and paste the article into your original post and summarize the article. Determine if the argument (opinion/point) is developed..

  Major concepts in kant ethics

Good and right are based on reason. Reason does not get happiness, reason it is only way for survival, it is functional instinct. Good Will-unqualifiably good.

  Develop a comprehensive mission statement for the clinic

The Primary Care Clinic- Develop a comprehensive mission statement for the clinic, and discuss how it will facilitate the provision of quality services.

  Importance of digital marketing

Explain the importance of digital marketing and its business analytics in decision making. Use scholarly articles to support your arguments

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd