Create another linear regression model

Assignment Help Other Subject
Reference no: EM132870241

Project Overview

This project is the first part of a two-part series. In the first part, you will blend and format data and deal with outliers.

For the second part, you will use your cleaned up dataset to create another linear regression model. The difference this time is that you will have to choose which variable(s) are the most important for the model using new techniques learned in the Selecting Predictor Variables section.

Scenario
Pawdacity is a leading pet store chain in Wyoming with 13 stores throughout the state. This year, Pawdacity would like to expand and open a 14th store. Your manager has asked you to perform an analysis to recommend the city for Pawdacity's newest store, based on predicted yearly sales.

How Do I Complete this Project?

This project uses skills learned throughout the "Data Preparation" lessons. To complete this project:
• Go through the course.
• Apply the skills learned in the course to solve the business problem given in the project details.
• Use our guidelines and rubric to help build your project.
• When you're ready, submit it to us for review using the submission template found in the supporting materials section.
Skills Required
In order to complete this project, you must be able to:
• Understand different data types. Review Lesson 1 Understanding Data
• Deal with a variety of data issues. Review Lesson 2 Data Issues
• Format data appropriately. Review Lesson 3 Data Formatting
• Blend data together using joins and unions. Review Lesson 4 Data Blending

The Business Problem

Pawdacity is a leading pet store chain in Wyoming with 13 stores throughout the state. This year, Pawdacity would like to expand and open a 14th store. Your manager has asked you to perform an analysis to recommend the city for Pawdacity's newest store, based on predicted yearly sales.

Your first step in predicting yearly sales is to first format and blend together data from different datasets and deal with outliers.
Your manager has given you the following information to work with:

1. The monthly sales data for all of the Pawdacity stores for the year 2010.

2. NAICS data on the most current sales of all competitor stores where total sales is equal to 12 months of sales.

3. A partially parsed data file that can be used for population numbers.

4. Demographic data (Households with individuals under 18, Land Area, Population Density, and Total Families) for each city and county in the state of Wyoming. For people who are unfamiliar with the US city system, a state contains counties and counties contains one or more cities.
Map of Wyoming Counties

Steps to Success

Step 1: Business and Data Understanding
Your project should include a description of the key business decisions that need to be made.

Step 2: Building the Training Set
To properly build the model, and select predictor variables, create a dataset with the following columns:
City
2010 Census Population
Total Pawdacity Sales
Households with Under 18
Land Area
Population Density
Total Families
This dataset will be your training set to help you build a regression model in order to predict sales in the Practice Project in the next lesson. Every row should have sales data because we're trying to predict sales.
Notes
You should be consolidating the data at the city level and not at the store level. We only have data at the city wide level so any analysis at the store level will not be sufficient to complete this analysis.
We simply need to focus on cleaning up and blending the data together in this step.
If you've done everything correctly, the sum for each of the above columns should be:
• Census Population: 213,862
• Total Pawdacity Sales: 3,773,304
• Households with Under 18: 34,064
• Land Area: 33,071
• Population Density: 63
• Total Families: 62,653
with 11 rows of data
For Alteryx users:
• Use the Autofield Tool to help quickly convert your data fields into the appropriate datafields for analysis.
• Research these three specific formulas to help you get rid of unwanted characters in the Formula tool: ReplaceFirst, Left, FindString

Step 3: Dealing with Outliers
Once you have created the dataset, look for outliers and figure out how deal with your outliers. Use the IQR method to determine if there are outlier cities for each of the variables and then justify which city that has at least one outlier value should be removed.

IQR Steps

To calculate the upper fence and the lower fence, here are the exact steps:

1 . Calculate 1st quartile Q1 and 3rd quartile Q3 of the dataset. You can use the Excel function QUARTILE.INC or QUARTILE.EXC

2 . Calculate the Interquartile Range: IQR = Q3 - Q1

3 . Add 1.5 IQR to Q3 to get the upper fence: Upper Fence = Q3 + 1.5 IQR

4 . Subtract 1.5 IQR to Q1 to get the lower fence: Lower Fence = Q1 - 1.5 IQR

5 . Values above the Upper Fence and values below the Lower Fence are outliers

Attachment:- Project Instructions.rar

Reference no: EM132870241

Questions Cloud

Main types of technology within the healthcare system : There are two main types of technology within the healthcare system. They are Applied Technology and Health Information Technology.
What is the author research question as well as sample : What is the author's research question as well as his sample and methodology? Explain his main research findings with three examples.
Expression for the data-generating process for subscriptions : Based on the data provided, write out an expression for the data-generating process for subscriptions per 1,000 local residents.
Define and give two examples of systemic ableism : Explain the difference between the medical model of disability vs social model of disability. Which model do social scientists and disability rights activist
Create another linear regression model : Create another linear regression model. The difference this time is that you will have to choose which variable(s) are the most important for the model
What does mean to say that race is socially constructed : Based on the Module Race-Ethnicity and Sex lecture: What is the difference between race and ethnicity? What does mean to say that race is socially constructed?
What does mean to identify as asexual : What does it mean to identify as asexual? Explain with two examples how asexual individuals are diverse. Finally, give one reason/motive with an example
Describe the retailing mix for the pi agency : Describe the retailing mix for the PI agency. For each element of the retailing mix, what your is the mix about and how it fits. Build the appropriate positioni
Analyze the strengths and weaknesses of utilitarianism : Analyze the strengths and weaknesses of Utilitarianism and Ethical Egoism. Provide an argument in favor of (or opposed to) either Utilitarianism

Reviews

len2870241

4/28/2021 1:36:32 AM

need to follow the instructions & guidelines attached + respect the words number mentioned for each Section, please focus on the highlighted parts as well using submission-template for any thing missing

Write a Review

Other Subject Questions & Answers

  Discuss the social impact that musicians such as bob dylan

Discuss the social impact that musicians such as Bob Dylan and the Beatles had on the youth culture of the "baby boomer"/ "Woodstock" generation.

  Strategy to attain sustainable competitive advantage

Describe the circumstances under which a firm chooses a low-cost strategy to attain sustainable competitive advantage.

  How can the be accomplished without one affecting the other

The criminal justice system's main goal is to protect our rights individually and as a society. How can this be accomplished without one affecting the other?

  Analyze what is happening there after explaining where what

go to an athletic facility anything from a local soccer field to a swimming pool and to a golf course where you can

  Explain deviant actions of los angeles police rampart crash

Research and explain the deviant actions of the Los Angeles Police Department's Rampart CRASH unit during the late 1990s. Did socialization and culture of the unit have a role within the officer's deviant behaviors?

  Find a discography of the artists recordings

Write a biography of a group you're critiquing. Find a Discography of the artist's recordings and make a print out to include in your review

  Progression of relationships in life

Think about the progression of relationships in your life. How is your communication different when you first meet a stranger, when you get to know the person and find something in common

  Discuss the use-of-force continuum

Examine the "Police Use of Force" article and discuss the use-of-force continuum and how much force should be used during an arrest. Then, analyze the.

  Multifactorial causation in drug effects-hypnosis

The idea of multifactorial causation in drug effects suggests that. Which of the following is NOT a common effect produced by hypnosis? Which of the following is NOT an example of a direct effect health-related risk associated with recreational drugs..

  Observations on the operations of the organization

The purpose of the assignment is to provide a final report on the student's first-hand experience with community engagement and service.

  Define demographic transition with a brief history

Define demographic transition with a brief history of its development - Briefly describe three living conditions or environmental impacts in developed countries that have reached phase IV, and contrast them with these conditions or impacts in devel..

  I have dissertation to write of 12000 words with a topic

i have dissertation to write of 12000 words with a topic employee performancecan a high standard employee performance

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd