Reference no: EM13214
Here is a procedure you can use when developing a regression model.
Regression Methodology
1) Collect data. This will be the most time consuming part of your job. Be sure to consider dummy variables and time. Be selective (Remember garbage in = garbage out).
Sources:
Interviews, focus groups
Market Studies
Market Experiments (can be very valuable)
Gvt. Data
Industry trade association data
2) Check for problems with covariance - different explanatory variables may be correlated and fighting each other in the model. For example - GDP vs. housing starts vs. Income. Pick the best of all variables that give you the same information. You can check covariance on Excel.
3) Regress your data first as a linear model
4) Check t-ratios and p-values of your coefficients. Make sure t-ratios are at least 1.75-2.0 (choice is up to you, decide on your desired confidence level). P-values tell you exactly what confidence level is.
5) Check R2 . This tells you percent of variability you have explained with your model. If it is very low - try adding more explanatory variables and/or a transformation. For multi-variable can also use corrected R2
6) Check p-value on f for regression. This tells you confidence that your model is statistically valid. Note the standard error - the smaller this is the less error you will have in your forecast. SE =standard deviation of the sample mean.
7) Weed out bad variables and develop a model.
8) Try a transformation and see if your diagnostics improve. (A detailed residual analysis is necessary to completely analyze your model - refer to a stats book for this. Diagnostic Plots like standardized residuals vs fitted values, nscores. Check for autocorrelation for time series. Check for heteroscedasticity (variable variance of residuals) - can often reduce with proper transformation. Check normal probability plots. Check outliers. We will not go into this level of detail in this course)
CAVEATS:
1) R2 will increase as you add more explanatory variables - this is not necessarily a better model. You are not trying to maximize R2. You are trying to develop a simple accurate model that predicts demand.
2) collecting more bad data will only improve your t-ratios
3) avoid collecting too many explanatory variables - they might mask the effect of the important ones
Sample Regression Model
Here is a detailed example of the development of a regression model
You open a jewelry store in Naperville. You make most of your money from diamond sales. You want to better understand and forecast your diamond demand:
- Data collection: What variables do you think might influence your demand for diamonds? You develop the following list:
income
|
|
price
|
|
wealth
|
|
GDP
|
|
stock market
|
price of diamonds at WalMart
|
demographics -age
|
wedding stats
|
time
|
|
advertising
|
season (high in holiday time)
|
You opened your store on Jan 1, 2002 and have gathered the following data.
carats/mo
|
$/carat
|
$/carat
|
$/mo
|
Naperville
|
$/yr
|
$B
|
|
|
Demand
|
Price
|
Pwalmart
|
adv $
|
Ave Age
|
Income
|
GDP
|
Dow Jones
|
month
|
47
|
1500
|
1200
|
200
|
50
|
35000
|
1155
|
11000
|
Jan
|
51
|
1500
|
1000
|
200
|
49
|
35350
|
1260
|
12000
|
Feb
|
40
|
1500
|
800
|
250
|
48.02
|
35703.5
|
1207.5
|
11500
|
Mar
|
121
|
900
|
1500
|
225
|
47.0596
|
36060.54
|
1102.5
|
10500
|
Apr
|
127
|
900
|
1600
|
180
|
46.11841
|
36421.14
|
1080.45
|
10290
|
May
|
121
|
900
|
1400
|
120
|
45.19604
|
36785.35
|
1058.841
|
10084.2
|
Jun
|
117
|
900
|
1000
|
500
|
44.29212
|
37153.21
|
1037.664
|
9882.516
|
Jul
|
73
|
1200
|
650
|
126
|
43.40628
|
37524.74
|
1016.911
|
9684.866
|
Aug
|
89
|
1200
|
890
|
600
|
42.53815
|
37899.98
|
996.5727
|
9491.168
|
Sep
|
77
|
1200
|
658
|
140
|
41.68739
|
38278.98
|
976.6412
|
9301.345
|
Oct
|
92
|
1200
|
986
|
220
|
40.85364
|
38661.77
|
957.1084
|
9115.318
|
Nov
|
92
|
1200
|
659
|
120
|
40.03657
|
39048.39
|
937.9662
|
8933.012
|
Dec
|
62
|
1500
|
888
|
50
|
39.23584
|
39438.88
|
919.2069
|
8754.352
|
Jan
|
77
|
1500
|
1020
|
125
|
38.45112
|
39833.26
|
900.8228
|
8579.264
|
Feb
|
55
|
1500
|
325
|
562
|
37.6821
|
40231.6
|
882.8063
|
8407.679
|
Mar
|
88
|
1500
|
1500
|
300
|
36.92846
|
40633.91
|
865.1502
|
8239.526
|
Apr
|
131
|
1100
|
1500
|
251
|
36.18989
|
41040.25
|
847.8472
|
8074.735
|
May
|
112
|
1100
|
800
|
401
|
35.46609
|
41450.66
|
830.8902
|
7913.24
|
Jun
|
115
|
1100
|
800
|
200
|
34.75677
|
41865.16
|
814.2724
|
7754.976
|
Jul
|
131
|
1100
|
1350
|
150
|
34.06163
|
42283.81
|
797.987
|
7599.876
|
Aug
|
140
|
1100
|
1350
|
687
|
33.3804
|
42706.65
|
782.0272
|
7447.879
|
Sep
|
119
|
1100
|
800
|
200
|
32.71279
|
43133.72
|
766.3867
|
7298.921
|
Oct
|
128
|
1100
|
825
|
500
|
32.05853
|
43565.06
|
751.059
|
7152.943
|
Nov
|
148
|
950
|
789
|
140
|
31.41736
|
44000.71
|
736.0378
|
7009.884
|
Dec
|
156
|
950
|
1200
|
263
|
30.78902
|
44440.71
|
721.317
|
6869.686
|
Jan
|
166
|
950
|
1200
|
241
|
30.17324
|
44885.12
|
706.8907
|
6732.292
|
Feb
|
175
|
950
|
1600
|
523
|
29.56977
|
45333.97
|
692.7529
|
6597.646
|
Mar
|
146
|
950
|
600
|
526
|
28.97838
|
45787.31
|
678.8978
|
6465.694
|
Apr
|
126
|
1200
|
600
|
641
|
28.39881
|
46245.18
|
665.3199
|
6336.38
|
May
|
94
|
1500
|
600
|
320
|
27.83083
|
46707.64
|
652.0135
|
6209.652
|
Jun
|
118
|
1300
|
600
|
265
|
27.27422
|
47174.71
|
638.9732
|
6085.459
|
Jul
|
122
|
1250
|
600
|
145
|
26.72873
|
47646.46
|
626.1937
|
5963.75
|
Aug
|
131
|
1250
|
700
|
285
|
26.19416
|
48122.92
|
613.6699
|
5844.475
|
Sep
|
138
|
1200
|
700
|
521
|
25.67027
|
48604.15
|
601.3965
|
5727.585
|
Oct
|
You select the following variables:
Price
|
Pwm
|
adv $
|
Dec
|
Feb
|
mos in business
|
GDP
|
Dow Jones
|
Ave Age
|
Income
|
This is your regression output:
SUMMARY OUTPUT
|
|
|
|
|
|
|
|
|
|
|
|
|
Regression Statistics
|
|
|
|
|
|
Multiple R
|
0.999662
|
|
|
|
|
|
R Square
|
0.999324
|
|
|
|
|
|
Adjusted R Square
|
0.99903
|
|
|
|
|
|
Standard Error
|
1.080827
|
|
|
|
|
|
Observations
|
34
|
|
|
|
|
|
|
|
|
|
|
|
|
ANOVA
|
|
|
|
|
|
|
|
df
|
SS
|
MS
|
F
|
|
|
Regression
|
10
|
39723.51
|
3972.351
|
3400.44
|
|
|
Residual
|
23
|
26.86831
|
1.168188
|
|
|
|
Total
|
33
|
39750.38
|
|
|
|
|
|
|
|
|
|
|
|
|
Coefficients
|
Standard Error
|
t Stat
|
P-value
|
Upper 95.0%
|
|
Intercept
|
218.594
|
279.748
|
0.781396
|
0.442539
|
797.2961
|
|
Price
|
-0.09904
|
0.001088
|
-91.0638
|
6.28E-31
|
-0.09679
|
|
Pwm
|
0.030112
|
0.000706
|
42.67671
|
2.1E-23
|
0.031571
|
|
adv $
|
0.01156
|
0.001222
|
9.457305
|
2.17E-09
|
0.014089
|
|
Dec
|
9.288621
|
0.887907
|
10.46125
|
3.24E-10
|
11.12539
|
|
Feb
|
8.619078
|
0.763213
|
11.29314
|
7.35E-11
|
10.1979
|
|
mos in business
|
5.951601
|
8.343354
|
0.713334
|
0.482815
|
23.21112
|
|
GDP
|
-0.35999
|
12383.11
|
-2.9E-05
|
0.999977
|
25616.01
|
|
Dow Jones
|
0.03772
|
1300.226
|
2.9E-05
|
0.999977
|
2689.757
|
|
Ave Age
|
2.211563
|
3.854305
|
0.57379
|
0.571679
|
10.18479
|
|
Income
|
-0.00506
|
0.013574
|
-0.37298
|
0.712582
|
0.023018
|
|
What do you think??
GDP, Dow Jones and Income look insignificant, but is that reasonable?
The problem is that they are fighting each other. They each have similar information. You only need one of these variables. You need to think about which makes the most sense or test and see which one works best alone in model.
Can check covariance (variables that are correlated with each other). This is undesirable. Each explanatory variable should give you new information.
In excel, it's also under tools, data analysis, covariance. Highlight data you want to check.
Covariance table:
|
Price
|
Pwm
|
adv $
|
Dec
|
Dow Jones
|
Ave Age
|
Income
|
Price
|
43194
|
|
|
|
|
|
|
Pwm
|
-26060
|
117538
|
|
|
|
|
|
adv $
|
-4557
|
-6724
|
30249
|
|
|
|
|
Dec
|
-6
|
-14
|
-10
|
0
|
|
|
|
Feb
|
12
|
10
|
-10
|
0
|
|
|
|
mos in business
|
-263
|
-1076
|
541
|
0
|
|
|
|
GDP
|
6630
|
17638
|
-9701
|
-1
|
305052
|
|
-707947
|
Dow Jones
|
63141
|
167984
|
-92392
|
-14
|
2905255
|
|
|
Ave Age
|
212
|
769
|
-397
|
0
|
12203
|
52
|
|
Income
|
-102177
|
-449231
|
222270
|
5
|
-6742349
|
-28900
|
16335056
|
The higher the number, the greater the covariance. Income and Dow Jones have strong correlation. As do Income and GDP and GDP and Dow Jones. We just need one of these variables.
Let's keep only income as this is logical. Re-run the regression:
SUMMARY OUTPUT
|
|
|
|
|
|
|
|
|
|
|
|
|
Regression Statistics
|
|
|
|
|
|
Multiple R
|
0.999662
|
|
|
|
|
|
R Square
|
0.999324
|
|
|
|
|
|
Adjusted R Square
|
0.999108
|
|
|
|
|
|
Standard Error
|
1.036802
|
|
|
|
|
|
Observations
|
34
|
|
|
|
|
|
|
|
|
|
|
|
|
ANOVA
|
|
|
|
|
|
|
|
df
|
SS
|
MS
|
F
|
|
|
Regression
|
8
|
39723.51
|
4965.439
|
4619.188
|
|
|
Residual
|
25
|
26.87398
|
1.074959
|
|
|
|
Total
|
33
|
39750.38
|
|
|
|
|
|
|
|
|
|
|
|
|
Coefficients
|
Standard Error
|
t Stat
|
P-value
|
Upper 95.0%
|
|
Intercept
|
214.2703
|
261.6642
|
0.818875
|
0.420595
|
753.1774
|
|
Price
|
-0.09905
|
0.001025
|
-96.6054
|
1.08E-33
|
-0.09694
|
|
Pwm
|
0.030119
|
0.000669
|
45.02481
|
1.87E-25
|
0.031497
|
|
adv $
|
0.011552
|
0.001167
|
9.900353
|
3.92E-10
|
0.013955
|
|
Dec
|
9.285746
|
0.85082
|
10.91388
|
5.34E-11
|
11.03804
|
|
Feb
|
8.598333
|
0.674088
|
12.75551
|
1.92E-12
|
9.986641
|
|
mos in business
|
5.805573
|
7.746753
|
0.74942
|
0.460598
|
21.7603
|
|
Ave Age
|
2.122573
|
3.488305
|
0.608483
|
0.548357
|
9.306867
|
|
Income
|
-0.00483
|
0.012633
|
-0.38262
|
0.705234
|
0.021185
|
|
Income still looks bad - check covariance again.
|
Price
|
Pwm
|
adv $
|
Dec
|
Income
|
Price
|
43194.2
|
|
|
|
|
Pwm
|
-26060
|
117537.7
|
|
|
|
adv $
|
-4556.53
|
-6723.71
|
30249.34
|
|
|
Dec
|
-6.40138
|
-13.9689
|
-9.96021
|
0.055363
|
|
Feb
|
11.72145
|
9.870242
|
-9.76384
|
-0.00519
|
|
mos in business
|
-262.5
|
-1075.62
|
541.0441
|
0.029412
|
|
Ave Age
|
212.2758
|
769.0929
|
-397.053
|
-0.04747
|
-28899.6
|
Income
|
-102177
|
-449231
|
222270
|
4.870607
|
16335056
|
Income is varying along with advertising and the negative of price and age (should investigate this).
Remove it:
Remove income
|
|
|
|
|
|
SUMMARY OUTPUT
|
|
|
|
|
|
|
|
|
|
|
|
|
Regression Statistics
|
|
|
|
|
|
Multiple R
|
0.99966
|
|
|
|
|
|
R Square
|
0.99932
|
|
|
|
|
|
Adjusted R Square
|
0.999137
|
|
|
|
|
|
Standard Error
|
1.019641
|
0.01945
|
|
|
|
|
Observations
|
34
|
|
|
|
|
|
|
|
|
|
|
|
|
ANOVA
|
|
|
|
|
|
|
|
df
|
SS
|
MS
|
F
|
|
|
Regression
|
7
|
39723.35
|
5674.764
|
5458.249
|
|
|
Residual
|
26
|
27.03136
|
1.039668
|
|
|
|
Total
|
33
|
39750.38
|
|
|
|
|
|
|
|
|
|
|
|
|
Coefficients
|
Standard Error
|
t Stat
|
P-value
|
Upper 95.0%
|
|
Intercept
|
114.302
|
14.08797
|
8.113449
|
1.36E-08
|
143.2603
|
|
Price
|
-0.09905
|
0.001008
|
-98.2357
|
5.91E-35
|
-0.09697
|
|
Pwm
|
0.030188
|
0.000634
|
47.64864
|
7.83E-27
|
0.031491
|
|
adv $
|
0.011618
|
0.001135
|
10.23668
|
1.3E-10
|
0.013951
|
|
Dec
|
9.328693
|
0.829424
|
11.2472
|
1.74E-11
|
11.0336
|
|
Feb
|
8.649533
|
0.649737
|
13.31237
|
4.06E-13
|
9.985087
|
|
mos in business
|
2.842606
|
0.208486
|
13.63453
|
2.35E-13
|
3.271155
|
|
Ave Age
|
0.792431
|
0.283334
|
2.796804
|
0.009581
|
1.374834
|
|
Looks good. You might question age - as it doesn't make sense. Might investigate replacing age with income. This is the art. We decide to keep the model as is. This is our demand equation:
Demand = 114.302-0.09905P+0.030188Pwm+0.011618Adv +9.329*Dec+ 8.65*Feb+2.84*Mos+0.79*Age
The above model looks good, but perhaps a transformation would be even better. Let's try log-linear. I take logs of all my original data and place in a new table.
ln(Demand)
|
ln(Price)
|
ln(Pwm)
|
ln(mos)
|
ln(age)
|
ln(Inc)
|
ln(adv)
|
Dec
|
Feb
|
3.85
|
7.31
|
7.09
|
0.00
|
3.91
|
10.46
|
5.30
|
0
|
0
|
3.93
|
7.31
|
6.91
|
0.69
|
3.89
|
10.47
|
5.30
|
0
|
1
|
3.69
|
7.31
|
6.68
|
1.10
|
3.87
|
10.48
|
5.52
|
0
|
0
|
4.80
|
6.80
|
7.31
|
1.39
|
3.85
|
10.49
|
5.42
|
0
|
0
|
4.84
|
6.80
|
7.38
|
1.61
|
3.83
|
10.50
|
5.19
|
0
|
0
|
4.80
|
6.80
|
7.24
|
1.79
|
3.81
|
10.51
|
4.79
|
0
|
0
|
4.76
|
6.80
|
6.91
|
1.95
|
3.79
|
10.52
|
6.21
|
0
|
0
|
4.29
|
7.09
|
6.48
|
2.08
|
3.77
|
10.53
|
4.84
|
0
|
0
|
4.49
|
7.09
|
6.79
|
2.20
|
3.75
|
10.54
|
6.40
|
0
|
0
|
4.34
|
7.09
|
6.49
|
2.30
|
3.73
|
10.55
|
4.94
|
0
|
0
|
4.52
|
7.09
|
6.89
|
2.40
|
3.71
|
10.56
|
5.39
|
0
|
0
|
4.52
|
7.09
|
6.49
|
2.48
|
3.69
|
10.57
|
4.79
|
1
|
0
|
4.13
|
7.31
|
6.79
|
2.56
|
3.67
|
10.58
|
3.91
|
0
|
0
|
4.34
|
7.31
|
6.93
|
2.64
|
3.65
|
10.59
|
4.83
|
0
|
1
|
4.01
|
7.31
|
5.78
|
2.71
|
3.63
|
10.60
|
6.33
|
0
|
0
|
4.48
|
7.31
|
7.31
|
2.77
|
3.61
|
10.61
|
5.70
|
0
|
0
|
4.88
|
7.00
|
7.31
|
2.83
|
3.59
|
10.62
|
5.53
|
0
|
0
|
4.72
|
7.00
|
6.68
|
2.89
|
3.57
|
10.63
|
5.99
|
0
|
0
|
4.74
|
7.00
|
6.68
|
2.94
|
3.55
|
10.64
|
5.30
|
0
|
0
|
4.88
|
7.00
|
7.21
|
3.00
|
3.53
|
10.65
|
5.01
|
0
|
0
|
4.94
|
7.00
|
7.21
|
3.04
|
3.51
|
10.66
|
6.53
|
0
|
0
|
4.78
|
7.00
|
6.68
|
3.09
|
3.49
|
10.67
|
5.30
|
0
|
0
|
4.85
|
7.00
|
6.72
|
3.14
|
3.47
|
10.68
|
6.21
|
0
|
0
|
5.00
|
6.86
|
6.67
|
3.18
|
3.45
|
10.69
|
4.94
|
1
|
0
|
5.05
|
6.86
|
7.09
|
3.22
|
3.43
|
10.70
|
5.57
|
0
|
0
|
5.11
|
6.86
|
7.09
|
3.26
|
3.41
|
10.71
|
5.48
|
0
|
1
|
5.16
|
6.86
|
7.38
|
3.30
|
3.39
|
10.72
|
6.26
|
0
|
0
|
4.98
|
6.86
|
6.40
|
3.33
|
3.37
|
10.73
|
6.27
|
0
|
0
|
4.84
|
7.09
|
6.40
|
3.37
|
3.35
|
10.74
|
6.46
|
0
|
0
|
4.54
|
7.31
|
6.40
|
3.40
|
3.33
|
10.75
|
5.77
|
0
|
0
|
4.77
|
7.17
|
6.40
|
3.43
|
3.31
|
10.76
|
5.58
|
0
|
0
|
4.80
|
7.13
|
6.40
|
3.47
|
3.29
|
10.77
|
4.98
|
0
|
0
|
4.88
|
7.13
|
6.55
|
3.50
|
3.27
|
10.78
|
5.65
|
0
|
0
|
4.93
|
7.09
|
6.55
|
3.53
|
3.25
|
10.79
|
6.26
|
0
|
0
|
SUMMARY OUTPUT
|
|
|
|
|
|
|
|
|
|
|
Regression Statistics
|
|
|
|
|
Multiple R
|
0.986941
|
|
|
|
|
R Square
|
0.974052
|
|
|
|
|
Adjusted R Square
|
0.965749
|
|
|
|
|
Standard Error
|
0.069079
|
|
|
|
|
Observations
|
34
|
|
|
|
|
|
|
|
|
|
|
ANOVA
|
|
|
|
|
|
|
df
|
SS
|
MS
|
|
|
Regression
|
8
|
4.478337
|
0.559792
|
|
|
Residual
|
25
|
0.119299
|
0.004772
|
|
|
Total
|
33
|
4.597636
|
|
|
|
|
|
|
|
|
|
|
Coefficients
|
Standard Error
|
t Stat
|
Lower 95.0%
|
Upper 95.0%
|
Intercept
|
27.29298
|
0
|
65535
|
27.29298
|
27.29298
|
ln(Price)
|
-1.21469
|
0.085059
|
-14.2805
|
-1.38987
|
-1.0395
|
ln(Pwm)
|
0.258325
|
0.040836
|
6.325837
|
0.17422
|
0.342429
|
ln(adv)
|
0.023881
|
0.022677
|
1.053086
|
-0.02282
|
0.070585
|
Dec
|
0.060605
|
0.055786
|
1.086398
|
-0.05429
|
0.175498
|
Feb
|
0.01679
|
0.044114
|
0.380596
|
-0.07406
|
0.107644
|
mos in business
|
0.115075
|
0.036956
|
3.113827
|
0.038963
|
0.191188
|
Ave Age
|
-1.30596
|
0
|
65535
|
-1.30596
|
-1.30596
|
Income
|
-1.09152
|
0
|
65535
|
-1.09152
|
-1.09152
|
|
|
|
|
|
|
Remove income
|
|
|
|
|
SUMMARY OUTPUT
|
|
|
|
|
|
|
|
|
|
|
Regression Statistics
|
|
|
|
|
Multiple R
|
0.986941
|
|
|
|
|
R Square
|
0.974052
|
|
|
|
|
Adjusted R Square
|
0.967066
|
|
|
|
|
Standard Error
|
0.067738
|
|
|
|
|
Observations
|
34
|
|
|
|
|
|
|
|
|
|
|
ANOVA
|
|
|
|
|
|
|
df
|
SS
|
MS
|
|
|
Regression
|
7
|
4.478337
|
0.639762
|
|
|
Residual
|
26
|
0.119299
|
0.004588
|
|
|
Total
|
33
|
4.597636
|
|
|
|
|
|
|
|
|
|
|
Coefficients
|
Standard Error
|
t Stat
|
Lower 95.0%
|
Upper 95.0%
|
Intercept
|
13.76921
|
1.097963
|
12.54069
|
11.51231
|
16.0261
|
ln(Price)
|
-1.21469
|
0.083407
|
-14.5633
|
-1.38613
|
-1.04324
|
ln(Pwm)
|
0.258325
|
0.040043
|
6.451113
|
0.176014
|
0.340635
|
ln(adv)
|
0.023881
|
0.022237
|
1.073942
|
-0.02183
|
0.069589
|
Dec
|
0.060605
|
0.054702
|
1.107913
|
-0.05184
|
0.173047
|
Feb
|
0.01679
|
0.043257
|
0.388134
|
-0.07213
|
0.105707
|
mos in business
|
0.115075
|
0.036239
|
3.175493
|
0.040586
|
0.189565
|
Ave Age
|
-0.76836
|
0.155205
|
-4.95063
|
-1.08739
|
-0.44933
|
|
|
|
|
|
|
remove Dec, Feb and Adv
|
|
|
|
|
|
|
|
|
|
SUMMARY OUTPUT
|
|
|
|
|
|
|
|
|
|
|
Regression Statistics
|
|
|
|
|
Multiple R
|
0.971709
|
|
|
|
|
R Square
|
0.944219
|
|
|
|
|
Adjusted R Square
|
0.938641
|
|
|
|
|
Standard Error
|
0.092459
|
|
|
|
|
Observations
|
34
|
|
|
|
|
|
|
|
|
|
|
ANOVA
|
|
|
|
|
|
|
df
|
SS
|
MS
|
|
|
Regression
|
3
|
4.341174
|
1.447058
|
|
|
Residual
|
30
|
0.256462
|
0.008549
|
|
|
Total
|
33
|
4.597636
|
|
|
|
|
|
|
|
|
|
|
Coefficients
|
Standard Error
|
t Stat
|
Lower 95.0%
|
Upper 95.0%
|
Intercept
|
10.67027
|
0.984992
|
10.83285
|
8.65865
|
12.68189
|
ln(Price)
|
-1.18737
|
0.105218
|
-11.2848
|
-1.40226
|
-0.97249
|
ln(Pwm)
|
0.237193
|
0.051394
|
4.615168
|
0.132232
|
0.342154
|
mos in business
|
0.282874
|
0.020884
|
13.54498
|
0.240223
|
0.325524
|
No improvement - in fact I had to remove variables that seem intuitive. I will stick with my linear model (I reality I might also investigate squaring some of the variables).
I will stick with my first linear demand model.
Demand = 114.302-0.09905P+0.030188Pwm+0.011618Adv +9.329*Dec+ 8.65*Feb+2.84*Mos+0.79*Age
Now we can forecast our sales. These are the types of questions I can answer with my model.
- What type of demand loss should I expect if I raise my prices?
- What level of demand increase should I anticipate if I spend an additional $100 of advertising?
- How will my demand changes as incomes rise? As my customers age?
- If I want to open a similar store in Westmont and forecast the following:
Price at Walmart = $1200/carat
Age = 45
Advertising = $500/month
What should expect to sell in the first 3 months of the year (Jan-Mar) if I open the store in Jan?
Let's answer this last one:
I use the demand equation for forecast sales in each month. Check this yourself on excel and see if you can replicate these results.
|
|
Jan
|
Feb
|
Mar
|
Total
|
Ave
|
Pwm =
|
1200
|
95.6766
|
107.1666
|
101.3566
|
304.1998
|
101.3999
|
Age=
|
45
|
|
|
|
|
|
Adv =
|
500
|
|
|
|
|
|
Price
|
1000
|
|
|
|
|
|