Provide a correlation matrix of these three variables

Assignment Help Basic Statistics
Reference no: EM13244776

Harvey Weinstein has employed you as a consultant to brief him on film performance in the Australian cinema industry over the years 1997-2007. You have obtained some data on film revenues and associated variables from the Motion Picture Distributors Association of Australia (MPDAA). The file ‘Film data.xls' contains data on the top 100 films at the Australian box office for each of the years 1997-2007. However, due to some missing data (on budget) the sample is reduced and only includes 992 films with complete information. The following table provides descriptions of the data provided in the ‘Film data.xls' file:

2221_Confidence interval for expected revenue.png

a) To get a sense of the data, provide two well labeled scatter-plots of 1) revenue vs. budget, and 2) revenue vs. screens. Also, provide a correlation matrix of these three variables

b) After speaking to an industry expert, they tell you that budget and screens are the most important determinants of film success at the box office (advertising is important too, the expert tells you, but you don't have data on that). Produce a regression with ‘revenue' as the dependent variable and ‘budget' and ‘screens' as the independent variables. Interpret the regression coefficients.

c) With your regression results of part b), provide residual plots vs. each independent variable. Also, provide a histogram of the (standardised) residuals. Comment on the nature of the residuals based upon these three plots.

d) You show your results and plots to your econometrics professor at university and he tells you that you may have an issue with non-normal errors caused by outliers. He suggests you transform the revenue, screens, and budget data into natural logarithms and re-estimate the equation, which you do. Provide results of the new regression and interpret the regression coefficients [Hint: d lnY/d lnX = (dY/dX)*(X/Y) = (dY/Y)/(dX/X)].

e) With your new regression results of part d), provide residual plots vs. each independent variable. Also, provide a histogram of the (standardised) residuals. Comment on the nature of the new residuals based upon these three plots.

f) You discuss the results with the industry expert again. They tell you that apart from screens and budget, films with A-list stars and films which are sequels generally earn more at the box office. Therefore you decide to include these dummy variables in your updated model from part d). Are these new variables individually significant in the updated model? Do their coefficients' signs support the expert's intuition?

g) You are about to prepare your final report when an old friend calls you on the phone. You tell your friend about the project because they go to film school and you suspect they might find it interesting. Your friend wonders whether genre and rating might be important as well. You have data on these variables but need to construct some more dummy variables. Treating ‘Other' and ‘G' as the base categories for ‘genre' and ‘rating', respectively; estimate the new model and provide a partial-F test to test whether the extra variables are worthwhile adding assuming 5% level of significance.

h) You decide to present the model you estimated in f) to Harvey. You ring him on the phone to request a meeting and let him know that you've completed the work. He asks one more request of you: to provide a point estimate of revenue for a film he is planning on releasing in Australian cinemas soon. The film has a budget of $200,000,000, will release on a maximum of 250 screen, features an A list actor, and is also a sequel. Calculate the prediction of how much revenue this film might earn given this information (Hint: be mindful of the (natural) log transformations you have made).

i) Before the final meeting with Harvey, you remember from your ECMT1010 course that it would be better to provide an estimate of the average value of revenue (for the given values of the independent variables) by presenting a confidence interval for the conditional mean (recall equation 13.12 in Black 2E/3E). Because you've been paying close attention in class, you realise the confidence intervals computed by KaddStat are not the ones you are after and the equation you learnt in ECMT1010 is only suitable for simple regression, and not multiple regression.

You pay another visit to your econometrics professor who is rushing out the door on the way to the faculty Christmas party. He knows you've been learning some matrix algebra so scribbles you down an equation in matrix form for the confidence interval for the conditional mean:

861_Confidence interval for expected revenue1.png

Where y^o = point prediction of revenue (which you found in part h); t α/2,n-k-1 = the t-value for a two tailed test with level of significance α and n-k-1 degrees freedom; σ^ is the estimated standard error of the estimate;

x0 is the (5x1) vector of values for the independent variables (i.e. the four given by Harvey - and the constant term); and X is the (992x5) matrix of all observations for the independent variables (including the constant).

Using the matrix multiplication (MMULT) and matrix inverse (MINV) functions in Excel, provide the following:

785_Confidence interval for expected revenue2.png

iii) What are the upper and lower bounds of a 95% confidence interval for expected revenue in Australian dollars?

Attachment:- Film-data.xls

Reference no: EM13244776

Questions Cloud

Social conduct-gender relations and human sexuality : This assignment is to analyse a primary source which is the vatsyana's kama sutra. Who produced the source? When, where, how, why, and for whom? What form does it take? What are its assumptions? What trends in thought does it engage? What language an..
Influences stock buyback programs : Companies that are smart and conservative about their buybacks are not just creating value by maximizing float reduction, but they are also signaling to investors that they are likely well managed in other ways
Explain the molar mass for kclo3 : Determine the volume of 02 (at STP) formed when 50.0g of KClO3 cedomposes according to the following reaction. The molar mass for KClO3 is 122.55g/mol. 2KClO3(s)-->2KCl(s)+3O2(g)
Explain contents of tese two syringes : A sytringe contains 589 mL of CO at 325K and 1.2 atm pressure. A second sytringe contains 473mL of n2 at 298K and 2.6atm. What is the final pressure if the contents of tese two syringes are injected into a 1.00L container at STP
Provide a correlation matrix of these three variables : To get a sense of the data, provide two well labeled scatter-plots of 1) revenue vs. budget, and 2) revenue vs. screens. Also, provide a correlation matrix of these three variables
What quantity would aprofit maximizing monopolist produce : Suppose now thatbefore making this product, the firm has to pay fixed costs of research anddevelopment equal to $3,000,000. How large does N need to be before theprofit-maximizing firm chooses to pay the fixed cost
What is the smallest value of the incident angle theta : A flat piece of glass covers the top of a vertical cylinder that is completely filled with water, What is the smallest value of the incident angle theta_a for which none of the ray refracts into the water
An analyst with river investment : David Wright CEA. An analyst with River Investment is considering buying a Montrose Cable Company corporate bond.
Explain what is the pressure of co : A mixture of .220 moles CO, .350 moles H2 and .640 moles He has a total pressure of 2.95 atm. What is the pressure of CO

Reviews

Write a Review

Basic Statistics Questions & Answers

  Find program was effective in reducing cholesterol levels

After six months each pilot's cholesterol level is checked again. At the 0.01 significance level, can we conclude that the program was effective in reducing cholesterol levels?

  Difference in mean number of months before raise granted

At the .05 level of significance, is there a difference in the mean number of months before a raise was granted among the four CPA firms?

  Find probability for familiar product by correct formula

Use the correct formula to find the probability that, out of 10 customers, exactly 5 are familiar with the product. Show your calculations or explain how you found the probability.

  Find probability that oven cause a core to be defective

If the oven gets hotter than 475 degrees F, the core is defective. What is the probability that the oven will cause a core to be defective?

  Explain confidence interval estimate of true bmi

Body mass index is measured as the ratio of weight in kilograms to height in meters squared. Generate a 95% confidence interval estimate of the true BMI.

  Find probability of randomly selecting a score less

A population of scores forms a normal distribution with a mean of 40 and a standard deviation of 12. What is the probability of randomly selecting a score less than x=34?

  Reject the null hypothesis of equal broken cookie rates

At a large commercial bakery, one of the measures used to evaluate employees is number of broken cookies per 1000 produced.  The quality control department wants to determine if there is a relationship between years of experience and the broken co..

  Probability mean number of diagnostic services is more

If you select a random sample of 200 children and observe the number of diagnostic services each of them receives, what is the probability that the mean number of diagnostic services is more than 1.72?

  Probability based on the one sample kolmogorov-smirnov test

Would you conclude that your data set is approximately normal? If so, what percentage of your data values lie:

  Estimating normal distribution with standard deviation

According to the retail electronics chain, adults in 40 to 50 year old age group, the mean amount spent per year on electronics is $1998.

  Computing probability values using random number table

The choices are to be designated by the letter a, b, c, d, and e. The professor wishes to use a random-number table to determine which letter choice should correspond to the correct answer for a question. Using the number correspondence 1 for a, 2 fo..

  Estimating linear programming

A company producing a standard line and a deluxe line of dishwasher has the following time requirements (in minutes) in departments where either model can be processed

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd