• Pick a topic that interests you, and ask the lecturer for approval. It can be any issue at all, so long as it is amenable to a non-trivial statistical analysis. This could be multiple regression of at least 3 variables, to try and predict a variable. This handout has a few suggestions, but you can pick any topic so long as the lecturer agrees. Each student should do a different topic, or a different focus on similar topics.
• Typical Requirements
• The student shall investigate the subject, research for the data and record the collection of and the actual data. From this the selection of variables they deem important to creating the regression model.
• The student should develop the regression models from these variables and then test to select the most accurate model, utilising the adjusted R2 to determine which model and number of variables best explains the model. The model should be used to predict the dependent variable Y by utilising actual data and determining the accuracy of the prediction.
• A residual analysis should be used in this analysis as well as confidence interval estimation.
• An overall F-test will be used to test for the significance of the overall model, that is; to determine whether there is a significant relationship between the dependent variable and the entire set of independent variables.
• Inference test of the model via hypothesis testing for the Slope of the Multiple Regression model, and determine the confidence interval of the slope.
• The students will also test individual portions of the regression model by determining the contribution of an independent variable to the regression model, by calculating the contribution of a variable given another is included, and visa-versa. The group shall determine the Coefficient of Partial Determination with ‘k' independent variables for the regression model. Within the process, co-linearity should be tested for and ensure that the resulting model has a VIF of less than 5.
• The regression model may utilise dummy variables if the group deems that categorical variables are important to the determinant of accurately predicting the independent variable.
• All Excel printouts, where deemed important to the report should be placed in the report near to where they are being referenced. Other printouts & or calculations that are important to the report and add weight/ justification to the report but are not referenced directly, should be placed in the appendix. This goes for any other material that justifies the model. If referenced in the report, it shall be placed near where referenced. Otherwise, it is to be remanded to the appendix. Please note, the appendix should not have any material that does not add or justify the information or pertain to the import of the report.
• The report should explain the reason for the model, what the model is trying to predict/forecast, and limitations of the resulting model. What data is being used and why it is important to the process? Where the data came from and the method of its collection? The appendix should contain the actual raw data collected and its source.
• The report should contain an executive summary, an introduction and the reason or need for this model. The body shall include the information on the data, the variables chosen and why, the process of developing, selection of best model, the testing and proving of the model as outlined in the text and a complete analysis of the model. The report should contain, where necessary, answers to any ethical considerations taken and how they were dealt by the group. Finally the report should conclude with resultant, recommendations on use and a conclusion.
• All graphs, drawings, figures and tables should be correctly placed as per standard business report conventions and format. All referencing and citing should be as per APA style referencing
• The final report should be a neat, concise and professional piece that conforms to the conventions of a Business report with title, contents, table of figures, executive summary, body, conclusion and appropriate Appendices.
• The report should be in a folder, not ring binder or lever arch, pages should not be in plastic sleeves. The appropriate Business Schools Assignment Page should be attached as the last page of the report just inside the cover.
• The student will deliver a presentation of this report, ensuring all highlights required to enable the audience to follow the group's reasoning and conclusion towards the resultant model. This should last 15 minutes.
Possible Topic: Global warming and urban heat adjustment
Is global warming real, or is merely local warming caused by cities getting bigger? That's a question we can answer.
Each city needs 2 sources of data: population and the rise in temperature.
Possible Topic: Does executing murderers (plus a few innocent people as well) save 20 people per execution?
Traditionally, murderers were executed. The problem is, occasionally an innocent person is executed by mistake. Now that DNA evidence can be used, there have been some embarrassing examples of people found guilty of murder who turned out to be innocent after all. Oops!
Nonetheless, there are a number of statistical studies which appear to suggest that, if you have the death penalty for murderers, the strong deterrent will greatly reduce the total number of people killed.
So should we have the death penalty (and occasionally execute a few innocent people), and thus reduce the total number of innocent people killed?
Here are some rather controversial articles that discuss statistical papers on this topic: