Reference no: EM133573976
Task 2: Utilizing the 'FLCRASH' dataset,
Question 1.
Build a new custom category variable based on 'Total Crash Injuries' variable. This new custom category variable should contain only two categories. One category is injuries equal to zero, while the other category is for crashes with one or more injuries. Visualize the frequency of the two new categories you just created on a bar chart. How many crashes report zero injuries? (3 Marks)
Question 2.
In Q1, you created a new categorical variable with only two values (binary). Your task now is to develop two models that can predict the value this target variable takes, given other explanatory variables. In other words, you attempt to predict if a crash is going to result in injuries (or not) given other important variables.
What are the two models (or techniques) you can use to predict this target variable?
Create one model to predict the target variable you created in Q1. Assess this model's accuracy. What are the most important variables in predicting this target variable?
Create the second model to predict the target variable. Assess this model's accuracy. What are the most important variables identified by the model to predict the target variable.
Compare the performance of the two models. Report and discuss the results of your comparison. Which model is the champion?