Describe what you did to prepare the data for analysis

Assignment Help Operation Research
Reference no: EM132997711

Question: The "Movie Dataset (original)" tab is a partial dataset from Kaggle.com, and comprises 3515 movies scraped from the Internet Movie Data Base (IMDB). Use this dataset to answer the following questions.

Remember to create a partition (use a training set of 60% and a validation set of 40%).

The dataset may require scrubbing and/or the creation of categorical variables.
Provide all your answers and working models in a single Excel workbook; use different worksheets as needed.

Answer the following questions (which are also listed in the Assignment tab of the spreadsheet):

1. Describe what you did to prepare the data for analysis, and describe any assumptions you are making in answering the questions that follow.

2. Create a scatterplot comparing budget (y-axis) against title_year (x-axis). What pattern do you observe? What could explain this trend?

3. Define a movie to be a "success" if its gross revenue is at least double its budget. Create a logistic regression model to predict whether a movie will be a success, based on the number of critics, duration, actor_1 Facebook likes, director Facebook likes, budget, and year. What is the error rate on the validation set?

5. Of the given six predictors, what are the strongest three predictors of whether a movie is a success? How did you determine this?

6. Using a boosting neural network, can you reduce the error rate? If so, to what? What might be a drawback to using this type of method, compared to logistic regression?

The steps are as follows:

1. Use Excel and Frontline Solver to build a model (or models) for the problem.

2. Provide all your answers and working models in a single Excel workbook; use different worksheets (tabs) as needed. Give clear, simple names to your worksheets.

3. Ensure that your answers are in a format suitable for consumption by decision-makers (that is, it should not take a math professor to understand your answers.)

4. Write in complete sentences; do not just provide numbers.

Attachment:- Project Supervised Learning Assignment.rar

Reference no: EM132997711

Questions Cloud

Determine the value of ending inventory : Determine the value of ending inventory and materials used under the following methods: a. FIFO - periodic and b. FIFO - perpetual
What was the amount of those contributions : Pension data for Fahy Transportation Inc. include the following: Assuming cash contributions were made at the end of the year, what was amount of contributions
What is the goodwill and the non-controlling interest : What is the goodwill and the Non-Controlling Interest (NCI) using methods 1 and 2 for dealing with Non-Controlling Interest
Explain the Swimwear market : Build a financial strategy for the brand, in line with the Marketing Mix strategy & its evolution for the first 3 years and Explain who would be the main groups
Describe what you did to prepare the data for analysis : Describe what you did to prepare the data for analysis, and describe any assumptions you are making in answering the questions
What concerns would you have in structuring the deal : What concerns would you have in structuring the deal and the post-merger integration that would be different from the concerns you would have
What is the company cost of preferred stock : Perpetual preferred stock from Franklin Inc. sells for $97.50 per share, and it pays an $8.50 annual dividend. What is the company cost of preferred stock
Discuss the firm corporate governance : Question - Discuss the firm's corporate governance and ownership structure and provide suggestions on how to improve
How much should the company charge on the third year : Total project costs is estimated to be $35,000,000. Using percentage of completion method, how much should the company charge on the third year

Reviews

Write a Review

Operation Research Questions & Answers

  Bioimetic vascular network design

What kind of material is needed for bioimetic vascular network design?

  Write a report on im operations governance

Write a report on IM operations governance

  Analyse the role of different stakeholders in tourism

Critically analyse the role of different stakeholders in tourism planning and policy

  Evaluate business value

How can internet technologies be involved in improving a process in one of the functions of business? Choose one example and evaluate its business value.

  Case study:the british airways story

Case study:The British Airways story

  Prepare a research proposal

Prepare a Research Proposal based on a business issue.

  Create the feasible solution space

Solve the following problem graphically and create the feasible solution space.

  Calculate annual rate of return

No additional fixed costs would be incurred if this proposal is accepted

  Compute the company''s predetermined overhead rate

Corporation bases its predetermined overhead rate on the estimated labor hours for the upcoming year

  Determine the probability distribution

Determine the probability distribution for demand using the given data.

  Give both a high-level algorithm and an implementation

Give both a high-level algorithm and an implementation (\bubble diagram") of a Turing machine for the language

  Red brand canners

Red Brand Canners Gordon asked Myers about the demand for tomato products for the coming year. Myers replied that they could sell all of the whole canned tomatoes they could produce.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd