Briefly explain the concept of decision trees

Assignment Help Other Subject
Reference no: EM133188603

ICT110 Introduction to Data Science - University of the Sunshine Coast

Assignment Task

You work at Real Beer as a data scientist. The product development team have approached you because they want to develop a new line of beer. Real Beer has a long history in the brewery market, but their target market has typically been pitched at the lower end. They are now looking to develop a range of beer for very discerning beer connoisseurs. This beer will be more expensive and will be sold through specialty stores or direct sales on a new website.

The product development team aren't sure what the characteristics of this new beer should have taste wise but know that they want it to have distinctive characteristics. An executive in the product development team at Real Beer head office has provided you with a dataset with most current producers and has asked you to provide a report with recommendations about what attributes this new beer could have. Note: not all columns are related to this purpose.

You need to use the data to develop a cohesive and convincing story that describes the process of finding the key features of a top beer.

First, the product development team would like to get a better understanding about what sorts of attributes top beers have. They have asked you to describe the data and find interesting phenomena.

Second, the product development team have asked you to explore the data in more detail. They would like you to use your expertise in data science to dig out anything you feel is interesting or significant. They are looking for attributes of top beers that could be put together to create a distinctive yet tasty beer.

You are required to prepare a report about your findings and to make suggestions about which attributes you would recommend be considered in the new product - whether it be based on some values, such as style, IBU, sweet, sour, etc.

The potential audiences of this report include other staff within Real Beer, such as executives or sales staff - this means that each graph will need a detailed explanation and some narrative around why or how this image adds to the story. Staff may have limited ICT or mathematical knowledge therefore the report should be technical but have clear explanations describing the findings.
To prepare the report, please include the following sections:

1. Introduction
Provide an introduction to the problem. Include background material as appropriate: who cares about this problem, what impact it has, where does the data come from, what are the dimensions and structure of the data.

2. Data Setup
Describe how to load the data, and how the pre-processing is performed.

The original dataset is not ready for analysis and it is different from the data forms that we are familiar with in previous practices. This means we need to do some pre-processing, either for the whole dataset, or for a subset of the dataset required for each sub task described later.

Once you have some ideas of exploratory or advanced analysis, you need to adjust the form of dataset. This can be achieved either by manipulating records in R by transposition or subsetting, or with other tools (e.g. notepad or excel) before reading them into R. Please clearly explain the way you have cleaned the data in this section. If you use Excel please still explain the steps that you used for cleaning.


3. Exploratory Data Analysis
Two, one-variable analyses with graphs

One-variable analysis studies one variable (one column/attribute) each time. You can choose the attribute you want to for this but the attributes you select need to add to the story you are telling about which features are keys to a top beer.

• Perform 2 one-variable analyses and graph them
• Explain the findings for each graph
• Provide the code for each graph

Two, two-variable analyses with graphs

A two-variable analysis studies the relation between two variables. It is up to you to decide which attributes/variables you use for this analysis but the attributes you select need to add to the story you are telling about which features are keys to a top beer.

• Perform 2 two-variable analyses and graph them
• Explain the findings for each graph
• Provide the code for each graph

4. Advanced Analysis

Two, Linear regression analyses with graphs

Briefly explain the concept of linear regression (with references). It is up to you to decide which attribute/s you use for this analysis. You may choose to use any two attributes for this but the values you select need to add to the story you are telling about which features are keys to a top beer.

• Perform 2 linear regression analyses and graph them
• Explain the findings for each graph
• Provide the code for each graph

Decision tree
Briefly explain the concept of decision trees (with references). It is up to you to decide which attribute/s you use for this analysis. You may choose the attributes for this but the values you select need to add to the story you are telling about which features are keys to a top beer.

• Create a decision tree and resulting visualisation
• Explain the findings for the decision tree
• Provide the code for the decision tree

5. Conclusion
Sum up your findings and provide some insight into the findings. Provide your overall recommendation/s in this section eg. which features have you selected and why.

6. Reflections
In this part, discuss any difficulties you had performing the analysis and how you solved those difficulties. Reflect on how the analysis process went for you, what you learnt, and what you might do differently next time. Aim to write 2-4 paragraphs.

For the data analysis (Section 3 & 4), you need to provide both R code, the explanation to the code, and the result. Please represent each R code snippet in your report using a box with some comments. For example:

Report Format

Your report should be no less than 1,200 words and it would be best to be no longer than
~2,000 words long. Texts in R code snippets are not counted.

The report MUST be formatted using the following guidelines:
1. Title Page - Include your name as the report's author.
2. Header - Report title
3. Footer - your name and the page number
4. Paragraph text - 12 point Calibri or Times New Roman single line spacing
5. Headings - In an appropriate type and size
6. Margins - 2.5cm on all margins
7. Page numbering - Introduction and onwards to use conventional numerals (1, 2, 3, 4) starting on page 1 from the introduction.
8. The report is to be created as a single Microsoft Word document (version 2007 or later). No other format is acceptable and doing so will result in the deduction of marks.

Referencing

References for the explanation of decision trees and linear regression are required. These references should follow the Harvard or APA method of referencing.

Reference no: EM133188603

Questions Cloud

Compute the net impact on company c profit : The transfer price for the polymer has been set at $45 per litre. Compute the net impact on Company C's profit if the transfer price is set at the rate by plant
What other qualitative factors would you recommend : What other qualitative factors would you recommend that Frontier managers consider before making a decision? Describe at least 4 factors
What is the yield to maturity of the bond : The spot rate of a bond for three years is respectively 4%, 6% and 7% and a redeeming value of Rs 990. What is the yield to maturity of the bond
What are your strategies to know which stock to buy or sell : What are your strategies to know which stock to buy or sell? When buying a stock, how do you measure the number of shares you will invest
Briefly explain the concept of decision trees : Develop a cohesive and convincing story that describes the process of finding the key features of a top beer.
Determine the optimal mix of products : Determine the optimal mix of products in terms of maximizing contributions to profits for the period. Then, find the range of optimality for profit coefficient
Explain the consultation process : Show understanding of topic, Cert IV Leadership and Management, BSBLDR413 - Lead effective workplace relationships
How much is net sales revenue : A company had the following selected account balances at year-end: Sales returns and allowances $1,000. How much is net sales revenue
What price will your bond sell for : Two years from now, the YTM on your bond has declined by 1 percent and you decide to sell. What price will your bond sell for

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd