Linear regression cross validation

Assignment Help Applied Statistics
Reference no: EM132095606

The Big Data Assignment is comprised of two parts:

- The first part is to create the algorithms in the tasks, namely: Decision Tree, Gradient Boosted Tree and Linear regression and then to apply them to the bike sharing dataset provided. Try and produce the output given in the task sections (also given in the Big-Data Assignment.docx provided on Blackboard).

- The second part is then use those algorithms created in the first part and apply them to another dataset chosen from Kaggle (other than the bike sharing dataset provided).

1. Utilising Python 3 Build the following regression models:
- Decision Tree
- Gradient Boosted Tree
- Linear regression
2. Select a dataset (other than the example dataset given in section 3) and apply the Decision Tree and Linear regression models created above. Choose a dataset
3. Build the following in relation to the gradient boost tree and the dataset choosen in step 2
a) Gradient boost tree iterations (see Big-Data Assignment.docx section 6.1)
b) Gradient boost tree Max Bins (see Big-Data Assignment.docx section 7.2)
4. Build the following in relation to the decision tree and the dataset choosen in step 2
a) Decision Tree Categorical features
b) Decision Tree Log (see Big-Data Assignment.docxsection 5.4)
c) Decision Tree Max Bins (see Big-Data Assignment.docx section 7.2)
d) Decision Tree Max Depth (see Big-Data Assignment.docx section 7.1)
5. Build the following in relation to the linear regression and the dataset choosen in step 2
a) Linear regression Cross Validation
i. Intercept (see Big-Data Assignment.docx section 6.5)
ii. Iterations (see Big-Data Assignment.docx section 6.1)
iii. Step size (see Big-Data Assignment.docxsection 6.2)
iv. L1 Regularization (see Big-Data Assignment.docx section 6.4)
v. L2 Regularization (see Big-Data Assignment.docx section 6.3)
b) Linear regression Log (see Big-Data Assignment.docx section 5.4)
6. Follow the provided example of the Bike sharing data set and the guide lines in the sections that follow this section to develop the requirements given in steps 1,3,4 and 5

3.1 Task 1
Task 1 is comprised of developing:
1. Decision Tree
a) Decision Tree Categorical features
b) Decision Tree Log (see Big-Data Assignment.docx section 5.4)
c) Decision Tree Max Bins (see Big-Data Assignment.docx section 7.2)
d) Decision Tree Max Depth (see Big-Data Assignment.docx section 7.1)

3.2 Task 2
Task 2 is compromised of developing:
1. Gradient boost tree
a) Gradient boost tree iterations (see Big-Data Assignment.docx section 6.1)
b) Gradient boost tree Max Bins (see Big-Data Assignment.docxsection 7.2)
c) Gradient boost tree Max Depth (see Big-Data Assignment.docx section 7.1)

3.3 Task 3
Task 3 is compromised of developing:
1. Linear regression model
a) Linear regression Cross Validation
i. Intercept (see Big-Data Assignment.docx section 6.5)
ii. Iterations (see Big-Data Assignment.docx section 6.1)
iii. Step size (see Big-Data Assignment.docx section 6.2)
iv. L1 Regularization (see Big-Data Assignment.docx section 6.4)
v. L2 Regularization (see Big-Data Assignment.docx section 6.3)
b) Linear regression Log (see Big-Data Assignment.docx section 5.4)

Attachment:- Marking Creiteria.rar

Reference no: EM132095606

Questions Cloud

How are their economic systems classified : What are some of the components of these cultures that you need to understand from a business standpoint?
Prepare a sales budget for the la babycakes store : Prepare a sales budget for the LA Babycakes store for the 4th quarter of 2016. Present the number of units, sales price, and total sales for each month;
Develop a process for managing risk assessment : The purpose of this assignment is to develop a process for managing risk assessment, threat and vulnerability, and enforcement of policies
Terms of the firm international expansion : Your unit is analyzing the different options in terms of the firm's international expansion. In this discussion that you are asked to participate
Linear regression cross validation : ICT707 Big Data Assignment - Build the relation to the linear regression and the dataset - first part and apply them to another dataset chosen from Kaggle
Business processes and csr initiatives : In the case, Banyan tree had resorted to innovative means to incorporate CSR in their business processes and CSR initiatives were taken as a tool for customer
Create a ms project schedule based on the fictitious : This is an individual assignment. It's an opportunity to become familiar with Microsoft Project.
Four approaches to compensation : Companies can take one of four approaches to compensation. Which do you think is the best approach? Why?
Performance rating form : Evaluate the performance rating in figure 8.4 found on pages 230 of your reading assignment. What do you think is most effective about this form?

Reviews

Write a Review

Applied Statistics Questions & Answers

  Hypothesis testing

What assumptions about the number of pedestrians passing the location in an hour are necessary for your hypothesis test to be valid?

  Calculate the maximum reduction in the standard deviation

Calculate the maximum reduction in the standard deviation

  Calculate the expected value, variance, and standard deviati

Calculate the expected value, variance, and standard deviation of the total income

  Determine the impact of social media use on student learning

Research paper examines determine the impact of social media use on student learning.

  Unemployment survey

Find a statistics study on Unemployment and explain the five-step process of the study.

  Statistical studies

Locate the original poll, summarize the poling procedure (background on how information was gathered), the sample surveyed.

  Evaluate the expected value of the total number of sales

Evaluate the expected value of the total number of sales

  Statistic project

Identify sample, population, sampling frame (if applicable), and response rate (if applicable). Describe sampling technique (if applicable) or experimental design

  Simple data analysis and comparison

Write a report on simple data analysis and comparison.

  Analyze the processed data in statistical survey

Analyze the processed data in Statistical survey.

  What is the probability

Find the probability of given case.

  Frequency distribution

Accepting Manipulation or Manipulating

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd