Fit a neural network model to the data

Assignment Help Applied Statistics
Reference no: EM132291853

Assignment -

Part 1 -

Overview - The Institute for Statistics Education at Statistics asks students to rate a variety of aspects of a course as soon as the student completes it. The Institute is contemplating instituting a recommendation system that would provide students with recommendations for additional courses as soon as they submit their rating for a completed course. Consider the excerpt from student ratings of online statistics courses shown in Table 1 below, and the problem of what to recommend to student E.N.

Table 1 - Ratings of online statistics courses: 4 = Best, 1 = worst, blank = not taken

 

SQL

Spatial

PA 1

DM in R

Python

Forecast

R Prog

Hadoop

Regression

L N

4

 

 

 

3

2

4

 

2

M H

3

 

 

 

4

 

 

 

 

J H

2

 

 

 

 

 

 

 

 

E N

4

 

 

4

 

 

4

 

3

D U

4

4

 

 

 

 

 

 

 

F L

 

4

 

 

 

 

 

 

 

G L

 

4

 

 

 

 

 

 

 

A H

 

3

 

 

 

 

 

 

 

S A

 

 

4

 

 

 

 

 

 

R W

 

 

2

 

 

 

 

4

 

B A

 

 

4

 

 

 

 

 

 

M G

 

 

4

 

 

4

 

 

 

A F

 

 

4

 

 

 

 

 

 

K G

 

 

3

 

 

 

 

 

 

D S

4

 

 

2

 

 

4

 

 

In R Your Job is To:

  • Consider a user-based collaborative filter. This requires computing correlations between all student pairs. For which students is it possible to compute correlations with E.N.? Compute them.

Then, tell me:

  • Which single course should we recommend to E.N. based on the single nearest student to E.N.? Explain why.
  • Based on the cosine similarities of the nearest students to E.N., which course should be recommended to E.N.?
  • What is the conceptual difference between using the correlation as opposed to cosine similarities? [Hint: how are the missing values in the matrix handled in each case?]

Then:

With large datasets, it is computationally difficult to compute user-based recommendations in real time, and an item-based approach is used instead. Returning to the rating data (not the binary matrix), let's now take that approach.

  • If the goal is still to find a recommendation for E.N., for which course pairs is it possible and useful to calculate correlations?
  • Just looking at the data, and without yet calculating course pair correlations, which course would you recommend to E.N., relying on item-based filtering? Calculate two course pair correlations involving your guess and report the results.

Finally:

  • Apply item-based collaborative filtering to this dataset (using R) and based on the results, recommend a course to E.N.

Part 2 -

Overview - The dataset below ToyotaCorolla.csv contains information with 1436 records and details on 38 attributes, including Price, Age, KM, HP, and other specifications. The goal is to predict the price of a used Toyota Corolla based on its specifications.

In R Your Job is To:

  • Fit a neural network model to the data. Use a single hidden layer with 2 nodes.
  • Use predictors Age_08_04, KM, Fuel_Type, HP, Automatic, Doors, Quarterly_Tax, Mfr_Guarantee, Guarantee_Period, Airco, Automatic_airco, CD_Player, Powered_Windows, Sport_Model, and Tow_Bar.
  • Remember to first scale the numerical predictor and outcome variables to a 0-1 scale (use function preprocess() with method = "range"-see Chapter 7) and convert categorical predictors to dummies.
  • Record the RMS error for the training data and the validation data. Repeat the process, changing the number of hidden layers and nodes to {single layer with 5 nodes}, {two layers, 5 nodes in each layer}.

Finally, answer the following prompts:

  • What happens to the RMS error for the training data as the number of layers and nodes increases?
  • What happens to the RMS error for the validation data?
  • Comment on the appropriate number of layers and nodes for this application.

Note - Immediate turn around file and csv file attached.

Attachment:- Assignment Files.rar

Reference no: EM132291853

Questions Cloud

How many different ways can they sit at the table : Answer the following questions using the Product Rule. (a) In how many different ways can they sit at the table?
Discuss the relationship between classes and objects : Discuss the relationship between classes and objects. Give a real-world analogy that depicts the relationship between classes and objects.
Banking industry from the perspective of mediolanum bank : prepare a presentation and group report on the current and future attractiveness of the banking industry from the perspective of Mediolanum Bank
How many coins each of the remaining pirates receives : What will happen? The solution should indicate which pirates die, and how many coins each of the remaining pirates receives.
Fit a neural network model to the data : In R Your Job is To: Fit a neural network model to the data. Use a single hidden layer with 2 nodes. What happens to the RMS error for the validation data
Need for information system experts or development : As a computer becomes faster and cheaper and the Internet becomes more widely used, most of the problems we have with information systems
What is the total manufacturing cost per dril : What is the total manufacturing cost PER DRILL if you start a production run of 125 drills (meaning you would yield less)?
What are the major problems of whatsapp : What are the major problems of WhatsApp and what are the solution to fix them?
Efficiency means doing something at the lowest possible cost : Efficiency means doing something at the lowest possible cost. A process cannot be effective without being efficient.

Reviews

Write a Review

Applied Statistics Questions & Answers

  Responsible for assuring statistical control. in one proces

A statistical process analyst is responsible for assuring statistical control. In one process, a machine is supposed to drop 11.4 ounces of mints into a bag. (Assume that this process can be approximated by a normal distribution). The acceptable rang..

  What is the standard error of his prediction

What is the standard error of his prediction

  Create a frequency table for General Health Condition

Create a frequency table for General Health Condition (NHANES dataset). What proportion indicated that their health was better than fair

  Examine the graphs of data in the accompanying excel file

In 2009, the New York Yankees won 103 baseball games during the regular season. Examine the graphs of the data in the accompanying Excel file on the sheet labeled "Regression."  Provide a short assessment of the message that the graphs impart

  Calculate the fixed-effects weighted mean and variance

Applied Economics Cost-Benefit Analysis Problems - Calculate the fixed-effects weighted mean, variance, standard deviation

  Statistic 3 hypothesis testing 15 marks should the cbc hire

3. hypothesis testing 15 marks should the cbc hire celebrities for their movies? to answer this question run a

  Draw a frequency histogram for the perceived age estimates

Draw a frequency histogram for the perceived age estimates. Describe the shape of the distribution of perceived age estimates.

  Which of the following statements are true probabilities can

Which of the following statements are true? A.Probabilities can be any positive value. B.Probabilities must be nonnegative. C.Probabilities must be negative. D.Probabilities can either be positive or negative.

  A more general exponential reliability model may be defined

A more general exponential reliability model may be defined by R(t)=a^(-bt) where a>1, b>0 and a and b are parameters to be determined. Find the hazard rate function, and show how this model is equivalent to R(t)=e^-(lambda*t).

  Scores on the sat college entrance test in a recent year wer

A.  Question 1: A study of voting chose 663 registered voters at random shortly after an election.  Of these 72% said they had voted in the election.  Election records show that only 56% of all registered voters voted in the election.  The boldface n..

  Which country should import bed linens

Suppose South Korea can produce 100 computer chips with 10 hours of labor input and 50 bed linens with 6 hours of labor input. The US can produce 100 computer chips with 8 hours of labor input and 50 bed linens with 4 hours of labor input.

  What types of data can be used to generate different graphs

Regression Analysis involves lurking variables, outliers and scatter plots. What types of data can be used to generate different kinds of graphs?

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd