What is the overall prediction accuracy

Assignment Help Basic Statistics
Reference no: EM133212676

Assignment:

The PollutionNom.csv dataset provides age-adjusted mortality rate per 100,000 people for 60 locations. Additional climate and demographic information for each location is available as well.refer to the list below for attribute data types and summary attribute descriptions.

Population Data Set Descriptors:

Precipitation: JanuaryF: JulyF:

>65: Household: Education: Housing: Density: NonWhite: WhiteCollar: LowIncome: HC:

NOX: SO2: Humidity: Mortality:

Average annual precipitation in inches

Average January temperature in degrees F

Average July temperature in degrees F

% of population aged 65 or older

Average household size

Median school years completed by those over 22

% of housing units which are sound & with all facilities Population per sq. mile in urbanized areas

% non-white population in urbanized areas

% employed in white collar occupations

% of families with income < $3000

Relative hydrocarbon pollution potential

Relative nitric oxides pollution potential

Relative sulphur dioxide pollution potential

Annual average % relative humidity at 1pm

Total age-adjusted mortality rate per 100,000

Use Weka to answer the following questions. (Always use "Use training set" option for testing).

Clustering

1) Perform SimpleKMeans clustering with default parameters (2 clusters). How would you describe the two clusters based on the attribute characteristics? Interpret how the identified clusters are different based on average attribute values. Which attributes were more important to differentiate the clusters?

2) Perform SimpleKMeans clustering with three clusters. How would you describe the three clusters based on the attribute characteristics? Discuss which subsets of the population each cluster represents.

Neural Networks

1) Perform neural network analysis (MultilayerPerceptron) with two hidden layers ("hiddenLayers"=2). What is the overall prediction accuracy? Identify the attributes that significantly impact each of the two hidden nodes. How would you characterize these two hidden factors identified by the neural network analysis?

2) Repeat the same analysis with three hidden layers. What is the new prediction accuracy? Interpret the confusion matrix. Why do you think the accuracy is different? Identify the attributes that significantly impact each of the three hidden nodes. How would you characterize these three hidden factors identified by the neural network analysis?

Reference no: EM133212676

Questions Cloud

How the syndrome contributes to increased risk of aneurysm : Marfans Syndrome and Aneurysm Discussion - Explain in your own words, how the syndrome contributes to increased risk of an aneurysm
Why is what you choose negative or positive : Share an additional positive and negative effect of sports media. Why is what you choose negative or positive? Share an example of each effect in today's media
Discuss the key elements of positive communications : Discuss the key elements of positive and proactive communications that are missing in the below scenario and any other aspects of the situation
Assess which of the effects appear to be present : Make up some results (you can do that! Just make up 4 numbers!) and then assess which of the effects appear to be present
What is the overall prediction accuracy : Perform neural network analysis (MultilayerPerceptron) with two hidden layers ("hiddenLayers"=2). What is the overall prediction accuracy?
About corporate transaction : Describe the terms of the deal. What are potential upsides for each party in doing the deal? What, in your opinion is the deal trying to accomplish?
Yoda and rosa in relation to business of cattle battle : Explain whether there is a partnership between Yoda and Rosa in relation to the business of Cattle Battle.
Compute for monthly amortization after down payment : Compute for monthly amortization after down payment. What if Celso defaulted on the 50th month, compute for the grace period.
How would organization to prevent these natural disasters : How would you help your organization to prevent these natural disasters from severely impacting your records?

Reviews

Write a Review

Basic Statistics Questions & Answers

  Plot the length of the year against the distance from sun

Plot the Length of the year against the Distance from the sun. Describe the shape of your plot.

  Is the sampling distribution approximately normal

Suppose a simple random sample of size n = 34 is obtained from a population with μ = 30 and σ = 4. Is the sampling distribution of x approximately normal.

  The feed mixes available for the horses diet are an oat

the battery park stable feeds and houses the horses used to pull tourist-filled carriages through the streets of

  How many of competitors mountain bikes should be purchased

If the bike purchaser wants to have 90% confidence that the sampling error will be no more than 5 psi, how many of the competitor's mountain bikes should be purchased for destructive testing?

  Proportion p of all high school

In a survey of 6000 high school students, 3230 read books during the summer break. Using these sample statistics, find a 99% confidence interval for the proportion p of all high school students that read books during the summer break.

  What is the weighted mean hourly wage

What is the weighted mean hourly wage?

  Determine the critical value of z or t

The owner of a tea packing company is concerned that a new packing machine is not placing the required average of 2 grams of tea per bag. If a sample of 25 tea

  What is the range of values for probability

1. What is the range of values for probability? 2. What do we mean by independent trials?

  Which of the following is a property of the linear

q1. which of the following is a property of the linear correlation coefficient r?if all values of either the x or y

  Stress and the demands of school

An adolescent girl is having extreme difficulty coping with stress and the demands of school. Penny has many fears of failing, of not being liked.

  Find the mean and standard deviation for the sample

For each of the following populations, would a score of X =50 be considered a central score (near the middle of the distribution) or an extreme score (far out in the tail of the distribution)?

  Sensitivity analysis of the probabilities of the payoffs

Show how a Monte Carlo simulation could facilitate a sensitivity analysis of the probabilities of the payoffs.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd