Calculate the variance of a series of values

Assignment Help Python Programming
Reference no: EM133091581

Describing and Visualising Statistical Data

Exercises

Question 1: In a new, empty .pyfIle, write a small program that calculates and prints out the mean, median, and modeof the following set of values:

1978, 1936, 1941, 1999, 2000, 2001, 2020, 2049, 2000, 1801, 1664

When calculating the modeit's easiest to use ‘from collections import Counter' to get access to a Counter object which will do the instance counting for you (see slides 10-12).

Question 2: Add the value 2001 to the above list of values. Now, both 2000 and 2001 occur twice - so the data has multiple modes. Modify your program to cater for this, i.e. if you ask it to calculate the mode of the list of values it will return a list containing both 2000 and 2001. See slide 13 if you need help.

Question 3: Modify your code to print out the highest and lowest values in the list, and from these values calculate and print the range of the values (i.e. the difference between the highest and lowest values).

Question 4: The following functions can be used to calculate the variance of a series of values, and from the variance you can calculate the standard deviation as the square root of the variance (in Python you can do this by raising a value to the power of 0.5, for example:value = 9, sqr_root = value ** 0.5)
defcalculate_mean(numbers):
s = sum(numbers)
N =len(numbers)
mean = s / N
return mean

deffind_differences(numbers):
mean =calculate_mean(numbers)

differences =[]
for num in numbers:
differences.append(num - mean)

return differences

defcalculate_variance(numbers):
differences =find_differences(numbers)

squared_diff=[]
for din differences:
squared_diff.append(d **2)

sum_squared_diff= sum(squared_diff)
variance =sum_squared_diff/len(numbers)

return variance
There is a file on your Moodle shell under this weeks' materials called: pokemon_num_name_height_metres_weight_kgs.csv
This file contains all the numbers, names, heights and weights of over 800 Pokemon (which I found here: https://pokemondb.net/pokedex/stats/height-weight). Take a look at this file in a text editor or excel to see the kind of data we're working with.
To open the file and split up each line into a list of four strings we can use code like this:
with open('pokemon_num_name_height_metres_weight_kg.csv') as csvfile:
readCSV = csv.reader(csvfile, delimiter=',')
for row in readCSV:
# 0 is number, 1 is name, 2 is height (m), 3 is weight (kg)
print(row[0], row[1], row[2], row[3])

Now, using the above calculate_variancefunction, load the file and calculate the variance and standard deviation of the Pokemon heights and weights - and print them to the screen.
REMEMBER: Each row value (row[0], row[1] etc.) will be a string - so if we want to do any math with any of the numerical fields (which we do) then we'll need to cast them to be a float!
If you've done this right, you should see output like this:
Height variance: 1.2243628057924427
Height standard deviation: 1.1065092886155283
-----
Weight variance: 15580.033463417874
Weight standard deviation: 124.82000425980554

Question 5: As we have the heights and weights for our Pokemon - let's create a quick scatterplot of the data:

from pylab import plot, show, title, xlabel, ylabel

# Have access to thePokemon heights and weights here!

myPlot = plot(weights, heights, 'x')
title('Pokemon Height Vs. Weight')
xlabel('Weight in Kilograms')
ylabel('Height in Metres')
show(myPlot)
If everything's going as planned you should see a plot like this:

Question 6: Our final task for the day will be to determine if there is a statistically significant correlation between the height and the weight of a Pokemon - that is, are bigger Pokemon usually heavier? From looking at the plot, what do you think? Is there any correlation? Or maybe a weak positive, or weak negative correlation?
Here's some code we can use to determine the correlation coefficient of two sets of values:
deffind_correlation(x, y):
# Find the length of the lists
n =len(x)

# Find the sum of the products
products =[]
for xi,yiinzip(x, y):
products.append(xi *yi)
sum_products= sum(products)

# Find the sum of each list
sum_x= sum(x)
sum_y= sum(y)

# Find the squared sum of each list
squared_sum_x=sum_x**2
squared_sum_y=sum_y**2

# Find the sum of the squared lists
x_square=[]
for xi in x:
x_square.append(xi **2)
x_square_sum= sum(x_square)

y_square=[]
foryiin y:
y_square.append(yi**2)
y_square_sum= sum(y_square)

# Use formula to calculate correlation
numerator = n *sum_products-sum_x*sum_y
denominator1 = n *x_square_sum-squared_sum_x
denominator2 = n *y_square_sum-squared_sum_y
denominator =(denominator1 * denominator2)**0.5
correlation = numerator / denominator
return correlation
Use the above function to calculate and print the correlation between the height and weight of Pokemon - if you've done it correctly, you should get output similar to the following:
Correlation between height and weight is: 0.6424145098518806
Looking at the below, this means that there IS a positive correlation - that is, the height of a Pokemon is an indicator that can allow us to estimate its weight... but the correlation is weak, so any estimate that we come up with may have a large margin of error to the actual weight of the Pokemon!

Attachment:- Visualising Statistical Data.rar

Reference no: EM133091581

Questions Cloud

Traditional marketing communication strategies : Describe traditional marketing communication strategies for tangible goods and business communication strategies for services
How much is the total income tax expense for the year : MYRRH Company reported P9,000,000 income before provision for income tax. How much is the total income tax expense for the year
Prepare journal entries for each of the transactions : Preferred stock, $100 par value; authorized, 300,000 shares; issued, 32,500 shares $3,250,000. Prepare journal entries for each of the above transactions
Basic democratic values that underlie our society : What are the basic democratic values that underlie our society? How have they changed in recent years?
Calculate the variance of a series of values : Write a small program that calculates and prints out the mean, median, and modeof the set of values - correlation between the height and the weight of a Pokemon
Compute the net present value for each machine : The cost of each machine is $14,000 and neither is expected to have salvage value at the end of a 4-year useful life. Compute net present value for each machine
Calculate the cost of goods sold : Smith Corp uses its periodic inventory system and the following information is available: Inventory - Beginning 400. Calculate the cost of goods sold
Prepare journal entries to record the preceding transactions : Mayfair Co. completed the following transactions and uses a perpetual inventory system. Prepare journal entries to record the preceding transactions
Developing and managing brands : Identify the factors that need to be considered when developing and managing brands.

Reviews

Write a Review

Python Programming Questions & Answers

  Write a python program to implement the diff command

Without using the system() function to call any bash commands, write a python program that will implement a simple version of the diff command.

  Write a program for checking a circle

Write a program for checking a circle program must either print "is a circle: YES" or "is a circle: NO", appropriately.

  Prepare a python program

Prepare a Python program which evaluates how many stuck numbers there are in a range of integers. The range will be input as two command-line arguments.

  Python atm program to enter account number

Write a simple Python ATM program. Ask user to enter their account number, and print their initail balance. (Just make one up). Ask them if they wish to make deposit or withdrawal.

  Python function to calculate two roots

Write a Python function main() to calculate two roots. You must input a,b and c from keyboard, and then print two roots. Suppose the discriminant D= b2-4ac is positive.

  Design program that asks user to enter amount in python

IN Python Design a program that asks the user to enter the amount that he or she has budget in a month. A loop should then prompt the user to enter his or her expenses for the month.

  Write python program which imports three dictionaries

Write a Python program called hours.py which imports three dictionaries, and uses the data in them to calculate how many hours each person has spent in the lab.

  Write python program to create factors of numbers

Write down a python program which takes two numbers and creates the factors of both numbers and displays the greatest common factor.

  Email spam filter

Analyze the emails and predict whether the mail is a spam or not a spam - Create a training file and copy the text of several mails and spams in to it And create a test set identical to the training set but with different examples.

  Improve the readability and structural design of the code

Improve the readability and structural design of the code by improving the function names, variables, and loops, as well as whitespace. Move functions close to related functions or blocks of code related to your organised code.

  Create a simple and responsive gui

Please use primarily PHP or Python to solve the exercise and create a simple and responsive GUI, using HTML, CSS and JavaScript.Do not use a database.

  The program is to print the time

The program is to print the time in seconds that the iterative version takes, the time in seconds that the recursive version takes, and the difference between the times.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd