Construct a fully labelled probability tree

Assignment Help Applied Statistics
Reference no: EM132364724

Task:

This assignment is designed to test your knowledge about the first two major topics, Descriptive Statistics and Probability and Probability Distributions as well as the sub topic Sampling and Sampling Distributions. Students will be required to use the Data Analysis Tools in Excel to complete some parts of the assignment.

As part of this assignment you will be required to summarise and describe a set of real data. The Data Analysis Tools in Excel will be used extensively to summarise the data.

Question 1

Download the data set ‘auction data.xlsx' from the Assignment folder in the resources section of Interact. The data given in the worksheet tab ‘Data' show the sales results (as compiled by the Australian Property Manager (APM)) for properties listed for auction in Sydney on Saturday 8 June, 2019. The variables in this data set are: Suburb, Address, Bedrooms, Type, Price, Result and Agent. The key for entries in the variable Result are provided in the worksheet tab ‘Results Category'.

a. Copy and then complete the table below by matching the correct overall outcome (shown below) to the appropriate result codes. Use the information given in the worksheet tab ‘key' in the data set when completing the table.

Overall Outcomes: Sold prior to the auction, sold at the auction, sold after the auction, did not sell, withdrawn from sale.

Result code

Overall Outcome

PI, NB, VB

 

SP, PN

 

S, SN

 

SA, SS

 

W

 

Most real data sets you encounter will contain problem data such as typographical errors, transcription errors, coding errors and possible outliers. This data set is no exception. In a real situation, we would make a note of these anomalies and ask for them to be investigated or checked. Since we cannot contact the owner of this data set, for the purpose of this assignment, we will ignore the anomalies and work with the data as best we can.

Read the document ‘working with real data sets.pdf' found in the Assignment folder, which explains some ways of identifying errors in a data set and how to deal with them.

b. There are two properties where the result has been incorrectly recorded. These both have the result recorded as SP instead of SN. Identify these two properties by Suburb and Address. Edit the result code to correct these errors.

Use the edited data set with the two errors identified in part b. corrected when answering all remaining questions.

c. James and Anna are property owners in the Sydney region and are planning to sell their four bedroom house over the next few months. They are considering putting it up for auction and are concerned about the falling market due to pressure on the economy. They would like to use these data to gain an insight into the current Sydney auction market. Use the data set (with errors corrected in part b) to generate a three way pivot table report of ‘Type' by ‘Bedrooms' by ‘Result'.

Hint: Use ‘Type' and ‘Bedrooms' as row labels and ‘Result' as a column label.

You might find it easier to read the information in the pivot table if you right align, the Results column labels.

Use the data in the pivot table generated in part c. to answer the following vendors' questions in parts d. and e., about the properties listed for auction in Sydney on 8 June 2019.

d. James and Anna would like to calculate the clearance rate for all properties listed for auction that week.
i. How many properties were originally listed for auction for the day in question?
ii. How many of these were sold (at auction, prior or after)?
iii. Express the number of properties sold (at auction, prior or after) as a percentage of all properties listed for auction.

e. Given that they will be selling their four bedroom house soon, they are also interested in the clearance rate of all four bedroom houses that week. They would like to compare the clearance rate of four bedroom houses with the overall clearance rate.
i. How many four bedroom houses were originally listed for auction for the day in question?
ii. How many of these were sold (at auction, prior or after)?
iii. Express the number of four bedroom houses sold (at auction, prior or after) as a percentage of all four bedroom houses listed for auction.
iv. Was the clearance rate for four bedroom houses worse, the same or better than the clearance rate for all properties that week?

f. James and Anna would like to compare the sales outcomes of all the properties that were listed for auction that week.
i. For all the properties that were listed for auction that week, generate a two way pivot table of ‘Type' by ‘Result'.
ii. Use this pivot table to generate a single horizontal 100% component bar chart with type along the vertical axis and the different types of ‘result' making up the components of each of the four bars. Insert an appropriate title on the chart.
iii. Use the graph to identify the two types of properties which had approximately the same proportion of properties passed in that week.
Include both the pivot table and the 100% component bar chart generated in parts i. and ii. with your assignment submission.

g. Sort the data by the ‘Type' variable and then extract the house data only to a separate file. We will use this new file to generate a table of descriptive statistics for the variable ‘Price'. We first need to clean up the data set however so we can calculate the correct descriptive statistics. You might like to make a copy of this new file before you start deleting entries just in case you delete an entry you shouldn't have.
• Sort the house data in the new file by ‘Result'.
• Delete any results for which there are no selling prices.
• You will notice that there is are a number of properties which have a price associated with the result vendor bid. As this bid is made by the vendor only, not an actual selling price, these should also be deleted.

Hint: You should have 171 house properties with a selling price

i. Use the data analysis tools in Excel to generate a table of descriptive statistics for the house data for the variable ‘Price' and include it in your assignment submission. Use the statistics from the table to answer the following questions.
ii. What was the mean and median selling price of houses that week, expressed to the nearest thousand dollars? What was the standard deviation of these selling prices, expressed to the nearest hundred dollars?
iii. What was the selling price of the cheapest house sold that week? Look further afield in the data set to include information about the number of bedrooms and the suburb.

iv. The table produced in part i. contains a figure that represents the sample variance. Copy the table below into your assignment and fill in both cells with the sample variance expressed as an actual number value and in scientific notation. One of the answers will come straight from your table of descriptive statistics. To obtain the other answer, you will have to convert the sample variance from scientific notation to number form or from number form into scientific notation. When expressing the number using scientific notation, do not use Excel's method of representing such large numbers.

 

sample variance

actual number value

 

scientific notation

 

h. Use the Data Analysis Tools in Excel to generate a frequency distribution and histogram for the selling prices of houses sold that week. Use $700 000 as the upper limit of the first class and a class width of $700 000.

Use the same 171 house selling prices we used to generate the table of descriptive statistics in part g.

i. When quoting the average house price in Sydney, sources always quote the median rather than the mean selling price. Why this is appropriate for the data in this question? Explain how the histogram generated in part h. and a key descriptive statistic generated in part g.i. supports your answer.

Question 2

A bank reports that families often have more than one credit card. The most popular in Australia are Mastercard, Visa and American Express. For those families with one or more of these cards, 30% have a Mastercard, 10% have an American Express card and 60% a Visa card. They also reported that 4% of families have both a Mastercard and an American Express card, 7% have both a Mastercard and a Visa card and only 3% have both a Visa card and an American Express card. Further, 1% have all three cards.

a. Calculate the probability of selecting a family that has either a Visa card or a Mastercard.
b. If a family has a Mastercard, what is the probability it also has a Visa card?
c. Is possession of a Mastercard independent of possession of a Visa card? Why or why not.
d. Is possession of an American Express card mutually exclusive of possession of a Visa card? Justify your decision.

Question 3

There are a number of companies that provide a delivery service for take away meals. One of the important factors for the customer is the time between placing an order and receiving the meal. A particular Thai restaurant use two different delivery companies. Company A delivers 40% of their orders while Company B delivers the rest. A survey of customers using the service have indicated they want their food delivered within 30 minutes. Historically, Company B has experienced 10% of their orders taking longer than 30 minutes to deliver while Company A has been late on 15% of their orders.

a. We will use A to represent the event ‘Company A delivers the order' and L to represent the event ‘The meal is delivered late'.

Use the correct statistical notation and words to define the complement of both A and L. (Use this terminology in your working throughout the remainder of this question).

b. Construct a fully labelled probability tree to describe this problem with the outcomes and probabilities shown along each branch.

c. A customer has just received their order. What is the probability the order was delivered on time, that is, within 30 minutes of placing the order?

d. A customer who contacted the Thai restaurant reported receiving their order 45 minutes after placing the order. Which company is most likely to have delivered the order? Use probabilities to support your conclusion.

e. Considering the probability calculated in part c., should the restaurant owner have any concerns about the reliability (delivery times) of the delivery companies they use? Explain.

Question 4

Suppose that 20% of all share market investors are retirees and we select a random sample of 25 share market investors.

a. If the random variable X is the number of investors in the sample who are retirees, provide two reasons why X has a binomial distribution.

b. State the value/s of parameter/s of X.
c. Calculate (using the appropriate statistical tables) the probability that between five and ten investors in the sample are retirees.
d. Calculate (using the appropriate statistical tables) the probability at least ten investors in the sample are retirees.

e. Verify your answers to parts c. and d. using the appropriate Excel statistical function and demonstrate you have done this by including the Excel formula used.

f. You have to explain to someone who has not studied statistics what this probability calculated in part d. means. Use simple non statistical language to explain this.

g. How many retirees would you expect to find in a random sample of 25 investors?

Question 5

A maintenance worker in a large manufacturing plant knows that a filling machine breaks down on average 3 times per month. Assume these break downs occur randomly and independently of one another.

a. If the random variable X is the number of times the machine breaks down, identify the distribution of X and state the value/s of its parameter/s.

b. Calculate (using the appropriate statistical tables) the probability there are less than three breakdowns in the next month.

c. Calculate (using the appropriate statistical tables) the probability of seven breakdowns in the next two months.

d. Verify your answers to parts b. and c. using the appropriate Excel statistical function and demonstrate you have done this by including the Excel formula used.

Question 6

Suppose that the height of Australian males is a normally distributed random variable with a mean of 176.8cm and a standard deviation of 9.5cm.

a. If the random variable X is the height of an Australian male, identify the distribution of X and state the value/s of its parameter/s.

b. Calculate (using the appropriate statistical tables) the probability that a randomly selected Australian man is more than two metres tall.

c. To become a jockey, as well as a passion for the sport, you need to be relatively small, generally between 147cm and 168cm tall. Calculate (using the appropriate statistical tables) the proportion of Australian males who fit this height range.

d. Some of the smaller regional planes have small cabins, consequently the ceilings can be quite low. Calculate (using the appropriate statistical tables) the ceiling height of a plane such that at most 2% of the Australian men walking down the aisle will have to duck their heads.

e. Verify your answers to parts b., c. and d. using the appropriate Excel statistical function and demonstrate you have done this by including the Excel formula used.

f. A random sample of forty Australian males is selected. State the type of distribution and the value/s of the parameter/s for the mean of this sample.

g. Calculate (using the appropriate statistical tables) the probability that the average height of this sample is less than 170cm.

Attachment:- Auction dataset assignment.rar

Reference no: EM132364724

Questions Cloud

What reason could there be for students : What reason could there be for students to have elasticity other than faculty?
Literature review of technology adoption models : THE LITERATURE REVIEW OF TECHNOLOGY ADOPTION MODELS AND THEORIES FOR THE NOVELTY TECHNOLOGY
Calculate perfectly competitive industry equilibrium price : Calculate the perfectly competitive industry equilibrium price and output. Assume that the firms in this industry organized into a cartel.
Discuss the implications of change for prices and profit : You would like to suggest to the management making 6-bottle packs available for sale in addition to that. Discuss the implications of this change for prices
Construct a fully labelled probability tree : QBM117 - Business Statistics - Charles Sturt University - Calculate the probability of selecting a family that has either a Visa card or a Mastercard
Does the comparative advantage unfairly advantage one group : In a third paragraph, discuss who benefits/gains and who is harmed or pays in this scenario. In your opinion, does this comparative advantage unfairly advantage
What is return as it is related to risk : What is return, component parts of return, historical returns, and the relationship of returns, and considerations in evaluating return?
Define appropriate strategy to optimize-incorporate : Define an appropriate strategy to optimize and incorporate best practices in to the SQL that you created. Describes your SQL, populated tables,
Stock ownership versus other types of investments : What are the advantages and disadvantages of stock ownership versus other types of investments. How does stock analysis impact investment performance?

Reviews

len2364724

9/2/2019 3:29:09 AM

The mark for each question is determined by the proportion of your solution that satisfies these criteria.Full marks Correct answer written as a clear response to the original question. Full worked solutions provided that are clear, adequate and legible and use the correct mathematical notation and reasoning, with neat diagrams and code excerpts where appropriate. Final answer includes appropriate units and (where specified) correct rounding. The application of these general principles to each individual question is given with the questions themselves. Please read them to maximise your marks.

len2364724

9/2/2019 3:28:53 AM

Marking criteria All questions in this assignment involve problems with a sequence of several steps. These are marked using the following criteria. Criteria Description Correctness Arithmetic, algebra and calculations are correct (except possibly some minor rounding errors) Process/Method The indicated/correct method is selected and carried out completely. Communication/Working You have made it clear what you have done using an appropriate mix of text, mathematical notation, neat diagrams and code excerpts.

len2364724

9/2/2019 3:28:45 AM

Submission Requirements The assignment can be submitted by hand in the Assignment box on your local campus or uploaded to EASTS. Pages must be numbered, and your name and student number must be included on every page whichever submission method is chosen. If submitting by hand the assignment must be submitted by 5pm on the due date. Please check your subject outline for information about the location of this box. If submitting through EASTS the assignment can be submitted up until 11.59pm (AEDT) on the due date. It is a requirement that students keep a copy of all assignments submitted to the University for marking.

len2364724

9/2/2019 3:28:34 AM

The assignment must have a completed signed cover sheet attached to the front of the assignment. The cover sheet can be found in the Assignment folder. Assignments submitted without a signed coversheet will not be marked. If you choose to submit the assignment online through EASTS, it must be submitted as a single Word or PDF file. Assignments submitted in non-printable formats such as a ZIP file or as a collection of images will not be marked. If your scanner produces separate graphics files please paste them into a Word document before submitting to EASTS. Pages must be numbered, and your name and student number must be included on every page. Once you have submitted the assignment, please view your uploaded assignment to ensure that all the images, graphs, etc are visible and formatted correctly.

len2364724

9/2/2019 3:28:28 AM

Presentation The assignment must be neatly handwritten. Any Excel output should be inserted where required at the appropriate place in the assignment not in an Appendix at the back of the assignment. Pages must be numbered, and your name and student number must be included on every page. This method of presentation has been chosen because it is very difficult and time consuming to type mathematical formulas in Word. Marks will be deducted for assignments which do not follow these guidelines.

len2364724

9/2/2019 3:28:17 AM

This assignment is designed to assess the following learning outcomes. That students • be able to summarise and interpret data graphically and numerically; • be able to use a statistical package to analyse data appropriately, and then interpret the output; • be able to explain the standard uses of Statistics in the media and in business environments, and judge whether the statistical methodology and conclusions drawn are appropriate; • be able to calculate and interpret probabilities, and use standard discrete and continuous probability distributions; • be able to evaluate if the assumptions underlying statistical techniques are valid in a given scenario; • be able to apply basic principles of survey design, such as determination of appropriate sample sizes and sampling techniques.

len2364724

9/2/2019 3:28:10 AM

This assignment is designed to test your knowledge about the first two major topics, Descriptive Statistics and Probability and Probability Distributions as well as the sub topic Sampling and Sampling Distributions. Students will be required to use the Data Analysis Tools in Excel to complete some parts of the assignment. As part of this assignment you will be required to summarise and describe a set of real data. The Data Analysis Tools in Excel will be used extensively to summarise the data. A copy of the assignment, the cover sheet, the data set and the document ‘working with real data sets’ can be found at the Subject Interact site in the Assignment folder.

Write a Review

Applied Statistics Questions & Answers

  Construct a frequency table and display it as a histogram

Construct a frequency table and display it as a histogram using a class width of 10 starting at 0.  [Recall:  LCB is included in the class interval whereas UCB is not included.]  (Be sure to show all of your work, including your frequency chart.)

  Smoke increase the risk of a low birthweight

Does secondhand smoke increase the risk of a low birthweight? A baby is considered have low birthweight if he/she weighs less than 5.5 pounds at birth. According to the National Center of Health Statistics, about 7.8% of all babies born in the U.S. a..

  Statistic is used to test a hypothesis a regression equation

Which statistic is used to test a hypothesis a regression equation?Select one:a. t-statisticb. z-statistic

  Question 1a marine biologist has evidence from other

question 1a marine biologist has evidence from other studies that the number of fish is declining in certain lakes

  Draw a plot of the autocorrelation function

MFE 6220 - What is the mean-reverting level for this data process and Is the intercept term significant to the model? Conduct a relevant hypothesis test

  Give the state transition probability matrix

Give the state transition probability matrix

  State hypotheses in terms of the median increase

Does exercise at the low rate raise heart rate significantly? State hypotheses in terms of the median increase in heart rate and apply the Wilcoxon signed rank test. What do you conclude?

  Lakeshia sells 3,600 high end folding therapy

Lakeshia sells 3,600 high end folding therapy

  Perform a profit and sales analysis of the Western US region

Assignment Task - Perform a profit and sales analysis of the Western US region using the spreadsheet (global superstore) in tableau

  What is the standard error of the percentage

What is the standard error of the percentage of respondents who support the proposed reform - percentage of respondents who support the proposed reform

  Question 1nbspnbsp a large shipping company recorded the

question 1nbspnbsp a large shipping company recorded the number of tons shipped weekly across the pacific for 50

  Construct a stem-and-leaf display for the data

List all the values in a table and then construct a stem-and-leaf display for the data and construct a relative frequency histogram for these data with equal class widths, the first class being "$4 to less than $6".

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd