Calculate the probability of selecting a family

Assignment Help Applied Statistics
Reference no: EM132363415

Business Statistics Assignment -

Task: This assignment is designed to test your knowledge about the first two major topics, Descriptive Statistics and Probability and Probability Distributions as well as the sub topic Sampling and Sampling Distributions. Students will be required to use the Data Analysis Tools in Excel to complete some parts of the assignment.

As part of this assignment you will be required to summarise and describe a set of real data. The Data Analysis Tools in Excel will be used extensively to summarise the data.

Rationale - This assignment is designed to assess the following learning outcomes. That students

  • be able to summarise and interpret data graphically and numerically;
  • be able to use a statistical package to analyse data appropriately, and then interpret the output;
  • be able to explain the standard uses of Statistics in the media and in business environments, and judge whether the statistical methodology and conclusions drawn are appropriate;
  • be able to calculate and interpret probabilities, and use standard discrete and continuous probability distributions;
  • be able to evaluate if the assumptions underlying statistical techniques are valid in a given scenario;
  • be able to apply basic principles of survey design, such as determination of appropriate sample sizes and sampling techniques.

Question 1 - Download the data set 'auction data.xlsx' from the Assignment folder in the resources section of Interact. The data given in the worksheet tab 'Data' show the sales results (as compiled by the Australian Property Manager (APM)) for properties listed for auction in Sydney on Saturday 8 June, 2019. The variables in this data set are: Suburb, Address, Bedrooms, Type, Price, Result and Agent. The key for entries in the variable Result are provided in the worksheet tab 'Results Category'.

a. Copy and then complete the table below by matching the correct overall outcome (shown below) to the appropriate result codes. Use the information given in the worksheet tab 'key' in the data set when completing the table.

Overall Outcomes: Sold prior to the auction, sold at the auction, sold after the auction, did not sell, withdrawn from sale.

Result code

Overall Outcome

PI, NB, VB

 

SP, PN

 

S, SN

 

SA, SS

 

W

 

Most real data sets you encounter will contain problem data such as typographical errors, transcription errors, coding errors and possible outliers. This data set is no exception. In a real situation, we would make a note of these anomalies and ask for them to be investigated or checked. Since we cannot contact the owner of this data set, for the purpose of this assignment, we will ignore the anomalies and work with the data as best we can.

Read the document 'working with real data sets.pdf' found in the Assignment folder, which explains some ways of identifying errors in a data set and how to deal with them.

b. There are two properties where the result has been incorrectly recorded. These both have the result recorded as SP instead of SN. Identify these two properties by Suburb and Address. Edit the result code to correct these errors.

Use the edited data set with the two errors identified in part b. corrected when answering all remaining questions.

c. James and Anna are property owners in the Sydney region and are planning to sell their four bedroom house over the next few months. They are considering putting it up for auction and are concerned about the falling market due to pressure on the economy. They would like to use these data to gain an insight into the current Sydney auction market. Use the data set (with errors corrected in part b) to generate a three way pivot table report of 'Type' by 'Bedrooms' by 'Result'.

Hint: Use 'Type' and 'Bedrooms' as row labels and 'Result' as a column label. You might find it easier to read the information in the pivot table if you right align, the Results column labels.

Use the data in the pivot table generated in part c. to answer the following vendors' questions in parts d. and e., about the properties listed for auction in Sydney on 8 June 2019.

d. James and Anna would like to calculate the clearance rate for all properties listed for auction that week.

i. How many properties were originally listed for auction for the day in question?

ii. How many of these were sold (at auction, prior or after)?

iii. Express the number of properties sold (at auction, prior or after) as a percentage of all properties listed for auction.

e. Given that they will be selling their four bedroom house soon, they are also interested in the clearance rate of all four bedroom houses that week. They would like to compare the clearance rate of four bedroom houses with the overall clearance rate.

i. How many four bedroom houses were originally listed for auction for the day in question?

ii. How many of these were sold (at auction, prior or after)?

iii. Express the number of four bedroom houses sold (at auction, prior or after) as a percentage of all four bedroom houses listed for auction.

iv. Was the clearance rate for four bedroom houses worse, the same or better than the clearance rate for all properties that week?

f. James and Anna would like to compare the sales outcomes of all the properties that were listed for auction that week.

i. For all the properties that were listed for auction that week, generate a two way pivot table of 'Type' by 'Result'.

ii. Use this pivot table to generate a single horizontal 100% component bar chart with type along the vertical axis and the different types of 'result' making up the components of each of the four bars. Insert an appropriate title on the chart.

iii. Use the graph to identify the two types of properties which had approximately the same proportion of properties passed in that week.

Include both the pivot table and the 100% component bar chart generated in parts i. and ii. with your assignment submission.

g. Sort the data by the 'Type' variable and then extract the house data only to a separate file. We will use this new file to generate a table of descriptive statistics for the variable 'Price'. We first need to clean up the data set however so we can calculate the correct descriptive statistics. You might like to make a copy of this new file before you start deleting entries just in case you delete an entry you shouldn't have.

  • Sort the house data in the new file by 'Result'.
  • Delete any results for which there are no selling prices.
  • You will notice that there is are a number of properties which have a price associated with the result vendor bid. As this bid is made by the vendor only, not an actual selling price, these should also be deleted.

Hint: You should have 171 house properties with a selling price

i. Use the data analysis tools in Excel to generate a table of descriptive statistics for the house data for the variable 'Price' and include it in your assignment submission. Use the statistics from the table to answer the following questions.

ii. What was the mean and median selling price of houses that week, expressed to the nearest thousand dollars? What was the standard deviation of these selling prices, expressed to the nearest hundred dollars?

iii. What was the selling price of the cheapest house sold that week? Look further afield in the data set to include information about the number of bedrooms and the suburb.

iv. The table produced in part i. contains a figure that represents the sample variance. Copy the table below into your assignment and fill in both cells with the sample variance expressed as an actual number value and in scientific notation. One of the answers will come straight from your table of descriptive statistics. To obtain the other answer, you will have to convert the sample variance from scientific notation to number form or from number form into scientific notation. When expressing the number using scientific notation, do not use Excel's method of representing such large numbers.

 

sample variance

actual number value

 

scientific notation

 

h. Use the Data Analysis Tools in Excel to generate a frequency distribution and histogram for the selling prices of houses sold that week. Use $700 000 as the upper limit of the first class and a class width of $700 000.

Use the same 171 house selling prices we used to generate the table of descriptive statistics in part g.

i. When quoting the average house price in Sydney, sources always quote the median rather than the mean selling price. Why this is appropriate for the data in this question? Explain how the histogram generated in part h. and a key descriptive statistic generated in part g.i. supports your answer.

Question 2 - A bank reports that families often have more than one credit card. The most popular in Australia are Mastercard, Visa and American Express. For those families with one or more of these cards, 30% have a Mastercard, 10% have an American Express card and 60% a Visa card. They also reported that 4% of families have both a Mastercard and an American Express card, 7% have both a Mastercard and a Visa card and only 3% have both a Visa card and an American Express card. Further, 1% have all three cards.

a. Calculate the probability of selecting a family that has either a Visa card or a Mastercard.

b. If a family has a Mastercard, what is the probability it also has a Visa card?

c. Is possession of a Mastercard independent of possession of a Visa card? Why or why not.

d. Is possession of an American Express card mutually exclusive of possession of a Visa card? Justify your decision.

Question 3 - There are a number of companies that provide a delivery service for take away meals. One of the important factors for the customer is the time between placing an order and receiving the meal. A particular Thai restaurant use two different delivery companies. Company A delivers 40% of their orders while Company B delivers the rest. A survey of customers using the service have indicated they want their food delivered within 30 minutes. Historically, Company B has experienced 10% of their orders taking longer than 30 minutes to deliver while Company A has been late on 15% of their orders.

a. We will use A to represent the event 'Company A delivers the order' and L to represent the event 'The meal is delivered late'.

Use the correct statistical notation and words to define the complement of both A and L. (Use this terminology in your working throughout the remainder of this question).

b. Construct a fully labelled probability tree to describe this problem with the outcomes and probabilities shown along each branch.

c. A customer has just received their order. What is the probability the order was delivered on time, that is, within 30 minutes of placing the order?

d. A customer who contacted the Thai restaurant reported receiving their order 45 minutes after placing the order. Which company is most likely to have delivered the order? Use probabilities to support your conclusion.

 e. Considering the probability calculated in part c., should the restaurant owner have any concerns about the reliability (delivery times) of the delivery companies they use? Explain.

Question 4 - Suppose that 20% of all share market investors are retirees and we select a random sample of 25 share market investors.

a. If the random variable X is the number of investors in the sample who are retirees, provide two reasons why X has a binomial distribution.

b. State the value/s of parameter/s of X.

c. Calculate (using the appropriate statistical tables) the probability that between five and ten investors in the sample are retirees.

d. Calculate (using the appropriate statistical tables) the probability at least ten investors in the sample are retirees.

e. Verify your answers to parts c. and d. using the appropriate Excel statistical function and demonstrate you have done this by including the Excel formula used.

f. You have to explain to someone who has not studied statistics what this probability calculated in part d. means. Use simple non statistical language to explain this.

g. How many retirees would you expect to find in a random sample of 25 investors?

Question 5 - A maintenance worker in a large manufacturing plant knows that a filling machine breaks down on average 3 times per month. Assume these break downs occur randomly and independently of one another.

a. If the random variable X is the number of times the machine breaks down, identify the distribution of X and state the value/s of its parameter/s.

b. Calculate (using the appropriate statistical tables) the probability there are less than three breakdowns in the next month.

c. Calculate (using the appropriate statistical tables) the probability of seven breakdowns in the next two months.

d. Verify your answers to parts b. and c. using the appropriate Excel statistical function and demonstrate you have done this by including the Excel formula used.

Question 6 - Suppose that the height of Australian males is a normally distributed random variable with a mean of 176.8cm and a standard deviation of 9.5cm.

a. If the random variable X is the height of an Australian male, identify the distribution of X and state the value/s of its parameter/s.

b. Calculate (using the appropriate statistical tables) the probability that a randomly selected Australian man is more than two metres tall.

c. To become a jockey, as well as a passion for the sport, you need to be relatively small, generally between 147cm and 168cm tall. Calculate (using the appropriate statistical tables) the proportion of Australian males who fit this height range.

d. Some of the smaller regional planes have small cabins, consequently the ceilings can be quite low. Calculate (using the appropriate statistical tables) the ceiling height of a plane such that at most 2% of the Australian men walking down the aisle will have to duck their heads.

e. Verify your answers to parts b., c. and d. using the appropriate Excel statistical function and demonstrate you have done this by including the Excel formula used.

f. A random sample of forty Australian males is selected. State the type of distribution and the value/s of the parameter/s for the mean of this sample.

g. Calculate (using the appropriate statistical tables) the probability that the average height of this sample is less than 170cm.

Reference no: EM132363415

Questions Cloud

Calculate an ICT carbon footprint for a department : ICTSUS601 Integrate Sustainability in ICT Planning and Design Projects Assignment, Abbey College Australia. Calculate an ICT carbon footprint for a department
Identify the factor that generates the periodic nature : Find the maximum and minimum voltages and the times at which they occur. Identify the factor that generates the periodic nature of the graph.
How would you use normal vectors to identify configuration : Solve one questions about intersection of the plane and then you will make a model of intersection of planes. How would you use normal vectors to identify.
What percentage of observations have missing data : POL-SOC 1041-Count the number of observations with any missing data. What percentage of observations have missing data?
Calculate the probability of selecting a family : QBM117 Business Statistics Assignment, Charles Sturt University, Australia. Calculate the probability of selecting a family
Ethics of the academy of criminal justice sciences : Consider how you would address this ethical issue using ethical standards from the Code of Ethics of the Academy of Criminal Justice Sciences (ACJS).
Important part of the planning process for emergencies : Preparing the populace is an important part of the planning process for emergencies. Creates several mechanisms for communicating with the public.
Tort claims that patty can make against cash mart : What are the possible tort claims that Patty can make against Cash Mart? Discuss the elements of claim and how those elements relate to the facts in scenario.
What is the core business of the company : What is the core business of the company? What is the background of the company; What industry does it operate in?

Reviews

len2363415

8/30/2019 3:50:44 AM

This assignment is designed to test your knowledge about the first two major topics, Descriptive Statistics and Probability and Probability Distributions as well as the sub topic Sampling and Sampling Distributions. Students will be required to use the Data Analysis Tools in Excel to complete some parts of the assignment. As part of this assignment you will be required to summarise and describe a set of real data. The Data Analysis Tools in Excel will be used extensively to summarise the data. A copy of the assignment, the cover sheet, the data set and the document ‘working with real data sets' can be found at the Subject Interact site in the Assignment folder.

len2363415

8/30/2019 3:50:37 AM

Presentation - The assignment must be neatly handwritten. Any Excel output should be inserted where required at the appropriate place in the assignment not in an Appendix at the back of the assignment. Pages must be numbered, and your name and student number must be included on every page. This method of presentation has been chosen because it is very difficult and time consuming to type mathematical formulas in Word. Marks will be deducted for assignments which do not follow these guidelines. The assignment must have a completed signed cover sheet attached to the front of the assignment. The cover sheet can be found in the Assignment folder.

len2363415

8/30/2019 3:50:29 AM

Assignments submitted without a signed coversheet will not be marked. If you choose to submit the assignment online through EASTS, it must be submitted as a single. Word or PDF file. Assignments submitted in non-printable formats such as a ZIP file or as a collection of images will not be marked. If your scanner produces separate graphics files please paste them into a Word document before submitting to EASTS. Pages must be numbered, and your name and student number must be included on every page. Once you have submitted the assignment, please view your uploaded assignment to ensure that all the images, graphs, etc are visible and formatted correctly.

len2363415

8/30/2019 3:50:20 AM

Submission Requirements - The assignment can be submitted by hand in the Assignment box on your local campus or uploaded to EASTS. Pages must be numbered, and your name and student number must be included on every page whichever submission method is chosen. If submitting by hand the assignment must be submitted by 5pm on the due date. Please check your subject outline for information about the location of this box. If submitting through EASTS the assignment can be submitted up until 11.59pm (AEDT) on the due date. It is a requirement that students keep a copy of all assignments submitted to the University for marking. Note: The photocopiers in the Learning Commons on each campus can be used to scan your assignment as a single pdf if you do not have this capacity at home. Make sure you choose colour scanning as your graphs will have been prepared in colour.

len2363415

8/30/2019 3:50:11 AM

Marking criteria - All questions in this assignment involve problems with a sequence of several steps. These are marked using the following criteria. Correctness Arithmetic, algebra and calculations are correct (except possibly some minor rounding errors) Process/Method The indicated/correct method is selected and carried out completely. Communication/Working You have made it clear what you have done using an appropriate mix of text, mathematical notation, neat diagrams and code excerpts. The mark for each question is determined by the proportion of your solution that satisfies these criteria.

len2363415

8/30/2019 3:50:02 AM

Full marks - Correct answer written as a clear response to the original question. Full worked solutions provided that are clear, adequate and legible and use the correct mathematical notation and reasoning, with neat diagrams and code excerpts where appropriate. Final answer includes appropriate units and (where specified) correct rounding. The application of these general principles to each individual question is given with the questions themselves. Please read them to maximise your marks.

len2363415

8/30/2019 3:49:53 AM

Question 1 (42 marks) - Marking Criteria for Question 1 - Pivot table generated correctly using Excel, using entire data set and with ‘Type' and ‘Bedrooms' as row labels and ‘Result' as a column label. 100% component bar chart generated correctly using. Excel with appropriate title. All statistics rounded correctly. Correct property and additional information. Correct justification clearly and concisely explained with relevant reference to histogram and statistic to support answer.

len2363415

8/30/2019 3:49:43 AM

Question 2 (11 marks) 10 marks as indicated +1 for answering the questions as sentences in parts a. and b. Marking Criteria for Question 2 - Correct translation of question into a mathematical probability statement and formula; correct numerical calculations and substitutions; correct answer. Correct translation of question into a mathematical probability statement and formula; correct numerical calculations and substitutions; correct answer. Question 3 (13 marks) 12 marks as indicated +1 for answering the question as sentences in parts c. Marking Criteria for Question 3 - Correct translation of question into a mathematical probability statement and formula; correct numerical calculations and substitutions; correct answer. Correct decision supported by relevant probability calculations. Answer clearly and concisely explained.

len2363415

8/30/2019 3:49:33 AM

Question 4 (16 marks) 15 marks as indicated +1 for answering the questions as sentences in parts c., d. and f. Marking Criteria for Question 4 - 2 valid reasons explained in the context of the question. Correct translation of ‘between' into a mathematical inequality; correct working and answer using statistical tables. Correct translation of ‘at least' into a mathematical inequality; correct working and answer using statistical tables. Question 5 (11 marks) 10 marks as indicated +1 for answering the questions as sentences in parts b. and c. Marking Criteria for Question 5 – Correct distribution, parameter value/s and notation. Correct translation of ‘less than' into a mathematical inequality and correct answer. Correct probability statement, parameter/s, working and answer using statistical tables.

len2363415

8/30/2019 3:49:23 AM

Question 6 (23 marks) - 22 marks as indicated +1 for answering the questions as sentences in parts b., c., d., g. Marking Criteria for Question 6 - Correct transformed value; relevant fully labelled diagram; correct working and answer using statistical tables. Relevant fully labelled diagram; correct working and answer using statistical tables. Correct transformed value; relevant fully labelled diagram; correct working and answer using statistical tables.

Write a Review

Applied Statistics Questions & Answers

  Analyze real and publically data set using SPSS software

This project will demonstrate your ability to analyze a real, publically available data set using SPSS software and to translate findings from SPSS

  Somerset furniture companys global supply chain

Somerset Furniture Company's Global Supply Chain

  Chevalier de méré made money betting

Chevalier de Méré made money betting that he could "roll at least one 6 in 4 tries." When people got tired of this wager he changed it to "roll at least one double 6 in 24 tries,"but then he started losing money. Compute the probabilities of winning ..

  What is the slope of the linear regression line

Calculate the mean yearly value using the average gas prices by month found in the Final Project Data Set. What is the slope of the linear regression line and what is the Y-intercept of the linear regression line

  Create two research questions

Create two research questions (RQ), complemented by one null and one alternative hypothesis statement that matches each RQ

  Consider an experiment withfour groups

Consider an experiment withfour groups, with eight values in each. For the ANOVA summary table below, fillin all the missing results: Source           Degrees ofFreedom           Sum of Squares    Mean Square (Variance) F

  Examine the graphs of data in the accompanying excel file

In 2009, the New York Yankees won 103 baseball games during the regular season. Examine the graphs of the data in the accompanying Excel file on the sheet labeled "Regression."  Provide a short assessment of the message that the graphs impart

  Perform exploratory data analysis on creativitypre

Create two graphs-one for systolic and one for diastolic pressure. Each graph should clearly delineate the three groups - Perform exploratory data analysis on both the SystolicBP and DiastolicBP variables.

  Construct a network for the given activities

The activities that are necessary to build one of these experimental weed-harvesting machines are listed in table. Construct a network for these activities.

  Collected on hourly workers and salaried workers preference

For a particular company, data was collected on hourly workers and salaried workers preference for two difference pension plans: Plan 1 (p1) and Plan 2 (p2). Using the table below, what is the probability employees prefer Plan 1?

  Compute the simple holding period returns

Compute the simple holding period returns without the dividend. Compute the arithmetic mean holding period return and Compute the continuously compounded return and the average return

  What is the null hypothesis for your question

Construct a research question using the General Social Survey dataset, which can be answered by a Pearson correlation and bivariate regression.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd