Analysis of US Health Insurance data

Assignment Help Other Subject
Reference no: EM132769015

MIS770 Foundation Skills in Data Analysis - Deakin University

Assessment Task - Analysis of US Health Insurance data

Description

The purpose of this assignment is to investigate a dataset using the knowledge learned in Modules 1 and 2. This will enable conclusions to be drawn that ultimately assist in decision making.
The assignment requires you to analyse a given dataset, interpret the results, and then draw conclusions such that you are able to reply to specific questions being asked of you in the form of a business report. (These questions are asked in the following email).
The aims of the assignment are to:
• provide you with some examples of the application of data analysis
• test your understanding of the material presented in the relevant topics
• test your ability to analyse data and interpret your results
• test your ability to effectively communicate your results to others

Before attempting the assignment, make sure that you have prepared yourself well by reading the relevant sections of the prescribed textbook and reviewing the materials provided in Modules 1 and 2 (i.e. Topics 1 to 7).

Specific Requirements
The UnitedHealth Group is America's most prominent health insurance provider. They want to better understand certain population characteristics that might contribute to the high medical costs being billed to insurance providers. They have access to a random sample of US Health Insurance data containing 1338 insured personnel with their Age, Gender, Body Mass Index (BMI), Number of Children, Smoking status, Region and Charges.

You are a Data Analyst working for UnitedHealth Group. Your Manager, Daisy Pearce, has asked you to conduct a preliminary analysis. In particular, you are expected to apply a series of statistical techniques and produce a report based on your findings.

Daisy's email is reproduced on the next page.

Email from Daisy Pearce
Hi,
As per our conversation, I have spoken with our reporting team and we have THE following questions relating to the US health insurance data (contained in the file Insurance.xlsx). Please complete the required analysis and prepare a report for me containing answers to the following questions:

Q1. An Overall View of both "Charges" and "Smoking" Can you provide me with overall summaries of
a) Individual medical cost billed by health insurance
b) Smoking status

Q2. Relationships
a) Is there a relationship between the age of the primary beneficiary, their body mass index (BMI), number of children and medical cost?
b) We would also like to know is there a gender bias in the smoking behaviour of the beneficiary.
c) Can you further analyse to see whether the beneficiary's residential area/region in the US affect how health insurance provider bill their medical costs?

I realise that the US Health Insurance data contain a random sample of 1338 insured personnel, and that this information can be used to draw inferences about the specific attributes of the whole insured population and charges billed by health insurance providers. With that in mind, Please provide me with answers to the following questions:

Q3. The UnitedHealth Group would like estimates of the following.
a) Average medical cost for an older beneficiary (older adulthood: 56 years and older)
b) Proportion of smokers who are obese (BMI of at least 30)

Q4. The UnitedHealth Group would like a comparison between this year's medical cost and the industry average.
a) The industry average medical cost for a single adult (i.e. without children) is at least $10,000. Is there any evidence to support this assertion?
b) Based on the industry average, less than 50% of beneficiaries are female. Can this claim also be substantiated?

Q5. Appropriate Sample Size
One of the company's overall goals is to estimate the average medical cost for all insured personnel to within
$1000 (±1000) and the proportion of all insured smokers to within 3%, Will a sample size of 1338 be large enough? If not, what size sample should be taken? What other factors should be taken into account when sampling?

Business Report Requirements
• Your report should be no longer than 4 pages and should not include any charts, tables, or appendices in the report. Charts/graphics and tables are only to be placed in the Data Analysis file i.e. the Excel spreadsheet and not reproduced in the report.
• Suggested formatting for the report: single-line spacing; no smaller than 10- point font; page margins
approx. 25mm, and good use of white space.
• Your report must have a cover sheet containing your particulars and Unit details.
• The report is to be written as a stand-alone document (assume Daisy will only read your report). Thus, you should not have any references in the report to your data analysis output. Eg. "According to Table 1 in the analysis..."
• Your report must contain an executive summary that explains in plain language the purpose of the report and summarises the main findings. The executive summary should be no more than 300 words long.
• The body of your report must be set out in the same order as in the originating email from Daisy, with each section (question) clearly marked
• Use plain language and succinct explanations. Avoid the use of technical or statistical jargon as Daisy cannot be expected to understand statistical terminology. As a guide to the meaning of "Plain Language", imagine you are explaining your findings to a person without any statistical training (e.g. someone who has not studied this unit). What type of language would you use in this case?
• Marks will be lost if you use unexplained technical terms, irrelevant material, or have poor presentation/ organization

Data Analysis Instructions

In order to prepare a reply to Daisy's email, you will need to examine and analyse the dataset Insurance.xlsx thoroughly.
Daisy has asked a number of questions and your Data Analysis output (i.e. your charts/tables/graphs) should be structured such that you answer each question on the separate tab/worksheet provided in your Excel document. There are also three extra tabs in Insurance.xlsx called CI, HT and SampleSize and you should use the various templates contained in these tabs arriving at your "Confidence Interval", "Hypothesis" and "Sample Size" answers.

Q1. An overall summary of Charges (in dollars) and summary of Smoking status
You are required to comprehensively describe the variable ‘Charge' by itself and the variable ‘Smoking' by itself using the most appropriate techniques from Module 1.
Your analysis should include numerical summaries, graphs and tables. The importance of other variables is considered in other questions. You should thoroughly investigate relevant summary measures (and their reliability) for these two variables. Also, there may well be suitable tables and charts/graphs that will illustrate more clearly other important features of charges and smoking. (See Topics 1-3 learning materials)

Q2. Descriptive measures and insights
Your course notes (Module One) give methods (numerical summaries/tables/graphs/charts) for summarising a single variable and investigating the relationships (dependencies) between two variables for these situations. For example
• Pie/Bar charts
• Summary/Frequency Distribution tables
• Comparative summary measures including quartiles and percentiles
• Scatter diagrams
• Coefficient of correlation, r value
• Contingency tables/Cross tabs
• Stack bar charts, side-by-side bar charts
• Histograms/Frequency polygons/Ogives
• Single/Multiple box and whisker plots etc. (See Module One learning materials)
Use whatever techniques you have studied in Module 1 to investigate the associations/relationships. Generate suitable visualisations (Tables/Graphs/Charts) and numerical measure(s) demonstrating the existence or otherwise of a relationship. Remember to provide a brief overall summary when concluding these questions.

Q3-Q4 The analysis required involves inferential statistics, which are covered in Module 2. Use the relevant Excel templates (CI and HT) provided in the Data file.
These questions will require you to complete either a confidence interval or a hypothesis test. Go through each of the questions asked by Daisy and decide which technique is the most appropriate. Below are some hints regarding the most appropriate technique:
• Do we have to make an estimate, and therefore need a confidence interval?
• Are we testing a theory/claim/ or comparing values... and therefore need a hypothesis test?
So decide which you think is the most appropriate technique (tutorials for topics 6 and 7 help here).
• You can assume that a 95% confidence level is appropriate.
• Use 5% significance in any hypothesis tests you perform, and provide a summary of your conclusions.

Q5. Use the relevant Excel templates provided in the Data file.

Learning Outcome 1: Manipulate and summarise data that accurately represents real world problems

Learning Outcome 2: Interpret and appraise statistical output to assist in real-world decision making

Learning Outcome 3: Critical thinking: evaluating information using critical and analytical thinking and judgment.

Attachment:- Analysis of US Health Insurance data.rar

Reference no: EM132769015

Questions Cloud

Components of construction project management : Discussing the components of construction project management that occur in the pre-construction planning phase.
What should be the initial cost of new investment : April 15, 2020, what should be the initial cost of its new investment assuming the new investment is classified as available for sale?
What is a mobile equipment identifier : What is a Mobile Equipment Identifier (MEID) and what is it used for? As an examiner, why is it important for you to understand how a call is routed through.
Effective vision for health care organization : Explains the purpose and characteristics of developing an effective vision for a health care organization.
Analysis of US Health Insurance data : Analysis of US Health Insurance data - provide you with some examples of the application of data analysis - Conduct a preliminary analysis
Find amount of derivative asset should parker company : Esther Company on April 15, 2020, What amount of derivative asset should Parker Company recognize on December 31, 2019?
Auditing : Auditors have come into a department as part of a company-wide audit prior to issuing an audit opinion for the company's financial reports.
What amount of derivative asset parker company recognize : On December 15, 2019, Parker Company entered into a call option, What amount of derivative asset should Parker Company recognize on December 15, 2019?
Larger issues that microeconomics attempts : What are some of the larger issues that Microeconomics attempts to address? How is this different than what is examined in Macroeconomics?

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd