Create an initial multivariable regression model

Assignment Help Applied Statistics
Reference no: EM132379573

Case Study Instructions -

Background - This data was simulated. The motivation for this assessment comes from Dr Leah Shepard's (USYD) article. It is not necessary to read this for the assignment or course, and the link is provided purely for your own interest and if you wish to know more about this topic.

Subject matter background - Prostate-specific antigen (PSA) is a blood marker for prostate cancer and men with elevated levels of PSA (relative to their age) may be sent for further cancer testing and monitoring. However, there is evidence that HIV+ men may have lower PSA levels compared to their HIV-negative counterparts, which could lead to underdiagnosis in this group.

You are approached by a colleague who is interested in quantifying whether HIV+ men have lower PSA levels compared to HIV-negative men. They are concerned because PSA level is known to be higher in older people and the HIV-positive population tends to have a younger age distribution than the HIV-negative population.

They provide you with a dataset on 296 men aged 18 and over attending participating health clinics. Serum samples were collected and analysed for total PSA and testosterone level. Information on several other variables that are possibly associated with PSA level were also collected. The variables in your data are:

  • id: participant identification number
  • psa: measured serum total PSA (ng/mL). This is the outcome variable for this assessment.
  • hiv: HIV-positive status (0: HIV-negative, 1: HIV-positive). This is the exposure variable for this assessment
  • age: at date of serum sample collection
  • test: measured serum total testosterone (ng/dL)
  • ethnicity: (0: Caucasian, 1: Other)
  • wc: waist circumference (cm)
  • pros_vol: prostate volume (ml)

Exercise -

Your task is to conduct a regression analysis to assess whether HIV+ men have lower PSA levels after adjusting for differences in age and other possible confounders. This regression model should have psa as the outcome variable and hiv as the exposure variable. It should also include adjustment of potential or actual confounders as you see appropriate.

To this end, follow the model building steps below (in this order):

1. Investigate the individual associations between each variable and psa: identify which variables should potentially be included in a multivariable model; identify if any transformations are necessary; and identify any possible issues such as non-linearity or collinearity.

2. Create an initial multivariable regression model with psa as the outcome, hiv as the exposure and including all possible confounders identified in step 1.

3. Investigate possible collinearity in this model and deal with it appropriately (if needed).

4. Identify the most suitable multivariable regression model for this research question excluding any further variables as you see fit (or excluding none at all).

5. Check the assumptions of this model and make any adjustments as necessary.

Written conclusion: (no longer than 1 page long)

In addition to the oral presentation, you must write a standalone summary of your findings for a clinical collaborator that includes the following:

1. A description and explanation of any issues that arose during the model building process and why you excluded any variables from the analysis (if any).

2. A specific answer to the research question by interpreting your final model including relevant P-values, regression coefficients and/or confidence intervals.

3. A summary of any other findings relevant to the research question.

4. An equation that describes your final model.

This summary should be targeted towards an audience (your hypothetical experimental/clinical collaborator) who is familiar with basic statistics (such as P-values and confidence intervals), but unfamiliar with the technical details of regression analysis. It should not include any Stata output or code.

Attachment:- Assignment Files.rar

Reference no: EM132379573

Questions Cloud

How much of her salary will she be allowed to exclude : This year, she spent the entire year in London and earned a salary of $117,200. How much of her salary will she be allowed to exclude?
What does Chimerism mean for a person : What does Chimerism mean for a person if that person is the product of two fertilized eggs? If the soul theory is the basis of identity
Discuss the role of contracts in brief : This week we will discuss the role of contracts. From a purely IT perspective, what do you think the value of a software contract is? (Focus on only one of the)
How the enigma machine has changed the world of security : In this essay, you will explain how the Enigma machine has changed the world of security to this day. You will provide a timeline of the major milestones.
Create an initial multivariable regression model : Create an initial multivariable regression model with psa as the outcome, hiv as the exposure and including all possible confounders identified in step 1
Evaluate qualitative vs quantitative risk assessment : Write an essay in 500 words- Compare and evaluate Qualitative vs Quantitative Risk Assessment. Use at least three sources. Include at least 3 quotes from your.
Discuss will be the birthright citizenship dispute : LSTD301-I will discuss will be the birthright citizenship dispute. The courts dispute whether natural-born refers to territorial, blood or some combination.
Define benefits of security focused configuration management : You should focus on the benefits of security focused configuration management and implementation of access control / controlled disclosure of information.
What are some changes you think will occur in the way : What are some changes you think will occur in the way financial information is gathered, processed, and communicated as a result of increasingly.

Reviews

Write a Review

Applied Statistics Questions & Answers

  What percent of the chinese giant salamanders

The lengths of Chinese giant salamanders can be modeled by a normal distribution with a mean of 113 cm and a standard deviation of 22 cm. 1. What percent of the Chinese giant salamanders do you expect to measure between 100 and 135 cm?

  Find reasonable estimate of minimum cost of correcting error

If it costs the company $10 to correct an erroneous travel voucher, find a reasonable estimate of the minimum cost of correcting all of last month's erroneous travel vouchers.

  Kmo and bartletts test

Explain how and why you are using eight variables instead of the original number of 25. I gave the reasons and method adopted in one of the feedbacks.

  Use tukey''s test as the post hoc test for anovas

Use Tukey's test as the post hoc test for ANOVAs in this course. Be sure to check this box when you run analyses. For letters a-d, instead of identifying these values on your output, as the text states, please write them into your Word file as writte..

  Describe the sample characteristics and baseline values

Analysis - Comparing group means - Describe the sample characteristics and baseline values, comparing the two groups' characteristics

  Find the probability

The capacity of an elevator is 8 people or 1232 pounds. The capacity will be exceeded if 8 people have weights with a mean greater than 1232/8=154 pounds. Suppose the people have weights that are normally distributed with a mean of 162 lb and a stand..

  What fraction of assemblies will fail to meet specifications

The specifications of the clearance between the mating parts are 0.5 ± 0.4. What fraction of assemblies will fail to meet specifications if assembly is at random?

  What is the probability that a customer will receive

What is the probability that a customer will receive his meal in less than 10 minutes?

  Discuss the shape of the distribution of the price

Discuss the shape of the distribution of the price of new house

  Comment on the overall adequacy of the final model

Compute the difference in the average birthweight of babies of indigenousandnon-indigenous mothers - Comment on the overall adequacy of the finalmodel.

  Hdtv''s need repairs at the manufacturer expense

The warranty period must be long enough to make the purchase attractive to the buyer. For a new HDTV the mean number of months until repairs are needed is 36.84 with a standard deviation of 3.34 months. Where should the warranty limits be set ..

  What is the shape of each dot plot

Shoes The graph is a dot plot of the number of pairs of shoes owned by men and women who took a survey on Stat Crunch.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd