BIOSTATISTICS I ASSIGNMENT

Assignment

The assignment questions are below

The assignment MUST be typed – note that you can copy and paste selected SPSS output into a Word document. Show working and reasoning. Do not hand in duplicated or unrequested SPSS output. Marks will be deducted for inadequate explanation and poor presentation.

Also working method in the SPSS must be mention clearly for each and every question to achieve the answers.

  • The attached work submitted for assessment that all material drawn from other sources has been fully acknowledged

BIOSTATISTICS I

Assignment

The assignment should be typed – you can copy and paste selected SPSS output into a Word document. Show all working and reasoning to gain full marks. Do not hand in duplicated or unrequested SPSS output. Marks will be deducted for inadequate explanation and poor presentation.

Question 1. [7 marks] This question relates to the SPSS file comsurv05.sav. See the document on “Description of datasets” for information on this dataset. Use SPSS (showing appropriate output) and the file comsurv05.sav to produce the requested output and answer the following questions, providing answers to three decimal places:

  1. [2 marks] Carry out a test that compares the mean body mass index (use BMI) for never smokers and current smokers (use SMOKING). What is the estimated difference in mean BMI between the groups? In your answer, you should provide the appropriate SPSS output, highlighting relevant parts, including p-value and confidence interval, state any assumptions made, and provide a conclusion.
  2. [1 mark] Provide an interpretation of the p-value from (a) as a probability.
  3. [2 marks] Produce a scatterplot that shows the relationship between BMI and SBP (ensure you use all observations in the dataset). Perform a linear regression of SBP on BMI and provide your regression output in your answer. What is the estimated intercept and slope of the linear regression of SBP on BMI? Provide an interpretation of the BMI coefficient.
  4. [2 marks] Based on the fitted model in (c), calculate the estimated difference in mean SBP for people with a BMI of 25 and people with a BMI of 30 and obtain a 95% confidence interval for this difference.

Question 2. [8 marks] A researcher investigating the association between a genetic marker and a disease in a population collects data from a random sample of 200 individuals. The overall prevalence of disease in this sample is 60% and of the 70 individuals who had the genetic marker 40 had the disease.

  1. [2 marks] Based on the information provided above, and using hand calculations, construct a 2×2 contingency table that has its structure defined by the variables, ‘Disease’ (Disease/No disease) and ‘Genetic Marker’ (Yes/No) and the contents, observed and expected counts. Ensure to provide working for any calculations.
  2. [2 marks] Enter the data from (a) into SPSS and perform an appropriate test of the association between the genetic marker and disease. In your answer, ensure you state your hypothesis, any assumptions made, appropriate SPSS output, p-value and conclusion.
  3. A second genetic marker was hypothesised to have a relationship with the disease. The same 200 individuals were examined and 134 of these had this second genetic marker, with 64 individuals having both genetic markers.
    1. [2 marks] By hand, construct an appropriate table that summarises the information on both genetic markers. Calculate an appropriate statistic, that measures the agreement between these two genetic markers and provide an interpretation of this statistic. Show all working.
    2. [1 mark] Perform a test that compares the proportion of individuals who have the first genetic marker to the proportion having the second genetic marker. What is your conclusion based on the p-value from this test?
  4. [1 mark] Without doing any additional calculations and based on your answers from parts (a-c), can we conclude whether the second genetic marker is associated with disease? Provide reasoning.

Question 3. [5 marks] This question relates to the table (next page) extracted from an article by Rowe et al. entitled “Associations between COVID-19 and hospitalisation with respiratory and non-respiratory conditions: a record linkage study” published in the Medical Journal of Australia in 2023. Using the information provided in the table, answer the following questions, showing your working and providing answers to two decimal places.

  1. [1 mark] Use hand calculations to estimate the odds of hospitalisation in Aboriginal/Torres Strait Islanders, and the odds of hospitalisation in Non-Indigenous individuals.
  2. [2 marks] Use hand calculation methods to estimate the hospitalisation odds ratio and 95% confidence interval that compares Aboriginal/Torres Strait Islanders to Non-Indigenous groups. Provide an interpretation of these.
  3. [1 mark] Perform a test by hand calculation, that compares the odds of hospitalisation between the Aboriginal/Torres Strait Islanders to Non-Indigenous groups. What is your conclusion based on your p-value?
  4. [1 mark] Are you results in (b) and (c) compatible? Explain your answer.