Name: _______________________________ K E Y
Student ID: _______________________________

MATH102 11SP Final Exam
B [ answers in web view ] Total points: 150

Let the words of my mouth and the meditation of my heart
Be acceptable in Your sight, O LORD, my Rock and my Redeemer.
-- Psalm 19:14
  1. A particular FDG-PET (fludeoxyglucose positron-emission tomography) screening test for non-Hodgkin's lymphoma has a 18% false-positive rate (82% specificity) and 91% sensitivity (i.e., 91% of lymphomas are caught by the screening process).
    1. Suppose the screening test is applied to 250 patients, of which 50 have non-Hodgkin's lymphoma. Draw an event tree for the outcomes of the test, and label the tree with probabilities for each branch of the tree. [4]
      First level of event tree has two branches: has lymphoma (0.20) vs. does not have lymphoma (0.80).
      Second level of event tree has a total of four branches:
      P(test pos | lymph) = 0.91,
      P(test neg | lymph) = 0.09,
      P(test pos | no lymph) = 0.18,
      P(test neg | no lymph) = 0.82.
      There are four possible outcomes:
      P(test pos ∩ lymph) = (0.20)(0.91) = 0.182,
      P(test neg ∩ lymph) = (0.20)(0.09) = 0.018
      P(test pos ∩ no lymph) = (0.80)(0.18) = 0.144,
      P(test neg ∩ no lymph) = (0.80)(0.82) = 0.656.
    2. On average, how many people in this group will test positive for non-Hodgkin's lymphoma? [3]
      (true positive) + (false positive) = n(test pos ∩ lymph) + n(test pos ∩ no lymph) = (0.182 + 0.144)(250) = (0.326)(250) = 81.5
    3. If a patient tests positive using this test, what is the probability that the patient really has non-Hodgkin's lymphoma? [2]
      (true positive) / (total positive) = 0.182 / (0.182 + 0.144) = 0.182 / 0.326 = 55.828%
  2. The home provinces of 16 students in a class are listed below. Draw a Pareto chart showing the distribution of home province in this sample. [4]
    AB, BC, BC, SK, AB, SK, BC, MB, AB, BC, ON, AB, BC, ON, MB, SK

    Frequencies (in order):
    BC = 5/16 = 31.25% (cum: 31.25%);
    AB = 4/16 = 25% (cum: 56.25%);
    SK = 3/16 = 18.75% (cum: 75%);
    MB = 2/16 = 12.5% (cum: 87.5%);
    ON = 2/16 = 12.5% (cum: 100%)
  3. The average number of hours of exercise per week was measured for a number of urban dwellers and rural dwellers. A 95% confidence interval for the difference of means (urban - rural) is (-0.27, 1.23). Based on this information, indicate whether each of the following statements is "True" or "False". (Please write the entire word, "True" or "False".) [6]
    1. We are 95% certain that urban dwellers exercise between 0.27 hrs less and 1.23 hrs more per week than rural dwellers. True
    2. 95% of urban dwellers exercise between 0.27 hrs less and 1.23 hrs more per week than rural dwellers. False
    3. There is no difference in the amount of exercise for urban and rural dwellers. False
    4. Urban dwellers exercise an average of between 0.27 hrs less and 1.23 hrs more per week than rural dwellers. False
    5. At a 5% level of significance, this study is unable to find a difference in amount of exercise between urban and rural dwellers. True
    6. With 95% confidence, the difference in hrs/week of exercise between urban and rural dwellers in this study is between -0.27 and 1.23. False
  4. Describe two events/conditions that one might reasonably expect are statistically independent, and justify why. Be sure also to indicate what the overall sample space is. [3]
    Plenty of options here: e.g., sample space is outcomes of flipping a coin twice: heads on first flip is independent of heads on second flip.
    Gender and eye colour: being female is independent of having brown eyes. (sample space is all people with any combination of gender and eye colour).
    As a counterexample: male and female are not independent: they are mutually exclusive.
  5. In clinical trials, a cholesterol pill produced by pharmaceutical company "Murck" exhibits severe side effects in 30 out of 250 patients. The pill produced by competitor "ZastroSeneca" exhibits severe side effects in 14 out of 200 patients.
    1. Build a 95% confidence interval for the risk of side effects with Murck's pill. Do the same for ZastroSeneca. [4]
      σM = √(npq)/n = √(pq/n) = √(.12*.88/250) = .02055
      95% ⇒ z = 1.96: confidence interval is .12 ± 1.96(.02055) or 7.967% < pM < 16.033%.
      σZ = √(npq)/n = √(pq/n) = √(.07*.93/200) = .01804
      95% ⇒ z = 1.96: confidence interval is .07 ± 1.96(.01804) or 3.46% < pZ < 10.54%.
    2. Does the risk of side effects differ significantly for the two companies' cholesterol pills? State the null and alternative hypotheses, both in words and in notation. [2]
      H0: binomial proportion pZ = pM
      HA: binomial proportion pZ ≠ pM
    3. What statistical test is appropriate to test the hypothesis? Number of tails? [2]
      Comparing two independent proportions, 1-tailed.
    4. Run the test: find the test statistic and either bracket a p-value or find the critical value. [4]
      SEp1-p2 = √(pq/n + pq/n) = √(.12*.88/250 + .07*.93/200) = .02735
      z = ( (p'M - p'Z) - 0 ) / SE = (0.12 - 0.07) / .02735 = 1.8283
      P-value: p = 0.0678; Classical: critical z = 1.96
    5. Draw a conclusion and interpret it in the context of the two drug companies.
      Please use complete English sentences. [2]
      Fail to reject H0: the risk of side effects does not significantly differ for the two companies' cholesterol pills.
    6. What assumptions did you rely upon in conducting the test? Are the assumptions met? Why or why not? [2]
      Random sampling of physicians (this is a big assumption here!), np = 30 > 5, nq = 220 > 5, np = 14 > 5, nq = 186 > 5.
    7. Now let's approach this research question from a different approach: first, identify the two variables involved (i.e., for each participant in the study, what two questions need to be asked). What are the levels of measurement? [2]
      Company (categorical/nominal, dichotomous)
      Occurrence of severe side effects (categorical/nominal, dichotomous)
    8. Now, let the research question be: are these two variables independent? State the null and alternative hypotheses, both in words and in notation. [3]
      H0: side effects are independent of company; P(side|Murck) = P(side|Zastro) (or any one of several equivalent forms)
      HA: side effects are not independent of company; P(side|Murck) ≠ P(side|Zastro) (or any one of several equivalent forms)
    9. What statistical test is appropriate to test the hypothesis?
      Number of tails? [2]
      Chi-squared on 2x2 contingency table. Omnibus (2-tailed).
    10. Run the test: find the test statistic and either bracket a p-value or find the critical value. [5]
      Observed: 30, 14, 220, 186. Expected: 24.44, 19.56, 225.56, 180.44
      χ2 = 3.15. df=1.
      p = 0.076, or critical value of χ2 = 3.84.
    11. Draw a conclusion and interpret it in the context of the two drug companies.
      Please use complete English sentences. [2]
      Fail to reject H0: insufficient evidence to show that having side effects is dependent of which company's pill you take.
    12. What assumptions did you rely upon in conducting the test? Are the assumptions met? Why? [2]
      Random sampling, no expected cell counts are zero. Yes, because smallest E is 19.56.
  6. A biomedical lab recently purchased a new spirometer (measures lung functioning) to replace its old one. To assess precision, both spirometers were run 16 times on a standard test apparatus to measure forced vital capacity (FVC). The old spirometer had a standard deviation of 20 mL, and the new spirometer has a standard deviation of 12 mL. Is the new spirometer more precise than the old one?
    1. State the null and alternative hypotheses, both in words and in notation. [2]
      H0: σold ≤ σnew, or σold2 / σnew2 ≤ 1
      HA: σold > σnew, or σold2 / σnew2 > 1
    2. What statistical test is appropriate to test the hypothesis? Number of tails? [2]
      F-test comparing two variances, 1-tailed.
    3. Run the test: find the test statistic and either bracket a p-value or find the critical value. [4]
      F = 202 / 122 = 2.7778, df = (15, 15), 1-tail:
      p = 0.0283, or critical F(15, 15, 0.05) = 2.40
    4. Draw a conclusion and interpret it in the context of the original research question. Please use complete English sentences. [2]
      Reject H0: new spirometer is significantly more precise than old one.
    5. What assumptions did you rely upon in conducting the test? [2]
      Random sampling of measurements, normal distribution of measurements.
  7. The peak expiratory flow (PEF) is the maximum flow rate (in L/min) from the lungs when a person blows out. The PEF for seven females with asthma is measured both before and after the patient inhales a corticosteroid:
    MeanSD
    No inhaler: 310325350355 373395433 363 41.85
    With inhaler: 332350362370 384400420 374 29.98
    1. Do corticosteroids enhance peak expiratory flow in females with asthma? State the null and alternative hypotheses, both in words and in notation. [2]
      H0: μd ≤ 0 (presuming d = after - before; you could also subtract in the other order), corticosteroids do not increase PEF.
      HA: μd > 0, corticosteroids do increase PEF.
    2. What statistical test is be appropriate to test the hypothesis? Number of tails? [2]
      Dependent t-test on pairwise differences. 1-tailed.
    3. Run the test: find the test statistic and either bracket a p-value or find the critical value. [4]
      mean diff = 11, SD of diffs = 12.556
      SEd = 4.746, t = 2.318, df = 6, one-tailed.
      p = 0.0298, or critical t = 2.45
    4. Draw a conclusion and interpret it in the context of the original research question. Please use complete English sentences. [2]
      Reject H0, corticosteroids increase peak expiratory flow in females with asthma.
    5. What assumptions did you rely upon in conducting the test? Are the assumptions met? Why or why not? [3]
      Random sampling of females with asthma; change in PEF is normally distributed. Not met: one outlier whose PEF decreased by 13 L/min
  8. Three pharmaceutical companies, "Murck", "ZastroSeneca", and "Faizer", all produce diabetes medication which purports to reduce glycated hemoglobin (A1C, an indicator of plasma glucose concentration). Each company's medication is given to a different group of diabetic patients, and the percent reduction in A1C is recorded (see below). Do all three medications have the same efficacy on A1C?
    Mean:SD:
    Murck: 0.40.6 0.50.1414
    ZastroSeneca: 0.50.61.0 0.70.2646
    Faizer: 1.01.41.5 1.30.2646
    1. State the null and alternative hypotheses, both in words and in notation. [3]
      [H0: μM = μZ = μF: Reduction in A1C is the same for all three medications.
      HA: μM ≠ μZ, or μM ≠ μF, or μZ ≠ μF. Some medications are more effective than others in reducing A1C.
    2. What statistical test is be appropriate to test the hypothesis? Number of tails? [2]
      ANOVA, omnibus (2-tailed)
    3. Run the test: find the test statistic and either bracket a p-value or find the critical value. [5]
      Grand mean = 0.875.
      SS(factor) = 0.915, df(factor) = 2;
      SS(error) = 0.3, df(error) = 5.
      MS(factor) = 0.4575, MS(error) = 0.06.
      F = 7.625.
      p = 0.0303, or critical F = 5.79.
    4. Draw a conclusion and interpret it in the context of the original research question. Please use complete English sentences. [2]
      Reject H0, the three medications do differ in A1C reduction .
    5. What assumptions did you rely upon in conducting the test? [2]
      Random sampling in all 3 groups; A1C reduction is are normally distributed within each group; variance of A1C reduction is similar in all groups.
  9. Anxiety is frequently measured on the Hamilton Anxiety Scale ("HAM-A"), a 14-parameter questionnaire, where a score of 0 indicates minimal anxiety, up a score of 14 representing severe anxiety. Is there a relationship between coffee intake (cups/week) and anxiety level?
    1. Name the variable(s) which need to be measured and their levels of measurement. [2]
      Coffee and anxiety, both continuous.
    2. What statistical test is appropriate? Number of tails? [2]
      Linear regression: t-test on slope b1. 2-tailed.
    3. State the null and alternate hypotheses, both in words and in notation. [2]
      [ H0: β = 0 (also ok: b1 = 0), there is no linear relationship between coffee and anxiety.
      HA: β ≠ 0 (also ok: b1 ≠ 0), there is a linear relationship between coffee and anxiety. ]
    4. A study with 11 participants results in the following data (X represents coffee in cups/week, and Y represents anxiety in HAM-A points):
      SSX = 110, SSY = 41.26, SSXY = 55.
      Find the slope of the best-fit line, indicate its units, and interpret the slope in light of the original variables. [3]
      slope=0.5: for every additional cup of coffee drunk per week, anxiety increases by 0.5 points on the HAM-A scale. Units are HAM-A pts / (cups/wk).
    5. The average coffee intake in the study was 9 cups/week, and the average anxiety level in the study was 7.5. Find the equation of the best-fit line, and interpret the intercept of the line in light of the model. [3]
      Anxiety = 3 + 0.5*Coffee.
      According to the model, non-coffee drinkers have an anxiety level of 3.
    6. Find the correlation between coffee intake and anxiety level in this study. Is this a low, medium, or high level of correlation? [2]
      r = SSXY / √(SSX SSY) = 0.8164, quite high.
    7. What fraction of the variability in anxiety levels in this study is explained by the linear relationship with coffee intake? [2]
      r2 = 66.65%
    8. What is the average anxiety level predicted by the linear model for people who drink 2 cups of coffee a day (14 per week)? [1]
      3 + 0.5*14 = 10 on the HAM-A scale.
    9. What is the standard deviation in anxiety level predicted by the linear model for people who drink 2 cups of coffee a day (14 per week)? [3]
      SS(err) = (1 - r2)SS(y) = (1 - 0.6665)(41.26) = 13.76,
      se = √( SS(err) / (n-2) ) = √(13.76/9) = 1.2366
    10. Conduct a significance test to answer the original research question: find the test statistic and either bracket a p-value or find the critical value. [4]
      SE = se / √(SSX) = 1.2366 / √(110) = 0.1179
      t = slope/SE = 0.5 / 0.1179 = 4.24.
      2-tailed, df=9: p < 0.01 (p = 0.0022), or critical value of t = 2.26
    11. Draw a conclusion and interpret it in the context of the research question.
      Please use complete English sentences. [2]
      Reject H0, coffee intake is linearly related to anxiety.
    12. What assumptions did you rely upon in conducting the test? [2]
      Random sampling, normality of the residuals (i.e., points are normally distributed about the line of best fit), homoscedasticity (i.e., variance of residuals is constant).
  10. A factory needs to ensure that the widgets it produces have variance no more than 2.5mm2. An inspector from corporate headquarters randomly selects 41 widgets from the factory, to check if the factory is within specifications. Those 41 widgets have a variance of 3.52mm2 in length.
    1. State the null and alternative hypotheses, both in words and in notation. [2]
      H0: variance σ2 ≤ 2.5
      HA: variance σ2 > 2.5
    2. What statistical test is be appropriate to test the hypothesis? Number of tails? [2]
      χ2 (chi-squared), 1-tailed.
    3. Run the test: find the test statistic and either bracket a p-value or find the critical value. [4]
      χ2 = 56.32, df = 40.
      p = 0.04503, or critical χ2 = 55.8.
    4. Draw a conclusion and interpret it in the context of the original research question. Please use complete English sentences. [2]
      Reject H0: the factory is out of spec.
    5. What assumptions did you rely upon in conducting the test? [2]
      Random sampling; length of widgets is normally distributed.
  11. The forced expiratory volume in 1 second (FEV1) is the volume of air (in litres) that can forcibly be blown out by a person in 1 second. Does FEV1 depend on gender?
    1. State the null and alternative hypotheses, both in words and in notation. [2]
      [H0: μM = μF: FEV1 levels are the same for both genders.
      HA: μM ≠ μF: FEV1 levels differ for males from females.
    2. What statistical test is be appropriate to test the hypothesis? Number of tails? [2]
      t-test on independent groups, 2-tailed.
    3. Data for this experiment are given below. Sketch boxplots for the data, on a common axis (number line). [4]
      Mean:SD:
      Males: 3.03.13.33.4 3.43.74.04.1 3.50.4000
      Females: 2.72.93.03.1 3.23.23.43.5 3.1250.2605
      M: (3.0, 3.2, 3.4, 3.85, 4.1). F: (2.7, 2.95, 3.15, 3.3, 3.5)
    4. Run the test: find the test statistic and either bracket a p-value or find the critical value. [4]
      SEM = 0.1414, SEF = 0.0921, SE = 0.1688.
      mean diff = 0.325, so t = 2.222.
      Two-tailed; df = min(n1, n2) - 1 = 7 (using the formula, real df = 12.03)
      p = 0.0617, or critical t = 2.36 (with df=12, p=0.0433).
    5. Draw a conclusion and interpret it in the context of the original research question. Please use complete English sentences. [2]
      Fail to reject H0, average FEV1 level in males does not differ significantly from FEV1 level in females.
    6. What assumptions did you rely upon in conducting the test? [2]
      Random sampling on both groups; FEV1 levels are normally distributed in both groups; variance of FEV1 levels is similar in both groups.