Name: _______________________________ K E Y
Student ID: _______________________________

MATH102 11SP Final Exam
A [ answers in web view ] Total points: 150

Let the words of my mouth and the meditation of my heart
Be acceptable in Your sight, O LORD, my Rock and my Redeemer.
-- Psalm 19:14
  1. The home provinces of 16 students in a class are listed below. Draw a Pareto chart showing the distribution of home province in this sample. [4]
    AB, BC, BC, SK, AB, SK, BC, MB, AB, BC, BC, AB, BC, ON, MB, SK

    Frequencies (in order):
    BC = 6/16 = 37.5% (cum: 37.5%);
    AB = 4/16 = 25% (cum: 62.5%);
    SK = 3/16 = 18.75% (cum: 81.25%);
    MB = 2/16 = 12.5% (cum: 93.75%);
    ON = 1/16 = 6.25% (cum: 100%)
  2. The average number of hours of exercise per week was measured for a number of urban dwellers and rural dwellers. A 95% confidence interval for the difference of means (urban - rural) is (-0.27, 1.23). Based on this information, indicate whether each of the following statements is "True" or "False". (Please write the entire word, "True" or "False".) [6]
    1. There is no difference in the amount of exercise for urban and rural dwellers. False
    2. Urban dwellers exercise an average of between 0.27 hrs less and 1.23 hrs more per week than rural dwellers. False
    3. We are 95% certain that urban dwellers exercise between 0.27 hrs less and 1.23 hrs more per week than rural dwellers. True
    4. With 95% confidence, the difference in hrs/week of exercise between urban and rural dwellers in this study is between -0.27 and 1.23. False
    5. 95% of urban dwellers exercise between 0.27 hrs less and 1.23 hrs more per week than rural dwellers. False
    6. At a 5% level of significance, this study is unable to find a difference in amount of exercise between urban and rural dwellers. True
  3. A particular FDG-PET (fludeoxyglucose positron-emission tomography) screening test for non-Hodgkin's lymphoma has a 15% false-positive rate (85% specificity) and 90% sensitivity (i.e., 90% of lymphomas are caught by the screening process).
    1. Suppose the screening test is applied to 200 patients, of which 80 have non-Hodgkin's lymphoma. Draw an event tree for the outcomes of the test, and label the tree with probabilities for each branch of the tree. [4]
      First level of event tree has two branches: has lymphoma (0.40) vs. does not have lymphoma (0.60).
      Second level of event tree has a total of four branches:
      P(test pos | lymph) = 0.90,
      P(test neg | lymph) = 0.10,
      P(test pos | no lymph) = 0.15,
      P(test neg | no lymph) = 0.85.
      There are four possible outcomes:
      P(test pos ∩ lymph) = (0.40)(0.90) = 0.36,
      P(test neg ∩ lymph) = (0.40)(0.10) = 0.04,
      P(test pos ∩ no lymph) = (0.60)(0.15) = 0.09,
      P(test neg ∩ no lymph) = (0.60)(0.85) = 0.51.
    2. On average, how many people in this group will test positive for non-Hodgkin's lymphoma? [3]
      (true positive) + (false positive) = n(test pos ∩ lymph) + n(test pos ∩ no lymph) = (0.36 + 0.09)(200) = (0.45)(200) = 90
    3. If a patient tests positive using this test, what is the probability that the patient really has non-Hodgkin's lymphoma? [2]
      (true positive) / (total positive) = 0.36 / (0.36 + 0.09) = 0.36 / 0.45 = 80%
  4. A factory needs to ensure that the widgets it produces have variance no more than 2.5mm2. An inspector from corporate headquarters randomly selects 41 widgets from the factory, to check if the factory is within specifications. Those 41 widgets have a variance of 3.18mm2 in length.
    1. State the null and alternative hypotheses, both in words and in notation. [2]
      H0: variance σ2 ≤ 2.5
      HA: variance σ2 > 2.5
    2. What statistical test is be appropriate to test the hypothesis? Number of tails? [2]
      χ2 (chi-squared), 1-tailed.
    3. Run the test: find the test statistic and either bracket a p-value or find the critical value. [4]
      χ2 = 50.88, df = 40.
      P-value: p > 0.10 (actual p = 0.116)
      Classical: χ2* = 55.8.
    4. Draw a conclusion and interpret it in the context of the original research question. Please use complete English sentences. [2]
      We fail to reject the null hypothesis: there is insufficient evidence to show that the factory is out of spec.
    5. What assumptions did you rely upon in conducting the test? [2]
      Random sampling; length of widgets is normally distributed.
  5. Describe two events/conditions that one might reasonably expect are statistically independent, and justify why. Be sure also to indicate what the overall sample space is. [3]
    Plenty of options here: e.g., sample space is outcomes of flipping a coin twice: heads on first flip is independent of heads on second flip.
    Gender and eye colour: being female is independent of having brown eyes. (sample space is all people with any combination of gender and eye colour).
    As a counterexample: male and female are not independent: they are mutually exclusive.
  6. A biomedical lab recently purchased a new spirometer (measures lung functioning) to replace its old one. To assess precision, both spirometers were run 16 times on a standard test apparatus to measure forced vital capacity (FVC). The old spirometer had a standard deviation of 20 mL, and the new spirometer has a standard deviation of 14 mL. Is the new spirometer more precise than the old one?
    1. State the null and alternative hypotheses, both in words and in notation. [2]
      H0: σold ≤ σnew, or σold2 / σnew2 ≤ 1
      HA: σold > σnew, or σold2 / σnew2 > 1
    2. What statistical test is appropriate to test the hypothesis? Number of tails? [2]
      F-test comparing two variances, 1-tailed.
    3. Run the test: find the test statistic and either bracket a p-value or find the critical value. [4]
      F = 202 / 142 = 2.0408, df = (15, 15), 1-tail:
      p = 0.089, or critical F(15, 15, 0.05) = 2.40
    4. Draw a conclusion and interpret it in the context of the original research question. Please use complete English sentences. [2]
      Fail to reject H0: new spirometer is not significantly more precise than old one.
    5. What assumptions did you rely upon in conducting the test? [2]
      Random sampling of measurements, normal distribution of measurements.
  7. The peak expiratory flow (PEF) is the maximum flow rate (in L/min) from the lungs when a person blows out. The PEF for seven females with asthma is measured both before and after the patient inhales a corticosteroid:
    MeanSD
    No inhaler: 310325350355 373395440 364 43.84
    With inhaler: 332350362370 384400420 374 29.98
    1. Do corticosteroids enhance peak expiratory flow in females with asthma? State the null and alternative hypotheses, both in words and in notation. [2]
      H0: μd ≤ 0 (presuming d = after - before; you could also subtract in the other order), corticosteroids do not increase PEF.
      HA: μd > 0, corticosteroids do increase PEF.
    2. What statistical test is be appropriate to test the hypothesis? Number of tails? [2]
      Dependent t-test on pairwise differences. 1-tailed.
    3. Run the test: find the test statistic and either bracket a p-value or find the critical value. [4]
      mean diff = 10, SD of diffs = 14.855
      SEd = 5.615, t = 1.781, df = 6, one-tailed.
      p = 0.0626, or critical t = 2.45
    4. Draw a conclusion and interpret it in the context of the original research question. Please use complete English sentences. [2]
      Fail to reject H0, insufficient evidence to show corticosteroids increase peak expiratory flow in females with asthma.
    5. What assumptions did you rely upon in conducting the test? Are the assumptions met? Why or why not? [3]
      Random sampling of females with asthma; change in PEF is normally distributed. Not met: one outlier whose PEF decreased by 20 L/min
  8. In clinical trials, a cholesterol pill produced by pharmaceutical company "Murck" exhibits severe side effects in 10 out of 250 patients. The pill produced by competitor "ZastroSeneca" exhibits severe side effects in 18 out of 200 patients.
    1. Build a 95% confidence interval for the risk of side effects with Murck's pill. Do the same for ZastroSeneca. [4]
      σM = √(npq)/n = √(pq/n) = √(.04*.96/250) = .01239
      95% ⇒ z = 1.96: confidence interval is .04 ± 1.96(.01239) or 1.568% < pM < 6.432%.
      σZ = √(npq)/n = √(pq/n) = √(.09*.91/200) = .02024
      95% ⇒ z = 1.96: confidence interval is .09 ± 1.96(.02024) or 5.029% < pZ < 12.971%.
    2. Does the risk of side effects differ significantly for the two companies' cholesterol pills? State the null and alternative hypotheses, both in words and in notation. [2]
      H0: binomial proportion pZ = pM
      HA: binomial proportion pZ ≠ pM
    3. What statistical test is appropriate to test the hypothesis? Number of tails? [2]
      Comparing two independent proportions, 1-tailed.
    4. Run the test: find the test statistic and either bracket a p-value or find the critical value. [4]
      SEp1-p2 = √(pq/n + pq/n) = √(.04*.96/250 + .09*.91/200) = .02373
      z = ( (p'M - p'Z) - 0 ) / SE = (0.04 - 0.09) / .02373 = 2.107
      P-value: p = 0.035; Classical: critical z = 1.96
    5. Draw a conclusion and interpret it in the context of the two drug companies.
      Please use complete English sentences. [2]
      Reject H0: the risk of side effects is different for the two companies' cholesterol pills.
    6. What assumptions did you rely upon in conducting the test? Are the assumptions met? Why? [2]
      Random sampling of physicians (this is a big assumption here!), np = 10 > 5, nq = 240 > 5, np = 18 > 5, nq = 182 > 5.
    7. Now let's approach this research question from a different approach: first, identify the two variables involved (i.e., for each participant in the study, what two questions need to be asked). What are the levels of measurement? [2]
      Company (categorical/nominal, dichotomous)
      Occurrence of severe side effects (categorical/nominal, dichotomous)
    8. Now, let the research question be: are these two variables independent? State the null and alternative hypotheses, both in words and in notation. [3]
      H0: side effects are independent of company; P(side|Murck) = P(side|Zastro) (or any one of several equivalent forms)
      HA: side effects are not independent of company; P(side|Murck) ≠ P(side|Zastro) (or any one of several equivalent forms)
    9. What statistical test is appropriate to test the hypothesis?
      Number of tails? [2]
      Chi-squared on 2x2 contingency table. Omnibus (2-tailed).
    10. Run the test: find the test statistic and either bracket a p-value or find the critical value. [5]
      Observed: 10, 18, 240, 182. Expected: 15.56, 12.44, 234.44, 187.56
      χ2 = 4.76. df=1.
      p = 0.029, or critical value of χ2 = 3.84.
    11. Draw a conclusion and interpret it in the context of the two drug companies.
      Please use complete English sentences. [2]
      Reject H0: having side effects is not independent of which company's pill you take.
    12. What assumptions did you rely upon in conducting the test? Are the assumptions met? Why? [2]
      Random sampling, no expected cell counts are zero. Yes, because smallest E is 12.44.
  9. The forced expiratory volume in 1 second (FEV1) is the volume of air (in litres) that can forcibly be blown out by a person in 1 second. Does FEV1 depend on gender?
    1. State the null and alternative hypotheses, both in words and in notation. [2]
      [H0: μM = μF: FEV1 levels are the same for both genders.
      HA: μM ≠ μF: FEV1 levels differ for males from females.
    2. What statistical test is be appropriate to test the hypothesis? Number of tails? [2]
      t-test on independent groups, 2-tailed.
    3. Data for this experiment are given below. Sketch boxplots for the data, on a common axis (number line). [4]
      Mean:SD:
      Males: 3.13.13.33.5 3.94.14.24.4 3.70.5155
      Females: 2.62.72.92.9 3.13.23.23.4 30.2726
      M: (3.1, 3.2, 3.7, 4.15, 4.4). F: (2.6, 2.8, 3.0, 3.2, 3.4)
    4. Run the test: find the test statistic and either bracket a p-value or find the critical value. [4]
      SEM = 0.1822, SEF = 0.0964, SE = 0.2062.
      mean diff = 0.7, so t = 3.3955.
      Two-tailed; df = min(n1, n2) - 1 = 7 (using formula, real df = 10.63).
      p = 0.0115, or critical t = 2.36 (with df=10.6, p=0.0044).
    5. Draw a conclusion and interpret it in the context of the original research question. Please use complete English sentences. [2]
      Reject H0, gender does impact FEV1 level.
    6. What assumptions did you rely upon in conducting the test? [2]
      Random sampling on both groups; FEV1 levels are normally distributed in both groups; variance of FEV1 levels is similar in both groups.
  10. Three pharmaceutical companies, "Murck", "ZastroSeneca", and "Faizer", all produce diabetes medication which purports to reduce glycated hemoglobin (A1C, an indicator of plasma glucose concentration). Each company's medication is given to a different group of diabetic patients, and the percent reduction in A1C is recorded (see below). Do all three medications have the same efficacy on A1C?
    Mean:SD:
    Murck: 0.40.6 0.50.1414
    ZastroSeneca: 0.50.61.0 0.70.2646
    Faizer: 0.91.31.4 1.20.2646
    1. State the null and alternative hypotheses, both in words and in notation. [3]
      [H0: μM = μZ = μF: Reduction in A1C is the same for all three medications.
      HA: μM ≠ μZ, or μM ≠ μF, or μZ ≠ μF. Some medications are more effective than others in reducing A1C.
    2. What statistical test is be appropriate to test the hypothesis? Number of tails? [2]
      ANOVA, omnibus (2-tailed)
    3. Run the test: find the test statistic and either bracket a p-value or find the critical value. [5]
      Grand mean = 0.8375.
      SS(factor) = 0.6788, df(factor) = 2;
      SS(error) = 0.3, df(error) = 5.
      MS(factor) = 0.3394, MS(error) = 0.06.
      F = 5.6563.
      p = 0.052, or critical F = 5.79.
    4. Draw a conclusion and interpret it in the context of the original research question. Please use complete English sentences. [2]
      Fail to reject H0, insufficient evidence to show a difference in A1C reduction amongst the three medications.
    5. What assumptions did you rely upon in conducting the test? [2]
      Random sampling in all 3 groups; A1C reduction is are normally distributed within each group; variance of A1C reduction is similar in all groups.
  11. Anxiety is frequently measured on the Hamilton Anxiety Scale ("HAM-A"), a 14-parameter questionnaire, where a score of 0 indicates minimal anxiety, up a score of 14 representing severe anxiety. Is there a relationship between coffee intake (cups/week) and anxiety level?
    1. Name the variable(s) which need to be measured and their levels of measurement. [2]
      Coffee and anxiety, both continuous.
    2. What statistical test is appropriate? Number of tails? [2]
      Linear regression: t-test on slope b1. 2-tailed.
    3. State the null and alternate hypotheses, both in words and in notation. [2]
      [ H0: β = 0 (also ok: b1 = 0), there is no linear relationship between coffee and anxiety.
      HA: β ≠ 0 (also ok: b1 ≠ 0), there is a linear relationship between coffee and anxiety. ]
    4. A study with 11 participants results in the following data (X represents coffee in cups/week, and Y represents anxiety in HAM-A points):
      SSX = 110, SSY = 41.26, SSXY = 55.
      Find the slope of the best-fit line, indicate its units, and interpret the slope in light of the original variables. [3]
      slope=0.5: for every additional cup of coffee drunk per week, anxiety increases by 0.5 points on the HAM-A scale. Units are HAM-A pts / (cups/wk).
    5. The average coffee intake in the study was 9 cups/week, and the average anxiety level in the study was 7.5. Find the equation of the best-fit line, and interpret the intercept of the line in light of the model. [3]
      Anxiety = 3 + 0.5*Coffee.
      According to the model, non-coffee drinkers have an anxiety level of 3.
    6. Find the correlation between coffee intake and anxiety level in this study. Is this a low, medium, or high level of correlation? [2]
      r = SSXY / √(SSX SSY) = 0.8164, quite high.
    7. What fraction of the variability in anxiety levels in this study is explained by the linear relationship with coffee intake? [2]
      r2 = 66.65%
    8. What is the average anxiety level predicted by the linear model for people who drink 2 cups of coffee a day (14 per week)? [1]
      3 + 0.5*14 = 10 on the HAM-A scale.
    9. What is the standard deviation in anxiety level predicted by the linear model for people who drink 2 cups of coffee a day (14 per week)? [3]
      SS(err) = (1 - r2)SS(y) = (1 - 0.6665)(41.26) = 13.76,
      se = √( SS(err) / (n-2) ) = √(13.76/9) = 1.2366
    10. Conduct a significance test to answer the original research question: find the test statistic and either bracket a p-value or find the critical value. [4]
      Standard error SE = sb1 = se / √(SSX) = 1.2366 / √(110) = 0.1179
      t = slope/SE = 0.5 / 0.1179 = 4.24.
      2-tailed, df=9: p < 0.01 (p = 0.0022), or critical value of t = 2.26
    11. Draw a conclusion and interpret it in the context of the research question.
      Please use complete English sentences. [2]
      Reject H0, coffee intake is linearly related to anxiety.
    12. What assumptions did you rely upon in conducting the test? [2]
      Random sampling, normality of the residuals (i.e., points are normally distributed about the line of best fit), homoscedasticity (i.e., variance of residuals is constant).