# MATH 108 Fall 2010 Supplemental HW: Answer Key

## HW2

1. Classify each of the following statements (which may or may not be true) as either (D)escriptive or (I)nferential:
1. 20% of plastic bottled water in Canada contains measurable amounts of bisphenol A. [Inferential]
2. Out of 30 plastic water bottles we measured, 6 bottles contained measurable amounts of bisphenol A. [Descriptive]
3. 33.9% of adults in the U.S. are clinically obese. [Inferential]
4. The mean age at diagnosis of self-reported diabetes in Canada is 46.823 years. [Inferential]
5. In a 1991 LCDC General Social Survey of 472 Canadians, 68% of diabetics were diagnosed after age 40. [Descriptive]
2. Suppose you wish to study whether people in rural communities in Canada utilize family doctors (e.g., visits per year) more than people in urban communities.
1. What is the population in question? [All Canadians]
2. List the variables which need to be measured.
[How many visits per year to family doctor, and whether they are from rural or urban community]
3. For each variable, indicate its level of measurement and whether it is a predictor (independent variable) or outcome (dependent variable).
[Doctor visits: continuous/discrete/scale, outcome.
Community: categorical/dichotomous, predictor.]
4. State the research question precisely in words and in notation as we discussed in class.
["Canadians from rural communities visit family doctors more frequently than Canadians from urban communities." (μR > μU) (or something similar)]
5. Discuss how you might do the sampling process. What are some steps you could take to ensure a random sampling? Address both criteria for random sampling. Ch8 of the textbook has further reading if you want more ideas.
[E.g., ask doctors' offices for patient data, place voluntary questionnaires in doctors' offices, ask BC-MSP for data, use online survey, etc. There is a fair bit of flexibility here; the point is just to think about how data can practically be gathered, and the limitations (privacy issues, etc.). The two criteria for random sampling are (1) everybody in population has an equal chance of being selected, and (2) selections are independent, i.e., choosing one person doesn't bias us toward choosing their spouse/relative/boss/etc.]
3. Suppose you wish to study whether praying more (e.g., minutes per day) reduces white blood cell count (cells per microlitre) in patients with leukaemia.
1. What is the population in question? [Patients with leukaemia]
2. List the variables which need to be measured. [Minutes per day of prayer, white blood cell count]
3. For each variable, indicate its level of measurement and whether it is a predictor (independent variable) or outcome (dependent variable). [Prayer: continuous/scale, predictor. WBC: continuous/scale, outcome.]
4. Suppose that in a study 70% of the participants are married and 40% of the participants consider their jobs to be high-stress. Consider the probability that a participant in the study is married and also has a high-stress job.
1. What is the minimum possible value for this probability? Draw a Venn diagram illustrating this situation.
[10%. Everybody is either married or high-stress (or both).]
 Married (10%) Stress
2. What is the maximum possible value for this probability? Draw a Venn diagram illustrating this situation.
[40%. High-stress is a subset of married (i.e., the circle representing high-stress lies completely within the circle representing married).]

## HW3

1. Let the sample space S represent people born in Canada. Let the event A represent people born in Alberta and B represent people born in BC. Are the events A and B independent, in the statistical sense?
[No, P(A|B) = 0 ≠ P(A), so they are not independent. (they are mutually exclusive, though).]
2. In a study of Canadian nurses, say that 60% of the nurses are in urban hospitals, and that one quarter of the nurses use prescription anti-depressants on themselves. Out of all the nurses in the study, 20% are in hospitals and also on anti-depressants.
1. For each of the three probabilities given (60%, 25%, 20%), express the probability in notation (e.g., P(A)) and draw a Venn diagram, shading in the relevant region (draw three separate Venn diagrams).
[P(urban) = 60%, P(antidep) = 25%, P(urban and antidep) = 20%. The last one is the intersection.]
2. In this study, what is the chance that an urban-hospital nurse is on anti-depressants?
[P(antidep|urban) = P(urban and antidep) / P(urban) = 20%/60% = 33%.]
3. In this study, is being in an urban hospital independent of being on anti-depressants? Why or why not?
[No, P(antidep|urban) = 33% ≠ 25% = P(antidep), so they are not independent.]
3. Come up with your own example (not one we have learned so far) of two events which are independent. Be sure to specify what the entire sample space is.
[The examples we did in class were female vs. brunette (females are not more likely to be brunette) and first flip of coin vs. second flip (outcome of first flip does not affect outcome of second flip).]