Practice Test 3 http://staging2.cnx.org/content new Practice Test 3 **new** 2013/10/18 17:24:41.277 GMT-5 2013/10/18 17:24:41.334 GMT-5 Words Numbers Words Numbers techsupport@cnx.org words_stats words_stats words_stats Mathematics and Statistics en
8.1: Confidence Interval, Single Population Mean, Population Standard Deviation Known, Normal Use the following information to answer the next seven exercises. You draw a sample of size 30 from a normally distributed population with a standard deviation of four. What is the standard error of the sample mean in this scenario, rounded to two decimal places? σ n = 4 30 =0.73 What is the distribution of the sample mean? normal If you want to construct a two-sided 95% confidence interval, how much probability will be in each tail of the distribution? 0.025 or 2.5% A 95% confidence interval contains 95% of the probability, and excludes five percent, and the five percent excluded is split evenly between the upper and lower tails of the distribution. What is the appropriate z-score and error bound or margin of error (EBM) for a 95% confidence interval for this data? z-score = 1.96; EBM=  z α 2 ( σ n )=( 1.96 )( 0.73 )= 1.4308 Rounding to two decimal places, what is the 95% confidence interval if the sample mean is 41? 41 ± 1.43 = (39.57, 42.43) Using the calculator function Zinterval, answer is (40.74, 41.26. Answers differ due to rounding. What is the 90% confidence interval if the sample mean is 41? Round to two decimal places The z-value for a 90% confidence interval is 1.645, so EBM = 1.645(0.73) = 1.20085. The 90% confidence interval is 41 ± 1.20 = (39.80, 42.20). The calculator function Zinterval answer is (40.78, 41.23). Answers differ due to rounding. Suppose the sample size in this study had been 50, rather than 30. What would the 95% confidence interval be if the sample mean is 41? Round your answer to two decimal places. The standard error of measurement is: σ n =  4 50 =0.57 EBM=  z α 2 ( σ n )=( 1.96 )( 0.57 )= 1.12 The 95% confidence interval is 41 ± 1.12 = (39.88, 42.12). The calculator function Zinterval answer is (40.84, 41.16). Answers differ due to rounding. For any given data set and sampling situation, which would you expect to be wider: a 95% confidence interval or a 99% confidence interval? The 99% confidence interval, because it includes all but one percent of the distribution. The 95% confidence interval will be narrower, because it excludes five percent of the distribution.
8.2: Confidence Interval, Single Population Mean, Standard Deviation Unknown, Student’s t Comparing graphs of the standard normal distribution (z-distribution) and a t-distribution with 15 degrees of freedom (df), how do they differ? The t-distribution will have more probability in its tails (“thicker tails”) and less probability near the mean of the distribution (“shorter in the center”). Comparing graphs of the standard normal distribution (z-distribution) and a t-distribution with 15 degrees of freedom (df), how are they similar? Both distributions are symmetrical and centered at zero. Use the following information to answer the next five exercises. Body temperature is known to be distributed normally among healthy adults. Because you do not know the population standard deviation, you use the t-distribution to study body temperature. You collect data from a random sample of 20 healthy adults and find that your sample temperatures have a mean of 98.4 and a sample standard deviation of 0.3 (both in degrees Fahrenheit). What is the degrees of freedom (df) for this study? df = n – 1 = 20 – 1 = 19 For a two-tailed 95% confidence interval, what is the appropriate t-value to use in the formula? You can get the t-value from a probability table or a calculator. In this case, for a t-distribution with 19 degrees of freedom, and a 95% two-sided confidence interval, the value is 2.093, i.e., t α 2   =2.093 . The calculator function is invT(0.975, 19). What is the 95% confidence interval? EBM=  t α 2 ( s n )=( 2.093 )( 0.3 20 )= 0.140 98.4 ± 0.14 = (98.26, 98.54). The calculator function Tinterval answer is (98.26, 98.54). Suppose your sample size had been 30 rather than 20. What would the 95% confidence interval be then? Round to two decimal places df = n – 1 = 30 – 1 = 29. t α 2   =2.045 EBM=  z t ( s n )=( 2.045 )( 0.3 30 )= 0.112 98.4 ± 0.11 = (98.29, 98.51). The calculator function Tinterval answer is (98.29, 98.51).
8.3: Confidence Interval for a Population Proportion Use this information to answer the next four exercises. You conduct a poll of 500 randomly selected city residents, asking them if they own an automobile. 280 say they do own an automobile, and 220 say they do not. Find the sample proportion and sample standard deviation for this data. p = 280 500 =0.56 q =1 p =10.56=0.44 s= pq n = 0.56(0.44) 500 =0.0222 What is the 95% two-sided confidence interval? Round to four decimal places. Because you are using the normal approximation to the binomial, z α 2 =1.96 . Calculate the error bound for the population (EBP): EBP=  z a 2 pq n =1.96( 0.222 )=0.0435 Calculate the 95% confidence interval: 0.56 ± 0.0435 = (0.5165, 0.6035). The calculator function 1-PropZint answer is (0.5165, 0.6035). Calculate the 90% confidence interval. Round to four decimal places. z α 2 =1.64 EBP=  z a 2 pq n =1.64( 0.0222 )=0.0364 0.56 ± 0.03 = (0.5236, 0.5964). The calculator function 1-PropZint answer is (0.5235, 0.5965) Calculate the 99% confidence interval. Round to four decimal places. z α 2 =2.58 EBP=  z a 2 pq n =2.58( 0.0222 )= 0.0573 0.56 ± 0.05 = (0.5127, 0.6173). The calculator function 1-PropZint answer is (0.5028, 0.6172). Use the following information to answer the next three exercises. You are planning to conduct a poll of community members age 65 and older, to determine how many own mobile phones. You want to produce an estimate whose 95% confidence interval will be within four percentage points (plus or minus) the true population proportion. Use an estimated population proportion of 0.5. What sample size do you need? EBP = 0.04 (because 4% = 0.04) z α 2 =1.96 for a 95% confidence interval n=  z 2 pq EB P 2 =  1.96 2 ( 0.5 )(0.5) 0.04 2 =  0.9604 0.0016 =600.25 You need 601 subjects (rounding upward from 600.25). Suppose you knew from prior research that the population proportion was 0.6. What sample size would you need? n=  n 2 pq EB P 2 =  1.96 2 ( 0.6 )(0.4) 0.04 2 =  0.9220 0.0016 =576.24 You need 577 subjects (rounding upward from 576.24). Suppose you wanted a 95% confidence interval within three percentage points of the population. Assume the population proportion is 0.5. What sample size do you need? n=  n 2 pq EB P 2 =  1.96 2 ( 0.5 )(0.5) 0.03 2 =  0.9604 0.0009 =1067.11 You need 1,068 subjects (rounding upward from 1,067.11).
9.1: Null and Alternate Hypotheses In your state, 58 percent of registered voters in a community are registered as Republicans. You want to conduct a study to see if this also holds up in your community. State the null and alternative hypotheses to test this. H0: p = 0.58; Ha: p ≠ 0.58 You believe that at least 58 percent of registered voters in a community are registered as Republicans. State the null and alternative hypotheses to test this. H0: p ≥ 0.58; Ha: p < 0.58 The mean household value in a city is $268,000. You believe that the mean household value in a particular neighborhood is lower than the city average. Write the null and alternative hypotheses to test this. H0: μ ≥ $268,000; Ha: μ < $268,000 State the appropriate alternative hypothesis to this null hypothesis: H0: μ = 107; Ha: μ ≠ 107 State the appropriate alternative hypothesis to this null hypothesis: H0: p < 0.25; Ha: p ≥ 0.25
9.2: Outcomes and the Type I and Type II Errors If you reject H0 when H0 is correct, what type of error is this? a Type I error If you fail to reject H0 when H0 is false, what type of error is this? a Type II error What is the relationship between the Type II error and the power of a test? Power = 1 – β = 1 – P(Type II error). A new blood test is being developed to screen patients for cancer. Positive results are followed up by a more accurate (and expensive) test. It is assumed that the patient does not have cancer. Describe the null hypothesis, the Type I and Type II errors for this situation, and explain which type of error is more serious. The null hypothesis is that the patient does not have cancer. A Type I error would be detecting cancer when it is not present. A Type II error would be not detecting cancer when it is present. A Type II error is more serious, because failure to detect cancer could keep a patient from receiving appropriate treatment. Explain in words what it means that a screening test for TB has an α level of 0.10. The null hypothesis is that the patient does not have TB. The screening test has a ten percent probability of a Type I error, meaning that ten percent of the time, it will detect TB when it is not present. Explain in words what it means that a screening test for TB has a β level of 0.20. The null hypothesis is that the patient does not have TB. The screening test has a 20 percent probability of a Type II error, meaning that 20 percent of the time, it will fail to detect TB when it is in fact present. Explain in words what it means that a screening test for TB has a power of 0.80. Eighty percent of the time, the screening test will detect TB when it is actually present.
9.3: Distribution Needed for Hypothesis Testing If you are conducting a hypothesis test of a single population mean, and you do not know the population variance, what test will you use if the sample size is 10 and the population is normal.? The Student’s t-test. If you are conducting a hypothesis test of a single population mean, and you know the population variance, what test will you use? The normal distribution or z-test. If you are conducting a hypothesis test of a single population proportion, with np and nq greater than or equal to five, what test will you use, and with what parameters? The normal distribution with μ = p and σ = pq n Published information indicates that, on average, college students spend less than 20 hours studying per week. You draw a sample of 25 students from your college, and find the sample mean to be 18.5 hours, with a standard deviation of 1.5 hours. What distribution will you use to test whether study habits at your college are the same as the national average, and why? t24. You use the t-distribution because you don’t know the population standard deviation, and the degrees of freedom are 24 because df = n – 1. A published study says that 95 percent of American children are vaccinated against measles, with a standard deviation of 1.5 percent. You draw a sample of 100 children from your community and check their vaccination records, to see if the vaccination rate in your community is the same as the national average. What distribution will you use for this test, and why? X ¯ ~N( 0.95, 0.051 100 ) Because you know the population standard deviation, and have a large sample, you can use the normal distribution.
9.4: Rare Events, the Sample, Decision, and Conclusion You are conducting a study with an α level of 0.05. If you get a result with a p-value of 0.07, what will be your decision? Fail to reject the null hypothesis, because αp You are conducting a study with α = 0.01. If you get a result with a p-value of 0.006, what will be your decision? Reject the null hypothesis, because αp. Use the following information to answer the next five exercises. According to the World Health Organization, the average height of a one-year-old child is 29”. You believe children with a particular disease are smaller than average, so you draw a sample of 20 children with this disease and find a mean height of 27.5” and a sample standard deviation of 1.5”. What are the null and alternative hypotheses for this study? H0: μ ≥ 29.0”; Ha: μ < 29.0” What distribution will you use to test your hypothesis, and why? t19. Because you do not know the population standard deviation, use the t-distribution. The degrees of freedom are 19, because df = n – 1. What is the test statistic and the p-value? The test statistic is -4.4721 and the p-value is 0.00013 using the calculator function TTEST. Based on your sample results, what is your decision? With α = 0.05, reject the null hypothesis. Suppose the mean for your sample was 25.0. Redo the calculations and describe what your decision would be. With α = 0.05, the p-value is almost zero using the calculator function TTEST so reject the null hypothesis.
9.5: Additional Information and Full Hypothesis Test Examples 9.5: Additional Information and Full Hypothesis Test Examples The level of significance is five percent. You conduct a study, based on a sample drawn from a normally distributed population with a known variance, with the following hypotheses: H0: μ = 35.5; Ha: μ ≠ 35.5 Will you conduct a one-tailed or two-tailed test? two-tailed You conduct a study, based on a sample drawn from a normally distributed population with a known variance, with the following hypotheses: H0: μ ≥ 35.5; Ha: μ < 35.5 Will you conduct a one-tailed or two-tailed test? one-tailed test Use the following information to answer the next three exercises. Nationally, 80 percent of adults own an automobile. You are interested in whether the same proportion in your community own cars. You draw a sample of 100 and find that 75 percent own cars. What are the null and alternative hypotheses for this study? H0: p = 0.8; Ha: p ≠ 0.8 What test will you use, and why? You will use the normal test for a single population proportion because np and nq are both greater than five.
10.1: Comparing Two Independent Population Means with Unknown Population Standard Deviations You conduct a poll of political opinions, interviewing both members of 50 married couples. Are the groups in this study independent or matched? They are matched (paired), because you interviewed married couples. You are testing a new drug to treat insomnia. You randomly assign 80 volunteer subjects to either the experimental (new drug) or control (standard treatment) conditions. Are the groups in this study independent or matched? They are independent, because participants were assigned at random to the groups. You are investigating the effectiveness of a new math textbook for high school students. You administer a pretest to a group of students at the beginning of the semester, and a posttest at the end of a year’s instruction using this textbook, and compare the results. Are the groups in this study independent or matched? They are matched (paired), because you collected data twice from each individual. Use the following information to answer the next two exercises. You are conducting a study of the difference in time at two colleges for undergraduate degree completion. At College A, students take an average of 4.8 years to complete an undergraduate degree, while at College B, they take an average of 4.2 years. The pooled standard deviation for this data is 1.6 years Calculate Cohen’s d and interpret it. d= x ¯ 1 x ¯ 2 s pooled = 4.84.2 1.6 =0.375 This is a small effect size, because 0.375 falls between Cohen’s small (0.2) and medium (0.5) effect sizes. Suppose the mean time to earn an undergraduate degree at College A was 5.2 years. Calculate the effect size and interpret it. d= x ¯ 1 x ¯ 2 s pooled = 5.24.2 1.6 =0.625 The effect size is 0.625. By Cohen’s standard, this is a medium effect size, because it falls between the medium (0.5) and large (0.8) effect sizes. You conduct an independent-samples t-test with sample size ten in each of two groups. If you are conducting a two-tailed hypothesis test with α = 0.01, what p-values will cause you to reject the null hypothesis? p-value < 0.01. You conduct an independent samples t-test with sample size 15 in each group, with the following hypotheses: H0: μ ≥ 110 Ha: μ < 110 If α = 0.05, what t-values will cause you to reject the null hypothesis? You will only reject the null hypothesis if you get a value significantly below the hypothesized mean of 110.
10.2: Comparing Two Independent Population Means with Known Population Standard Deviations Use the following information to answer the next six exercises. College students in the sciences often complain that they must spend more on textbooks each semester than students in the humanities. To test this, you draw random samples of 50 science and 50 humanities students from your college, and record how much each spent last semester on textbooks. Consider the science students to be group one, and the humanities students to be group two. What is the random variable for this study? X ¯ 1 X ¯ 2 , i.e., the mean difference in amount spent on textbooks for the two groups. What are the null and alternative hypotheses for this study? H0: X ¯ 1 X ¯ 2 ≤ 0 Ha: X ¯ 1 X ¯ 2 > 0 This could also be written as: H0: X ¯ 1 X ¯ 2 Ha: X ¯ 1 > X ¯ 2 If the 50 science students spent an average of $530 with a sample standard deviation of $20 and the 50 humanities students spent an average of $380 with a sample standard deviation of $15, would you not reject or reject the null hypothesis? Use an alpha level of 0.05. What is your conclusion? Using the calculator function 2-SampTtest, reject the null hypothesis. At the 5% significance level, there is sufficient evidence to conclude that the science students spend more on textbooks than the humanities students. What would be your decision, if you were using α = 0.01? Using the calculator function 2-SampTtest, reject the null hypothesis. At the 1% significance level, there is sufficient evidence to conclude that the science students spend more on textbooks than the humanities students.
10.3: Comparing Two Independent Population Proportions Use the information to answer the next six exercises. You want to know if proportion of homes with cable television service differs between Community A and Community B. To test this, you draw a random sample of 100 for each and record whether they have cable service. What are the null and alternative hypotheses for this study H0: pA = pB; Ha: pApB If 65 households in Community A have cable service, and 78 households in community B, what is the pooled proportion? p c = x A + x A n A + n A = 65+78 100+100 =0.715 At α = 0.03, will you reject the null hypothesis? What is your conclusion? 65 households in Community A have cable service, and 78 households in community B. 100 households in each community were surveyed. Using the calculator function 2-PropZTest, the p-value = 0.0417. Reject the null hypothesis. At the 3% significance level, here is sufficient evidence to conclude that there is a difference between the proportions of households in the two communities that have cable service. Using an alpha value of 0.01, would you reject the null hypothesis? What is your conclusion? 65 households in Community A have cable service, and 78 households in community B. 100 households in each community were surveyed. Using the calculator function 2-PropZTest, the p-value = 0.0417. Do not reject the null hypothesis. At the 1% significance level, there is insufficient evidence to conclude that there is a difference between the proportions of households in the two communities that have cable service.
10.4: Matched or Paired Samples Use the following information to answer the next five exercises. You are interested in whether a particular exercise program helps people lose weight. You conduct a study in which you weigh the participants at the start of the study, and again at the conclusion, after they have participated in the exercise program for six months. You compare the results using a matched-pairs t-test, in which the data is {weight at conclusion – weight at start}. You believe that, on average, the participants will have lost weight after six months on the exercise program. What are the null and alternative hypotheses for this study? H0: x ¯ d 0 Ha: x ¯ d <0 Calculate the test statistic, assuming that x ¯ d = –5, sd = 6, and n = 30 (pairs) t = – 4.5644 What is the degrees of freedom for this statistic? df = 30 – 1 = 29. Using α = 0.05, what is your decision regarding the effectiveness of this program in causing weight loss? What is the conclusion? Using the calculator function TTEST, the p-value = 0.00004 so reject the null hypothesis. At the 5% level, there is sufficient evidence to conclude that the participants lost weight, on average. What would it mean if the t-statistic had been 4.56, and what would have been your decision in that case? A positive t-statistic would mean that participants, on average, gained weight over the six months.
11.1: Facts About the Chi-Square Distribution What is the mean and standard deviation for a chi-square distribution with 20 degrees of freedom? μ = df = 20 σ= 2(df) = 40 =6.32
11.2: Goodness-of-Fit Test Use the following information to answer the next four exercises. Nationally, about 66 percent of high school graduates enroll in higher education. You perform a chi-square goodness of fit test to see if this same proportion applies to your high school’s most recent graduating class of 200. Your null hypothesis is that the national distribution also applies to your high school. What are the expected numbers of students from your high school graduating class enrolled and not enrolled in higher education? Enrolled = 200(0.66) = 132. Not enrolled = 200(0.34) = 68 Fill out the rest of this table. Observed (O) Expected (E) O – E (O – E)2 (OE) 2 z Enrolled 145 Not enrolled 55
Observed (O) Expected (E) O – E (O – E)2 (OE) 2 z Enrolled 145 132 145 – 132 = 13 169 169 132 =1.280 Not enrolled 55 68 55 – 68 = -13 169 169 68 =2.485
What are the degrees of freedom for this chi-square test? df = n – 1 = 2 – 1 = 1. What is the chi-square test statistic and the p-value. At the 5% significance level, what do you conclude? Using the calculator function Chi-square GOF – Test (in STAT TESTS), the test statistic is 3.7656 and the p-value is 0.0523. Do not reject the null hypothesis. At the 5% significance level, there is insufficient evidence to conclude that high school most recent graduating class distribution of enrolled and not enrolled does not fit that of the national distribution. For a chi-square distribution with 92 degrees of freedom, the curve _____________. approximates the normal For a chi-square distribution with five degrees of freedom, the curve is ______________. skewed right
Test of Independence Use the following information to answer the next four exercises. You are considering conducting a chi-square test of independence for the data in this table, which displays data about cell phone ownership for freshman and seniors at a high school. Your null hypothesis is that cell phone ownership is independent of class standing. Compute the expected values for the cells. Cell = Yes Cell = No Freshman 100 150 Senior 200 50
Cell = Yes Cell = No Total Freshman 250(300) 500 =150 250(200) 500 =100 250 Senior 250(300) 500 =150 250(200) 500 =100 250 Total 300 200 500
Compute (OE) 2 z for each cell, where O = observed and E = expected. ( 100150 ) 2 150 =16.67 ( 150100 ) 2 100 =25 ( 200100 ) 2 150 =16.67 ( 50100 ) 2 100 =25 What is the chi-square statistic and degrees of freedom for this study? Chi-square = 16.67 + 25 + 16.67 + 25 = 83.34. df = (r – 1)(c – 1) = 1 At the α = 0.5 significance level, what is your decision regarding the null hypothesis? p-value = P(Chi-square, 83.34) = 0 Reject the null hypothesis. You could also use the calculator function STAT TESTS Chi-Square – Test.
Test of Homogeneity You conduct a chi-square test of homogeneity for data in a five by two table. What is the degrees of freedom for this test? The table has five rows and two columns. df = (r – 1)(c – 1) = (4)(1) = 4.
11.5: Comparison Summary of the Chi-Square Tests: Goodness-of-Fit, Independence and Homogeneity A 2013 poll in the State of California surveyed people about taxing sugar-sweetened beverages. The results are presented in the following table, and are classified by ethnic group and response type. Are the poll responses independent of the participants’ ethnic group? Conduct a hypothesis test at the 5% significance level. Ethnic Group \ Response Type Favor Oppose No Opinion Row Total White / Non-Hispanic 234 433 43 710 Latino 147 106 19 272 African American 24 41 6 71 Asian American 54 48 16 118 Column Total 459 628 84 1171
Using the calculator function (STAT TESTS) Chi-square Test, the p-value = 0. Reject the null hypothesis. At the 5% significance level, there is sufficient evidence to conclude that the poll responses independent of the participants’ ethnic group.
In a test of homogeneity, what must be true about the expected value of each cell? The expected value of each cell must be at least five. Stated in general terms, what are the null and alternative hypotheses for the chi-square test of independence? H0: The variables are independent. Ha: The variables are not independent. Stated in general terms, what are the null and alternative hypotheses for the chi-square test of homogeneity? H0: The populations have the same distribution. Ha: The populations do not have the same distribution.
11.6: Test of a Single Variance A lab test claims to have a variance of no more than five. You believe the variance is greater. What are the null and alternative hypothesis to test this? H0: σ2 ≤ 5 Ha: σ2 > 5