Giáo trình

Introductory Statistics

Mathematics and Statistics

Terminology

Tác giả: OpenStaxCollege

Probability is a measure that is associated with how certain we are of outcomes of a particular experiment or activity. An experiment is a planned operation carried out under controlled conditions. If the result is not predetermined, then the experiment is said to be a chance experiment. Flipping one fair coin twice is an example of an experiment.

A result of an experiment is called an outcome. The sample space of an experiment is the set of all possible outcomes. Three ways to represent a sample space are: to list the possible outcomes, to create a tree diagram, or to create a Venn diagram. The uppercase letter S is used to denote the sample space. For example, if you flip one fair coin, S = {H, T} where H = heads and T = tails are the outcomes.

An event is any combination of outcomes. Upper case letters like A and B represent events. For example, if the experiment is to flip one fair coin, event A might be getting at most one head. The probability of an event A is written P(A).

The probability of any outcome is the long-term relative frequency of that outcome. Probabilities are between zero and one, inclusive (that is, zero and one and all numbers between these values). P(A) = 0 means the event A can never happen. P(A) = 1 means the event A always happens. P(A) = 0.5 means the event A is equally likely to occur or not to occur. For example, if you flip one fair coin repeatedly (from 20 to 2,000 to 20,000 times) the relative frequency of heads approaches 0.5 (the probability of heads).

Equally likely means that each outcome of an experiment occurs with equal probability. For example, if you toss a fair, six-sided die, each face (1, 2, 3, 4, 5, or 6) is as likely to occur as any other face. If you toss a fair coin, a Head (H) and a Tail (T) are equally likely to occur. If you randomly guess the answer to a true/false question on an exam, you are equally likely to select a correct answer or an incorrect answer.

To calculate the probability of an event A when all outcomes in the sample space are equally likely, count the number of outcomes for event A and divide by the total number of outcomes in the sample space. For example, if you toss a fair dime and a fair nickel, the sample space is {HH, TH, HT, TT} where T = tails and H = heads. The sample space has four outcomes. A = getting one head. There are two outcomes that meet this condition {HT, TH}, so P(A) = $\frac{2}{4}$ = 0.5.

Suppose you roll one fair six-sided die, with the numbers {1, 2, 3, 4, 5, 6} on its faces. Let event E = rolling a number that is at least five. There are two outcomes {5, 6}. P(E) = $\frac{2}{6}$. If you were to roll the die only a few times, you would not be surprised if your observed results did not match the probability. If you were to roll the die a very large number of times, you would expect that, overall, $\frac{2}{6}$ of the rolls would result in an outcome of "at least five". You would not expect exactly $\frac{2}{6}$. The long-term relative frequency of obtaining this result would approach the theoretical probability of $\frac{2}{6}$ as the number of repetitions grows larger and larger.

This important characteristic of probability experiments is known as the law of large numbers which states that as the number of repetitions of an experiment is increased, the relative frequency obtained in the experiment tends to become closer and closer to the theoretical probability. Even though the outcomes do not happen according to any set pattern or order, overall, the long-term observed relative frequency will approach the theoretical probability. (The word empirical is often used instead of the word observed.)

It is important to realize that in many situations, the outcomes are not equally likely. A coin or die may be unfair, or biased. Two math professors in Europe had their statistics students test the Belgian one Euro coin and discovered that in 250 trials, a head was obtained 56% of the time and a tail was obtained 44% of the time. The data seem to show that the coin is not a fair coin; more repetitions would be helpful to draw a more accurate conclusion about such bias. Some dice may be biased. Look at the dice in a game you have at home; the spots on each face are usually small holes carved out and then painted to make the spots visible. Your dice may or may not be biased; it is possible that the outcomes may be affected by the slight weight differences due to the different numbers of holes in the faces. Gambling casinos make a lot of money depending on outcomes from rolling dice, so casino dice are made differently to eliminate bias. Casino dice have flat faces; the holes are completely filled with paint having the same density as the material that the dice are made out of so that each face is equally likely to occur. Later we will learn techniques to use to work with probabilities for events that are not equally likely.

"OR" Event:An outcome is in the event A OR B if the outcome is in A or is in B or is in both A and B. For example, let A = {1, 2, 3, 4, 5} and B = {4, 5, 6, 7, 8}. A OR B = {1, 2, 3, 4, 5, 6, 7, 8}. Notice that 4 and 5 are NOT listed twice.

"AND" Event:An outcome is in the event A AND B if the outcome is in both A and B at the same time. For example, let A and B be {1, 2, 3, 4, 5} and {4, 5, 6, 7, 8}, respectively. Then A AND B = {4, 5}.

The complement of event A is denoted A′ (read "A prime"). A′ consists of all outcomes that are NOT in A. Notice that P(A) + P(A′) = 1. For example, let S = {1, 2, 3, 4, 5, 6} and let A = {1, 2, 3, 4}. Then, A′ = {5, 6}. P(A) = $\frac{4}{6}$, P(A′) = $\frac{2}{6}$, and P(A) + P(A′) = $\frac{4}{6}+\frac{2}{6}$ = 1

The conditional probability of A given B is written P(A|B). P(A|B) is the probability that event A will occur given that the event B has already occurred. A conditional reduces the sample space. We calculate the probability of A from the reduced sample space B. The formula to calculate P(A|B) is P(A|B) = $\frac{P\left(A\text{AND}B\right)}{P\left(B\right)}$ where P(B) is greater than zero.

For example, suppose we toss one fair, six-sided die. The sample space S = {1, 2, 3, 4, 5, 6}. Let A = face is 2 or 3 and B = face is even (2, 4, 6). To calculate P(A|B), we count the number of outcomes 2 or 3 in the sample space B = {2, 4, 6}. Then we divide that by the number of outcomes B (rather than S).

We get the same result by using the formula. Remember that S has six outcomes.

P(A|B) = $\frac{P\left(A\phantom{\rule{2pt}{0ex}}\text{AND}\phantom{\rule{2pt}{0ex}}B\right)}{P\left(B\right)}=\frac{\frac{\left(\text{the number of outcomes that are 2 or 3 and even in}\phantom{\rule{2pt}{0ex}}S\right)}{6}}{\frac{\left(\text{the number of outcomes that are even in}\phantom{\rule{2pt}{0ex}}S\right)}{6}}=\frac{\frac{1}{6}}{\frac{3}{6}}=\frac{1}{3}$

Understanding Terminology and SymbolsIt is important to read each problem carefully to think about and understand what the events are. Understanding the wording is the first very important step in solving probability problems. Reread the problem several times if necessary. Clearly identify the event of interest. Determine whether there is a condition stated in the wording that would indicate that the probability is conditional; carefully identify the condition, if any.

The sample space S is the whole numbers starting at one and less than 20.

1. S = _____________________________

Let event A = the even numbers and event B = numbers greater than 13.

2. A = _____________________, B = _____________________
3. P(A) = _____________, P(B) = ________________
4. A AND B = ____________________, A OR B = ________________
5. P(A AND B) = _________, P(A OR B) = _____________
6. A′ = _____________, P(A′) = _____________
7. P(A) + P(A′) = ____________
8. P(A|B) = ___________, P(B|A) = _____________; are the probabilities equal?
1. S = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19}
2. A = {2, 4, 6, 8, 10, 12, 14, 16, 18}, B = {14, 15, 16, 17, 18, 19}
3. P(A) = $\frac{9}{19}$, P(B) = $\frac{6}{19}$
4. A AND B = {14,16,18}, A OR B = 2, 4, 6, 8, 10, 12, 14, 15, 16, 17, 18, 19}
5. P(A AND B) = $\frac{3}{19}$, P(A OR B) = $\frac{12}{19}$
6. A′ = 1, 3, 5, 7, 9, 11, 13, 15, 17, 19; P(A′) = $\frac{10}{19}$
7. P(A) + P(A′) = 1 ($\frac{9}{19}$ + $\frac{10}{19}$ = 1)
8. P(A|B) = $\frac{P\left(A\text{AND}B\right)}{P\left(B\right)}$ = $\frac{3}{6}$, P(B|A) = $\frac{P\left(A\text{AND}B\right)}{P\left(A\right)}$ = $\frac{3}{9}$, No

A fair, six-sided die is rolled. Describe the sample space S, identify each of the following events with a subset of S and compute its probability (an outcome is the number of dots that show up).

1. Event T = the outcome is two.
2. Event A = the outcome is an even number.
3. Event B = the outcome is less than four.
4. The complement of A.
5. A GIVEN B
6. B GIVEN A
7. A AND B
8. A OR B
9. A OR B′
10. Event N = the outcome is a prime number.
11. Event I = the outcome is seven.
1. T = {2}, P(T) = $\frac{1}{6}$
2. A = {2, 4, 6}, P(A) = $\frac{1}{2}$
3. B = {1, 2, 3}, P(B) = $\frac{1}{2}$
4. A′ = {1, 3, 5}, P(A′) = $\frac{1}{2}$
5. A|B = {2}, P(A|B) = $\frac{1}{3}$
6. B|A = {2}, P(B|A) = $\frac{1}{3}$
7. A AND B = {2}, P(A AND B) = $\frac{1}{6}$
8. A OR B = {1, 2, 3, 4, 6}, P(A OR B) = $\frac{5}{6}$
9. A OR B′ = {2, 4, 5, 6}, P(A OR B′) = $\frac{2}{3}$
10. N = {2, 3, 5}, P(N) = $\frac{1}{2}$
11. A six-sided die does not have seven dots. P(7) = 0.

[link] describes the distribution of a random sample S of 100 individuals, organized by gender and whether they are right- or left-handed.

 Right-handed Left-handed Males 43 9 Females 44 4

Let’s denote the events M = the subject is male, F = the subject is female, R = the subject is right-handed, L = the subject is left-handed. Compute the following probabilities:

1. P(M)
2. P(F)
3. P(R)
4. P(L)
5. P(M AND R)
6. P(F AND L)
7. P(M OR F)
8. P(M OR R)
9. P(F OR L)
10. P(M')
11. P(R|M)
12. P(F|L)
13. P(L|F)
1. P(M) = 0.52
2. P(F) = 0.48
3. P(R) = 0.87
4. P(L) = 0.13
5. P(M AND R) = 0.43
6. P(F AND L) = 0.04
7. P(M OR F) = 1
8. P(M OR R) = 0.96
9. P(F OR L) = 0.57
10. P(M') = 0.48
11. P(R|M) = 0.8269 (rounded to four decimal places)
12. P(F|L) = 0.3077 (rounded to four decimal places)
13. P(L|F) = 0.0833

References

“Countries List by Continent.” Worldatlas, 2013. Available online at http://www.worldatlas.com/cntycont.htm (accessed May 2, 2013).

Chapter Review

In this module we learned the basic terminology of probability. The set of all possible outcomes of an experiment is called the sample space. Events are subsets of the sample space, and they are assigned a probability that is a number between zero and one, inclusive.

Formula Review

A and B are events

P(S) = 1 where S is the sample space

0 ≤ P(A) ≤ 1

P(A|B) = $\frac{P\text{(}A\text{AND}B\text{)}}{P\text{(}B\text{)}}$

In a particular college class, there are male and female students. Some students have long hair and some students have short hair. Write the symbols for the probabilities of the events for parts a through j. (Note that you cannot find numerical answers here. You were not given enough information to find any probability values yet; concentrate on understanding the symbols.)

• Let F be the event that a student is female.
• Let M be the event that a student is male.
• Let S be the event that a student has short hair.
• Let L be the event that a student has long hair.
1. The probability that a student does not have long hair.
2. The probability that a student is male or has short hair.
3. The probability that a student is a female and has long hair.
4. The probability that a student is male, given that the student has long hair.
5. The probability that a student has long hair, given that the student is male.
6. Of all the female students, the probability that a student has short hair.
7. Of all students with long hair, the probability that a student is female.
8. The probability that a student is female or has long hair.
9. The probability that a randomly selected student is a male student with short hair.
10. The probability that a student is female.
1. P(L′) = P(S)
2. P(M OR S)
3. P(F AND L)
4. P(M|L)
5. P(L|M)
6. P(S|F)
7. P(F|L)
8. P(F OR L)
9. P(M AND S)
10. P(F)

Use the following information to answer the next four exercises. A box is filled with several party favors. It contains 12 hats, 15 noisemakers, ten finger traps, and five bags of confetti.
Let H = the event of getting a hat.
Let N = the event of getting a noisemaker.
Let F = the event of getting a finger trap.
Let C = the event of getting a bag of confetti.

Find P(H).

Find P(N).

P(N) = $\frac{15}{42}$ = $\frac{5}{14}$ = 0.36

Find P(F).

Find P(C).

P(C) = $\frac{5}{42}$ = 0.12

Use the following information to answer the next six exercises. A jar of 150 jelly beans contains 22 red jelly beans, 38 yellow, 20 green, 28 purple, 26 blue, and the rest are orange.
Let B = the event of getting a blue jelly bean
Let G = the event of getting a green jelly bean.
Let O = the event of getting an orange jelly bean.
Let P = the event of getting a purple jelly bean.
Let R = the event of getting a red jelly bean.
Let Y = the event of getting a yellow jelly bean.

Find P(B).

Find P(G).

P(G) = $\frac{20}{150}$ = $\frac{2}{15}$ = 0.13

Find P(P).

Find P(R).

P(R) = $\frac{22}{150}$ = $\frac{11}{75}$ = 0.15

Find P(Y).

Find P(O).

P(O) = $\frac{150-22-38-20-28-26}{150}$ = $\frac{16}{150}$ = $\frac{8}{75}$ = 0.11

Use the following information to answer the next six exercises. There are 23 countries in North America, 12 countries in South America, 47 countries in Europe, 44 countries in Asia, 54 countries in Africa, and 14 in Oceania (Pacific Ocean region).
Let A = the event that a country is in Asia.
Let E = the event that a country is in Europe.
Let F = the event that a country is in Africa.
Let N = the event that a country is in North America.
Let O = the event that a country is in Oceania.
Let S = the event that a country is in South America.

Find P(A).

Find P(E).

P(E) = $\frac{47}{194}$ = 0.24

Find P(F).

Find P(N).

P(N) = $\frac{23}{194}$ = 0.12

Find P(O).

Find P(S).

P(S) = $\frac{12}{194}$ = $\frac{6}{97}$ = 0.06

What is the probability of drawing a red card in a standard deck of 52 cards?

What is the probability of drawing a club in a standard deck of 52 cards?

$\frac{13}{52}$ = $\frac{1}{4}$ = 0.25

What is the probability of rolling an even number of dots with a fair, six-sided die numbered one through six?

What is the probability of rolling a prime number of dots with a fair, six-sided die numbered one through six?

$\frac{3}{6}$ = $\frac{1}{2}$ = 0.5

Use the following information to answer the next two exercises. You see a game at a local fair. You have to throw a dart at a color wheel. Each section on the color wheel is equal in area.

Let B = the event of landing on blue.
Let R = the event of landing on red.
Let G = the event of landing on green.
Let Y = the event of landing on yellow.

If you land on Y, you get the biggest prize. Find P(Y).

If you land on red, you don’t get a prize. What is P(R)?

$P\left(R\right)=\frac{4}{8}=0.5$

Use the following information to answer the next ten exercises. On a baseball team, there are infielders and outfielders. Some players are great hitters, and some players are not great hitters.
Let I = the event that a player in an infielder.
Let O = the event that a player is an outfielder.
Let H = the event that a player is a great hitter.
Let N = the event that a player is not a great hitter.

Write the symbols for the probability that a player is not an outfielder.

Write the symbols for the probability that a player is an outfielder or is a great hitter.

P(O OR H)

Write the symbols for the probability that a player is an infielder and is not a great hitter.

Write the symbols for the probability that a player is a great hitter, given that the player is an infielder.

P(H|I)

Write the symbols for the probability that a player is an infielder, given that the player is a great hitter.

Write the symbols for the probability that of all the outfielders, a player is not a great hitter.

P(N|O)

Write the symbols for the probability that of all the great hitters, a player is an outfielder.

Write the symbols for the probability that a player is an infielder or is not a great hitter.

P(I OR N)

Write the symbols for the probability that a player is an outfielder and is a great hitter.

Write the symbols for the probability that a player is an infielder.

P(I)

What is the word for the set of all possible outcomes?

What is conditional probability?

The likelihood that an event will occur given that another event has already occurred.

A shelf holds 12 books. Eight are fiction and the rest are nonfiction. Each is a different book with a unique title. The fiction books are numbered one to eight. The nonfiction books are numbered one to four. Randomly select one book
Let F = event that book is fiction
Let N = event that book is nonfiction
What is the sample space?

What is the sum of the probabilities of an event and its complement?

1

Use the following information to answer the next two exercises. You are rolling a fair, six-sided number cube. Let E = the event that it lands on an even number. Let M = the event that it lands on a multiple of three.

What does P(E|M) mean in words?

What does P(E OR M) mean in words?

the probability of landing on an even number or a multiple of three

Homework

The graph in [link] displays the sample sizes and percentages of people in different age and gender groups who were polled concerning their approval of Mayor Ford’s actions in office. The total number in the sample of all the age groups is 1,045.

1. Define three events in the graph.
2. Describe in words what the entry 40 means.
3. Describe in words the complement of the entry in question 2.
4. Describe in words what the entry 30 means.
5. Out of the males and females, what percent are males?
6. Out of the females, what percent disapprove of Mayor Ford?
7. Out of all the age groups, what percent approve of Mayor Ford?
8. Find P(Approve|Male).
9. Out of the age groups, what percent are more than 44 years old?
10. Find P(Approve|Age < 35).

Explain what is wrong with the following statements. Use complete sentences.

1. If there is a 60% chance of rain on Saturday and a 70% chance of rain on Sunday, then there is a 130% chance of rain over the weekend.
2. The probability that a baseball player hits a home run is greater than the probability that he gets a successful hit.
1. You can't calculate the joint probability knowing the probability of both events occurring, which is not in the information given; the probabilities should be multiplied, not added; and probability is never greater than 100%
2. A home run by definition is a successful hit, so he has to have at least as many successful hits as home runs.