当我听到别人讲解某些数学问题时,常觉得很难理解,甚至不可理解。这时便想,是否可以将问题化简些呢?往往,在终于弄清楚之后,实际上,它只是一个更简单的问题。 -----希尔伯特

Chapter 1: Sample Space and Probability

Probability is common sense reduced to calculation. -------Laplace

Sec 1.1 Sets

牢记

对任意三事件 , 试证:

设事件 的概率分别为 , 求 .

一个是一个是

, 求 .

, 求 .

用Venn图来看好表示一点

设事件 互不相容, , 求 .

仅发生一个的概率为 , 求 都发生 的概率.

Sec 1.3 Conditional Probability

容易忽视的独立性

Problem :

A conservative design team,call it C, and an innovative design team, call it N, are asked to separately design a new product within a month. From past experience we know that:

  1. The probability that team C is successful is 2/3.
  2. The probability that team N is successful is 1/2.
  3. The probability that at least one team is successful is 3/4.

Assuming that exactly one successful design is produced, what is the probability that it was designed by team N?

Solution:

这里要注意那个C队和N队成功的概率是不独立的捏

分组/partition问题

Problem :

A class consisting of 4 graduate and 12 undergraduate students is randomly divided into 4 groups of 4. What is the probability that each group includes a graduate student? We interpret"randomly"to mean that given the assignment of some students to certain slots,any of the remaining students is equally likely to be assigned to any of the remaining slots.

Solution:

考虑分组的情况,可以从两个角度进行分析,也就是排列组合和概率两个角度。

概率角度入手,分成四组也就是要体现随机性,可以考虑假设有序位置的方法进行可视化分析,以某种随机的顺序排好了,只要关注需要的元素,进行逐步调整即可

fx的解释:这题只看老师不看学生就行,16个人假设有十六个位置从左到右排列,前4个是第一组,再来4个是第二组以此类推,现在有老师ABCD找个位置,对老师A,16个位置随便哪种都进入了某一组,那么就是16/16,接下来对B老师分位置有15个位置,但是A老师进入的组所对应的4个位置就不能用了,所以要进入新的一组应该是12/15,C 老师在新的一组的概率以此类推是8/14,D老师是4/13

注意,从概率角度只要随机性就行了,至于具体怎么排的,排列的种数(也就是重复24遍)不是应该考虑的东西。

排列组合角度,也是可以分析的,通过计数总数和符合情况的个数,可以得到答案

稍微解释一下,分母是16人分成4组的种数,然后应该是要除以A44的(消序),为了美观放到上面去了,分子是排那12个人**(Problem 62详细解释了)**

sum:遇到有些问题的时候,可以考虑从部分简单开始入手,然后再去扩展规模,比如这题的分组需要消序,一开始没注意到,后来是通过4个人分两组的简单试验发现的规律,还有像神经网络那里处理数据,由单批数据扩展到多批矩阵,也是像这样的思路

Monty Hall & 犯人问题

Problem:

The Monty Hall Problem. This is a much discussed puzzle, based on an old American game show. You are told that a prize is equally likely to be found behind any one of three closed doors in front of you. You point to one of the doors. A friend opens for you one of the remaining two doors, after making sure that the prize is not behind it. At this point,you can stick to your initial choice, or switch to the other non-opened door. You win the prize if it lies behind your final choice of a door. Consider the following strategies:
(a) Stick to your initial choice.
(b) Switch to the other unopened door.
© You first point to door 1. If door 2 is opened, you do not switch. If door 3 is opened, you switch.
Which is the best strategy?

Solution:

To answer the question,let us calculate the probability of winning under each of the three strategies.
Under the strategy of no switching, your initial choice will determine whether you win or not, and the probability of winning is 1/3. This is because the prize is equally likely to be behind each door.

Under the strategy of switching,if the prize is behind the initially chosen door(probability 1/3).you do not win. If it is not(probability 2/3),and given that another door without a prize has been opened for you,you will get to the winning door once you switch. Thus. the probability of winning is now 2/3, so (b) is a better strategy than (a).

Consider now strategy ©. Under this strategy, there is insufficient informa- tion for determining the probability of winning. The answer depends on the way that your friend chooses which door to open. Let us consider two possibilities.

Suppose that if the prize is behind door 1,your friend always chooses to open door 2.(If the prize is behind door 2 or 3, your friend has no choice.)If the prize is behind door 1. your friend opens door 2, you do not switch, and you win. If the prize is behind door 2, your friend opens door 3, you switch. and you win. If the prize is behind door 3. your friend opens door 2. you do not switch, and you lose. Thus, the probability of winning is 2/3. so strategy © in this case is as good as strategy (b).

Suppose now that if the prize is behind door 1. your friend is equally likely to open either door 2 or 3. If the prize is behind door 1(probability 1/3). and if your friend opens door 2(probability 1/2),you do not switch and you win (probability 1/6). But if your friend opens door 3, you switch and you lose. If the prize is behind door 2, your friend opens door 3. you switch, and you win(probability 1/3). If the prize is behind door 3, your friend opens door 2, you do not switch and you lose. Thus,the probability of winning is 1/6+1/3= 1/2, so strategy © in this case is inferior to strategy (b)

Problem 24:

The prisoner’s dilemma. The release of two out of three prisoners has been announced. but their identity is kept secret. One of the prisoners considers asking a friendly guard to tell him who is the prisoner other than himself that will be released, but hesitates based on the following rationale: at the prisoner’s present state of knowledge, the probability of being released is 2/3, but after he knows the answer, the probability of being released will become 1 /2, since there will be two prisoners (including himself) whose fate is unknown and exactly one of the two will be released. What is wrong with this line of reasoning?

Solution:

可以看成是Monty Hall的inverse版本

也可以这么解释:

Sec 1.4 Total Probability Theorem and Bayes’ Rule

Bayes’ Rule 常用于从effect到cause的inference

两个重要模型:飞机雷达(or检测阳性患病) & 抽屉找卷子(or寻宝)

雷达问题

p22(example 1.9) and p33(example 1.16),太多了就不写在这里了

找卷子寻宝

Problem :

Alice searches for her term paper in her filing cabinet. which has several drawers. She knows that she left her term paper in drawer j with probability p;>0. The drawers are so messy that even if she correctly guesses that the term paper is in drawer i the probability that she finds it is only d Alice searches in a particular drawer. say drawer i. but the search is unsuccessful. Conditioned on this event,show that the probability that her paper is in drawer j, is given by

Solution:

Solution to Problem 1.19. Let A be the event that Alice does not find her paper in drawer i. Since the paper is in drawer i with probability p;,and her search is successful with probability d;,the multiplication rule yields P(A*)= p;d;,so that P(A)= 1-p;d;.Let B be the event that the paper is in drawer j, if j ≠ i, then A B = B, P(A B) = P(B),and we have

Similarly, if i = j, we have

Problem : Let A and B be events with P(A)>0 and P(B) > 0. We say that an event B suggests an event A if P(A|B)> P(A),and does not suggest event A if P(A|B) <P(A).
(a.) Show that B suggests A if and only if A suggests B.
(b.) Assume that P(B*)> 0. Show that B suggests A if and only if B * does not suggest A.
(c.) We know that a treasure is located in one of two places, with probabilities β and 1-β, respectively, where 0<β<1. We search the first place and if the treasure is there, we find it with probability p>0. Show that the event of not finding the treasure in the first place suggests that the treasure is in the second place.

Sec 1.6 Counting

Permutation & Combination

不一定与先后顺序,时序有关,本质是靠等可能性的所有可能(树状)生成的sequence组成的样本空间,可以视为一种discrete uniform model,于是就可以用计数原理来计算概率。

捡帽子,抽签,摸球本质上都是一样的,也就是说,outcome(也就是sequence)中每一个位置上的符合目标的可能(捡到帽子,中奖)都是一样的,也是没有任何额外信息造成的捏。

Partition

注意分成相同大小组的时候,需要消序,也就是比如分成A,B,C三组,A,B所含的元素数量是一样的(记为r),则如果只是单纯的这种会造成有序的重复,也就是AB和BA这两种次序 ,可以用这样直观的sequence示意图来看[ ] [ ] [ ],简单分析一下就会发现问题所在。

Sec 1.7 Chapter 1 Summary

A probability problem can usually be broken down into a few basic steps:

  1. The description of the sample space, that is, the set of possible outcomes
of a given experiment.
  2. The(possibly indirect) specification of the probability law(the probability
of each event.
  3. The calculation of probabilities and conditional probabilities of various
events of interest.

The probabilities of events must satisfy the nonnegativity, additivity,and normalization axioms. In the important special case where the set of possible outcomes is finite,one can just specify the probability of each outcome and obtain the probability of any event by adding the probabilities of the elements of the event.
Given a probability law, we are often interested in conditional probabilities, which allow us to reason based on partial information about the outcome of the experiment. We can view conditional probabilities as probability laws of a special type, under which only outcomes contained in the conditioning event can have positive conditional probability. Conditional probabilities can be derived from the (unconditional) probability law using the definition P(A| B) = P(A ∩ B) / P(B). However, the reverse process is often convenient,that is, first specify some conditional probabilities that are natural for the real situation that we wish to model,and then use them to derive the (unconditional) probability law. We have illustrated through examples three methods for calculating probabilities:

  1. The counting method. This method applies to the case where the number of possible outcomes is finite, and all outcomes are equally likely. To calculate the probability of an event,we count the number of elements of the event and divide by the number of elements of the sample space.

  2. The sequential method. This method applies when the experiment has a sequential character, and suitable conditional probabilities are specified or calculated along the branches of the corresponding tree (perhaps using the counting method). The probabilities of various events are then obtained by multiplying conditional probabilities along the corresponding paths of the tree, using the multiplication rule.

  3. The divide-and-conquer method. Here,the probabilities P(B) of various events B are obtained from conditional probabilities P(B |A;),where the A; are suitable events that form a partition of the sample space and have known probabilities P(A;). The probabilities P(B) are then obtained by using the total probability theorem.

    Finally, we have focused on a few side topics that reinforce our main themes. We have discussed the use of Bayes’ rule in inference, which is an important application context. We have also discussed some basic principles of counting and combinatorics, which are helpful in applying the counting method.

Chapter 2: Discrete Random Variables

Sec 2.2 Probability Mass Function

Geometry Random Variable

Point: 几何随机变量的k取值通常为infinite,但注意当k为有限时,也要保证PMF各项概率和为1的性质。

e.g. P119 Problem 3. 经典的比赛问题

另外就是几何随机变量特有的memoryless性质。

Binomial Random Variable

Point: 也是要注意PMF各项概率和为1,有时候是合并项的概率和

e.g. P119 Problem 4. 经典的调制解调器问题

Form of Binomial PMF: 单调递增/递减区间

Sec 2.6 Conditioning

Problem:

The Hat Problem. Suppose that n people throw their hats in a box and then each picks one hat at random.(Each hat can be picked by only one person,and each assignment of hats to persons is equally likely.)What is the expected value of X,the number of people that get back their own hat?

Wrong way of thinking:

Shouldn’t it depend on the value of i. What I mean is for the first person it is 1/n but for the second it is 1/n-1 if the first one did not get his hat and 0 if the first one picked his hat.

Correct way of thinking:

The point is that the sequential way you thought doesn’t include the particular case 1st got 2nd 's hat,so in fact it’s bigger than the right answer 1/n. And the right way of thinking of it is to just focus on one person and the previous probability.

consider 3 people - the first has 1/3 chance to get his because there are 3 choices he can make - the 2nd can get his had out of 2 if the first didn’t take his (conditional probabilty): probability the first didn’t take his probability of getting his: 2/31/2=1/3 - the 3rd can’t choose anymore, but there is a 2/3 chance the others took his, so he has a 1/3 chance of getting his all of them have 1/3 chance with n: first has 1/n chance 2nd has (1-1/n)1/(n-1)=(n-1)/n1/(n-1)=1/n 3rd has (1-1/n-1/n)1/(n-2)=(n-2)/n1/(n-2)=1/n and so on …

There are n! ways of distributing the hats to the people as you rightly pointed out. So that is the cardinality of the sample space Ω. Now, let the event A where the k-th person gets his own hat. There are n-1 hats to distribute freely as one has to have his hat. So the cardinality of the event A turns out to be (n-1)! Thus the probability of the k-th person to get his own hat is: (n-1)!/n! = 1/n

其实彼此独不独立和最后的概率没有必然的联系。

Solution:

For the ith person. we introduce a random variable X; that takes the value 1 if the person selects his/her own hat and takes the value 0 otherwise. Since P(Xi= 1)= 1/n and P(Xi=0)=1-1/n, the mean of Xi is

so we can get

Sec 2.8 Chapter 2 Summary

Chapter 3

Sec 3.1 Continuous Random Variable and PDFs

Problem

设连续型随机变量的所有可能取值在区间之内,则

Solution

Sec 3.3 Normal Random Variable

Problem

是两个相互独立的且服从正态分布的随机变量,求

Solution

Problem

相互独立,且都服从分布,试证

Solution

Chapter 4: Further Topics on Random Variable

Sec 4.2 Covariance and Correlation

Problem

设二维随机变量的概率密度为

其中都是二维正态分布的概率密度,且它们对应的二维随机变量的相关系数分别为和$ -{1\over3}X,Y\rhoX,Y$是否独立。

Thinkings

一开始看到这题比较懵逼就是,作为的函数,它们的相关系数不就是的相关系数么,怎么还不一样了捏?

有这种想法是因为对于随机变量没有理解透彻,其实这里的仅仅是从样本空间映射到的取值而已,也就是两条实数轴。在这个上,可以取不同,只是说可能取这个值的概率是0而已,并不妨碍说能取到这个值。因此对于或者来说,其实它的相关系数是针对其他随机变量来说的,也就是两个的随机变量,是通过这个分布推出的,而不是

也就是说一个域上,根据的不同取值,可以在其上定义不同的分布函数,也就对应着不同的随机变量,有着对应不同的相关系数。

Solution

这题只给个hint

Hint 1.Calculate the marginal pdf of f(x,y), and calculate E[XY] using the correlation of phi_1 and phi_2. 2.Since phi_1 and phi_2 are normal distributed, we can derive their pdf to get f(x,y) and check the independence of X and Y.

Chapter 5: Limit Theorems

Problem

若随机变量序列满足条件

则证明$ {X_n}$服从大数定律。

Solution

注意这里的并不一定是的,要结合条件和定义(Convergence to probability)入手。

Chapter 9

https://en.wikipedia.org/wiki/Standard_error先放着捏