Random Variables
Definition
Random Variable
Given an experiment with sample space , a random variable (r.v.) is a function from the sample space to the real numbers . It is common, but not required, to denote random variables by capital letters. Thus, a random variable assigns a numerical value to each possible outcome s of the experiment. The randomness comes from the fact that we have a random experiment (with probabilities described by the probability function ); the mapping itself is deterministic.
Example
Consider an experiment where we toss a fair coin twice. The sample space consists of four possible outcomes: . Here are some random variables on this space (for practice, you can think up some of your own). Each r.v. is a numerical summary of some aspect of the experiment. Let be the number of Heads. This is a random variable with possible values , , and . Viewed as a function, assigns the value to the outcome , to the outcomes and , and to the outcome . That is,
Distributions and Probability Mass Functions
Definition
Discrete Random Variable
A random variable is said to be discrete if there is a finite list of values or an infinite list of values such that . If is a discrete r.v., then the finite or countably infinite set of values such that is called the support of .
Definition
Probability Mass Function
The probability mass function (PMF) of a discrete r.v. is the function given by . Note that this is positive if is in the support of , and otherwise.
Note
In writing , we are using to denote an event, consisting of all outcomes to which assigns the number . This event is also written as ; formally, is defined as , but writing is shorter and more intuitive.
Theorem
Valid PMFs
Let be a discrete r.v. with support (assume these values are distinct and, for notational simplicity, that the support is countably infinite; the analogous results hold if the support is finite). The PMF of must satisfy the following two criteria:
- Nonnegative: if for some , and otherwise;
- Sums to : .
Bernoulli and Binomial
Definition
Bernoulli Distribution
An r.v. is said to have the Bernoulli distribution with parameter if and , where . We write this as . The symbol is read “is distributed as”.
Story
Bernoulli Trial An experiment that can result in either a “success” or a “failure” (but not both) is called a Bernoulli trial. A Bernoulli random variable can be thought of as the indicator of success in a Bernoulli trial: it equals if success occurs and if failure occurs in the trial.
Definition
Indicator Random Variable
The indicator random variable of an event is the r.v. which equals if occurs and otherwise. We will denote the indicator r.v. of by or . Note that with .
Story
Binomial Distribution
Suppose that independent Bernoulli trials are performed, each with the same success probability . Let be the number of successes. The distribution of is called the Binomial distribution with parameters and . We write to mean that has the Binomial distribution with parameters and , where is a positive integer and .
Note
it is clear that is the same distribution as : the Bernoulli is a special case of the Binomial.
Theorem
Binomial PMF
If , then the PMF of is
for (and otherwise).
Theorem
Let , and (we often use to denote the failure probability of a Bernoulli trial). Then .
Corollary
Let with and even. Then the distribution of is symmetric about , in the sense that for all nonnegative integers .
Hypergeometric
Story
Hypergeometric Distribution
Consider an urn with white balls and black balls. We draw balls out of the urn at random without replacement, such that all samples are equally likely. Let be the number of white balls in the sample. Then is said to have the Hypergeometric distribution with parameters , , and ; we denote this by .
Theorem
Hypergeometric PMF
If , then the PMF of is
for integers satisfying and , and otherwise.
Theorem
The and distributions are identical. That is, if and , then and have the same distribution.
Discrete Uniform
Story
Discrete Uniform Distribution
Let be a finite, nonempty set of numbers. Choose one of these numbers uniformly at random (i.e., all values in are equally likely). Call the chosen number . Then is said to have the Discrete Uniform distribution with parameter ; we denote this by .
Theorem
The PMF of is
for (and otherwise), since a PMF must sum to . As with questions based on the naive definition of probability, questions based on a Discrete Uniform distribution reduce to counting problems. Specifically, for and any , we have
Cumulative Distribution Functions
Definition
The cumulative distribution function (CDF) of an r.v. is the function given by . When there is no risk of ambiguity, we sometimes drop the subscript and just write (or some other letter) for a CDF.
Theorem
Valid CDFs
Any CDF has the following properties.
- Increasing: If , then .
- Right-continuous: The CDF is continuous except possibly for having some jumps. Wherever there is a jump, the CDF is continuous from the right. That is, for any , we have
- Convergence to and in the limits:
Functions of Random Variables
Definition
Function of an r.v.
For an experiment with sample space , an r.v. , and a function , is the r.v. that maps s to for all .
Theorem
PMF of
Let X be a discrete r.v. and . Then the support of is the set of all such that for at least one in the support of , and the PMF of is
for all in the support of
Definition
Functions of two r.v.s
Given an experiment with sample space , if and are r.v.s that map to and respectively, then is the r.v. that maps to .
Independence of r.v.s
Definition
Independence of two r.v.s
Random variables and are said to be independent if
for all In the discrete case, this is equivalent to the condition
for all , with in the support of and in the support of .
Definition
Independence of many r.v.s
Random variables are independent if
for all . For infinitely many r.v.s, we say that they are independent if every finite subset of the r.v.s is independent.
Theorem
Functions of independent r.v.s
If and are independent r.v.s, then any function of is independent of any function of .
Definition
i.i.d.
We will often work with random variables that are independent and have the same distribution. We call such r.v.s independent and identically distributed, or i.i.d. for short.
Theorem
If , viewed as the number of successes in independent Bernoulli trials with success probability , then we can write where the are i.i.d. .
Theorem
If , , and is independent of , then .
Definition
Conditional independence of r.v.s
Random variables and are conditionally independent given an r.v. if for all and all in the support of ,
For discrete r.v.s, an equivalent definition is to require
Definition
Conditional PMF
For any discrete r.v.s and , the function , when considered as a function of for fixed , is called the conditional PMF of given .
Connections between Binomial and Hypergeometric
Theorem
If , , and is independent of , then the conditional distribution of given is .
Theorem
If and such that remains fixed, then the PMF of converges to the PMF.