Random Variables

Definition

Random Variable

Given an experiment with sample space , a random variable (r.v.) is a function from the sample space to the real numbers . It is common, but not required, to denote random variables by capital letters. Thus, a random variable assigns a numerical value to each possible outcome s of the experiment. The randomness comes from the fact that we have a random experiment (with probabilities described by the probability function ); the mapping itself is deterministic.

Example

Consider an experiment where we toss a fair coin twice. The sample space consists of four possible outcomes: . Here are some random variables on this space (for practice, you can think up some of your own). Each r.v. is a numerical summary of some aspect of the experiment. Let be the number of Heads. This is a random variable with possible values , , and . Viewed as a function, assigns the value to the outcome , to the outcomes and , and to the outcome . That is,

Distributions and Probability Mass Functions

Definition

Discrete Random Variable

A random variable is said to be discrete if there is a finite list of values or an infinite list of values such that . If is a discrete r.v., then the finite or countably infinite set of values such that is called the support of .

Definition

Probability Mass Function

The probability mass function (PMF) of a discrete r.v. is the function given by . Note that this is positive if is in the support of , and otherwise.

Note

In writing , we are using to denote an event, consisting of all outcomes to which assigns the number . This event is also written as ; formally, is defined as , but writing is shorter and more intuitive.

Theorem

Valid PMFs

Let be a discrete r.v. with support (assume these values are distinct and, for notational simplicity, that the support is countably infinite; the analogous results hold if the support is finite). The PMF of must satisfy the following two criteria:

  • Nonnegative: if for some , and otherwise;
  • Sums to : .

Bernoulli and Binomial

Definition

Bernoulli Distribution

An r.v. is said to have the Bernoulli distribution with parameter if and , where . We write this as . The symbol is read “is distributed as”.

Story

Bernoulli Trial An experiment that can result in either a “success” or a “failure” (but not both) is called a Bernoulli trial. A Bernoulli random variable can be thought of as the indicator of success in a Bernoulli trial: it equals if success occurs and if failure occurs in the trial.

Definition

Indicator Random Variable

The indicator random variable of an event is the r.v. which equals if occurs and otherwise. We will denote the indicator r.v. of by or . Note that with .

Story

Binomial Distribution

Suppose that independent Bernoulli trials are performed, each with the same success probability . Let be the number of successes. The distribution of is called the Binomial distribution with parameters and . We write to mean that has the Binomial distribution with parameters and , where is a positive integer and .

Note

it is clear that is the same distribution as : the Bernoulli is a special case of the Binomial.

Theorem

Binomial PMF

If , then the PMF of is

for (and otherwise).

Theorem

Let , and (we often use to denote the failure probability of a Bernoulli trial). Then .

Corollary

Let with and even. Then the distribution of is symmetric about , in the sense that for all nonnegative integers .

Hypergeometric

Story

Hypergeometric Distribution

Consider an urn with white balls and black balls. We draw balls out of the urn at random without replacement, such that all samples are equally likely. Let be the number of white balls in the sample. Then is said to have the Hypergeometric distribution with parameters , , and ; we denote this by .

Theorem

Hypergeometric PMF

If , then the PMF of is

for integers satisfying and , and otherwise.

Theorem

The and distributions are identical. That is, if and , then and have the same distribution.

Discrete Uniform

Story

Discrete Uniform Distribution

Let be a finite, nonempty set of numbers. Choose one of these numbers uniformly at random (i.e., all values in are equally likely). Call the chosen number . Then is said to have the Discrete Uniform distribution with parameter ; we denote this by .

Theorem

The PMF of is

for (and otherwise), since a PMF must sum to . As with questions based on the naive definition of probability, questions based on a Discrete Uniform distribution reduce to counting problems. Specifically, for and any , we have

Cumulative Distribution Functions

Definition

The cumulative distribution function (CDF) of an r.v. is the function given by . When there is no risk of ambiguity, we sometimes drop the subscript and just write (or some other letter) for a CDF.

Theorem

Valid CDFs

Any CDF has the following properties.

  • Increasing: If , then .
  • Right-continuous: The CDF is continuous except possibly for having some jumps. Wherever there is a jump, the CDF is continuous from the right. That is, for any , we have
  • Convergence to and in the limits:

Functions of Random Variables

Definition

Function of an r.v.

For an experiment with sample space , an r.v. , and a function , is the r.v. that maps s to for all .

Theorem

PMF of

Let X be a discrete r.v. and . Then the support of is the set of all such that for at least one in the support of , and the PMF of is

for all in the support of

Definition

Functions of two r.v.s

Given an experiment with sample space , if and are r.v.s that map to and respectively, then is the r.v. that maps to .

Independence of r.v.s

Definition

Independence of two r.v.s

Random variables and are said to be independent if

for all In the discrete case, this is equivalent to the condition

for all , with in the support of and in the support of .

Definition

Independence of many r.v.s

Random variables are independent if

for all . For infinitely many r.v.s, we say that they are independent if every finite subset of the r.v.s is independent.

Theorem

Functions of independent r.v.s

If and are independent r.v.s, then any function of is independent of any function of .

Definition

i.i.d.

We will often work with random variables that are independent and have the same distribution. We call such r.v.s independent and identically distributed, or i.i.d. for short.

Theorem

If , viewed as the number of successes in independent Bernoulli trials with success probability , then we can write where the are i.i.d. .

Theorem

If , , and is independent of , then .

Definition

Conditional independence of r.v.s

Random variables and are conditionally independent given an r.v. if for all and all in the support of ,

For discrete r.v.s, an equivalent definition is to require

Definition

Conditional PMF

For any discrete r.v.s and , the function , when considered as a function of for fixed , is called the conditional PMF of given .

Connections between Binomial and Hypergeometric

Theorem

If , , and is independent of , then the conditional distribution of given is .

Theorem

If and such that remains fixed, then the PMF of converges to the PMF.