In probability theory, the multinomial distribution is a generalization of the binomial distribution. The binomial distribution is the probability distribution of the number of "successes" in n independent Bernoulli trials, with the same probability of "success" on each trial. Instead of each trial resulting in "success" or "failure", imagine that each trial results in one of some fixed finite number k of possible outcomes, with probabilities p1, ..., pk, and there are n independent trials. In effect, it is the probability distribution of a random vector satisfying the constraint The probabilities are given by for non-negative integers x1, ..., xk, if and 0 otherwise.

Each of the k components separately has a binomial distribution with parameters n and pi, for the appropriate value of the subscript i, and, because of the constraint that the sum of the components is n, they are negatively correlated.

The expected value is The covariance matrix is as follows. Each diagonal entry is the variance of a binomially distributed random variable, and is therefore The off-diagonal entries are the covariances. These are for i, j distinct. This is a k × k nonnegative-definite matrix of rank k − 1.

The Dirichlet distribution is the conjugate prior of the multinomial in Bayesian statistics.