In statistics, a population is a set of similar items or events which is of interest for some question or experiment.[1][2] A statistical population can be a group of existing objects (e.g. the set of all stars within the Milky Way galaxy) or a hypothetical and potentially infinite group of objects conceived as a generalization from experience (e.g. the set of all possible hands in a game of poker).[3] A population with finitely many values in the support[4] of the population distribution is a finite population with population size . A population with infinitely many values in the support is called infinite population.
A common aim of statistical analysis is to produce information about some chosen population.[5] In statistical inference, a subset of the population (a statistical sample) is chosen to represent the population in a statistical analysis.[6] Moreover, the statistical sample must be unbiased and accurately model the population. The ratio of the size of this statistical sample to the size of the population is called a sampling fraction. It is then possible to estimate the population parameters using the appropriate sample statistics.[7]
For finite populations, sampling from the population typically removes the sampled value from the population due to drawing samples without replacement. This introduces a violation of the typical independent and identically distribution assumption so that sampling from finite populations requires "finite population corrections" (which can be derived from the hypergeometric distribution). As a rough rule of thumb,[8] if the sampling fraction is below 10% of the population size, then finite population corrections can approximately be neglected.