From Surf Wiki (app.surf) — the open knowledge base
Effective population size
Ecological concept
Ecological concept
The effective population size (N**e) is the size of an idealised population that would experience the same rate of genetic drift as the real population. Idealised populations are those where each locus evolves independently, following the assumptions of the neutral theory of molecular evolution. The effective population size is normally smaller than the census population size N. This can be due to chance events prevent some individuals from breeding, to occasional population bottlenecks, to background selection, and to genetic hitchhiking.
The same real population could have a different effective population size for different properties of interest, such as genetic drift (or more precisely, the speed of coalescence) over one generation vs. over many generations. Within a species, areas of the genome that have more genes and/or less genetic recombination tend to have lower effective population sizes, because of the effects of selection at linked sites. In a population with selection at many loci and abundant linkage disequilibrium, the coalescent effective population size may not reflect the census population size at all, or may reflect its logarithm.
The concept of effective population size was introduced in the field of population genetics in 1931 by the American geneticist Sewall Wright. Some versions of the effective population size are used in wildlife conservation.
Empirical measurements
In a rare experiment that directly measured genetic drift one generation at a time, in Drosophila populations of census size 16, the effective population size was 11.5. This measurement was achieved through studying changes in the frequency of a neutral allele from one generation to another in over 100 replicate populations.
More commonly, effective population size is estimated indirectly by comparing data on current within-species genetic diversity to theoretical expectations. According to the neutral theory of molecular evolution, an idealised diploid population will have a pairwise nucleotide diversity equal to 4\muN**e, where \mu is the mutation rate. The effective population size can therefore be estimated empirically by dividing the nucleotide diversity by 4\mu. This captures the cumulative effects of genetic drift, genetic hitchhiking, and background selection over longer timescales. More advanced methods, permitting a changing effective population size over time, have also been developed.
The effective size measured to reflect these longer timescales may have little relationship to the number of individuals physically present in a population. Measured effective population sizes vary between genes in the same population, being low in genome areas of low recombination and high in genome areas of high recombination.{{cite journal |title=Toward a selection theory of molecular evolution|journal=Evolution|year=2008|volume=62|pages=255–265|doi=10.1111/j.1558-5646.2007.00308.x |author=Hahn, Matthew W.
If the recombination map of recombination frequencies along chromosomes is known, N**e can be inferred from rP2 = 1 / (1+4N**e r), where rP is the Pearson correlation coefficient between loci. This expression can be interpreted as the probability that two lineages coalesce before one allele on either lineage recombines onto some third lineage.
The population size might not be constant over time, and thus neither might the effective population size (defined as coalescence speed). With a constant population size, we expect larger pairwise Hamming distance between sequences to be rarer. Under population expansion, an intermediate Hamming distance is instead most common; this is seen for humans. A skyline plot more directly describes coalescence speed over time. The pairwise sequential Markovian coalescent and multiple sequential Markovian coalescent take the average of skyline plots over many loci. An alternative approach infers effective population size over time, together with migration among populations, using the allele frequency spectrum, describing how often alleles are rare versus common. Yet another approach exploits runs of homozygosity to incorporate information from recombination events.
A survey of publications on 102 mostly wildlife animal and plant species yielded 192 N**e/N ratios. Seven different estimation methods were used in the surveyed studies. Accordingly, the ratios ranged widely from 10*-6* for Pacific oysters to 0.994 for humans, with an average of 0.34 across the examined species. Based on these data they subsequently estimated more comprehensive ratios, accounting for fluctuations in population size, variance in family size and unequal sex-ratio. These ratios average to only 0.10-0.11.
A genealogical analysis of Inuit hunter-gatherers determined the effective-to-census population size ratio for haploid (mitochondrial DNA, Y chromosomal DNA), and diploid (autosomal DNA) loci separately: the ratio of the effective to the census population size was estimated as 0.6–0.7 for autosomal and X-chromosomal DNA, 0.7–0.9 for mitochondrial DNA and 0.5 for Y-chromosomal DNA.
Selection effective size
In an idealised Wright-Fisher model, the fate of an allele, beginning at an intermediate frequency, is largely determined by selection if the selection coefficient s ≫ 1/N, and largely determined by neutral genetic drift if s ≪ 1/N. In real populations, the cutoff value of s may depend instead on local recombination rates.{{cite journal
The ability of a species to differentiate between nearly neutral alleles can be measured by how codon bias differs from neutral expectations. The Ka/Ks ratio is also sometimes used as a proxy.
The drift-barrier hypothesis claims that populations with different selection effective population sizes are predicted to evolve profoundly different genome architectures.
History of theory
Ronald Fisher and Sewall Wright originally defined effective population size as "the number of breeding individuals in an idealised population that would show the same amount of dispersion of allele frequencies under random genetic drift or the same amount of inbreeding as the population under consideration". This implied two potentially different effective population sizes, based either on the one-generation increase in variance across replicate populations (variance effective population size), or on the one-generation change in the inbreeding coefficient (inbreeding effective population size). These two are closely linked, and derived from F-statistics, but they are not identical.
Today, the effective population size is usually estimated empirically with respect to the amount of within-species genetic diversity divided by the mutation rate, yielding a coalescent effective population size that reflects the cumulative effects of genetic drift, background selection, and genetic hitchhiking over longer time periods. Another important effective population size is the selection effective population size 1/scritical, where scritical is the critical value of the selection coefficient at which selection becomes more important than genetic drift.
Variance effective size
In the Wright-Fisher idealized population model, the conditional variance of the allele frequency p', given the allele frequency p in the previous generation, is
:\operatorname{var}(p' \mid p)= {p(1-p) \over 2N}.
Let \widehat{\operatorname{var}}(p'\mid p) denote the same, typically larger, variance in the actual population under consideration. The variance effective population size N_e^{(v)} is defined as the size of an idealized population with the same variance. This is found by substituting \widehat{\operatorname{var}}(p'\mid p) for \operatorname{var}(p'\mid p) and solving for N which gives
:N_e^{(v)} = {p(1-p) \over 2 \widehat{\operatorname{var}}(p)}.
In the following examples, one or more of the assumptions of a strictly idealised population are relaxed, while other assumptions are retained. The variance effective population size of the more relaxed population model is then calculated with respect to the strict model.
Variations in population size
Population size varies over time. Suppose there are t non-overlapping generations, then effective population size is given by the harmonic mean of the population sizes:
:{1 \over N_e} = {1 \over t} \sum_{i=1}^t {1 \over N_i}
For example, say the population size was N = 10, 100, 50, 80, 20, 500 for six generations (t = 6). Then the effective population size is the harmonic mean of these, giving:
| :{ |
|---|
| {1 \over N_e} |
| = {\begin{matrix} \frac{1}{10} \end{matrix} + \begin{matrix} \frac{1}{100} \end{matrix} + \begin{matrix} \frac{1}{50} \end{matrix} + \begin{matrix} \frac{1}{80} \end{matrix} + \begin{matrix} \frac{1}{20} \end{matrix} + \begin{matrix} \frac{1}{500} \end{matrix} \over 6} |
| - |
|
| = {0.1945 \over 6} |
|---|
|
| = 0.032416667 |
|---|
| N_e |
| = 30.8 |
| } |
Note this is less than the arithmetic mean of the population size, which in this example is 126.7. The harmonic mean tends to be dominated by the smallest bottleneck that the population goes through.
Dioeciousness
If a population is dioecious, i.e. there is no self-fertilisation then
:N_e = N + \begin{matrix} \frac{1}{2} \end{matrix}
or more generally,
:N_e = N + \begin{matrix} \frac{D}{2} \end{matrix}
where D represents dioeciousness and may take the value 0 (for not dioecious) or 1 for dioecious.
When N is large, N**e approximately equals N, so this is usually trivial and often ignored:
:N_e = N + \begin{matrix} \frac{1}{2} \approx N \end{matrix}
Variance in reproductive success
If population size is to remain constant, each individual must contribute on average two gametes to the next generation. An idealized population assumes that this follows a Poisson distribution so that the variance of the number of gametes contributed, k is equal to the mean number contributed, i.e. 2:
:\operatorname{var}(k) = \bar{k} = 2.
However, in natural populations the variance is often larger than this. The vast majority of individuals may have no offspring, and the next generation stems only from a small number of individuals, so
:\operatorname{var}(k) 2.
The effective population size is then smaller, and given by:
:N_e^{(v)} = {4 N - 2D \over 2 + \operatorname{var}(k)}
Note that if the variance of k is less than 2, N**e is greater than N. In the extreme case of a population experiencing no variation in family size, in a laboratory population in which the number of offspring is artificially controlled, V**k = 0 and N**e = 2N.
Non-Fisherian sex-ratios
When the sex ratio of a population varies from the Fisherian 1:1 ratio, effective population size is given by:
:N_e^{(v)} = N_e^{(F)} = {4 N_m N_f \over N_m + N_f}
Where N**m is the number of males and N**f the number of females. For example, with 80 males and 20 females (an absolute population size of 100): :{| |- |N_e
| = {4 \times 80 \times 20 \over 80 + 20} |
|---|
|
| ={6400 \over 100} |
|---|
| |= 64 |}
Again, this results in N**e being less than N.
Inbreeding effective size
Alternatively, the effective population size may be defined by noting how the average inbreeding coefficient changes from one generation to the next, and then defining N**e as the size of the idealized population that has the same change in average inbreeding coefficient as the population under consideration. The presentation follows Kempthorne (1957).
For the idealized population, the inbreeding coefficients follow the recurrence equation
:F_t = \frac{1}{N}\left(\frac{1+F_{t-2}}{2}\right)+\left(1-\frac{1}{N}\right)F_{t-1}.
Using Panmictic Index (1 − F) instead of inbreeding coefficient, we get the approximate recurrence equation
:1-F_t = P_t = P_0\left(1-\frac{1}{2N}\right)^t.
The difference per generation is
:\frac{P_{t+1}}{P_t} = 1-\frac{1}{2N}.
The inbreeding effective size can be found by solving
:\frac{P_{t+1}}{P_t} = 1-\frac{1}{2N_e^{(F)}}.
This is
:N_e^{(F)} = \frac{1}{2\left(1-\frac{P_{t+1}}{P_t}\right)} .
Theory of overlapping generations and age-structured populations
When organisms live longer than one breeding season, effective population sizes have to take into account the life tables for the species.
Haploid
Assume a haploid population with discrete age structure. An example might be an organism that can survive several discrete breeding seasons. Further, define the following age structure characteristics:
: v_i = Fisher's reproductive value for age i,
: \ell_i = The chance an individual will survive to age i, and
: N_0 = The number of newborn individuals per breeding season.
The generation time is calculated as
: T = \sum_{i=0}^\infty \ell_i v_i = average age of a reproducing individual
Then, the inbreeding effective population size is
:N_e^{(F)} = \frac{N_0T}{1 + \sum_i\ell_{i+1}^2v_{i+1}^2(\frac{1}{\ell_{i+1}}-\frac{1}{\ell_i})}.
Diploid
Similarly, the inbreeding effective number can be calculated for a diploid population with discrete age structure. This was first given by Johnson, but the notation more closely resembles Emigh and Pollak.
Assume the same basic parameters for the life table as given for the haploid case, but distinguishing between male and female, such as N0ƒ and N0m for the number of newborn females and males, respectively (notice lower case ƒ for females, compared to upper case F for inbreeding).
The inbreeding effective number is
: \begin{align} \frac{1}{N_e^{(F)}} = \frac{1}{4T}\left{\frac{1}{N_0^f}+\frac{1}{N_0^m} + \sum_i\left(\ell_{i+1}^f\right)^2\left(v_{i+1}^f\right)^2\left(\frac{1}{\ell_{i+1}^f}-\frac{1}{\ell_i^f}\right)\right. ,,,,,,,, & \ \left. {} + \sum_i\left(\ell_{i+1}^m\right)^2\left(v_{i+1}^m\right)^2\left(\frac{1}{\ell_{i+1}^m}-\frac{1}{\ell_i^m}\right) \right}. & \end{align}
References
References
- "Effective population size". [[Blackwell Publishing]].
- Wright S. (1931). "Evolution in Mendelian populations". [[Genetics (journal).
- Wright S. (1938). "Size of population and breeding structure in relation to evolution". [[Science (journal).
- Buri, P. (1956). "Gene frequency in small populations of mutant Drosophila". Evolution.
- (2023). "The foundations of population genetics". The MIT Press.
- Masel, Joanna. (2012). "Rethinking Hardy–Weinberg and genetic drift in undergraduate biology". BioEssays.
- (23 November 2013). "Genetic Draft, Selective Interference, and Population Genetics of Rapid Adaptation". Annual Review of Ecology, Evolution, and Systematics.
- (April 2007). "Recent human effective population size estimated from linkage disequilibrium". Genome Research.
- (May 1992). "Population growth makes waves in the distribution of pairwise genetic differences.". Molecular Biology and Evolution.
- (2011). "Skyline-plot methods for estimating demographic history from nucleotide sequences". Molecular Ecology Resources.
- (13 July 2011). "Inference of human population history from individual whole-genome sequences.". Nature.
- (August 2014). "Inferring human population size and separation history from multiple genome sequences.". Nature Genetics.
- (1 September 2016). "A Genealogical Look at Shared Ancestry on the X Chromosome". Genetics.
- (1995). "Effective population size/adult population size ratios in wildlife: a review". Genetics Research.
- (2008). "Generation time and effective population size in Polar Eskimos.". Proc Biol Sci.
- (6 September 2024). "The protein domains of vertebrate species in which selection is more effective have greater intrinsic structural disorder". eLife.
- (18 June 2025). "Effective population size does not explain long-term variation in genome size and transposable element content in animals". eLife.
- Lynch, Michael. (2007). "The Origins of Genome Architecture". Sinauer Associates.
- (2011). "Evolution of molecular error rates and the consequences for evolvability". PNAS.
- James F. Crow. (2010). "Wright and Fisher on Inbreeding and Random Drift". Genetics.
- Lynch, M.. (2003). "The origins of genome complexity". Science.
- (2011). "Genetic Draft and Quasi-Neutrality in Large Facultatively Sexual Populations". Genetics.
- Karlin, Samuel. (1968-09-01). "Rates of Approach to Homozygosity for Finite Stochastic Models with Variable Population Size". The American Naturalist.
- Kempthorne O. (1957). "An Introduction to Genetic Statistics". Iowa State University Press.
- Felsenstein J. (1971). "Inbreeding and variance effective numbers in populations with overlapping generations". [[Genetics (journal).
- Johnson DL. (1977). "Inbreeding in populations with overlapping generations". [[Genetics (journal).
- (1979). "Fixation probabilities and effective population numbers in diploid populations with overlapping generations". Theoretical Population Biology.
This article was imported from Wikipedia and is available under the Creative Commons Attribution-ShareAlike 4.0 License. Content has been adapted to SurfDoc format. Original contributors can be found on the article history page.
Ask Mako anything about Effective population size — get instant answers, deeper analysis, and related topics.
Research with MakoFree with your Surf account
Create a free account to save articles, ask Mako questions, and organize your research.
Sign up freeThis content may have been generated or modified by AI. CloudSurf Software LLC is not responsible for the accuracy, completeness, or reliability of AI-generated content. Always verify important information from primary sources.
Report