From Surf Wiki (app.surf) — the open knowledge base
Exponentially modified Gaussian distribution
Describes the sum of independent normal and exponential random variables
Describes the sum of independent normal and exponential random variables
name =EMG| type =density| pdf_image =[[Image:EMG Distribution PDF.png|360px|Probability density function for the EMG distribution]]| cdf_image =[[Image:EMG Distribution CDF.png|360px|Cumulative distribution function for the EMG distribution]]| parameters = μ ∈ R — mean of Gaussian component σ2 0 — variance of Gaussian component λ 0 — rate of exponential component| support = x ∈ R| pdf =\frac{\lambda}{2} \exp \left[\frac{\lambda}{2} (2 \mu + \lambda \sigma^2 - 2 x)\right] \operatorname{erfc} \left(\frac{\mu + \lambda \sigma^2 - x}{ \sqrt{2} \sigma}\right)| cdf = \Phi(x,\mu,\sigma) - \frac{1}{2} \exp \left[\frac{\lambda}{2} (2\mu+\lambda\sigma^2 -2x)\right] \operatorname{erfc}\left(\frac{\mu+\lambda\sigma^2 -x}{\sqrt{2} \sigma}\right)
where
\Phi(x, \mu, \sigma) is the CDF of a Gaussian distribution| mean =\mu + 1/\lambda| median =| mode =x_m = \mu - \operatorname{sgn}\left(\tau\right)\sqrt{2}\sigma\operatorname{erfcxinv}\left(\frac\tau}{\sigma}\sqrt{\frac{2}{\pi}}\right) + \frac{\sigma^2}{\tau} f(x_m)=h\exp \left( -\frac {1}{2} \left( \frac {\mu-x_m}{\sigma} \right)^2\right) | variance =\sigma^2 + 1/\lambda^2| skewness =\frac{2}{\sigma^3 \lambda^3} \left( 1 + \frac{1}{\sigma^2 \lambda^2} \right)^{-3/2}| kurtosis =\frac{3 \left(1 + \frac{2}{\sigma^2 \lambda^2} + \frac{3}{\lambda^4 \sigma^4}\right)}{\left( 1 + \frac{1}{\lambda^2 \sigma^2} \right)^2 } - 3| entropy =| mgf = \left(1 - \frac{t}{\lambda}\right)^{-1},\exp \left( \mu t + \frac{1}{2}\sigma^2 t^2 \right)| cf = \left(1 - \frac{it}{\lambda}\right)^{-1},\exp \left( i\mu t - \frac{1}{2}\sigma^2 t^2 \right)|
In probability theory, an exponentially modified Gaussian distribution (EMG, also known as exGaussian distribution) describes the sum of independent normal and exponential random variables. An exGaussian random variable Z may be expressed as , where X and Y are independent, X is Gaussian with mean μ and variance σ2, and Y is exponential of rate λ. It has a characteristic positive skew from the exponential component.
It may also be regarded as a weighted function of a shifted exponential with the weight being a function of the normal distribution.
Definition
The probability density function (pdf) of the exponentially modified Gaussian distribution is
:f(x;\mu,\sigma,\lambda) = \frac{\lambda}{2} \exp \left[\frac{\lambda}{2} (2 \mu + \lambda \sigma^2 - 2 x)\right] \operatorname{erfc} \left(\frac{\mu + \lambda \sigma^2 - x}{ \sqrt{2} \sigma}\right),
where erfc is the complementary error function defined as
:\begin{align} \operatorname{erfc}(x) & = 1-\operatorname{erf}(x) \ & = \frac{2}{\sqrt{\pi}} \int_x^\infty e^{-t^2},dt. \end{align}
This density function is derived via convolution of the normal and exponential probability density functions.
Alternative forms for computation
An alternative but equivalent form of the EMG distribution is used to describe the shape of the peak in chromatography. This is as follows
f(x; h, \mu, \sigma, \tau ) = \frac{h\sigma}{\tau} \sqrt{\frac{\pi}{2}}\exp \left( \frac {1}{2} \left( \frac {\sigma}{\tau} \right)^2 - \frac {x-\mu}{\tau} \right) \operatorname{erfc} \left( \frac {1}{\sqrt{2}}\ \left( \frac {\sigma}{\tau} - \frac {x-\mu}{\sigma} \right) \right ) , |}}
where :h is the amplitude of Gaussian, :\tau=\frac{1}{\lambda} is exponent relaxation time, \tau^2 is a variance of exponential probability density function.
This function cannot be calculated for some values of parameters (for example, \tau=0) because of arithmetic overflow. Alternative, but equivalent form of writing the function was proposed by Delley:
f(x; h, \mu, \sigma, \tau )=h\exp \left( -\frac {1}{2} \left( \frac {x-\mu}{\sigma} \right)^2\right) \frac{\sigma}{\tau} \sqrt{\frac{\pi}{2}}\operatorname{erfcx} \left( \frac {1}{\sqrt{2}}\ \left( \frac {\sigma}{\tau} - \frac {x-\mu}{\sigma} \right) \right ) , |}}
where \operatorname{erfcx} t = \exp t^2 \cdot \operatorname{erfc} t is a scaled complementary error function
For this formula, arithmetic overflow is also possible, but the region of overflow is different.
For very small τ, an asymptotic form of the second formula can be used, which allows evaluation for \tau=0:
f(x; h, \mu, \sigma, \tau )=\frac{h\exp \left( -\frac {1}{2} \left( \frac {x-\mu}{\sigma} \right)^2\right)} {1+ \frac {\left(x-\mu\right)\tau}{\sigma^2}}, |}}
A decision on formula usage can be made on the basis of the parameter z = \frac{1}{\sqrt{2}}\left(\frac{\sigma}{\tau} - \frac{x - \mu}{\sigma}\right): :for z :for 0 ≤ z ≤ 6.71·107 (in the case of double-precision floating-point format) according to the second formula, :and for z 6.71·107 according to the asymptotic form of the second formula. The Mode (position of apex, most probable value) can be calculated using the derivative of formula 2, where the inverse of scaled complementary error function erfcxinv() is used. Approximate values are also proposed by Kalambet et al. Though the mode is at a value higher than that of the original Gaussian, the apex is always located on the original (unmodified) Gaussian.
Parameter estimation
There are three parameters: the mean of the normal distribution (μ), the standard deviation of the normal distribution (σ) and the exponential decay parameter (τ = 1 / λ). The shape K = τ / σ is also sometimes used to characterise the distribution. Depending on the values of the parameters, the distribution may vary in shape from almost normal to almost exponential.
The parameters of the distribution can be estimated from the sample data with the method of moments as follows:
: m = \mu + \tau,
: s^2 = \sigma^2 + \tau^2,
: \gamma_1 = \frac{2\tau^3}{(\sigma^2 + \tau^2)^{3/2}},
where m is the sample mean, s is the sample standard deviation, and γ1 is the skewness.
Solving these for the parameters gives:
: \hat\mu = m - s \left( \frac{\gamma_1}{2} \right)^{1/3},
: \hat{\sigma^2} = s^2 \left[ 1 - \left( \frac{\gamma_1}{2} \right)^{2/3} \right],
: \hat\tau = s \left( \frac{\gamma_1}{2} \right)^{1/3}.
Recommendations
Ratcliff has suggested that there be at least 100 data points in the sample before the parameter estimates should be regarded as reliable. Vincent averaging may be used with smaller samples, as this procedure only modestly distorts the shape of the distribution. These point estimates may be used as initial values that can be refined with more powerful methods, including a least-squares optimization, which has shown to work for the Multimodal Exponentially Modified Gaussian (MEMG) case. A code implementation with analytical MEMG derivatives and an optional oscillation term for sound processing is released as part of an open-source project.
Confidence intervals
There are currently no published tables available for significance testing with this distribution. The distribution can be simulated by forming the sum of two random variables one drawn from a normal distribution and the other from an exponential.
Skew
The value of the nonparametric skew
: \frac{\text{mean} - \text{median}}{\text{standard deviation}}
of this distribution lies between 0 and 0.31. The lower limit is approached when the normal component dominates, and the upper when the exponential component dominates.
Occurrence
The distribution is used as a theoretical model for the shape of chromatographic peaks. It has been proposed as a statistical model of intermitotic time in dividing cells. It is also used in modelling cluster ion beams. It is commonly used in psychology and other brain sciences in the study of response times. In a slight variant where the mean of the Normal component is set to zero, it is also used in Stochastic Frontier Analysis, as one of the distributional specifications for the composed error term that models inefficiency. In signal processing, EMGs have been extended to the multimodal case with an optional oscillation term to represent digitized sound signals.
References
References
- (1972). "Characterization of Exponentially Modified Gaussian Peaks in Chromatography". Analytical Chemistry.
- (2011). "Reconstruction of chromatographic peaks using the exponentially modified Gaussian function". Journal of Chemometrics.
- (1985). "Series for the Exponentially Modified Gaussian Peak Shape". Anal. Chem..
- Dyson, N. A.. (1998). "Chromatographic Integration Methods". Royal Society of Chemistry, Information Services.
- Olivier J. and Norberg M. M. (2010) Positively skewed data: Revisiting the Box−Cox power transformation. Int. J. Psych. Res. 3 (1) 68−75.
- (1979). "Group reaction time distributions and an analysis of distribution statistics". Psychol. Bull..
- (1912). "The functions of the vibrissae in the behaviour of the white rat". Animal Behaviour Monographs.
- (2022). "2022 IEEE International Ultrasonics Symposium (IUS)".
- "MEMG on GitHub".
- (1996). "RTSYS: A DOS application for the analysis of reaction time data". Behavior Research Methods, Instruments, & Computers.
- (1994). "Effects of outlier exclusion on reaction time analysis". J. Exp. Psych.: General.
- (1969). "Computer-Assisted Gas-Liquid Chromatography". Anal. Chem..
- (2010). "Exponentially modified Gaussian (EMG) relevance to distributions related to cell proliferation and differentiation". Journal of Theoretical Biology.
- (2012). "Fractional proliferation: A method to deconvolve cell population dynamics from single-cell data". Nature Methods.
- (2006). "Multiparameter characterization of cluster ion beams". Journal of Vacuum Science & Technology B: Microelectronics and Nanometer Structures.
- (2011). "What are the shapes of response time distributions in visual search?". J Exp Psychol.
- (1994). "An analysis of latency and interresponse time in free recall". Memory & Cognition.
- (2021). "A Bayesian Mixture Modelling of Stop Signal Reaction Time Distributions: The Second Contextual Solution for the Problem of Aftereffects of Inhibition on SSRT Estimations". Brain Sciences.
- (2000). "Stochastic Frontier Analysis". Cambridge University Press.
- Peter Carr and Dilip B. Madan, Saddlepoint Methods for Option Pricing, The Journal of Computational Finance (49–61) Volume 13/Number 1, Fall 2009
This article was imported from Wikipedia and is available under the Creative Commons Attribution-ShareAlike 4.0 License. Content has been adapted to SurfDoc format. Original contributors can be found on the article history page.
Ask Mako anything about Exponentially modified Gaussian distribution — get instant answers, deeper analysis, and related topics.
Research with MakoFree with your Surf account
Create a free account to save articles, ask Mako questions, and organize your research.
Sign up freeThis content may have been generated or modified by AI. CloudSurf Software LLC is not responsible for the accuracy, completeness, or reliability of AI-generated content. Always verify important information from primary sources.
Report