From Surf Wiki (app.surf) — the open knowledge base

Maximum spacing estimation

Method of estimating a statistical model's parameters

Summary

Method of estimating a statistical model's parameters

The maximum spacing method tries to find a distribution function such that the spacings, ''D''<sub>(''i'')</sub>, are all approximately of the same length. This is done by maximizing their [[geometric mean]].

In statistics, maximum spacing estimation (MSE or MSP), or maximum product of spacing estimation (MPS), is a method for estimating the parameters of a univariate statistical model. The method requires maximization of the geometric mean of spacings in the data, which are the differences between the values of the cumulative distribution function at neighbouring data points.

The concept underlying the method is based on the probability integral transform, in that a set of independent random samples derived from any random variable should on average be uniformly distributed with respect to the cumulative distribution function of the random variable. The MPS method chooses the parameter values that make the observed data as uniform as possible, according to a specific quantitative measure of uniformity.

One of the most common methods for estimating the parameters of a distribution from data, the method of maximum likelihood (MLE), can break down in various cases, such as involving certain mixtures of continuous distributions. In these cases, the method of maximum spacing estimation may be successful.

Apart from its use in pure mathematics and statistics, the trial applications of the method have been reported using data from fields such as hydrology, econometrics, magnetic resonance imaging, and others.

History and usage

The MSE method was derived independently by Russel Cheng and Nik Amin at the University of Wales Institute of Science and Technology, and Bo Ranneby at the Swedish University of Agricultural Sciences. The authors explained that due to the probability integral transform at the true parameter, the “spacing” between each observation should be uniformly distributed. This would imply that the difference between the values of the cumulative distribution function at consecutive observations should be equal. This is the case that maximizes the geometric mean of such spacings, so solving for the parameters that maximize the geometric mean would achieve the “best” fit as defined this way. justified the method by demonstrating that it is an estimator of the Kullback–Leibler divergence, similar to maximum likelihood estimation, but with more robust properties for some classes of problems.

There are certain distributions, especially those with three or more parameters, whose likelihoods may become infinite along certain paths in the parameter space. Using maximum likelihood to estimate these parameters often breaks down, with one parameter tending to the specific value that causes the likelihood to be infinite, rendering the other parameters inconsistent. The method of maximum spacings, however, being dependent on the difference between points on the cumulative distribution function and not individual likelihood points, does not have this issue, and will return valid results over a much wider array of distributions.

The distributions that tend to have likelihood issues are often those used to model physical phenomena. seek to analyze flood alleviation methods, which require accurate models of river flood effects. The distributions that better model these effects are all three-parameter models, which suffer from the infinite likelihood issue described above, leading to Hall's investigation of the maximum spacing procedure. , when comparing the method to maximum likelihood, uses various data sets ranging from a set on the oldest ages at death in Sweden between 1905 and 1958 to a set containing annual maximum wind speeds.

Definition

Given an Independent and identically random sample {x_1, \dots, x_n} of size n from a univariate distribution with continuous cumulative distribution function F(x;\theta_0) , where \theta_0 \in \Theta is an unknown parameter to be estimated, let {x_{(1)}, \dots, x_{(n)}} be the corresponding ordered sample, that is the result of sorting of all observations from smallest to largest. Denote x_{(0)} = \inf{S} and x_{(n+1)} = \sup{S} , where S denotes the support of the distribution.

Define the spacings as the “gaps” between the values of the distribution function at adjacent ordered points: D_i(\theta) = F(x_{(i)};,\theta) - F(x_{(i-1)};,\theta), \quad i=1,\ldots,n+1.

Then the maximum spacing estimator of \theta_0 is defined as a value that maximizes the logarithm of the geometric mean of sample spacings: \hat{\theta} = \underset{\theta\in\Theta}{\operatorname{arg,max}} ; S_n(\theta), \quad\text{where }
S_n(\theta) = \ln!! \sqrt[n+1]{D_1D_2\cdots D_{n+1}} = \frac{1}{n+1}\sum_{i=1}^{n+1}\ln{D_i}(\theta).

By the inequality of arithmetic and geometric means, function S_n (\theta) is bounded from above by -\ln(n+1) , and thus the maximum has to exist at least in the supremum sense.

Note that some authors define the function S_n (\theta) somewhat differently. In particular, multiplies each D_i by a factor of (n+1) , whereas omit the \frac{1}{n+1} factor in front of the sum and add the “−” sign in order to turn the maximization into minimization. As these are constants with respect to \theta , the modifications do not alter the location of the maximum of the function S_n .{{multiref | |

Examples

This section presents two examples of calculating the maximum spacing estimator.

Example 1

A box containing the graph of two offset concave functions with different peaks, vertical lines bisecting the peaks, and labeled arrows pointing to where the vertical lines intersect the bottom of the box. — log]] value of ''λ'' for the simplistic example under both likelihood and spacing estimation. The values for which both likelihood and spacing are maximized, the maximum likelihood and maximum spacing estimates, are identified.

Suppose two values x_{(1)} = 2 , x_{(2)} = 4 were sampled from the exponential distribution F(x;\lambda) = 1 - e^{-x\lambda}, x \ge 0 with unknown parameter \lambda 0 . In order to construct the MSE, one must first and foremost find the spacings:

i	F(x(i))	F(x(i−1))	Di = F(x(i)) − F(x(i−1))
1	1 − e−2λ	0	1 − e−2λ
2	1 − e−4λ	1 − e−2λ	e−2λ − e−4λ
3	1	1 − e−4λ	e−4λ

The process continues by finding the \lambda that maximizes the geometric mean of the “difference” column. Using the convention that ignores taking the (n+1) st root, this turns into the maximization of the following product: (1-e^{-2\lambda} \cdot e^{-2\lambda} - e^{-4\lambda}) \cdot (e^{-4\lambda}) . Letting \mu = e^{-2\lambda} , the problem becomes finding the maximum of \mu^5 - 2\mu ^4 + \mu^3 . Differentiating, the \mu has to satisfy 5\mu^4 - 8\mu^3 + 3\mu^2 = 0 . This equation has roots 0, 0.6, and 1. As \mu is actually e^{-2\lambda} , it has to be greater than zero but less than one. Therefore, the only acceptable solution is \mu=0.6 \quad \Rightarrow \quad \lambda_{\text{MSE}} = \frac{\ln 0.6}{-2} \approx 0.255, which corresponds to an exponential distribution with a mean of \lambda \approx 3.915 . For comparison, the maximum likelihood estimate of \lambda is the inverse of the sample mean, 3, so \lambda_\text{MLE} = \frac{1}{3} \approx 0.333 .

Example 2

Suppose {x_{(1)}, \dots, x_{(n)}} is the ordered sample from a uniform distribution U(a,b) with unknown endpoints a and b . The cumulative distribution function is F(x;a,b) = \frac{x-a}{b-a} when x \in [a,b] . Therefore, individual spacings are given by D_1 = \frac{x_{(1)}-a}{b-a}, \
D_i = \frac{x_{(i)}-x_{(i-1)}}{b-a}\ \text{for } i = 2, \ldots, n, \
D_{n+1} = \frac{b-x_{(n)}}{b-a}. \ \

Calculating the geometric mean and then taking the logarithm, statistic S_n will be equal to S_n(a,b) = \frac{\ln(x_{(1)}-a)}{n+1} + \frac{\sum_{i=2}^n \ln(x_{(i)}-x_{(i-1)})}{n+1} + \frac{\ln(b-x_{(n)})}{n+1} - \ln(b-a). Here, only three terms depend on the parameters a and b . Differentiating with respect to those parameters and solving the resulting linear system, the maximum spacing estimates will be \hat{a} = \frac{nx_{(1)} - x_{(n)}}{n-1},\ \ \hat{b} = \frac{nx_{(n)}-x_{(1)}}{n-1}. These are known to be the uniformly minimum variance unbiased (UMVU) estimators for the continuous uniform distribution. In comparison, the maximum likelihood estimates for this problem \hat{a}=x_{(1)} and \hat{b}=x_{(n)} are biased and have higher mean-squared error.

Properties

Consistency and efficiency

The maximum spacing estimator is a consistent estimator in that it converges in probability to the true value of the parameter, θ0, as the sample size increases to infinity. The consistency of maximum spacing estimation holds under much more general conditions than for maximum likelihood estimators. In particular, in cases where the underlying distribution is J-shaped, maximum likelihood will fail where MSE succeeds. An example of a J-shaped density is the Weibull distribution, specifically a shifted Weibull, with a shape parameter less than 1. The density will tend to infinity as x approaches the location parameter, rendering estimates of the other parameters inconsistent.

Maximum spacing estimators are also at least as asymptotically efficient as maximum likelihood estimators, where the latter exist. However, MSEs may exist in cases where MLEs do not.

Sensitivity

Maximum spacing estimators are sensitive to closely spaced observations, and especially ties. Given X_{i+k} = X_{i+k-1}=\cdots=X_i, , this obtains D_{i+k}(\theta) = D_{i+k-1}(\theta) = \cdots = D_{i+1}(\theta) = 0. ,

When the ties are due to multiple observations, the repeated spacings (those that would otherwise be zero) should be replaced by the corresponding likelihood. That is, one should substitute f_{i}(\theta) for D_i(\theta), as \lim_{x_i \to x_{i-1}}\frac{\int_{x_{i-1}}^{x_i}f(t;\theta),dt}{x_i-x_{i-1}} = f(x_{i-1},\theta) = f(x_{i},\theta), since x_{i} = x_{i-1}.

When ties are due to rounding error, suggests another method to remove the effects. Given r tied observations from x_i to x_{i+r-1} , let \delta represent the round-off error. All of the true values should then fall in the range x \pm \delta. The corresponding points on the distribution should now fall between y_L = F(x-\delta, \hat\theta) and y_U = F(x+\delta, \hat\theta). Cheng and Stephens suggest assuming that the rounded values are uniformly spaced in this interval, by defining D_j = \frac{y_U-y_L}{r-1} \quad (j=i+1,\ldots,i+r-1).

The MSE method is also sensitive to secondary clustering. One example of this phenomenon is when a set of observations is thought to come from a single normal distribution, but in fact comes from a mixture of normals with different means. A second example is when the data is thought to come from an exponential distribution, but actually comes from a gamma distribution. In the latter case, smaller spacings may occur in the lower tail. A high value of M(\theta) would indicate this secondary clustering effect, and suggest a closer look at the data is required.

Moran test

The statistic S_n (\theta) is also a form of Moran or Moran-Darling statistic, M(\theta) , which can be used to test goodness of fit. It has been shown that the statistic, when defined as S_n(\theta) = M_n(\theta)= -\sum_{j=1}^{n+1}\ln{D_j(\theta)}, is asymptotically normal, and that a chi-squared approximation exists for small samples. In the case, to know the true parameter \theta^0, show that the statistic M_n(\theta) has a normal distribution with \begin{align} \mu_M & \approx (n+1)(\ln(n+1)+\gamma)-\frac{1}{2}-\frac{1}{12(n+1)},\ \sigma^2_M & \approx (n+1)\left ( \frac{\pi^2}{6} -1 \right ) -\frac{1}{2}-\frac{1}{6(n+1)}, \end{align} where γ is the Euler–Mascheroni constant which is approximately 0.57722.

The distribution can also be approximated by that of A, where A = C_1 + C_2\chi^2_n ,, in which \begin{align} C_1 &= \mu_M - \sqrt{\frac{\sigma^2_Mn}{2}},\ C_2 &= {\sqrt\frac{\sigma^2_M}{2n}},\ \end{align} and where \chi^2_n follows a chi-squared distribution with n degrees of freedom. Therefore, to test the hypothesis H_0 that a random sample of n values comes from the distribution F(x,\theta), the statistic T(\theta)= \frac{M(\theta)-C_1}{C_2} can be calculated. Then H_0 should be rejected with significance \alpha if the value is greater than the critical value of the appropriate chi-squared distribution.

Where \theta_0 is being estimated by \hat\theta, showed that S_n(\hat\theta) = M_n(\hat\theta) has the same asymptotic mean and variance as in the known case. However, the test statistic to be used requires the addition of a bias correction term and is: T(\hat\theta) = \frac{M(\hat\theta)+\frac{k}{2}-C_1}{C_2}, where k is the number of parameters in the estimate.

Generalized maximum spacing

Alternate measures and spacings

generalized the MSE method to approximate other measures besides the Kullback–Leibler measure. further expanded the method to investigate properties of estimators using higher order spacings, where an m -order spacing would be defined as F(X_{j+m}) - F(X_{j}){{multiref | |

Multivariate distributions

discuss extended maximum spacing methods to the multivariate case. As there is no natural order for \mathbb{R}^k (k1), they discuss two alternative approaches: a geometric approach based on Dirichlet cells and a probabilistic approach based on a “nearest neighbor ball” metric.

Notes

References

Citations

Works cited

{{cite journal | access-date = 2009-01-21 | archive-date = 2011-08-16 | archive-url = https://web.archive.org/web/20110816101736/http://fir.nes.ru/~gkosenok/MPS.pdf | url-status = dead
{{cite journal |access-date = 2008-12-31 |archive-url = https://web.archive.org/web/20050505044534/http://www.menem.com/ilya/digital_library/entropy/beirlant_etal_97.pdf |archive-date = May 5, 2005
{{cite journal
{{cite journal
{{cite journal | access-date = 2008-12-30 | archive-url = https://web.archive.org/web/20070214143052/http://www.matstat.umu.se/varia/reports/rep9706.ps.gz | archive-date = February 14, 2007
{{cite journal |doi-access = free
{{cite conference
{{cite journal
{{cite journal
{{cite journal |access-date = 2008-12-30 |archive-url = https://web.archive.org/web/20070214143042/http://www.matstat.umu.se/varia/reports/rep9705.ps.gz |archive-date = February 14, 2007
{{cite journal |access-date = 2008-12-31
{{cite book

References

{{harvtxt. Hall. al.. 2004
{{harvtxt. Anatolyev. Kosenok. 2004
{{harvtxt. Pieciak. 2014
{{harvtxt. Wong. Li. 2006

Wikipedia Source

This article was imported from Wikipedia and is available under the Creative Commons Attribution-ShareAlike 4.0 License. Content has been adapted to SurfDoc format. Original contributors can be found on the article history page.

estimation-methods probability-distribution-fitting

Want to explore this topic further?

Ask Mako anything about Maximum spacing estimation — get instant answers, deeper analysis, and related topics.

Research with Mako

Free with your Surf account

Content sourced from Wikipedia, available under CC BY-SA 4.0.

This content may have been generated or modified by AI. CloudSurf Software LLC is not responsible for the accuracy, completeness, or reliability of AI-generated content. Always verify important information from primary sources.

Report