Nevertheless, over fairly wide ranges, and to a fairly good approximation, many natural phenomena obey Zipf's law. In human languages, word frequencies have a very heavy-tailed distribution, and can therefore be modeled reasonably well by a Zipf distribution with an s close to 1.
Wentian Li has shown that in a document in which each character has been chosen randomly from a uniform distribution of all letters plus a space character , the "words" follow the general trend of Zipf's law appearing approximately linear on log-log plot. He took a large class of well-behaved statistical distributions not only the normal distribution and expressed them in terms of rank.
He then expanded each expression into a Taylor series. In every case Belevitch obtained the remarkable result that a first-order truncation of the series resulted in Zipf's law. Further, a second-order truncation of the Taylor series resulted in Mandelbrot's law. The principle of least effort is another possible explanation: Zipf himself proposed that neither speakers nor hearers using a given language want to work any harder than necessary to reach understanding, and the process that results in approximately equal distribution of effort leads to the observed Zipf distribution.
Similarly, preferential attachment intuitively, "the rich get richer" or "success breeds success" that results in the Yule—Simon distribution has been shown to fit word frequency versus rank in language [16] and population versus city rank [17] better than Zipf's law.
It was originally derived to explain population versus rank in species by Yule, and applied to cities by Simon. Indeed, Zipf's law is sometimes synonymous with "zeta distribution," since probability distributions are sometimes called "laws".
This distribution is sometimes called the Zipfian distribution. The "constant" is the reciprocal of the Hurwitz zeta function evaluated at s. In practice, as easily observable in distribution plots for large corpora, the observed distribution can be modelled more accurately as a sum of separate distributions for different subsets or subtypes of words that follow different parameterizations of the Zipf—Mandelbrot distribution, in particular the closed class of functional words exhibit s lower than 1, while open-ended vocabulary growth with document size and corpus size require s greater than 1 for convergence of the Generalized Harmonic Series.
Zipfian distributions can be obtained from Pareto distributions by an exchange of variables. The Zipf distribution is sometimes called the discrete Pareto distribution [18] because it is analogous to the continuous Pareto distribution in the same way that the discrete uniform distribution is analogous to the continuous uniform distribution.
The tail frequencies of the Yule—Simon distribution are approximately. In the parabolic fractal distribution , the logarithm of the frequency is a quadratic polynomial of the logarithm of the rank. This can markedly improve the fit over a simple power-law relationship. It has been argued that Benford's law is a special bounded case of Zipf's law, [19] with the connection between these two laws being explained by their both originating from scale invariant functional relations from statistical physics and critical phenomena.
Hence, Zipf law for natural numbers: Zipf's law also has been used for extraction of parallel fragments of texts out of comparable corpora. From Wikipedia, the free encyclopedia. Zipf's law Probability mass function. Association for Computational Linguistics: Power-Law Distributions in Empirical Data. SIAM Review, 51 4 , — Artificial Intelligence and Applications. Archived PDF from the original on 5 March Archived from the original on Human Behavior and the Principle of Least Effort.
Archived PDF from the original on Univariate Discrete Distributions second ed.