vocabulary

Statistics of words occurrences in natural and random texts

We study experimentally statistical distributions that describe the appearance of words in a number of natural texts, as well as in the random texts derived on their basis. It is shown that the probability mass function of the respective intervals between words is practically the same for the natural and random texts and manifests a fat tail, which is inconsistent with purely stochastic character of those intervals.