character encoding

Технічні аспекти опрацювання комп'ютером природномовної інформації

The article deals with technical problems of natural language information processing by computer caused by the presence of multiple character encoding standards and non-compliance by users with spelling and punctuation rules. The necessity of previous technical processing of such texts before their use in scientific researches as well as in various information systems has been grounded.