Institute of Sociology
of the Federal Center of Theoretical and Applied Sociology
of the Russian Academy of Sciences

Kornienko A.V. The national corpus of the Russian language as a source base for social and humanitarian research. St. Petersburg Sociology Today. 2023. No. 21. P. 58-71. DOI: 10.25990 ...



Kornienko A.V. The national corpus of the Russian language as a source base for social and humanitarian research. St. Petersburg Sociology Today. 2023. No. 21. P. 58-71. DOI: 10.25990/socinstras.pss-21.8sqp-bb68
ISSN 2308-3166
DOI 10.25990/socinstras.pss-21.8sqp-bb68
РИНЦ: https://elibrary.ru/contents.asp?id=55823513

Posted on site: 07.01.23

Текст статьи на сайте журнала URL: https://pitersociology.ru/ru/node/905 (дата обращения 07.01.2024)


Abstract

The title of the article directly reflects its purpose: to present the National Corpus of the Russian Language (NCRL, Corpus) as an information base for research in the socio-humanitarian profile. Currently, the Corpus has a volume of over two billion words usages and is an information and reference system based on a collection of Russian-language texts created from the beginning of the 18th century to 2010, representing the Russian language within the specified time limits. The NCRL reflects the whole variety of genres, styles, social and territorial variants of our language, covering fiction and scientific literature, essays, journalism, public speaking, etc. The texts of the second half of the 20th — early 21st centuries are presented in the Corpus most fully and diversely. The article highlights two main areas of modern scientific research using NCRS both as a source of empirical information and at the same time as a research tool. The first has a distinct linguistic orientation and is associated with turning to the Corpus when teaching Russian and foreign languages, in the analysis of dialects and sociolects, when immersed in the field of theory and practice of language translation. The corpus also makes it possible to track changes in the language norm and other language parameters over a certain period of time. The second direction is formed by studies of a clearly social profile, associated with the diagnosis of social consciousness and its transformations. With the help of the NCRS, its huge, in the thousands and tens of thousands of units of analysis, data arrays, equipped with the necessary statistics, the conclusions obtained in sociological surveys, discourse studies, associative experiments performed on fairly limited samples are reliably verified. Autonomous appeals to the Corpus are also of indisputable value, providing the most important information about the attitude of the bearers of Russian linguistic culture to various social institutions, processes, phenomena and subjects. The article provides examples of such studies associated with a number of key political concepts, such as officials, law enforcement agencies, corruption, politicians, the elite and compatriots.