Corruption in Online Editorial Media

In recent years, members of our research group have published several studies on corruption research in Hungarian and international leading journals as well. These researches were based on survey data. Turning to nun-survey based methods, our research team has conducted two case studies using NLP methods in corruption research in 2018-2019, . The first case study uses the author-topic model. Using the corpus collected by K-Monitor, we identified 25 corruption topics, and analysed the thematization of the corruption on different websites in different times.

In the second case study, we focused on the temporal changes in the topics of corruption, also in the Hungarian online news sites. We used a dynamic topic model for the analysis in the K-Monitor corpus. Based on 26,000 articles, we analyzed the changes in the popularity and content of typical corruption topics for the period 2007-2018. As a result of the model, we found seven well-separated topics. Our study is currently under review in a leading Hungarian sociology journal.

Our previous studies are mainly descriptive, they can serve as a base for further research. In addition to the empirical analysis, we systematically deal with the question what NLP methods can give to corruption research.

We examine the framework of corruption-definition, furthermore the possibilities of automated processing of huge amounts of texts in corruption research and the data analysis and data processing technologies based on them.

In the course of the educational activity related to the project, K-Monitor also brought a corpus on a data-based hackathon organized for students with K-Monitor and Precognox, which students could use to analyze the data we used in our research team.

Related Results

Eszter Katona, Zoltán Kmetty, Renáta Németh (2021): Applying natural language processing to analyise the representation of corruption in the Hungarian online media

2021.07.19. Publication

This paper presents a thematic analysis of the representation of corruption in the Hungarian online media, using a text mining tool called dynamic topic modeling. The text corpus was provided by K-Monitor and includes online articles on corruption and issues related to the misuse of public funds. Our study is [...]

View Result Details

Eszter Katona, Renáta Németh (2021): Automated text analytics in corruption research

2021.05.22. Publication

Our study examines the use and possible applicability of Natural Language Processing (NLP) in corruption research. In our review, we aim to collect and summarize automated text analytics-based corruption research born after 2000. We focus on the prevalence and potential of NLP methods. We found significant differences in the textual [...]

View Result Details

Related publications

2019.07.01. Publication

Kostadinova, Tatiana; Kmetty, Zoltán: Corruption and Political Participation in Hungary: Testing Models of Civic Engagement. EAST EUROPEAN POLITICS AND SOCIETIES Online first p. 1 (2019) Kmetty, Zoltán: Incumbent party support and perceptions of corruption – an experimental study. SZOCIOLÓGIAI SZEMLE 28 : 4 pp. 152-165., 14 p. (2018) Kmetty, Zoltán: Korrupció percepciója, [...]

View Result Details

Previous publications

2019.07.01. Publication

Eszter Katona's presentation Eszter Katona held a presentation entitled ‘Natural Language Processing in Social Sciences’ on 10 May 2019 in Basel, at the Joint Annual Conference of the GPSA Methods of Political Science Section and the SPSA Empirical Methodology Working Group ( She presented the results of the paper she wrote [...]

View Result Details