Our study examines the use and possible applicability of Natural Language Processing (NLP) in corruption research. In our review, we aim to collect and summarize automated text analytics-based corruption research born after 2000. We focus on the prevalence and potential of NLP methods. We found significant differences in the textual data sources, the corruption measurement methods, and the analytical approaches used.
However, there were unfortunately few mixed-type studies (in terms of data source, method, or corruption measurement method). In addition to the classic works describing of the volume of corruption or the attitude or perception related to it, we found results that can be used to prevent corruption and even be directly suitable for intervention. NLP has been used in only a few studies, and mostly only for some technical tasks. Our results show that NLP is not very widespread in this area yet. However, it can also be seen that its use can be useful and could support traditional quantitative research as an alternative tool. The aim of our article is to provide inspiration for the use of NLP in the social sciences and to draw attention to its embeddability in existing scientific discourses.