Our members received ÚNKP research grants


Just got the news that three junior members of our research group were awarded a research grant by the New National Excellence Program (ÚNKP). Eszter Katona (The hidden barriers of open competition – using text mining to explore barriers to competition in public procurement, Supervisor: Renáta Németh-Mihály Fazekas), Zsófi Rakovics (Examining the linguistic and political polarisation of parliamentary speeches, Supervisor: Domonkos Sik) and Emese Tímea Tóth (Narrative possibilities framed by political polarisation – sustainability in online media platforms, Supervisor: Renáta Németh) received funding for their work for the next academic year. The topic of Zsófi and Mesi is related to our OTKA research.

Congratulations, Eszti, Zsófi, Mesi!

Text. Machine. Society.

2022.07.22. Data Science in Social Research Digital Lens Discursive framing of depression in online health communities The layers of political public sphere in Hungary (2001–2020)

Capturing social relations through computational analysis of textual data


Our research group is organizing a workshop on 20 September. Over the past four years, we have been supported by several national and international grants, and the workshop aims to present our current research results.

Our workshop also aims to present some of the linguistically grounded social research in other fields by inviting a representative from Hungary. We hope that these presentations will provide inspiration for broadening the tools coming from the informatics field, promoting interdisciplinarity and capturing valid content results.



The work of our PhD researcher, Árpád Knap, was selected among the best of the New National Excellence Programme (ÚNKP) 2020/21 cycle at ELTE, based on his final report and presentation. In his research entitled “Analysing emotions related to twentieth-century traumas using text analytics methods” Árpád analysed newspaper articles related to two major events of the twentieth century, the Treaty of Trianon and the Holocaust, using natural language processing methods. A summary video of his research can be viewed here. Based on the research, two academic publications were published – one in English, co-authored with Ildikó Barna and one in Hungarian with Diána Bartha and Ildikó Barna. The publications are expected to be published soon, and will be announced on our website.

International Sociological Association’s RC33 conference


International Sociological Association’s RC33 conference, 7-10 September 2021, online via Teams.

Session 12: Natural Language Processing: a New Tool in the Methodological Tool-Box of Sociology.

Presenters and co-authors from our research group: Ildikó Barna, Eszter Katona, Árpád Knap, Renáta Németh, Márton Rakovics, Domonkos Sik. Session chair: Ildikó Barna & Renáta Németh.

Attendees need to be RC33 members in order to participate in the conference (see www.rc33.org on how to become a member). Other than that, the conference will be free of charge! All attendees need to send an email to Inga Gaizauskaite (inga.gaizauskaite@lstc.lt), expressing their willingness to participate in the online conference.

Our first Digital Lens kick-off meeting


We’ll be working with the http://degob.org database of testimonies from Hungarian Holocaust survivors from the immediate postwar period. Incredibly some typed protocols have only been recently discovered… With our research we will strive to find more of these hidden treasures. Meanwhile we’ll be analyzing the available material with automated quantitative and qualitative text analytics methods.

We’ve won a Research Grant – Digital Lens


With Ildikó Barna as project leader, our project “Revisiting Early Testimonies of Hungarian Jewish Holocaust Survivors through a Digital Lens” has been awarded a research grant for the next three years. In our interdisciplinary project, we will analyze the digitized database of Hungarian Jewish Holocaust survivors’ immediate post-war testimonies conducted by the National Committee for Attending Deportees (DEGOB) using Natural Language Processing.

Ildikó Barna and Árpád Knap at the 2021 ISCA conference

2020.12.30. Online Antisemitism

Good news. Ildikó Barna and Árpád Knap are among the speakers of the 2021 ISCA conference organized by Indiana University. The “Antisemitism in Today’s America: Manifestations, Causes, and Consequences” conference is one of the most important international forums for antisemitism researchers. Ildikó and Árpád present a comparative analysis of Hungarian and American conspiracy theories related to György Soros, using NLP analysis of online discourses. It is a special pleasure for us to say, that according to the head of the organizing institute, “Your paper promises to make a particularly interesting contribution to our collective work on contemporary antisemitism.”



Three members of our research group (Renáta Németh, Eszter Rita Katona, Zoltán Kmetty) recently published an article, which aims to present the characteristics and possibilities of automated text analysis. Their goal is to inspire Hungarian social scientists by providing an insight into a less-institutionalized area, since they believe that at an international level, text mining will be a standard method for empirical social science research within a few years.

RC2S2 presentations at the conference “Sociology at the Dawn of a Successful Century”


The conference “Sociology at the Dawn of a Successful Century” will be held on October 8-9th at the Hungarian Academy of Sciences. The sociological applications of NLP will be presented in it’s own section, led by two co-leaders of our research group, Ildikó Barna and Renáta Németh (and Bence Ságvári, leader of the CSS-Recens research group of the Hungarian Academy of Sciences). Júlia Koltai, member of RC2S2, belongs to the organizing committee of the conference. From RC2S2 Ildikó Barna, Eszter Katona, Zoltán Kmetty, Árpád Knap, Júlia Koltai, Renáta Németh, Márton Rakovics, Domonkos Sik perform as speaker / co-author.

Two of our researchers won the grant of the New National Excellence Program


Both Eszter Katona and Árpád Knap won the scholarship of the New National Excellence Program (ÚNKP) for the next one-year period. The title of Eszter’s research is Corruption risk and prediction – Analysis of the texts of public procurement tenders using the tools of natural language processing. Her supervisor is Mihály Fazekas. According to the hypothesis of Eszter’s research, the examination of the wording of public procurement tenders can help in uncovering suspected cases of corruption. The title of Árpád’s research is Analysis of emotions related to twentieth-century traumas using Natural Language Processing methods. In his research, he will analyse emotions found in social media and the press, related to the two most influential events of the 20th century’s Hungarian history: the Treaty of Trianon and the Holocaust. Árpád’s supervisor is Ildikó Barna, the co-leader of our research group.



We are pleased to announce that our student, Bernadett Csala-Ferencz (MSc in Survey Statistics and Data Analytics), has won a scholarship from the New National Excellence Program for the new semester. The title of her research is: Exploratory clustering of online posts on depression. With the support of the program, Bernadett is analysing the posts of English-language online depression forums using natural language processing (NPL) methods within the Research Center for Computational Social Science research group. Her supervisor is Renáta Németh.



Our research group has received an outstanding opportunity – we have been awarded the 2020 “OTKA” (Hungarian Scientific Research Fund) research grant for 2020-2023 for our project titled “The layers of the political public sphere in Hungary (2001-2020) – a sociological analysis of the official, media-based and lay online public sphere using automated text analytics and critical discourse analysis”. The Principal Investigator is Renáta Németh.

Project members: Ildikó Barna, Domonkos Sik, Márton Rakovics, Eszter Katona, Árpád Knap (RC2S2, ELTE TáTK) and Péter Csigó (BME).

Research summary

Our research focuses on revealing language change in the political public sphere applying NLP (Natural Language Processing) methods on a large digital text corpus in critical discourse analysis framework, which treats language as a tool of ideology and power.

The two highlighted stakes–important for both society and sociology–of the research: (1) The inner workings of the political public sphere on its different levels, the dynamics and interaction between these levels, the exercise of power through language, the expressed ideological polarization, and identification of discourses free of these tendencies. (2) The organic integration of NLP methods into empirical sociology.

Both aspects of the research have international relevance since the studied phenomena of language polarization and diffusion of usage patterns are not specific to Hungary. The integration of NLP methods into empirical sociology is an emerging topic of huge interest because it allows for the new kind of analysis of the large digital corpora at hand utilizing sociological knowledge in the process.

All innovative methodological solutions of data collection and analysis will be made publicly available through digital repositories, scientific articles and conference talks to support the international and domestic users of NLP. Senior members of the research group will provide opportunities to join the project by supervising researches of Scientific Students’ Associations (TDK), master theses and Ph.D. topics for young researchers, and research internships or thesis supervision for graduate students.

Significance of the research

The digital revolution is also the revolution of self-expression. Before the internet, textual documents mainly bore the narratives of the elite, but now almost everybody has the opportunity to express themselves online. The primary relevance of our research is that by the automated processing of this continuously forming flow of texts, even those characteristics of the political public sphere can be examined and understood that have previously been only available in local fragments by the observer. Thus both the social and the scientific stakes of our research are high.

From a scientific standpoint, our research opens new perspectives by involving observational data into quantitative research in addition to previously used self-reported survey data, and also by combining qualitative discourse analysis with quantitative methods, employing new and innovative solutions from computational linguistics. This methodological blend is an international novelty. Using NLP on a large-scale text corpus covering different levels of digital communication–to our knowledge–has never been done in Hungary. While there is evidence for strong ideological polarization (e.g. Vegetti 2018), and polarization in the network structure of the political public sphere has been examined (Bene and Szabó 2019), language polarization has not been researched domestically, and there are only a few examples internationally, which are on a much smaller scale of application compared to our own research (see e.g. Demszky et al., 2019, on Twitter data).

Our research has an important stake from a social standpoint as well: the public sphere is one of the cornerstones of modern democracy and serves an important role in preventing potential distortions and crises.

One of the strengths of the proposal is that it is backed by a young but highly successful research group, already with several international publications, doctoral research topics, and a consciously built domestic and international network of collaborations. Besides the compilation and utilization of innovative methods for sociology, the aim of the research group is to foster the institutionalization of the new and promising automated text analysis methods in social sciences.



A new paper entitled Sociologists using machine learning: Hermeneutic limitations of ‘big data’ text analytics if non-trivial concepts are taught has been published in the International Journal of Qualitative Methods (Q1, impact factor 3.6) by Renáta Németh, Domonkos Sik and Fanni Máté. We were pleased to read the reviews, e.g. “The article is the one of the fundamental researches in this area” and “I look forward to seeing future efforts as you proceed with this research“. We hope that we will be able to fulfil the latter, since we already have three international publications being reviewed.

The paper can be accessed on the following URL: https://journals.sagepub.com/doi/full/10.1177/1609406920949338

Further results in this field: https://rc2s2.elte.hu/en/project/discursive-framing-of-depression-in-online-health-communities/

Our research group working on online antisemitism started a new project on coronavirus-related online antisemitism


Jews have been accused many times throughout the history of deliberately spreading disease among non-Jews. Simultaneously with the outbreak of the coronavirus epidemic, conspiracy theories linking Jews to the virus appeared. The internet is of paramount importance in the distribution of these conspiracy theories. In our research, we examine a large text corpus of Hungarian online articles and comments/posts to answer the research question of whether coronavirus-related antisemitic discourses appear in the Hungarian online space, and if so, what their content is. Our corpus contains articles, comments, and posts written in Hungarian between December 1, 2019, and July 10, 2020, in which the different forms of the word Jew, Zionism, Israel, and that of coronavirus appear simultaneously. Fifteen students from Sociology BA at ELTE University, Faculty of Social Sciences, are participating in the research as interns.



We are pleased to announce that two theses supervised by members of our research group have been awarded the title of “Thesis of the Year”. Anna Farkas wrote the best dissertation at Sociology BA (supervisor: Renáta Németh), and Jakab Buda at Survey Statistics and Data Analysis MSc (supervisor: Márton Rakovics). Congratulations! The dissertations can be found here, along with other dissertations in Computational Social Science led by members of our research team.

Business Cooperation


Inspira Group, a research company, according to their “data for social good” principle, made it possible for Anna Farkas, a BA student in sociology, to add questions to their online omnibus research, free of charge. The paper, supervised by Renáta Németh, is a case study that investigates gender bias in Google Translate and its translations of occupations from Hungarian (a gender-neutral language) to English (a gender-based language) (the thesis can be accessed on this link). Using quantitative methods, the study aims to measure the extent of gender bias in machine translations. It examines the use of pronouns in the English translation of sentences such as “ő egy orvos” (“he/she is a doctor”). To measure the bias in the algorithm, the study compares Google Translate’s translations to the proportion of men and women in each occupation, and to society’s perception of those occupations. To assess whether people find those occupations feminine or masculine, we used a survey. Inspira assisted in this research: as part of their online omnibus research, they carried out the survey on a representative sample about the perceptions of occupations using questions provided by the Anna Farkas. The study found that Google Translate mirrors people’s perception of occupations to a greater extent than the proportion of men and women in those occupations.



Anna Brecsok’s thesis (Survey Statistics and Data Analytics MSc), in which she conducted a survey experiment to investigate a problem she encountered at her workplace, was published in the Hungarian Statistical Review. Anna’s supervisor, and the co-author of the paper was Renáta Németh, co-leader of our research group.

Call for papers: Sociology at the Dawn of a Successful Century – conference with NLP section


Organizers are now accepting papers for the Sociology at the Dawn of a Successful Century conference, which will take place from 11 to 12 June 2020 at the Hungarian Academy of Sciences, Centre for Social Sciences. At the conference, the sociological applications of NLP will be presented in a separate section. The section was facilitated by the two co-leaders of our research group, Ildikó Barna and Renáta Németh, along with Bence Ságvári, head of the CSS-Recens research group. Registration for the conference is open until March 31st.

Update: The situation with the coronavirus has made it uncertain whether the conference can be held in June. Regardless, we encourage everyone to submit an abstract by the deadline of 15 April (extended deadline), as the conference will be held at worst at a later date.

Lecture by Ildikó Barna at Charles University Prague 


Ildikó Barna, co-leader of our research group, gave a presentation on contemporary Hungarian antisemitism at the Formal and Applied Linguistics Institute of Charles University Prague. The presentation was based on the Online Antisemitism project conducted with Árpád Knap. In addition to presenting the results of the research so far, in her lecture she also discussed why sociological and domain knowledge is indispensable for interpreting the output of natural language processing.

Further information of the lecture is available on the university’s website. The video recording of the presentation can be accessed on this link.

Research and skills development for TDK students with the participation of our research team


Our research group is represented in the NTP-HHTDK grant won by the ELTE TÁTK TDK Workshop. The grant was established to support the Hungarian Scientific Students’ Associations and their events. As part of this, we will be launching a Python-based text analytics course for faculty students during the spring semester, after which they will be able to join our research team for internship positions.

NLP section at the Cyprus conference of ISA’s RC33


Our research group’s two leaders, Ildikó Barna and Renáta Németh has successfully initiated an NLP-related section (Natural Language Processing: a New Tool in the Methodological Tool-Box of Sociology) at the International Sociological Association’s RC33 (Logic and Methodology in Sociology) conference, held between 8th-11th September, 2020. Abstract submissions are welcome until 30th January on the conference website.