Ildikó Barna and Árpád Knap at the 2021 ISCA conference

2020.12.30. Online Antisemitism

Good news. Ildikó Barna and Árpád Knap are among the speakers of the 2021 ISCA conference organized by Indiana University. The “Antisemitism in Today’s America: Manifestations, Causes, and Consequences” conference is one of the most important international forums for antisemitism researchers. Ildikó and Árpád present a comparative analysis of Hungarian and American conspiracy theories related to György Soros, using NLP analysis of online discourses. It is a special pleasure for us to say, that according to the head of the organizing institute, “Your paper promises to make a particularly interesting contribution to our collective work on contemporary antisemitism.”
https://isca.indiana.edu/

Publication

2020.09.30.

Three members of our research group (Renáta Németh, Eszter Rita Katona, Zoltán Kmetty) recently published an article, which aims to present the characteristics and possibilities of automated text analysis. Their goal is to inspire Hungarian social scientists by providing an insight into a less-institutionalized area, since they believe that at an international level, text mining will be a standard method for empirical social science research within a few years.

RC2S2 presentations at the conference “Sociology at the Dawn of a Successful Century”

2020.09.21.

The conference “Sociology at the Dawn of a Successful Century” will be held on October 8-9th at the Hungarian Academy of Sciences. The sociological applications of NLP will be presented in it’s own section, led by two co-leaders of our research group, Ildikó Barna and Renáta Németh (and Bence Ságvári, leader of the CSS-Recens research group of the Hungarian Academy of Sciences). Júlia Koltai, member of RC2S2, belongs to the organizing committee of the conference. From RC2S2 Ildikó Barna, Eszter Katona, Zoltán Kmetty, Árpád Knap, Júlia Koltai, Renáta Németh, Márton Rakovics, Domonkos Sik perform as speaker / co-author.

Two of our researchers won the grant of the New National Excellence Program

2020.09.14.

Both Eszter Katona and Árpád Knap won the scholarship of the New National Excellence Program (ÚNKP) for the next one-year period. The title of Eszter’s research is Corruption risk and prediction – Analysis of the texts of public procurement tenders using the tools of natural language processing. Her supervisor is Mihály Fazekas. According to the hypothesis of Eszter’s research, the examination of the wording of public procurement tenders can help in uncovering suspected cases of corruption. The title of Árpád’s research is Analysis of emotions related to twentieth-century traumas using Natural Language Processing methods. In his research, he will analyse emotions found in social media and the press, related to the two most influential events of the 20th century’s Hungarian history: the Treaty of Trianon and the Holocaust. Árpád’s supervisor is Ildikó Barna, the co-leader of our research group.

Success

2020.09.08.

We are pleased to announce that our student, Bernadett Csala-Ferencz (MSc in Survey Statistics and Data Analytics), has won a scholarship from the New National Excellence Program for the new semester. The title of her research is: Exploratory clustering of online posts on depression. With the support of the program, Bernadett is analysing the posts of English-language online depression forums using natural language processing (NPL) methods within the Research Center for Computational Social Science research group. Her supervisor is Renáta Németh.

Success

2020.08.30.

Our research group has received an outstanding opportunity – we have been awarded the 2020 “OTKA” (Hungarian Scientific Research Fund) research grant for 2020-2023 for our project titled “The layers of the political public sphere in Hungary (2001-2020) – a sociological analysis of the official, media-based and lay online public sphere using automated text analytics and critical discourse analysis”. The Principal Investigator is Renáta Németh.

Project members: Ildikó Barna, Domonkos Sik, Márton Rakovics, Eszter Katona, Árpád Knap (RC2S2, ELTE TáTK) and Péter Csigó (BME).

Research summary

Our research focuses on revealing language change in the political public sphere applying NLP (Natural Language Processing) methods on a large digital text corpus in critical discourse analysis framework, which treats language as a tool of ideology and power.

The two highlighted stakes–important for both society and sociology–of the research: (1) The inner workings of the political public sphere on its different levels, the dynamics and interaction between these levels, the exercise of power through language, the expressed ideological polarization, and identification of discourses free of these tendencies. (2) The organic integration of NLP methods into empirical sociology.

Both aspects of the research have international relevance since the studied phenomena of language polarization and diffusion of usage patterns are not specific to Hungary. The integration of NLP methods into empirical sociology is an emerging topic of huge interest because it allows for the new kind of analysis of the large digital corpora at hand utilizing sociological knowledge in the process.

All innovative methodological solutions of data collection and analysis will be made publicly available through digital repositories, scientific articles and conference talks to support the international and domestic users of NLP. Senior members of the research group will provide opportunities to join the project by supervising researches of Scientific Students’ Associations (TDK), master theses and Ph.D. topics for young researchers, and research internships or thesis supervision for graduate students.

Significance of the research

The digital revolution is also the revolution of self-expression. Before the internet, textual documents mainly bore the narratives of the elite, but now almost everybody has the opportunity to express themselves online. The primary relevance of our research is that by the automated processing of this continuously forming flow of texts, even those characteristics of the political public sphere can be examined and understood that have previously been only available in local fragments by the observer. Thus both the social and the scientific stakes of our research are high.

From a scientific standpoint, our research opens new perspectives by involving observational data into quantitative research in addition to previously used self-reported survey data, and also by combining qualitative discourse analysis with quantitative methods, employing new and innovative solutions from computational linguistics. This methodological blend is an international novelty. Using NLP on a large-scale text corpus covering different levels of digital communication–to our knowledge–has never been done in Hungary. While there is evidence for strong ideological polarization (e.g. Vegetti 2018), and polarization in the network structure of the political public sphere has been examined (Bene and Szabó 2019), language polarization has not been researched domestically, and there are only a few examples internationally, which are on a much smaller scale of application compared to our own research (see e.g. Demszky et al., 2019, on Twitter data).

Our research has an important stake from a social standpoint as well: the public sphere is one of the cornerstones of modern democracy and serves an important role in preventing potential distortions and crises.

One of the strengths of the proposal is that it is backed by a young but highly successful research group, already with several international publications, doctoral research topics, and a consciously built domestic and international network of collaborations. Besides the compilation and utilization of innovative methods for sociology, the aim of the research group is to foster the institutionalization of the new and promising automated text analysis methods in social sciences.

Publication

2020.08.25.

A new paper entitled Sociologists using machine learning: Hermeneutic limitations of ‘big data’ text analytics if non-trivial concepts are taught has been published in the International Journal of Qualitative Methods (Q1, impact factor 3.6) by Renáta Németh, Domonkos Sik and Fanni Máté. We were pleased to read the reviews, e.g. “The article is the one of the fundamental researches in this area” and “I look forward to seeing future efforts as you proceed with this research“. We hope that we will be able to fulfil the latter, since we already have three international publications being reviewed.

The paper can be accessed on the following URL: https://journals.sagepub.com/doi/full/10.1177/1609406920949338

Further results in this field: https://rc2s2.elte.hu/en/project/discursive-framing-of-depression-in-online-health-communities/

Our research group working on online antisemitism started a new project on coronavirus-related online antisemitism

2020.07.27.

Jews have been accused many times throughout the history of deliberately spreading disease among non-Jews. Simultaneously with the outbreak of the coronavirus epidemic, conspiracy theories linking Jews to the virus appeared. The internet is of paramount importance in the distribution of these conspiracy theories. In our research, we examine a large text corpus of Hungarian online articles and comments/posts to answer the research question of whether coronavirus-related antisemitic discourses appear in the Hungarian online space, and if so, what their content is. Our corpus contains articles, comments, and posts written in Hungarian between December 1, 2019, and July 10, 2020, in which the different forms of the word Jew, Zionism, Israel, and that of coronavirus appear simultaneously. Fifteen students from Sociology BA at ELTE University, Faculty of Social Sciences, are participating in the research as interns.

Success

2020.07.21.

We are pleased to announce that two theses supervised by members of our research group have been awarded the title of “Thesis of the Year”. Anna Farkas wrote the best dissertation at Sociology BA (supervisor: Renáta Németh), and Jakab Buda at Survey Statistics and Data Analysis MSc (supervisor: Márton Rakovics). Congratulations! The dissertations can be found here, along with other dissertations in Computational Social Science led by members of our research team.

Business Cooperation

2020.06.30.

Inspira Group, a research company, according to their “data for social good” principle, made it possible for Anna Farkas, a BA student in sociology, to add questions to their online omnibus research, free of charge. The paper, supervised by Renáta Németh, is a case study that investigates gender bias in Google Translate and its translations of occupations from Hungarian (a gender-neutral language) to English (a gender-based language) (the thesis can be accessed on this link). Using quantitative methods, the study aims to measure the extent of gender bias in machine translations. It examines the use of pronouns in the English translation of sentences such as “ő egy orvos” (“he/she is a doctor”). To measure the bias in the algorithm, the study compares Google Translate’s translations to the proportion of men and women in each occupation, and to society’s perception of those occupations. To assess whether people find those occupations feminine or masculine, we used a survey. Inspira assisted in this research: as part of their online omnibus research, they carried out the survey on a representative sample about the perceptions of occupations using questions provided by the Anna Farkas. The study found that Google Translate mirrors people’s perception of occupations to a greater extent than the proportion of men and women in those occupations.

Publication

2020.03.18.

Anna Brecsok’s thesis (Survey Statistics and Data Analytics MSc), in which she conducted a survey experiment to investigate a problem she encountered at her workplace, was published in the Hungarian Statistical Review. Anna’s supervisor, and the co-author of the paper was Renáta Németh, co-leader of our research group.

Call for papers: Sociology at the Dawn of a Successful Century – conference with NLP section

2020.03.10.

Organizers are now accepting papers for the Sociology at the Dawn of a Successful Century conference, which will take place from 11 to 12 June 2020 at the Hungarian Academy of Sciences, Centre for Social Sciences. At the conference, the sociological applications of NLP will be presented in a separate section. The section was facilitated by the two co-leaders of our research group, Ildikó Barna and Renáta Németh, along with Bence Ságvári, head of the CSS-Recens research group. Registration for the conference is open until March 31st.

Update: The situation with the coronavirus has made it uncertain whether the conference can be held in June. Regardless, we encourage everyone to submit an abstract by the deadline of 15 April (extended deadline), as the conference will be held at worst at a later date.

Lecture by Ildikó Barna at Charles University Prague 

2020.02.20.

Ildikó Barna, co-leader of our research group, gave a presentation on contemporary Hungarian antisemitism at the Formal and Applied Linguistics Institute of Charles University Prague. The presentation was based on the Online Antisemitism project conducted with Árpád Knap. In addition to presenting the results of the research so far, in her lecture she also discussed why sociological and domain knowledge is indispensable for interpreting the output of natural language processing.

Further information of the lecture is available on the university’s website. The video recording of the presentation can be accessed on this link.

Research and skills development for TDK students with the participation of our research team

2020.02.19.

Our research group is represented in the NTP-HHTDK grant won by the ELTE TÁTK TDK Workshop. The grant was established to support the Hungarian Scientific Students’ Associations and their events. As part of this, we will be launching a Python-based text analytics course for faculty students during the spring semester, after which they will be able to join our research team for internship positions.

NLP section at the Cyprus conference of ISA’s RC33

2020.01.29.

Our research group’s two leaders, Ildikó Barna and Renáta Németh has successfully initiated an NLP-related section (Natural Language Processing: a New Tool in the Methodological Tool-Box of Sociology) at the International Sociological Association’s RC33 (Logic and Methodology in Sociology) conference, held between 8th-11th September, 2020. Abstract submissions are welcome until 30th January on the conference website.

Full House and Great Success at our Workshop

2019.10.11.

Full house and great success at the workshop organized by our research group on the new ways of sociological research

The first section, chaired by Sára Simon, dealt with the social scientific application of automated text analysis. In the first lecture, Renáta Németh talked about the aim of the research group and the positions of the new methods in social sciences. Ildikó Barna and Árpád Knap detected various types of antisemitic narratives in antisemitic articles and comments, some of which are not measurable by surveys. Zoltán Kmetty and Julia Koltai focused on word embedding NLP models; in addition to presenting the methodology, they demonstrated the potential of application for social scientists with concrete examples. Domonkos Sik and Fanni Máté talked about the results of their research on the framing of depression in online forums, including the benefits and difficulties of human annotation. Eszter Katona and Fanni Máté, in the last lecture of the section, detailed the methodology of supervised teaching models in the same research.

In the second section, Gábor Palkó talked about the analytical possibilities of a semantic prosopographic (collective biography) data network they built, called HECEdata, which can be examined together with the data elements of WikiDATA. Then Balázs Indig argued for the scientific necessity of using webcrawlers, and introduced the process step-by-step, in the case of humanities and social scientific applications.

The third section was chaired by Kinga Szálkai and opened by Nikosz Fokasz. In his presentation, the professor spoke about Hungarian and Greek auto- and heterostereotypes, and networks, based on identities and newspaper articles. The second lecture of the section was given by Pál Susánszky and Márton Gerő on the analysis of the police database on protests in Hungary, which they supplemented. The last presentation of the section and also the whole workshop was held by Erzsébet Takács and Flóra Takács. They put the previous presentation into a larger context and unfolded the traditions and problems of domestic movement research and the instruments of demonstration.

The presentations (in Hungarian) can be found here:

International Network Researching the European Memory Politics

2019.10.02.

The RC2S2 (Research Center for Computational Social Science founded by the Institute of Empirical Studies of ELTE Faculty of Social Sciences) contributes to international research cooperation titled “European Memory Politics – Populism, Nationalism and the Challenges to a European Memory Culture (EuMePo) led by the Canadian University of Victoria (UVic). Other partners are the Université de Strasbourg and the Adam Mickiewicz University (Poznań).

(more…)

Internship

2019.09.05.
Starting this September, three Survey Statistics and Data Analytics MSc students will be working in conjunction with ELTE BTK Digital Humanities Research Center and Research Center for Computational Social Sciences. They will work under the supervision of Gábor Palkó and Balázs Indig, mainly on the “Webaratás” (Web-plowing) project. During the internship, participants will gain interesting and valuable programming experience. We wish them every success in their work!

Presentation

2019.08.21.

In August 2019, Ildikó Barna and Árpád Knap represented our online antisemitism research group and presented at the European Sociological Association (ESA) conference in Manchester. In their paper, they analysed online antisemitism by scrutinising articles and comments from the far-right Kuruc.info news portal using LDA topic modelling. In their earlier published paper, they dealt with the articles only. In their present paper, they supplemented it with the analysis of comments and also with the comparison of the two. Click here for the presentation.

Achievement

2019.07.01.

Zoltán Kmetty earned the Young Researcher Scholarship by NKFI. In the project of 36 month support an experimental Facebook research is taking place, which is foremost methodological and focuses of the usefulness of Facebook research.

The website of the research can be found here https://fbpilot.tatk.elte.hu

Business Cooperation

2019.07.01.

An agreement has been made between the Research Center for Computational Social Science and Clementine Hungary. From the autumn semester of 2018/2019, text analytics courses will be held and thesis topics will be offered on text analytics by Clementine. Thesis topics will include a wide range of issues, including characteristics of communication with robots, changing habits of online media consuming and Facebook activity of political parties.

Business Cooperation

2019.07.01.

Our research group made an agreement with SentiOne, a social listening and online media monitoring corporation. SentiOne saves, searches and analyses publicly available texts found online dating back to three years. These texts are used by our group for several research projects as corpora.

Renewed Training Program

2019.07.01.

From the autumn semester of 2019/2020 the Survey Statistics MSc program is running under the name of Survey Statistics and Data Analysis MSc. The structure of the course has changed significantly in the last few years in respects of the tools taught and the fields of human resources market targeted. Students are tought big data analytics, methods of network and text analytics, data mining, usage of several statistics software and programming languages in scientific, administrative and business applications, and social research as well.

Achievement

2019.06.15.

80 students, of which 50 students first-ranked, applied for getting admission to the Survey Statistics and Data Analytics MSc in 2019. The number of applicants is bigger than that of last year and this achievement makes the program the 10th among the 163 MSc courses of ELTE considering the number of first-ranked applications of MSc courses.

Publication

2019.06.15.

A paper entitled “Antisemitism in Contemporary Hungary: Exploring Topics of Antisemitism in the Far-Right Media Using Natural Language Processing” was published by Ildikó Barna and Árpád Knap in the Theo-Web journal. The paper discusses contemporary online antisemitism in Hungary by analysing articles of the far-right website Kuruc.info. The paper can be accessed by clicking here.

Presentation

2019.05.14.

Eszter Katona held a presentation entitled ‘Natural Language Processing in Social Sciences’ on 10 May 2019 in Basel, at the Joint Annual Conference of the GPSA Methods of Political Science Section and the SPSA Empirical Methodology Working Group (https://www.methodology-dvpw-svpw.com/). She presented the results of the paper she wrote together with Renáta Németh and Zoltán Kmetty. The study is currently under review in a Hungarian sociology journal.

Lecture

2019.05.09.

Zoltán Kmetty and Júlia Koltai held a presentation entitled ‘Understanding Cultural Choices with NLP’ on 9 May 2019 at the Budapest Data Science Meetup, discussing the role of natural language processing (NLP), including neural network-based word embedding models in better understanding human behaviour and culture. The abstract of the presentation is available here: https://www.meetup.com/budapest_data_science/events/260865653/