Inspira Group, a research company, according to their “data for social good” principle, made it possible for Anna Farkas, a BA student in sociology, to add questions to their online omnibus research, free of charge. The paper, supervised by Renáta Németh, is a case study that investigates gender bias in Google Translate and its translations of occupations from Hungarian (a gender-neutral language) to English (a gender-based language) (the thesis can be accessed on this link). Using quantitative methods, the study aims to measure the extent of gender bias in machine translations. It examines the use of pronouns in the English translation of sentences such as “ő egy orvos” (“he/she is a doctor”). To measure the bias in the algorithm, the study compares Google Translate’s translations to the proportion of men and women in each occupation, and to society’s perception of those occupations. To assess whether people find those occupations feminine or masculine, we used a survey. Inspira assisted in this research: as part of their online omnibus research, they carried out the survey on a representative sample about the perceptions of occupations using questions provided by the Anna Farkas. The study found that Google Translate mirrors people’s perception of occupations to a greater extent than the proportion of men and women in those occupations.
Publication
2020.03.18.Anna Brecsok’s thesis (Survey Statistics and Data Analytics MSc), in which she conducted a survey experiment to investigate a problem she encountered at her workplace, was published in the Hungarian Statistical Review. Anna’s supervisor, and the co-author of the paper was Renáta Németh, co-leader of our research group.
Call for papers: Sociology at the Dawn of a Successful Century – conference with NLP section
2020.03.10.Organizers are now accepting papers for the Sociology at the Dawn of a Successful Century conference, which will take place from 11 to 12 June 2020 at the Hungarian Academy of Sciences, Centre for Social Sciences. At the conference, the sociological applications of NLP will be presented in a separate section. The section was facilitated by the two co-leaders of our research group, Ildikó Barna and Renáta Németh, along with Bence Ságvári, head of the CSS-Recens research group. Registration for the conference is open until March 31st.
Update: The situation with the coronavirus has made it uncertain whether the conference can be held in June. Regardless, we encourage everyone to submit an abstract by the deadline of 15 April (extended deadline), as the conference will be held at worst at a later date.
Lecture by Ildikó Barna at Charles University Prague
2020.02.20.Ildikó Barna, co-leader of our research group, gave a presentation on contemporary Hungarian antisemitism at the Formal and Applied Linguistics Institute of Charles University Prague. The presentation was based on the Online Antisemitism project conducted with Árpád Knap. In addition to presenting the results of the research so far, in her lecture she also discussed why sociological and domain knowledge is indispensable for interpreting the output of natural language processing.
Further information of the lecture is available on the university’s website. The video recording of the presentation can be accessed on this link.
Research and skills development for TDK students with the participation of our research team
2020.02.19.Our research group is represented in the NTP-HHTDK grant won by the ELTE TÁTK TDK Workshop. The grant was established to support the Hungarian Scientific Students’ Associations and their events. As part of this, we will be launching a Python-based text analytics course for faculty students during the spring semester, after which they will be able to join our research team for internship positions.
NLP section at the Cyprus conference of ISA’s RC33
2020.01.29.Our research group’s two leaders, Ildikó Barna and Renáta Németh has successfully initiated an NLP-related section (Natural Language Processing: a New Tool in the Methodological Tool-Box of Sociology) at the International Sociological Association’s RC33 (Logic and Methodology in Sociology) conference, held between 8th-11th September, 2020. Abstract submissions are welcome until 30th January on the conference website.
Flóra Bolonyai placed first at the faculty’s TDK conference
2020.01.03.Flóra Bolonyai, second year Survey Statistics and Data Analytics MsC student, took first place with her presentation entitled “Text Analytics Models for Profiling Authors’ gender” at the faculty’s TDK (Scientific Students’ Associations) conference on December 13th. Flóra’s supervisor was one of our research group’s member, Eszter Katona.
Our research group received the support of Ariosz Ltd.
2020.01.03.In 2019, our research group’s student program received the support of Ariosz Ltd. The donation aims to contribute to the transformation in the field of quantitative social research.
Full House and Great Success at our Workshop
2019.10.11.Full house and great success at the workshop organized by our research group on the new ways of sociological research
The first section, chaired by Sára Simon, dealt with the social scientific application of automated text analysis. In the first lecture, Renáta Németh talked about the aim of the research group and the positions of the new methods in social sciences. Ildikó Barna and Árpád Knap detected various types of antisemitic narratives in antisemitic articles and comments, some of which are not measurable by surveys. Zoltán Kmetty and Julia Koltai focused on word embedding NLP models; in addition to presenting the methodology, they demonstrated the potential of application for social scientists with concrete examples. Domonkos Sik and Fanni Máté talked about the results of their research on the framing of depression in online forums, including the benefits and difficulties of human annotation. Eszter Katona and Fanni Máté, in the last lecture of the section, detailed the methodology of supervised teaching models in the same research.
In the second section, Gábor Palkó talked about the analytical possibilities of a semantic prosopographic (collective biography) data network they built, called HECEdata, which can be examined together with the data elements of WikiDATA. Then Balázs Indig argued for the scientific necessity of using webcrawlers, and introduced the process step-by-step, in the case of humanities and social scientific applications.
The third section was chaired by Kinga Szálkai and opened by Nikosz Fokasz. In his presentation, the professor spoke about Hungarian and Greek auto- and heterostereotypes, and networks, based on identities and newspaper articles. The second lecture of the section was given by Pál Susánszky and Márton Gerő on the analysis of the police database on protests in Hungary, which they supplemented. The last presentation of the section and also the whole workshop was held by Erzsébet Takács and Flóra Takács. They put the previous presentation into a larger context and unfolded the traditions and problems of domestic movement research and the instruments of demonstration.
The presentations (in Hungarian) can be found here:
International Network Researching the European Memory Politics
2019.10.02.The RC2S2 (Research Center for Computational Social Science founded by the Institute of Empirical Studies of ELTE Faculty of Social Sciences) contributes to international research cooperation titled “European Memory Politics – Populism, Nationalism and the Challenges to a European Memory Culture (EuMePo) led by the Canadian University of Victoria (UVic). Other partners are the Université de Strasbourg and the Adam Mickiewicz University (Poznań).
Internship
2019.09.05.Presentation
2019.08.21.In August 2019, Ildikó Barna and Árpád Knap represented our online antisemitism research group and presented at the European Sociological Association (ESA) conference in Manchester. In their paper, they analysed online antisemitism by scrutinising articles and comments from the far-right Kuruc.info news portal using LDA topic modelling. In their earlier published paper, they dealt with the articles only. In their present paper, they supplemented it with the analysis of comments and also with the comparison of the two. Click here for the presentation.
Presentation
2019.07.24.Renáta Németh and Fanni Máté represented our depression research group in July 2019 at the International Conference on Computational Social Science Conference. They gave a poster presentation on Bio, psycho or social – Discursive framing of depression in online health communities, with Domonkos Sik and Eszter Katona as co-authors. The poster can be downloaded here.
Achievement
2019.07.01.Zoltán Kmetty earned the Young Researcher Scholarship by NKFI. In the project of 36 month support an experimental Facebook research is taking place, which is foremost methodological and focuses of the usefulness of Facebook research.
The website of the research can be found here https://fbpilot.tatk.elte.hu
Business Cooperation
2019.07.01.An agreement has been made between the Research Center for Computational Social Science and Clementine Hungary. From the autumn semester of 2018/2019, text analytics courses will be held and thesis topics will be offered on text analytics by Clementine. Thesis topics will include a wide range of issues, including characteristics of communication with robots, changing habits of online media consuming and Facebook activity of political parties.
Business Cooperation
2019.07.01.Our research group made an agreement with SentiOne, a social listening and online media monitoring corporation. SentiOne saves, searches and analyses publicly available texts found online dating back to three years. These texts are used by our group for several research projects as corpora.
Renewed Training Program
2019.07.01.From the autumn semester of 2019/2020 the Survey Statistics MSc program is running under the name of Survey Statistics and Data Analysis MSc. The structure of the course has changed significantly in the last few years in respects of the tools taught and the fields of human resources market targeted. Students are tought big data analytics, methods of network and text analytics, data mining, usage of several statistics software and programming languages in scientific, administrative and business applications, and social research as well.
Achievement
2019.06.15.80 students, of which 50 students first-ranked, applied for getting admission to the Survey Statistics and Data Analytics MSc in 2019. The number of applicants is bigger than that of last year and this achievement makes the program the 10th among the 163 MSc courses of ELTE considering the number of first-ranked applications of MSc courses.
Publication
2019.06.15.A paper entitled “Antisemitism in Contemporary Hungary: Exploring Topics of Antisemitism in the Far-Right Media Using Natural Language Processing” was published by Ildikó Barna and Árpád Knap in the Theo-Web journal. The paper discusses contemporary online antisemitism in Hungary by analysing articles of the far-right website Kuruc.info. The paper can be accessed by clicking here.
Presentation
2019.05.14.Eszter Katona held a presentation entitled ‘Natural Language Processing in Social Sciences’ on 10 May 2019 in Basel, at the Joint Annual Conference of the GPSA Methods of Political Science Section and the SPSA Empirical Methodology Working Group (https://www.methodology-dvpw-
Lecture
2019.05.09.Zoltán Kmetty and Júlia Koltai held a presentation entitled ‘Understanding Cultural Choices with NLP’ on 9 May 2019 at the Budapest Data Science Meetup, discussing the role of natural language processing (NLP), including neural network-based word embedding models in better understanding human behaviour and culture. The abstract of the presentation is available here: https://www.meetup.com/budapest_data_science/events/260865653/
Publication
2019.05.01.A paper entitled ‘Collapse of an online social network: Burning social capital to create it?’ was published in the D1 qualified (2.53 impact factor) Social Networks journal with Júlia Koltai as a co-author. The paper is discussing how the iWIW social media site collapsed from the aspect of connection networks. The paper is ‘open access’, so it can be accessed for free at the following site: https://www.sciencedirect.com/science/article/pii/S0378873317301399
Conference Organizing
2019.03.05.Organising Conferences. Julia Koltai was among the organisers of ‘Women in Data Science Budapest’ conference held on 4 March 2019 at the Central European University. The conference was a partner event of a conference organised at Stanford University with the same title. The session attracted more than 350 participants. The official site of the event and further information is available here: https://www.facebook.com/events/2134883203508900/
The video made at the conference is available here:
Education
2019.03.01.Júlia Koltai held a course entitled ‘Structural Equation Modelling (SEM) with R’ from 22 February 2019 to 1 March 2019 at the European Consortium for Political Research Winter School in Methods and Techniques, University of Bamberg. The schedule of the course is available here: https://ecpr.eu/Events/PanelDetails.aspx?PanelID=8356&EventID=127
Lecture
2019.02.23.Ildiko Barna held a presentation entitled “Overt and Subtle Antisemitism in Hungary” at the “Antisemitism, Anti-Zionism, Israel, and the Holocaust” workshop in Salzburg on 23 February 2019, discussing the methods and challenges of measuring antisemitism. Besides presenting the results of surveys on antisemitism in Hungary, she talked about new research opportunities offered by Natural Language Processing (NLP) methods, and also about the work of the research group.
Lecture
2018.10.28.Zoltán Kmetty, Julia Koltai and Karoly Bozsonyi presented their joint research on a poster at the BigSurv18 conference in Barcelona. Their work is focused on depression research in social media (Twitter, Instagram).
Lecture
2018.10.19.A keynote speech entitled ’Data science and statistics’ was held by Renáta Németh on 19 October 2018 at the meeting of the Clinical Biostatistical Society discussing the paradigm of ’big data’ methodology and its relation to ‘classical’ statistics. The discussion was focused mainly on biostatistical relevancy of the subject.
Achievement
2018.08.01.Zoltán Kmetty earned a postdoctoral scholarship offered by MTA. The focus of the scholarship is the application of text analytics in social sciences. In the scholarship program, Zoltán Kmetty is joining the MTA TK CSS-RECENS research team.
Hackathon
2018.04.21.Our faculty jointly organised a Hackathon for students with the Precognox data mining corporation and the K-Monitor civil organisation in April, 2018. The aim of the competition was to create analyses and presentations using open-source software for data visualisation, to help to understand problems hiding behind data supplied by civil initiatives – with the deadline of 12 hours.