Anna Farkas – Social biases in machine learning: A case study of Google Translate

2020 Sociology BA Supervisor Renáta Németh, PhD

In recent years, several studies have been published about the phenomenon that machine learning algorithms are prone to reinforce or amplify human biases. This paper is a case study that investigates gender bias in Google Translate and its translations of occupations from Hungarian (a gender-neutral language) to English (a gender-based language). Using quantitative methods, the study aims to measure the extent of gender bias in machine translations. It examines the use of pronouns in the English translation of sentences such as “ő egy orvos” (“he/she is a doctor”).

To measure the bias in the algorithm, the study compares Google Translate’s translations to the proportion of men and women in each occupation, and to society’s perception of those occupations. To assess whether people find those occupations feminine or masculine, we used an omnibus survey created with the help of Inspira Group research company. The study found that Google Translate mirrors people’s perception of occupations to a greater extent than the proportion of men and women in those occupations.

The paper also includes research about how using attributives such as “good”, “very good”, “bad”, “very bad” in the sentences modify the translations of the pronouns.

View Thesis