Renáta Németh gave a presentation titled “Who knows it better? The task of detecting discrimination using human coding vs. text mining” at the EuMePo (European Memory Politics) Jean Monnet Network conference on 15th June, 2023, Budapest. The co-authors were Jakab Buda and Bori Simonovits. In their study they assess the responsiveness of Hungarian local governments to requests for information by Roma and non-Roma clients, relying on a nationwide email study that applied a randomized controlled trial design. Two methods were used in parallel to evaluate the response emails in parallel: traditional qualitative coding and machine learning (ML). Both methods provided evidence of attention discrimination. Our ML models worked significantly better compared to random classification, confirming the differential treatment of Roma clients. The most important predictors showed that the answers sent to ostensibly Roma clients are not only shorter, but their tone is less polite and more reserved, supporting the idea of attention discrimination. A higher level of attention discrimination is detectable against male senders, and in smaller settlements.
The study showed that it is possible to detect discrimination in textual data in an automated way without human coding, and that ML may detect linguistic features of discrimination that human coders may not recognize. To the best of the authors’ knowledge, the study is the first attempt to assess discrimination using ML techniques.
Renáta Németh’s and Jakab Buda’s work was supported by the K-134428 NKFIH grant, and Bori Simonovits’s work was supported by the FK-12798 NKFIH grant.