An Italian lexical resource for incivility detection in online discourses

Tontodimamma, Alice; Fontanella, Lara; Anzani, Stefano; Basile, Valerio

doi:10.1007/s11135-022-01494-7

The exponential growth of social media has brought an increasing propagation of online hostile communication and vitriolic discourses, and social media have become a fertile ground for heated discussions that frequently result in the use of insulting and ofensive language. Lexical resources containing specifc negative words have been widely employed to detect uncivil communication. This paper describes the development and implementation of an innovative resource, namely the Revised HurtLex Lexicon, in which every headword is annotated with an ofensiveness level score. The starting point is HurtLex, a multilingual lexicon of hate words. Concentrating on the Italian entries, we revised the terms in HurtLex and derived an ofensive score for each lexical item by applying an Item Response Theory model to the ratings provided by a large number of annotators. This resource can be used as part of a lexicon-based approach to track ofensive and hateful content. Our work comprises an evaluation of the Revised HurtLex lexicon