Detection of Hate Speech in Marathi Using Language Specific Pre-Processing

Authors

  • Ankur Sarode WB Tech, ING Bank, Netherland
  • Nailya Sultanova Kazan Federal University, Russia

DOI:

https://doi.org/10.69511/ijdsaa.v6i6.230

Keywords:

Marathi language, pre-processing, social media, hate speech, detection

Abstract

The increasing accessibility of social media platforms has exponentially increased the amount of textual content on the internet. Among this textual content, there is also a rapid growth of hate speech on these social media platforms. This offensive and hateful speech are increasingly becoming a cause for self-harm, depression and even suicidal tendencies, motivating social media platforms to invest in strategies that can make social communities safer. Most of the research done on hate speech detection is done in languages like English, Spanish, French and more. There is also a good amount of work done on the Hindi language, but there is not much significant work done on the Marathi language apart from a few notable exceptions. Marathi has 83 million speakers many of whom consume Marathi social media content. This makes hate speech detection in the Marathi language a desirable opportunity to explore. How various pre-processing methods affect hate speech detection in Marathi were explored in this research. Furthermore, transfer learning was used to analyse the efficiency of multilingual hate speech detection models for the Marathi language. Finally, how hate speech detection in large texts can vary from that in shot text data was explored.

Downloads

Published

2024-06-10

How to Cite

Sarode, A., & Sultanova, N. (2024). Detection of Hate Speech in Marathi Using Language Specific Pre-Processing. International Journal of Data Science and Advanced Analytics, 6(1), 297–301. https://doi.org/10.69511/ijdsaa.v6i6.230

Issue

Section

Articles