Skip to yearly menu bar Skip to main content

Affinity Workshop: Women in Machine Learning

Categorizing Online Harassment on Twitter using Graph Convolutional Networks

Mozhgan saeidi


Online platforms and social media are places where people express themselves freely more and more. Twitter is one of these social media that attracts more daily users. When users can express themselves freely, various tones can be seen in their posts. Harassment is one of the consequences of these platforms. Text categorization and classification is a task that aims to solve this problem. Many studies applied classical machine learning methods and recent deep neural networks to categorize text. However, only a few studies have explored graph convolutional neural networks simultaneously using classical approaches to categorize harassment in Tweets. In this work, we propose using Graph Convolutional Networks for the tweet categorization task. Second, we explore this categorization task using classical machine learning approaches and compare the results with the GCN model. Third, we show the effectiveness of the GCN model in performing this task by feeding half of the dataset to the model and still obtaining good performance, above 91%, for categorizing all different types. In addition, we used different embedding approaches to find the best representation for the dataset in each model. We used classical machine learning approaches, including Logistic Regression, Gaussian Na"ive Bayes, Decision Trees, Random Forests, Linear Support Vector Machines, Gaussian SVM, Polynomial SVM, and Multi-Layer Perceptron AdaBoost methods. Finally, we use a collection of English tweets as our dataset when running the experiments. We applied TF-IDF vectors and Word2Vec embeddings as features in these classical machine learning approaches. In our experiments with classical approaches, s a result, we achieved above 0.80 accuracies for detecting sexism and sexual harassment types in the data.

Chat is not available.