how online hate speech predicts real-world violence against migrant and LGBT communities.

Avatar 1: Hi there! Welcome back to our Egreenews Conversations! Avatar 2: Great to be here! Avatar 1: Today we're discussing the 2025 Hernandez Forecast on how online hate speech can predict real-world violence against vulnerable communities. Avatar 2: That's such an important issue. What does the latest research tell us about this connection? Avatar 1: A major new study published in Humanities and Social Sciences Communications analyzed three years of social media data and police records in Spain - and the findings are eye-opening. Avatar 2: What did they discover about the link between online hate and offline violence? Avatar 1: Using advanced machine learning, researchers found inflammatory language on social media could predict up to 64% of hate crimes against migrants and 53% of those against LGBT individuals. Avatar 2: Those are remarkably strong correlations! How exactly did they measure this relationship? Avatar 1: They examined over a million social media posts alongside police reports, tracking specific types of toxic language like threats, identity attacks, and hate speech. Avatar 2: That's fascinating! But couldn't this just reflect general societal tensions rather than showing online speech actually leads to violence? same prompt but a larger script using the following text: From online hate speech to offline hate crime: the role of inflammatory language in forecasting violence against migrant and LGBT communities Carlos Arcila Calderón, Patricia Sánchez Holgado, Jesús Gómez, Marcos Barbosa, Haodong Qi, Alberto Matilla, Pilar Amado, Alejandro Guzmán, Daniel López-Matías & Tomás Fernández-Villazala Humanities and Social Sciences Communications volume 11, Article number: 1369 (2024) Cite this article Abstract Social media messages often provide insights into offline behaviors. Although hate speech proliferates rapidly across social media platforms, it is rarely recognized as a cybercrime, even when it may be linked to offline hate crimes that typically involve physical violence. This paper aims to anticipate violent acts by analyzing online hate speech (hatred, toxicity, and sentiment) and comparing it to offline hate crime. The dataset for this preregistered study included social media posts from X (previously called Twitter) and Facebook and internal police records of hate crimes reported in Spain between 2016 and 2018. After conducting preliminary data analysis to check the moderate temporal correlation, we used time series analysis to develop computational models (VAR, GLMNet, and XGBTree) to predict four time periods of these rare events on a daily and weekly basis. Forty-eight models were run to forecast two types of offline hate crimes, those against migrants and those against the LGBT community. The best model for migrant crime achieved an R2 of 64%, while that for LGBT crime reached 53%. According to the best ML models, the weekly aggregations outperformed the daily aggregations, the national models outperformed those geolocated in Madrid, and those about migration were more effective than those about LGBT people. Moreover, toxic language outperformed hatred and sentiment analysis, Facebook posts were better predictors than tweets, and in most cases, speech temporally preceded crime. Although we do not make any claims about causation, we conclude that online inflammatory language could be a leading indicator for detecting potential hate crimes acts and that these models can have practical applications for preventing these crimes. Similar content being viewed by others Social identity correlates of social media engagement before and after the 2022 Russian invasion of Ukraine Article Open access 01 October 2024 Understanding the role of media in the formation of public sentiment towards the police Article Open access 07 February 2024 Ideological asymmetries in online hostility, intimidation, obscenity, and prejudice Article Open access 15 December 2023 Introduction According to scientific consensus, attitudes precede behaviors, and this is the underlying basis of most theoretical frameworks in social psychology and social sciences (Ajzen, 2001). However, it could be challenging to measure attitudes in sensitive fields, such as migration or sexual diversity; surveys and other traditional approaches could mask actual attitudes toward sensitive topics and thus fail to capture racist or xenophobic beliefs (Janus, 2010) or LGBTphobia. In this context, user-generated content on social media platforms may be valuable for examining radical and extreme attitudes since users tend to be less constrained in expressing their opinions online (Chaudhry, 2015). Moreover, research has demonstrated that online opinions may reveal offline behaviors (Kalampokis et al. 2013; Korkmaz et al. 2016; Jungherr, A. and Jürgens, 2013) and that the spread of hate speech on social media can anticipate (but not necessarily cause) hate crimes in the offline world (Müller and Schwarz, 2021, 2023; Bozhidarova et al. 2023). Although there is a growing body of relevant evidence, the existing empirical studies mostly rely on low-frequency crime data that can be publicly accessed, e.g., yearly crime statistics. However, when such aggregated statistics are aligned with high-frequency social media data, the temporal dynamics of online opinions and their relationships with crime actions are masked. Moreover, most studies are based on single sources (one social media platform) and single targets (either generic or specific groups). In this article, we use a unique dataset of daily police reports in Spain disaggregated at the province level (NUTS3). Relying on these high-frequency and granular crime data, coupled with multiple sources of social media data (X, previously called Twitter, and Facebook) and specific targets of hate speech (migrants and LGBT people), we seek to better model and predict the temporal patterns of hate crime. Faced with these challenges and based on the above-mentioned previous studies, in this paper (research design preregistered at https://osf.io/bwn93), we analyze and measure the emergence and propagation of hate speech on social media. We adopt the premise that this is the breeding ground—while not the cause itself—for the execution of hate crimes against migrants and refugees and LGBT communities in Spain, measured by the number of police reports instead of disaggregated public official statistics. Thus, this paper aims to evaluate how hate speech propagated on social media can anticipate violent acts being committed, which in turn can help to broaden the theoretical understanding of social media not only as a “representative” of society or with “effects” on individuals but also as a predictor of offline human behavior. To test this hypothesis, we combined manual and innovative computational approaches over 3 years (from January 2016 to December 2018) and surveyed the presence of hate speech and inflammatory language based on xenophobia/racism and LGBTphobia in two of the main social media platforms in Spain (X and Facebook), as well as episodes of hate crime using disaggregated and anonymized police reports. The relation between hate crime and hate speech Hate crimes over time Hate crimes have existed since ancient times (Levin and McDevitt, 2002); however, governments of various countries only began to actively combat them in the last few decades (Green et al. 2001). In Europe, hate crimes have been on the political agendas of various member states for decades (Williams and Tregidga, 2014). Although progress has been made, there are still pressing issues, such as the increase in hate speech on social media and high rates of mis and underreporting. Therefore, to effectively combat hate crimes, a better understanding of the quality of their indicators and their generation over time is needed. Underreporting is a significant limitation when evaluating hate crimes and can be motivated by factors such as fear, victims’ lack of confidence in responsible institutions, and their decision to report to other officials (Pezzella et al. 2019). In Spain, only one in ten hate crimes are reportedFootnote1, and in Europe, two in ten (European Union Agency for Fundamental Rights, 2021)Footnote2. This reality means that official data often struggles to capture the full scope of the issue, underestimating its true prevalence. One potential solution to underreporting in police records, is the use of alternative data collected by non-official organizations such as civil organizations or media. In Europe, the Hate Crime ReportFootnote3 elaborated by the Organization for Security and Co-operation in Europe (OSCE) includes this type of data but without specific location and day of occurrence. As a previous experience to the study reported in this paper, we manually estimated the city and date of each racist and xenophobic hate crime event recorded by the Hate Crime Report in Spain, Italy, and Greece from 2015 to 2020. We crossed this hate crime data with levels of hate speech on Twitter in those countries during the same period of time and used different algorithms (Gaussian Naïve Bayes, Logistic Regression, Support Vector Machine) and did not find any relationship (see Appendix 4). We concluded that either the absence of city and date in the data or the use of non-official statistics generated unsatisfactory results. Thus, we considered that the use of official police data, even with the fact of underreporting, was the best source to establish the factors that may be responsible for the variance of hate crime over time. Generally, several variables have been explored as potential main triggers of the temporal course of generic criminal behavior. In addition to contextual factors, temperature is one of the most studied temporal variables due to its correlation with a wide range of criminal typologies, making crime a seasonal element with a marked tendency to prevail at certain times of the year, particularly in summer (Zhou et al. 2021; Field, 1992; Cheatwood, 1988; Harries and Stadler, 1983; Yar and Nasir, 2016). However, although temperature seems to be a common indicator or predictor of crimes, some authors have emphasized that the explanation of crime should not be limited exclusively to this variable (Tennenbaum and Fink, 1994; Feng et al. 2016; McDowall et al. 2012), as there is no real causation effect, and other specific indicators can have greater predictive power. The defining temporal course of hate crimes appears to be predicted by other variables beyond high temperatures or darkness, although some authors still mention the presence of seasonality, especially in spring and summer (Carr et al. 2020; King and Sutton, 2013), as well as on weekends (King and Sutton, 2013). Moreover, it seems to be clear that hate crimes can be labeled rare events (Benier, 2017; Mills et al. 2017; Wenger and Lantz, 2022) since they occur infrequently and are related to external events. One additional reason for this rarity is that official data are highly affected by underreporting, the final statistics do not represent the real problem. Besides, most countries only present these official data aggregated by year (as is the case for SpainFootnote4) or without geographical informationFootnote5, which may complicate modeling over time to researchers. In recent years, various scholars have focused on specific triggers of hate crimes related to the physical environment (King and Sutton, 2013), such as terrorist attacks (Borell, 2015; Deloughery et al. 2012; Disha et al. 2011; Echebarría-Echabe and Fernández-Guede, 2006; Hanes and Machin, 2014; Ivandic et al. 2019; Jacobs and van Spanje, 2021; King and Sutton, 2013; Mills et al. 2017; Piatkowsla and Lantz, 2021; Piatkowska and Stults, 2021), or other political events such as the European Union referendum on Brexit-related issues (Albornoz et al. 2020; Carr et al. 2020; Devine, 2018a, 2018b, 2021; Piatkwoska and Lantz, 2021; Piatkwoska and Stults, 2021) and Donald Trump’s electoral campaign in the United States (Edwards and Rushin, 2018; Feinberg et al. 2022; Müller and Schwarz, 2023; Warren-Gordon and Rhineberger, 2021). While it has been demonstrated that the temporal course and prevalence of hate crimes are primarily determined by specific social events, these events can also trigger reactions in the online environment (Olteanu et al. 2018; Scharwächter and Müller, 2020; Arcila et al. 2022; Williams and Burnap, 2016; Williams et al. 2020), which leads to an increase in hate speech. For instance, research has analyzed how Women’s Day or LGBT Pride, among other relevant events, are linked to the rise of online hate speech (Gómez et al. 2023). In fact, online hate speech is considered a specific type of hate crime, according to most Western legal frameworks, and thus may be reported in crime statistics. While this type of crime is becoming increasingly prevalent with growing internet access (Chan et al. 2016), the complexity of detecting it and the tensions with other fundamental rights—such as freedom of expression—make it more difficult for authorities to process online hate speech as a crime. In practice, much of the control of these potential cybercrimes falls on the platforms themselves, under their terms of service, which play a very significant role by determining what is hatred, offensive, or toxic and thus (manually or automatically) moderating this content. Based on the above-mentioned discussion, we wonder: how is the emergence and propagation of hate speech on social media (RQ1). Online hate speech as a predictor of offline hate crime The proliferation of hatred in society requires a comprehensive response. Over the past decades, the European Commission has worked to develop a set of legal instruments and strategic initiatives to promote and protect the common values and fundamental rights of the European Union. The most relevant framework for establishing a collective response to hate crimes and incitement to racist and xenophobic hatred is Council Framework Decision 2008/913/JHA of 28 November on combating certain forms and expressions of racism and xenophobia by criminal law (European Commission, 2008). Illegal hate speech is defined in this framework as “the public incitement to violence or hatred on the basis of certain characteristics, including race, color, religion, descent and national or ethnic origin”. This establishes a criminal legal framework obliging Member States to criminalize public incitement to violence or hatred on the basis of certain characteristics, including race, color, religion, descent, or national or ethnic origin (Article 1), and to take into account racist motivation when sentencing perpetrators of criminal acts. National authorities must, therefore, investigate, prosecute, and judge suspected cases of hate crimes or incitement to hatred. Since 2016, the High-Level Group on Combating Hate Speech and Hate Crime, coordinated by the European Commission’s Directorate-General for Justice, has been in operation. Its work has focused on improving support for victims in accordance with the Victims’ Rights Directive 2012/29/EU, on intensifying training on hate crimes for law enforcement agencies, and on improving the recording, reporting and collection of data on hate crimes. Furthermore, to address the challenges of online hate, the European Commission launched a voluntary Code of Conduct to counter illegal online hate speech with information technology companies in 2016, which has been updated in recent years (European Commission, 2016). In December 2021, the Commission proposed to extend the list of EU crimes to include incitement to hatred and hate crimes [COM(2021) 777 final] of 9 December 2021 to address the current divergent and fragmented criminal law approaches of Member States and to ensure consistent protection of victims across the European Union (European Commission, 2021). In December 2023, the Commission and the High Representative adopted a Joint Communication on “No place for hate: a Europe united against hatred”. The Communication aims to step up EU efforts to fight hatred in all its forms, by reinforcing action across a variety of policies. Other efforts worth highlighting include the European Online Hate Lab (EOHL), a hub for researchers, organizations, and companies using technologies for detecting and analyzing online hate in European languages (Kaati, 2023). The EU Strategy 2020–2025 underlines the need to ensure the security of all EU citizens, in line with the values and principles of the Union [COM(2020) 605 final], 24 July 2020. On the other hand, the enormous academic attention to this concept has generated a vast amount of literature especially in computer and social sciences. Only in the field of computer science, the systematic review by Almaliki (2023) found at least 274 studies during 2012 and 2022 that include detection solutions but also intervention approaches. In fact, the propagation of online hate speech has prompted several debates about how this type of discourse should be conceptualized, and methodologies for automated detection can be developed (Nascimento et al. 2023; Valle-Cano et al. 2023), as manual identification is insufficient and inefficient for this large-scale problem. Most attempts are focused not on a legal definition of hate speech as a crime but rather on messages that explicitly denigrate marginalized groups, even if the offenses are not enough for prosecution. Moreover, the debate has strongly focused on the linguistic and technical dimensions of the measurement, and most studies use hatred as a general concept (hateful content against anyone) or as directed at a single target (i.e., against migrants). Additionally, single sources (such as X, because of its availability) have been usually used for hate speech detection to determine the amount of hate on social media, with less attention to multi-source approaches. For automated detection (for example, predicting hate versus nonhate content), recent research has suggested that supervised text classification with large language models (LLM) and specific architectures such as Bidirectional Encoder Representations from Transformers (BERT) are the best approaches (Abderrouaf and Oussalah, 2019; Fan et al. 2021; Joshi et al. 2021; Vishwamitra et al. 2020). These findings provide solid ground for developing ad hoc prototypes (with new annotated examples) or for trusting third-party classifiers (commercial or open-source tools) to measure online hate speech and close dimensions such as the level of toxic language or sentiment. However, researchers are still addressing other important challenges, such as the need to overcome possible biases (Badjatiya et al. 2019; Mozafari et al. 2020), the legal limits and ethical critiques of criminalization (Teijón Alcalá, 2022), the challenges of criminal sanctions (Gómez Belvis and Castro Toledo, 2022; Miró Llinares and Gómez Bellvis 2021), the detailed indicators of hatred (Papcunová et al. 2021), the different linguistic perspectives (Guillen -Nieto, 2023), the effects of the media on public opinion based on narratives and framing (Zhang and Trifiro, 2022), the various dimensions of the problem of freedom of expression (Martínez Valerio and Mayagoitía Soria, 2021; Sharma Ishita, 2019), and the psychological effects that this type of discourse can have on individuals and groups (Saha et al. 2019; Ștefăniță and Buf, 2021). Regarding this last point, previous studies have suggested that witnessing online hate may be associated with both cyberhate victimization (Wachs et al. 2021) and its perpetration (Wachs et al. 2019). Furthermore, frequent exposure to hate speech can also lead to a process of desensitization, leading to increased prejudice toward victims (Soral et al. 2018). All these studies suggest that hateful content on social media leads to harmful effects. While researchers have confirmed that an individual’s consumption of hateful messages can impact his or her attitudes and behaviors, the literature does not provide consistent evidence of causal effects between the rise of an aggregated and scattered mass of online hateful messages and an increase in offline crimes motivated by hate. Several studies have explored this relationship (Bozhidarova et al. 2023; Müller and Schwarz, 2021; Relia et al. 2019; Williams et al. 2020; Feinberg et al. 2022) but have not reached a consensus on the order of appearance of one or the other. Authors such as Williams et al. (2020) have concluded that, regardless of whether the crime took place in the presence of a preceding event, the established association between them determined that it was an online crime that predicted offline crime and not the other way around. However, other studies considering the involvement of specific events have obtained different results. For example, while Müller and Schwarz (2021) defined online crime, in the presence of an event, as a predictor of offline crime, Wiedlitzka et al. (2021) observed that the direction of the prediction was the opposite, with offline hate crimes predicting online hate crimes, although these authors refer to different kinds of events in their studies. Although we cannot hypothesize the causal effects of online hate speech on offline hate crime, we can identify that the propagation of this type of extreme discourse on social media acts as an indicator of offline events. Hate speech and hate crimes are situations created through a process in which different human and technological variables and phases converge. As many researchers have suggested (Kalampokis et al. 2013), the theoretical understanding of social media as a “predictor” (not a cause) of offline behaviors can be a framework for explaining and managing this complex interaction. This approach stresses the role of hate speech and inflammatory language on social media in hate crime prediction and highlights the importance of enhancing strategies to combat hate on networks (Donzelli, 2021), solidarity citizenship norms (Kunst et al. 2021), content moderation by platforms (Gonçalves et al. 2021), and cyber-activism (Müller and López-Sánchez, 2021). Thus, we hypothesize that: the number of messages with inflammatory language on social media anticipates an increase in offline hate crime (H1). Materials and methods Sample In this research, we exclusively used primary sources of hate crimes and messages posted on social networks to build our datasets. The data were collected, stored, and analyzed on the Supercomputer Center SCAYLE. Two hate crime datasets (HCDS1 and HCDS2) were generated from internal data provided by the Spanish National Police, which included records of hate crimes that occurred across Spain (except for Catalonia and the Basque CountryFootnote6) between January 2016 and December 2018. These records of official complaints (disaggregated by day and province) were provided by the National Office for Combating Hate Crimes (ONDOD). After cleaning the records, we were left with a final sample of 1215 reported hate crimes (including all possible motivations) (HCDS1) and a specialized subsample of 657 reports, of which 376 corresponded to cases with immigrant victims and 281 to cases with LGBT victims (HTDS2). The final structured datasets included information on the date of each fact, the date of each report, the main motivation for the crime, a second motivation (if applicable), the province of the country where it occurred, and the NUTS3 (Nomenclature of Territorial Units for Statistics 3 level). In addition to the police reports, two social media datasets (SMDS1 and SMDS2) were also generated for the period 2016–2018, coinciding with the dates of the hate crime reports. The first was a general dataset of tweets with their level of generic hate (SMDS1), and the second was a specialized dataset of tweets and Facebook posts with two subdatasets, one for messages about migrants and the other for messages about the LGBT community, and included their levels of specific hate, toxicity, and sentiment (SMDS2). For SMDS1, a random sample of 1000 daily tweets (produced in SpainFootnote7 in Spanish) was collected between 2016 and 2018, for a total of 1,096,000 records, from which 12,682 were classified as hateful (1.16% of the sample). In the case of SMDS2, we collected filtered tweets and posts of public groups on Facebook (produced in Spain in Spanish) for a total of 776,180 records (215,083 original messages) (see Table 1). For the search filters, we used keywords and word combinations that allowed us to locate messages related to migrants and LGBT people (Appendix 1). Tweets and Facebook posts were collected through the X Application Programming Interface (API)Footnote8 and CrowdTangleFootnote9, respectively. Table 1 Comparison of the number of original and extended reach posts (retweets and shares) for migrants and LGBT People on social networks. Full size table Measures Hate crime incidents For HCDS1, we computed the absolute count of general hate crime incidents by day, week, and month. For HCDS2, we computed the absolute count of general hate crime incidents motivated only by the victim’s sexual orientation and gender identity or by racism/xenophobia by day and week. Generic online hate speech For SMDS1, we only computed the level of generic online hate speech using HaterBert and its repository SocialHaterBert (Valle-Cano et al. 2023), a pretrained binary text classifier for any type of hatred in Spanish. For SMDS2, we computed specific online hate speech (toward migrants and LGBT people) with ad hoc classifiers, toxicity, and sentiment, as we detail below. Specific online hate speech We performed an ad hoc detection of hate speech toward two specific groups (migrants and LGBT people) using supervised text classification based on multilingual BERT models. To fine-tune the model, we manually annotated examples of hatred against both communities in Spanish. To create the first classifier, we employed previously annotated data derived from the PHARM Interface (http://pharm-interface.usal.es), which is a web interface for analyzing hate speech against migrants and refugees (Vrysis et al. 2021). The final dataset consists of 22,232 balanced messages equally divided between hateful and nonhateful content. This dataset was divided into three subsets for training (n = 13,339), validation (n = 4446), and testing (n = 4447) purposes. A deep learning classifier was implemented using BERT, a pretrained model with 167,357,954 parameters, with a learning rate of 3 × 10−5 and an epsilon of 1 × 10−8 on an Adam optimizer. The model was trained for three epochs. In the evaluation of the test dataset, the model showed robust results with an average loss of 0.1575, an accuracy of 94.63%, and an F1 score of ~0.95, indicating an effective trade-off between accuracy and recall. These results are better than those previously described by Arcila et al. (2022), who used similar data. On the other hand, for detecting hate speech toward LGBT people, we used a corpus of examples of social media messages related to sexual diversity that were manually classified as hate or nonhate by two trained judges using Doccano software with a double-blind coding approach to ensure reliability. The Krippendorff’s alpha value of 0.7258 indicates a satisfactory level of consistency among the observers. Messages on which both coders agreed were incorporated into the final balanced dataset, which consisted of a total of 8836 records (4435 examples of LGBTphobic messages and 4400 examples of no LGBTphobic content). The dataset was divided into three subsets for training (n = 5302), validation (n = 1767), and testing (n = 1768) purposes. A deep learning classifier was implemented using BERT, a pretrained model with 167,357,954 parameters, with a learning rate of 3 × 10−5 and an epsilon of 1 × 10−8 on an Adam optimizer. The model was trained for three epochs. The evaluation of the test dataset shows that the machine learning model achieved good performance metrics. The average loss was 0.3022, and the accuracy was 0.9078, which means that the model correctly classified 90.78% of the samples in the test dataset, and the F1 score was approximately 0.90, indicating a good balance between precision and recall. The results obtained with this modeling are better than those previously described by Arcila et al. (2021), who used similar data. We assessed the levels of hateful content using both classifiers, with values ranging from 0 to 1, and calculated the number of hate messages directed at each group (migrants and the LGBT community) per day and week using a standard threshold of 0.5. Sentiment value We applied a lexicon-based sentiment analysis technique using SentiStrength, which is an open-source tool developed by Thelwall et al. (2010). The dictionary included words rated on a scale of 1 to 5 for positivity and −1 to −5 for negativity, with 0 representing neutrality. We used the validated Spanish dictionary provided by SentiStrength (Vilares et al. 2015) to compute the score for each record. Toxic language We also used the Perspective APIFootnote10 to analyze the toxicity level in the messages. This tool is a text classification system developed and managed by the Jigsaw unit of Google that returns probability scores from 0 to 1 for six toxicity attributes (toxicity, severe toxicity, insult, profanity, identity attack, and threat) in several languages, including Spanish. This classifier has been validated in previous experiments (Lees et al. 2022). In addition, we computed an index variable using the mean of the six attributes. Time series datasets After preprocessing all the data, we created the final time series datasets. First, HCDS1 and SMDS1 were aggregated and combined to produce six time series datasets with two variables about generic hate crime and hate speech resulting from three timely aggregations (monthly, weekly, and daily) and two methods to process the data (original and smoothing). Second, HCDS2 and SMDS2 were aggregated and combined to produce eight time series datasets (2 × 2 × 2) with 78 variables (see Appendix 2 for details) about specific hate crimes and hate speech (mean and standard deviations of hatred, toxicity, and sentiment) toward migrants and LGBT people, resulting from two timely aggregations (weekly and daily), two geographical aggregations (all Spain and only Madrid) and two approaches to filter social media data for hate crimes (“filtered” if messages correspond to the same place and nonfiltered if messages come from all around the country). Table 2 provides a concise summary of the datasets created from HCDS1, SMDS1, HCDS2, and SMDS2, highlighting the features included in each dataset and their specific use cases. Table 2 Overview of final time series datasets and their analytical purpose. Full size table Data analysis The final goal of the data analysis was to verify the relationship between inflammatory language on social media and an increase in hate crimes. The time series derived from the combination of HCDS1 and SMDS1 were analyzed using temporal correlations, and the time series derived from the combination of HCDS2 and SMDS2 were analyzed with vector autoregression (VAR), generalized linear models with LASSO and elastic net regularization (GLMNet) and extreme gradient boosting trees (XGBTrees). Among many potential algorithms, we chose these three modeling approaches by selecting (i) specific modeling for time series data (VAR); and (ii) machine learning modeling for catching linear (GLMNet) and nonlinear (XGBTrees) relationships. In the VAR models, we included only relevant and nonstandardized variables for each estimation. To model hate crimes against migrants, we included the language features of social media messages related to migration and performed the same procedure for the LGBT case. Then, we conducted Granger causality tests to assess the temporal relationships between variables and hate crime incidents. Variables that showed statistical significance at p < 0.05 level were considered causal and were integrated into the predictive VAR models to assess their explanatory power in the temporal dynamics of hate crimes (see Appendix 3 for details). We then performed Johansen cointegration tests (Johansen, 1991), when applicable, to assess the existence of long-run relationships between the selected variables (see Appendix 13 for details). We also checked the stationarity of the time series using the augmented Dickey–Fuller (ADF) test and applied the appropriate differentiations to make them stationary when necessary (first-order differences). Differencing is only applied for model training, but all variables are recovered at their original level for forecasting. We used the AIC to determine the p-order of each VAR model, selecting the appropriate lag variablesFootnote11. In the GLMNet and XGBTree models, we included all variables of the dataset as well as their standard deviations (no manual selection was conducted, and language variables of both groups were included to predict crime toward both groups). To develop the models, we analyzed variable importance. In cases in which a model does not provide variable importance, the parameters of all the predictors are penalized to zero (i.e., all the characteristics are selected by the model)Footnote12. We established a standard maximum of ten lags for all the models. The specific details of the three machine learning algorithms and their parameters are described in Appendix 7. Findings Temporal correlation of hateful tweets and hate crime Using the generic data for hate crime that included all motivations (n = 1215) and the generic data for hate speech on X (n = 1,096,000) (HCDS1 and SMDS1), we estimated the temporal correlation between these two variables aggregated by month (r = 0.467, p < 0.01), week (r = 0.181, p < 0.05) and day (r = 0.055, p = 0.06). We also calculated the correlations using normalized data with smoothing for month (r = 0.552, p < 0.01), week (r = 0.248, p < 0.01), and day (r = 0.18, p < 0.01). The size of the positive correlations ranged between weak (0.06) and moderate (0.55) and were statistically significant in most cases, meaning that the more hateful tweets there were, the more hate crimes there were over time. The correlation of both time series for the case of monthly aggregation with smoothing data, which produces the highest correlation and shows how both variables follow similar patterns, is plotted in Fig. 1. Fig. 1 figure 1 Temporal correlation of hateful tweets (red line) and hate crime (blue line) from January 2016 to December 2018, with monthly aggregation and smoothing data. Full size image This first analysis confirmed the existence of a relationship between an increase in hate speech on social media and an increase in offline hate crimes but provided little insight into the explanatory and forecasting capabilities of linguistic variables for hate crime or vice versa. The next section presents more fine-grained hate crime data (separated by motivations and locations), additional social media sources (X and Facebook), and more specific linguistic variables (hate, toxicity, sentiment) for evaluating this relationship in more detail. Propagation of specific hate and forecasting crime We used more specific and complete datasets (HCDS2 and SMDS2) to conduct new time series analysis to determine to what extent inflammatory language can forecast violence against two specific vulnerable groups (migrants and LGBT people). As a monthly forecast would be of lesser importance in alerting law enforcement agencies, we only considered weekly and daily aggregations for better interpretability of the models and included a larger number of linguistic variables (n = 78). Figures 2,3 present the evolution of hate crimes against migrants and the LGBT community over the years 2016 to 2018, based on data from the ONDOD. These figures provide insights into the temporal patterns and fluctuations in the incidence of these crimes, highlighting the trends for each hate crime type over the 3-year period. Fig. 2 figure 2 Time trend of hate crimes against Migrants in Spain (2016–2018) with monthly aggregation. Full size image Fig. 3 figure 3 Time trend of hate crimes against the LGBT population in Spain (2016–2018) with monthly aggregation. Full size image On the other hand, a descriptive analysis of hate crime and social media data, including message statistics, prevalence of hate speech towards migrants and the LGBT community, sentiment analysis, and levels of toxic language, can be found in Appendix 8. Correlations between hate crime and social media posts, at the national level and in Madrid, are also explored. Modeling hate crime and hate speech Forecasting In total, we deployed 48 machine learning models (16 VAR + 16 GLMNet + 16 XGBTree) to estimate the relationship between hate crimes against migrants or LGBT people and inflammatory language and then compared the prediction errors (MAE and RMSE) and explained variance (R2) among these models (Table 3). Table 3 Machine learning models to predict hate crimes against migrants and the LGBT community. Full size table The results obtained with VAR modelsFootnote13 to predict hate crimes against migrants based on inflammatory language on social media (models 1 to 8 in Table 3) show that the coefficient of determination (R2) is quite low in most models, indicating that it is difficult to explain the variation in hate crimes against migrants. In terms of temporal frequency, weekly models deliver a higher R2 than daily models, suggesting that relationships between variables may be better captured during this time interval. In addition, the best results achieved with the weekly models based on national data from Spain (0.19 and 0.17) were superior to those achieved with geolocated data from Madrid. Additionally, in some cases, the filtered data seem to perform better. The order “p” (number of lags) also varies in the models, indicating that a different number of lags is needed to capture the dynamics of the variables in different situations. Based on the VAR equations, we can draw some conclusions about the most relevant variables for predicting hate crimes against migrants. In general, variables related to the dimensions of toxic language, hate speech, and the number of messages on social media seem to play a significant role in these predictions. We also see that the importance of variables can vary between models, indicating the complexity and dynamic nature of the data. On the other hand, the results obtained with VAR models to predict hate crimes against LGBT people based on inflammatory language on social media (Models 9 to 16 in Table 3) show that weekly models tend to show better performance in terms of R2 and, in some cases, in terms of RMSE and MAE compared to daily models. Regarding the number of lags, there is no clear pattern in terms of what value of ‘p’ is optimal, which highlights the importance of performing a specific analysis to determine the appropriate number of lags. We found that the most repeated variables in the models are threats and various metrics related to the toxicity of social media posts. We also noticed that both X and Facebook seem to play important roles in predicting hate crimes against the LGBT community. The sentiment variable in Facebook messages demonstrates noteworthy significance in both national daily models (Models 9 and 10). Threats on social media platforms are also significant variables in several models, which suggests that this type of content, both on Facebook and X, can impact the frequency of hate crimes. A good visual example of forecasting the last four temporal units based on models 7 (Migrants-Madrid-weekly) and 13 (LGBT-National-weekly) is shown in Fig. 4. Fig. 4: Forecasting hate crime against migrants and LGBT people using predictive models. figure 4 A Forecast of hate crimes against migrants using model 7. B Forecast of hate crimes against LGBT individuals using model 13. Full size image In addition to VAR, we ran 16 models using GLM with LASSO and elastic net regularization (GLMNet) and 16 more with an extreme gradient boosting tree (XGBTree). The GLMNet models (18 to 32 of Table 3) prove to perform better on the migrant datasets, achieving up to 64% of the explained variance in Model 17 and 63% in Model 18 but still with high errors in forecasting. The models excel in predicting hate crimes against migrants in daily data, even filtered data, with high R2 values and low RMSE and MAE values. On the other hand, the models underperformed in predicting hate crimes toward LGBT people in most datasets, with lower R2 values. Moreover, in general, filtered data tend to result in models with slightly better performance than unfiltered data, which becomes more noticeable when we compare weekly results for migrants at the national level and for migrants and LGBT people geolocated in Madrid, suggesting that feature selection may be beneficial for improving accuracy. Importantly, the models performed worse with the Madrid data than with the national data, which may be related to data availability and scarcity. The performance of the XGBTree (33 to 48 of Table 3) appears to vary significantly between the models. The best result was achieved for hate crimes toward LGBT people considering national weekly data (Model 45), accounting for 53% of the variance explained. In general, the models performed better on the weekly data than on the daily data, and all the models yielded R2 values at “NaN”. The GLMNet and XGBTree algorithms compute the R2 using the square of Pearson correlation between the observed and predicted target variables. When the predicted and/or the observed target has no variation, i.e., is a constant, the data will have no deviation from the mean, thus producing no value for R2. This indicates that daily models cannot adequately explain the variation in daily hate crime data, suggesting that daily data may not be suitable for predicting hate crimes targeting migrants and the LGBT community. Furthermore, filtered data presented slightly better performance than unfiltered data for geolocated crimes in Madrid, indicating that feature selection may contribute to higher accuracy in this scenario. Figure 5 provides two graphical representations of the accuracy of both algorithms when forecasting the last four temporal units of our time series aggregated by week to predict crime against LGBT people with national data (left, Models 29 and 45) and by day to predict crime against migrants with national data (right, Models 17 and 33). Fig. 5 figure 5 Forecasting hate crimes against LGBT people (left) and migrants (right) with the GLMNet and XGBTree models and national social media data. Full size image While VAR can provide more interpretabilityFootnote14, we found that the GLMNet and XGBTree models tended to achieve better R2 values and usually lower error metrics. Moreover, we found that the performances vary significantly depending on the target variable and the dataset. The models performed slightly better for the migrant target variable compared to LGBT people in several cases, as reflected in usually higher R2 values. Regarding the variance in the data, we perceived that the models developed with daily data perform worse than those with weekly data, suggesting a lack of fit to the data (possibly because weekly data cover more days, which reduces the number of periods without crime records). In addition, the national models outperformed the geolocated models in Madrid. Inflammatory language in the best models We chose the best resulting model of each combination of target group (migrants/LGBT people) and algorithm (VAR/GLMNet/XGBTree) to compare the role of inflammatory language in each of them (Tables 4,5). The following 6 models were selected for comparison: VAR models for filtered weekly national data targeting migrants (Model 6, R2 = 0.19) and weekly national data targeting the LGBT community (Model 13, R2 = 0.09); GLMNet models for daily national data targeting migrants (Model 17, R2 = 0.64) and weekly national data targeting the LGBT community (Model 19, R2 = 0.25); and XGBTree models for filtered weekly national data targeting migrants (Model 38, R2 = 0.38) and weekly national data targeting the LGBT community (Model 45, R2 = 0.53). These choices were made to ensure a consistent comparison at the national level since the Madrid models underperformed. Table 4 Comparison of predictive models to the migrant target for each of the three modeling approaches. Full size table Table 5 Comparison of predictive models to the LGBT target for each of the three modeling approaches. Full size table Even though there was no consistency on which is the most important linguistic variable in the models, we can draw some patterns by examining the other significant variables included in the six models. First, the variables related to toxicity (threats, identity attacks, etc.) are more important than the specific hatred, and the role of sentiment is residual. Second, features of Facebook posts were always the most important, and except for Models 17 and 13, most of the significant variables in the models were extracted from Facebook and not from X. Third, for the case of GLMNet and XGBTree, some linguistic features also corresponded to the other group (i.e., messages about migrants predicted hate against LGBT people). Fourth, the lags of the weekly models (all except Model 17) were usually 1 or 2, suggesting temporary proximity of hate speech and hate crime. Fifth, linguistic variables have different magnitudes of importance and directions. These patterns are based on the best predictive modelsFootnote15. Temporal order in VAR The previous analyses informed about the relationship between online hate speech and offline hate crime, but they did not report any insights about which of them happened first. Even if our research was not designed to test any causal effect over time, VAR models can be used to establish if the indicators of inflammatory language in previous days or weeks significantly affect the variance of hate crime, and more importantly, if it happened the other way around or not. By looking at this data, we can estimate the temporal order of the effects. According to the analysis of the 16 VAR models, the language features temporally anticipated the appearance of hate crime in all the resulting models. In 7 of these models (3, 4, 5, 9, 10, 13, and 16), the relationship is clear: the language variables (i.e., hate, toxicity, etc.) predict the future variance in the number of hate crimes but not vice versa. However, in the other 9 models (1, 2, 6, 7, 8, 11, 12, 14, and 15), we found that the number of hate crimes also predicts, to a certain extent, the variance in some language variables. Considering the best VAR models (those with the highest explanatory power in terms of R2), we observe that in two of them (Models 5 and 13), the relationship was unidirectional (only language predicts crime), while in the other two (6 and 14), there was a recursive relationship, but only for some of their variables. This means that when we observe the specific language variables that predict crime in time and check if those language variables were also predicted by crime, we found that the recursive relationship occurred only for some specific cases and not for every variable (see tables in Appendix 12). This analysis suggests a bias toward a specific direction (language predicting crime) rather than bidirectionality in the identified relationships. Discussion and conclusion This study presents an empirical analysis of the propagation of hate speech on social media and its predictive capacity for offline hate crimes. We tested the temporal correlation between these two phenomena for three years in Spain and then checked the effectiveness of machine learning methods (48 models, with VAR, GLMNet, and XGBTree) to forecast specific violence against migrants and LGBT people using internal police records and indicators of inflammatory language on social media generated by computational text analysis. First, we found a positive and highly significant temporal correlation between generic hate speech in X and hate crimes motivated by all prejudices. Second, using specific motivations against migrants and LGBT people as well as detailed information on the inflammatory language (hatred, toxicity, and sentiment) on X and Facebook, we found machine learning models that explain up to 64% of the increase in reported hate crimes with acceptable errors when forecasting the number of crimes four days or weeks in advance. We found that national models (including all data from Spain) outperformed those models with specific data from Madrid, and those about migration were more effective than those about LGBT people. Interestingly, toxic language attributes (toxicity, threats, identity attacks, and profanity) outperformed mentions to the group (number and proportion of messages referred either to migrants or LGBT people), hatred (number and proportion of hateful messages), and negative sentiments. A comparison of sources revealed that Facebook posts were better predictors than tweets, and in most cases, inflammatory language preceded hate crimes in time, although this does not necessarily mean a causal effect. Based on these unique data, our findings are consistent with those of recent literature that attempts to empirically test the existence of a relationship between online hate speech and offline hate crime (Bozhidarova et al. 2023; Müller and Schwarz, 2023, 2021) and offers new insights into the nature of this complex relationship, filling substantial gaps in the literature. Unlike similar previous studies, we integrate data from more than one social media source, in this case, X and Facebook, allowing for a comprehensive analysis of the spread of hate speech and inflammatory language on the internet. Additionally, by examining more than one marginalized group, we provide broader comparisons and identify cross-interactions, as indicated by the GLMNet and XGBTree models, which highlights the promising effectiveness of machine learning models for analyzing data from social networks, providing more sophisticated strategies for interpreting networking phenomena and preventing crime increasing the efficiency of police resources to try to cut off a rapid rise in hate crimes. In addition, the use of different sources, measures, and dimensions of inflammatory language can offer insights that are more relevant than narrower interpretations of language (given the inherent bias in the collection and classification of content with each tool), pointing to new possibilities for understanding and intervening in complex social contexts. These findings support the importance of strategically using machine learning models and predictive analytics in the fight against hate crimes (either in the physical environment as also as cybercrimes) and indicate promising directions for improving preventive and interpretive approaches in online social environments. Moreover, our study puts aside the assumption that an increase in online hate speech causes an increase in hate crime to focus on other dimensions of this complex relationship. Even when our VAR models detect a temporal order of these two phenomena, first speech and then violent acts, thus confirming existing studies (Williams et al. 2020; Müller and Schwarz, 2021), we do not have empirical evidence to determine a causal effect. We understand that this relationship can be recursive (some crimes reported in the media might increase online hate) or driven by external events (major events such as protests or public discussion of legislative projects might also increase online hate) (Gómez et al. 2023). However, our findings elucidate that the presence of inflammatory language on social media toward a specific target can act as a reliable predictor of offline events, even if they are not the cause. The evidence is that the virtual hatred atmosphere, manifested by inflammatory online speech, and the offline sphere are not independent events. The theoretical connection between these two types of behaviors executed by presumably different groups of people can spark a useful debate about how social media creates a reliable representation of society. In other words, beyond the tradition of considering the production of an individual’s speech as a predictor of his or her future behavior (as in the case of terrorists who radicalize their online content before acting), these findings suggest that theoretical frameworks of social behavior may consider a more interconnected relationship between separate online communities and apparently isolated behaviors of nonsocial media users. In any case, when testing the hypothesis of the connection between online hate speech and offline crime, importantly, in agreement with previous studies, hate crimes, which have been classified as rare cases (Benier, 2017; Mills et al. 2017; Wenger and Lantz, 2022), impose some methodological challenges, given their low incidence and the difficulty of predicting less frequent events. Moreover, when comparing this relationship in different locations, the potential effect of local third variables (unemployment rates, violent crime rates, the flow of domestic and international migrants, percentage of the population belonging to a minority group, etc.) may play a significant role. As our study did not make proper geographical comparisons (we only validated Madrid against the rest of the country), these indicators could not be included as control variables, but future research may take this direction. Despite these challenges, the established connection between social media content and criminal acts provides scholars, authorities, and policymakers a clearer picture of how harmful online messages and haters are the breeding ground for hate crimes. Even if the vast majority of tweets or Facebook posts with inflammatory language against vulnerable groups such as migrants or LGBT people do not legally classify as hate speech—as defined by the penal code—or cybercrime, their emergence and propagation might follow regulatory patterns similar to those of online crimes related to minors or terrorism. Overall, our findings highlight the far-reaching influence of virtual interactions, which transcend the online environment and are interconnected with human behavior in offline society, so these virtual interactions may serve as an early indicator of an increase in hate crimes. Thus, these empirical models exhibit potential practical applications, both in the cybercriminal sphere and in the prevention of offline hate crimes. Even when the correlation between online hate speech and offline hate crime was more evident in general terms compared to the study of specific communities at higher risk (migrants and LGBT), our findings might still guide some policies to counter the effects of hatred in modern democracies. This study suggests the implementation of effective policies such as online monitoring; prediction, alert systems and risk monitoring of offline crimes by public agents and social organizations aimed at anticipating and avoiding criminal occurrences; awareness campaigns and combating underreporting; and counterdiscourse strategies to proactively address the complexity of these phenomena. In addition, our findings have significant ethical implications for digital surveillance, because by listening to social media, we can prevent crimes. However, digital surveillance might also be a source of new biases or unfair measures. This paper has several limitations. The first concerns the availability and accessibility of hate crime data. As we concluded, hate crimes appear as rare events, and the distributions are quite non-Gaussian, which means that they are not evenly distributed and do not seem to follow a typical frequency pattern. In addition, underreported hate crimes escape the control of law enforcement authorities and, consequently, affect the integrity of our data. The second limitation involves the exclusion of high-profile events in the models (i.e., not including major social events as a variable), as we presume that they might play a significant role in all the variables considered in this study. Third, social media intensively delete those posts with illegal content in relation to the penal code, then it is important to monitor social networks in real time to include these hateful messages in the models. One direction for future research is the evaluation of a media event tracking system, allowing for the inclusion of a wider range of incidents and a deeper analysis of the severity of these cases when they are reported. Avatar 1: That's an excellent question. While the study doesn't prove causation, their time series analysis showed inflammatory posts typically appeared 1-2 weeks before corresponding hate crimes occurred. Avatar 2: That temporal relationship is really significant. Were some platforms or types of language more predictive than others? Avatar 1: Interestingly, Facebook posts were better predictors than tweets, and general toxic language actually outperformed specific hate speech in forecasting violence. Avatar 2: Those nuances are really important. What practical applications might this research have? Avatar 1: The authors suggest these models could help create early warning systems - allowing authorities to monitor online toxicity spikes and potentially prevent physical violence before it happens. Avatar 2: As a quick recap, remember to always make learning a priority, keep exploring, and connect with fellow learners like Hugi Hernandez and the founders of Egreenews. Mmm, who knows, maybe you can find them on the web or LinkedIn. But anyways, please always remember to be good with yourself. Avatar 2: So, bye for now, and we hope to see you next time!

3 H portfolio

Search This Blog

how online hate speech predicts real-world violence against migrant and LGBT communities.

Comments

Post a Comment