It is proving to be an efficient deterrent against hate speech according to a study that questioned whether banning specific hateful sub-reddits would successfully diminish hateful speech or just relocate it (Chandrasekharan et al., 2017).
Context sensitivity to understand content according to a survey of 10 major platforms (Caplan, 2018)
Automated flagging (for human human review)
Caplan (2018) notes that larger companies practicing what she calls industrial moderation, use use automated detection to flag spam, child pornography, or pro-terrorism content. This then usually goes to human review.
This is consistent with YouTube’s transparency reports which highlight the efficiency of automated flagging (built out of pattern recognition) as primary sources of detection.
Individual trusted flaggers
Individual trusted flaggers are individuals who have access to “priority flags”. They are often professionals in a field relevant to the content they flag (anti-terrorism, child safety, anti-racism…). According to a YouTube Transparency report from October 2018, these individuals represent only 6.2% of overall flaggers but are responsible for more than 3 times as much accurate flagging.
Taking more time to review each report, which is only possible with low volumes (Caplan, 2018).
What doesn’t work:
Over-reliance on the community / social norms
Allow for users to directly reach out to moderators, as it causes retaliation and harassment, or rely on volunteers left to fend for themselves (ex Reddit).
Adapting different community standards for various cultural context.
While research (on hate speech) indicates that while interpretation does vary by country, there is also significant difference from one individual to another (Salminen et al., 2018).
In addition as many people live transnational lives and online communities are increasingly the product of intersecting offline context, making offline and online context correspond is not feasible beyond tracking user’s geographic distribution