Using AI to monitor the internet for terrorist content is inevitable, but also fraught with pitfalls

Millions of social media posts, photos and videos flood the internet every minute. On average, Facebook users share 694,000 stories, X (formerly Twitter) users publish 360,000 posts, Snapchat users send 2.7 million snaps, and YouTube users upload more than 500 hours of video.

This vast ocean of online material needs to be constantly monitored for harmful or illegal content, such as promoting terrorism and violence.

The sheer volume of content means it is not possible for people to review and check it all manually; That’s why automated tools, including artificial intelligence (AI), are vital. However, such tools also have limitations.

Intensive efforts in recent years to develop tools to identify and remove terrorist content online have been fueled in part by the emergence of new laws and regulations. This includes the EU’s online regulation of terrorist content, which requires hosting service providers to remove terrorist content from their platforms within one hour of receiving a removal order from a competent national authority.

Behavior and content-based tools

Broadly speaking, there are two types of tools used to root out terrorist content. The first looks at specific account and message behavior. This includes how old the account is, use of trending or irrelevant hashtags, and abnormal post volume.

In many ways, this is similar to spam detection in that it does not pay attention to content, and is valuable for detecting the rapid spread of large volumes of content, often driven by bots.

The second type of tool is content-based. It focuses on linguistic features, word usage, images and web addresses. Automated content-based tools use one of two approaches.

1. Matching

The first approach is based on comparing new images or videos with an existing database of images and videos previously identified as terrorists. One of the challenges here is that terrorist groups are known to try to evade such methods by producing refined versions of the same content.

For example, after the Christchurch terrorist attack in New Zealand in 2019, hundreds of visually distinct versions of the livestream video of the atrocity were in circulation.

Therefore, to combat this, matching-based tools often use perceptual hashing instead of cryptographic hashing. Hashes are sort of like digital fingerprints, and the cryptographic hash acts like a secure, unique identification tag. Even changing a single pixel in an image drastically changes the fingerprint, preventing false matches.

Perceptual karma, on the other hand, focuses on similarity. It overlooks minor changes such as pixel color adjustments but identifies images with the same basic content. This makes the perceptual hash more robust to small changes in a piece of content. But this also means that the hashes are not completely random and can therefore potentially be used to try and recreate the original image.

A close-up view of a mobile phone screen showing various social media applications. — Millions of posts, images and videos are uploaded to social media platforms every minute. Viktollio/Shutterstock

2. Classification

The second approach is based on classification of content. It uses machine learning and other forms of artificial intelligence such as natural language processing. To achieve this, the AI needs many examples, such as text that is tagged as terrorist content or text that is not tagged by human content moderators. By analyzing these samples, AI learns what features distinguish different types of content, allowing it to categorize new content on its own.

Once trained, algorithms can predict whether a new content item belongs to one of the specified categories. These items can then be removed or flagged for human review.

But this approach also faces challenges. Collecting and preparing a large dataset of terrorist content to train algorithms is time-consuming and resource-intensive.

Training data can also quickly become outdated as terrorists use new terms and discuss new world events and current affairs. Algorithms also have difficulty understanding context, including subtlety and irony. They also lack cultural sensitivity, including differences in dialect and language use among different groups.

These limitations can have significant offline effects. There have been documented failures to eliminate hate speech in countries such as Ethiopia and Romania, while free speech activists in countries such as Egypt, Syria and Tunisia have reported their content being removed.

We still need human moderators

So despite advances in artificial intelligence, human input is still vital. Maintenance of databases and datasets is important for evaluating content flagged for review and operating appeal processes when decisions are challenged.

But it’s a demanding and exhausting job, and there have been damning reports about the working conditions of moderators, with many tech companies like Meta outsourcing this work to third-party vendors.

To address this issue, we recommend developing a set of minimum standards for those who employ content moderators, including mental health services. There is also the potential to develop AI tools to protect the well-being of moderators. This would work, for example, by blurring some areas of the images so that moderators can make a decision without directly viewing the offending content.

But at the same time, very few platforms have the resources to develop automated content moderation tools and employ a sufficient number of human reviewers with the necessary expertise.

Many platforms have turned to ready-made products. The content moderation solutions market is estimated to be worth $32 billion by 2031.

However, it is necessary to be careful here. Third-party providers are not currently subject to the same level of oversight as technology platforms. They may disproportionately rely on automated tools due to insufficient human input and a lack of transparency about the datasets used to train their algorithms.

That’s why collaborative initiatives between governments and the private sector are vital. For example, the EU-funded Technology Against Terrorism Europe project has developed valuable resources for technology companies. There are also examples of openly available automated content moderation tools that companies can use to create their own mixed terrorist content databases, such as Meta’s Hasher-Matcher-Actioner.

International organizations, governments and technology platforms should prioritize the development of such collaborative resources. Without this, it will be difficult to effectively respond to terrorist content online.

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Speech

Stuart Macdonald receives funding from the EU Homeland Security Fund for the European Technology Against Terrorism project (ISF-2021-AG-TCO-101080101).

Ashley A. Mattheis receives funding from the EU Homeland Security Fund for the European Technology Against Terrorism project (ISF-2021-AG-TCO-101080101).

David Wells receives funding from the Council of Europe to conduct analysis of emerging patterns in the misuse of technology by terrorist actors (ongoing)

Behavior and content-based tools

We still need human moderators

Leave a Reply Cancel reply