Social media not only announces their “opinions” and opinions, but increasingly also bots and trolls who act on behalf of parties, organizations, but also foreign governments. A team of researchers has now tested whether and how content-based, learnable algorithms can recognize posts from such externally controlled influence campaigns. It turned out that troll posts definitely have characteristic features that were recognized by their AI system even across platforms. However, the trolls’ changing strategies require that AI be trained over and over again.
Nowhere else do statements and opinions spread as quickly as on social media. Facebook, Twitter and Co have long become the mouthpiece and mood barometer of modern society. But that also harbors dangers. Already in 2016, in the US presidential election campaign, it became clear that not only the politicians involved and their supporters were voting for or against the candidates on social media – Russian trolls and bots also interfered, as studies have shown. “The features that make social media so useful for activists – low entry barriers, scalability, easy division of labor and the ability to post media from almost anywhere in a country – also make the networks vulnerable to industrialized manipulation campaigns own or foreign governments, ”explains Meysam Alizadeh from Princeton University and his colleagues. Between 2013 and 2018 alone there were at least 53 such large-scale attempts at influence in 24 countries.
Learning algorithm for troll hunting
Because of the large amount of troll and bot posts and targeted manipulation through false statements, the operators of the social media platforms can hardly find, mark or delete suspicious posts. While algorithms that can be learned are already being used to find and filter such messages, they have so far only been of limited success. “The key question is how industrial information campaigns can be distinguished from organic, normal activity,” say the researchers. It is also important to be cross-platform
Recognize characteristics, because the campaigns are mostly not only on a social network. To find out, Alizadeh and his colleagues trained a content-based, learnable algorithm in a specific way and then exposed it to different test situations.
The study was based on a specific, particularly common type of social media post – a short text combined with a link. Data sets from platforms such as Twitter, Reddit and Facebook were used as learning material, comprising a total of 7.2 million posts – those from trolls as well as from normal users. The AI system was given the data of a month to learn in a test – the troll posts were identified in this data set. Then it should recognize the posts of the same or other trolls in the data record of the following month or year based on the characteristics recognized in this data. In a complementary experiment, the AI was first trained on Twitter, then searched for trolls on Reddit and vice versa. The researchers conducted the tests in the English language area of the platforms, specifically looking for campaigns of influence from Russian, Chinese and Venezuelan sources.
Recognizable across accounts and platforms
The tests showed: In almost all test variants, the algorithm was able to recognize which posts were part of an externally controlled influence campaign and which were not, as the scientists report. This detection was successful even if the algorithm had been trained with other trolls or in other campaigns and therefore had to perform a transfer service, so to speak. “The industrialized campaigns leave a characteristic signal in the content that allows these influences to be followed from month to month and across different accounts,” the researchers report. The posts often gave away by the way they were linked – they linked to websites that were advertised by countless other trolls or that were politically and content-related to the test content or context of the post. Some of them had URLs to local pages, but in their test section mentioned people who were not suitable. In general, the Venezuelan trolls were the easiest to spot, Chinese and Russian, on the other hand, were more subtle.
According to the scientists, such content-based search algorithms open up a chance to counteract the flood of externally controlled influence campaigns – and that across platforms. “You can use it to estimate in real time how many such trolls are out there and what they’re talking about,” says Princeton University co-author Jacob Shapiro. “The detection is not perfect, but it could force the actors to become more creative or even stop their campaigns.” However, the tests also showed that the influence campaigns have learned over time and have become more sophisticated. If the trolls change their strategy and characteristics, the algorithm can only recognize them once they have received enough training material. This is why this AI-based troll search is not a panacea, Alizadeh and his colleagues emphasize. But it can – provided the appropriate funding and the will of the platform operators – help fight trolls.
Source: Meysam Alizadeh (Princeton University, USA) et al., Science Advances, doi: 10.1126 / sciadv.abb5824