The maths of the #KeepPrisonsSingleSex exploit
How to hack the trending algorithm
This explainer is unpacking research almost entirely due to David Allsopp (aka @doublehelix on Twitter), based on having looked at the data and had conversations with him and Meryl Links on the topic, with a great deal of appreciation for the important work of Logically.AI who have done some excellent follow up work.
We did it. I would estimate maybe 40 to 60 of us hacked Twitter’s trending algorithm, doing a community proof of concept exploit. THANK YOU YOU BEAUTIFUL PEOPLE!
The “trending” algorithm on Twitter seems to be designed to detect buzz about a topic based on there being a significant number of tweets relating to chunks of one or two words. So far as we can tell, the algorithm counts hits within a certain previous time period for a given hashtag. Hashtags in particular are a big part of this and make detection easier, for obvious reasons: a hashtag is all one continuous unambiguous string of letters with no spaces and a clear start and finish, making statistical analysis easier.
A tweet with a hashtag is a hit for the trending algorithm. What surprised us was that a retweet of a tweet containing a hashtag was also a hit for that hashtag. This has some implications, in terms of potential manipulation by organised groups targetting trending algorithms. This was derived from using the API to pull tweets from the last 24 hours for a given hashtag, and noting that the count did not match the number of original tweets alone.
We are going to assume that users are only able to retweet a given tweet once. This isn’t entirely certain but you do need to abuse interfaces in order to retweet a tweet more than once without removing a previous retweet. There is additionally an open question as to whether a quote tweet of a hashtagged tweet may count as an additional hit, but we will leave this out of our model at this point:
2 users who make 10 tweets, and retweet each others tweets results in
(2 × 10) × 2 = 40 hits
2 people accounting for 40 hits on the trending algorithm is weird behaviour but not a big deal in the bigger scheme of things.
20 users who each make 10 tweets, and retweet each others tweets once results in:
(20 people × 10 posts each) × 20 retweets for each post= 4000 hits
This is a considerable increase for recruiting a handful of people to donate their accounts to this activity. For 10 times as many people you get 100 times as many trending points. A fairly large group of people trying to trend a hashtag through everyone tweeting once or even a few times on the hashtag would struggle to keep up with that.
In general there are N²×P for N the size of the clique mutually retweeting each other and P posts.
As a demonstration of how much faster this N-squared curve grows than users simply tweeting themselves to contribute to the hashtag:The red line on the graph below has one post for each user, retweeting all the other users in a clique. The blue line is if every user just tweeted their own contribution to the hashtag. The difference is striking — it is impossible for normal organic human behaviour to compete with a small cell deliberately retweeting each other in a cluster to try and hack the trending algorithm.
This is an easy exploit behaviour pattern to detect and conforms closely to the signal detected by David Allsopp previously, and, so far as I understand it, to the #KeepPrisonsSingleSex trend observed more recently by LogicallyAI.
Without a clear view as to what sorts of technical constraints Twitter is under this is hard to be sure of. One potential solution would be to clamp the number of points any given user is allowed to contribute to a hashtag’s trending score to a given maximum. This would mean that there would still be a valid metric for mild, moderate and enthusiastic engagement, but further than the clamping limit, anything else would not be able to further drive up the trending rate.
Another solution could be to allow flagging of trends to trigger detection in special cases so as to root out synthetic/malicious attempts to manipulate the algorithm.
Both of those have big cost drawbacks for computation so another way would be to just stop counting retweets.