An Absurdly Basic Bug Let Anyone Grab All of Parler’s Data
The social media platform Parler rose to prominence as an outlet for free speech. In practice, it became a haven for disinformation, hate speech, and calls for violence, the sort of content generally blocked on more mainstream platforms like Twitter and Facebook. It’s fair to say, though, that by “free speech” the site’s creators didn’t mean that anyone could freely download every message, photo, and video posted to the site, including sensitive geolocation data. But a very basic bug in Parler’s architecture nonetheless seems to have made it all too easy to do just that.
Late Sunday night, Parler went offline after Amazon Web Services cut off hosting for the social media outlet, a decision that followed the site’s use as a tool to plan and coordinate an insurrectionist, pro-Trump mob’s invasion of the US Capitol building last week. In the days and hours before that shutdown, a group of hackers scrambled to download and archive the site, uploading dozens of terabytes of Parler data to the Internet Archive. One pseudonymous hacker who led the effort and goes only by the twitter handle @donk_enby told Gizmodo that the group had successfully archived “99 percent” of the site’s public contents, which she said includes a trove of “very incriminating” evidence of who participated in the Capitol raid and how.
By Monday, rumors were circulating on Reddit and across social media that the mass disemboweling of Parler’s data had been carried out by exploiting a security vulnerability in the site’s two-factor authentication that allowed hackers to create “millions of accounts” with administrator privileges. The truth was far simpler: Parler lacked the most basic security measures that would have prevented the automated scraping of the site’s data. It even ordered its posts by number in the site’s URLs, so that anyone could have easily, programmatically downloaded the site’s millions of posts.
Parler’s cardinal security sin is known as an insecure direct object reference, says Kenneth White, codirector of the Open Crypto Audit Project, who looked at the code of the download tool @donk_enby posted online. An IDOR occurs when a hacker can simply guess the pattern an application uses to refer to its stored data. In this case, the posts on Parler were simply listed in chronological order: Increase a value in a Parler post url by one, and you’d get the next post that appeared on the site. Parler also doesn’t require authentication to view public posts and doesn’t use any sort of “rate limiting” that would cut off anyone accessing too many posts too quickly. Together with the IDOR issue, that meant that any hacker could write a simple script to reach out to Parler’s web server and enumerate and download every message, photo, and video in the order they were posted.
“It’s just a straight sequence, which is mind-numbing to me,” says White. “This is like a Computer Science 101 bad homework assignment, the kind of stuff that you would do when you’re first learning how web servers work. I wouldn’t even call it a rookie mistake because, as a professional, you would never write something like this.”
Services like Twitter, by contrast, randomize the URLs of posts so they can’t be guessed. And while they offer APIs that give developers access to tweets en masse, they carefully restrict access to those APIs. By contrast, Parler had no authentication for an API that offered access to all its public contents, says Josh Rickard, a security engineer for security firm Swimlane. “Honestly it seemed like an oversight, or just laziness,” says Rickard, who says he analyzed Parler’s security architecture in a personal capacity. “They didn’t think about how big they were going to get, so they didn’t do this properly.”