Tech

Court Rules That ‘Scraping’ Public Website Data Isn’t Hacking

The Ninth Circuit Court of Appeals shot down LinkedIn's claim that a company that was using its public facing data was violating the Computer Fraud and Abuse Act.
LinkedIn
Image: Getty Images

Scraping public data from a website doesn’t constitute “hacking,” according to a new court ruling that could dramatically limit abuse of the United States’ primary hacking law.

The ruling comes after a lengthy battle between data analytics firm HiQ Labs and Microsoft owned LinkedIn, which have been at each other’s throats for several years over HiQ Labs’ practice of scraping the business social networking website’s public-facing data, then selling it (fused with other datasets) to a laundry list of employers. In the ruling by The Ninth Circuit Court of Appeals, the court shot down LinkedIn’s claim that access to this public data violated the Computer Fraud and Abuse Act (CFAA). In its declaration, the court ruled that to violate the CFAA, somebody would need to actually "circumvent [a] computer's generally applicable rules regarding access permissions, such as username and password requirements," meaning it’s not really hacking if you’re not bypassing some kind of meaningful authorization system.

Advertisement

The CFAA, enacted back in 1986, makes it a crime to access a computer “without authorization.” In the wake of the internet’s creation however, there’s been ample debate about what “authorization” really means, with a number of previous court rulings taking a variety of different and sometimes conflicting positions on the subject. As a result, the law is quite often abused by creative prosecutors to bring charges against targets that may not actually have much to do with computer hacking. That was the case in the prosecution of noted internet activist Aaron Swartz, who committed suicide in early 2013 while awaiting trial for what amounted to the violation of a terms of service agreement. Corporations have also taken advantage of the law’s poor wording for financial gain. Facebook, for example, has taken advantage of the law to suggest users are violating the CFAA if they ignore a cease and desist letter from the data’s owner. Craigslist has also historically taken advantage of the CFAA to hamstring companies who use publicly-available data to build competing and potentially superior products.

HiQ Labs makes its money by scraping information on LinkedIn profiles that LinkedIn users have set be viewable to the broader internet. It packages that data with public data gleaned from other websites, then sells it to employers looking for more insight into the employment pool. Researchers also sometimes scrape data for public interest purposes. Wanting monetization of this data all to itself, LinkedIn sent a sent a cease-and-desist letter to HiQ and other companies starting in 2016, threatening to sue. HiQ sued first, demanding an injunction and a declaratory ruling that the practice was legal. A district court then issued a preliminary injunction, bringing us to this week’s ruling by the Ninth Circuit.

“LinkedIn has no protected property interest in the data contributed by its users, as the users retain ownership over their profiles,” the court ruled. “And as to the publicly available profiles, the users quite evidently intend them to be accessed by others, including for commercial purposes.”

This latest decision finally puts many questions to bed, pending appeal. Electronic Frontier Foundation Senior Staff Attorney Andrew Crocker told Motherboard that the ruling was by and large a good thing.

“This sort of scraping is a commonplace technique that supports research in the public interest, among other beneficial uses,” Crocker said. “As the court recognized, access to publicly available websites is not access ‘without authorization’ under the CFAA, nor does sending a cease and desist letter make such access unauthorized.” Crocker was quick to note that the CFAA has long been used to “chill speech and paint benign and even competitive uses of technology as malicious,” though he added that “courts and Congress should continue to ensure that the law is limited to its intended purpose.” Dylan Gilbert, a privacy expert at consumer group Public Knowledge also applauded the ruling, but told Motherboard that the United States still needs a cohesive privacy law giving consumers not only transparency into the scope of datasets being collected, but control over how this data is used. “A bad actor could, for example, build a comprehensive profile on an individual that includes public information and then sell such information to the highest bidder, leading to a host of harms like lost opportunity, predatory lending, and unfair discrimination in housing, credit or education,” Gilbert said. “We need comprehensive federal privacy legislation that goes beyond notice and consent to restrict harmful data uses,” he added.

LinkedIn is likely to file an appeal, and given there remains some circuit court splits on the scope of the CFAA, a Supreme Court ruling will likely have to clarify things down the road.