Web Scraping Legal Battle: LinkedIn vs. hiQ Labs and What It Means for Data Collection
Web scraping exists in a legal gray area that many find confusing. Is automatically collecting publicly available information legal, or does it cross into unauthorized territory? A landmark case between LinkedIn and hiQ Labs has helped shape the current understanding of web scraping legality in the United States.
What is Web Scraping?
At its core, web scraping is simply using automated tools to collect information from websites. Rather than manually copying and pasting data, a scraper program can extract information quickly and efficiently. For example, a scraper could collect all product prices from an e-commerce site in seconds rather than hours of manual work.
The David vs. Goliath Battle
hiQ Labs was a small startup that developed tools to predict when employees might leave their jobs. Their methodology involved scraping publicly available LinkedIn profiles—information anyone could view without logging in. No passwords were cracked, no private areas accessed.
LinkedIn, however, took strong exception to this practice. They sent hiQ a cease and desist letter, blocked their IP addresses, and implemented technical barriers to prevent the scraping of their platform.
Instead of backing down, hiQ Labs took an unexpected approach—they sued LinkedIn, arguing that the tech giant was attempting to monopolize public information.
The Legal Arguments
LinkedIn’s primary legal argument rested on the Computer Fraud and Abuse Act (CFAA), a U.S. law designed to combat hacking. They claimed hiQ’s scraping constituted “unauthorized access” to their systems.
hiQ countered that publicly available web pages, by definition, cannot be considered unauthorized access since they’re designed to be viewed by anyone.
The Court Rulings
In 2019, the Ninth Circuit Court sided with hiQ Labs in a significant victory. The court determined that scraping publicly available data did not violate the CFAA, establishing that hiQ had a right to access information LinkedIn had made public.
LinkedIn appealed all the way to the Supreme Court. However, another case, Van Buren v. United States, changed how “unauthorized access” was defined in 2021. The Supreme Court sent the LinkedIn vs. hiQ case back down for review based on these new definitions.
Eventually, after years of legal battles, hiQ and LinkedIn reached a private settlement.
The Legacy and Current State of Web Scraping Law
The case established an important precedent: scraping public data is not automatically illegal under U.S. federal hacking laws like the CFAA. However, this doesn’t mean scraping is always permissible.
Companies can still:
- Block scrapers through technical means
- Sue under various state laws
- Enforce their terms of service
The reality is that while scraping public data may be legally defensible, companies with resources can make enforcement extremely difficult through technical barriers and prolonged legal challenges.
Best Practices for Ethical Web Scraping
For those considering web scraping, here are some guidelines to minimize risk:
- Only scrape public, non-personal data
- Be respectful of server resources—don’t overload websites
- Carefully review terms of service
- Never scrape private areas requiring authentication
- Consider ceasing operations if directly asked to stop
The key takeaway is that while the law may technically be on your side in some circumstances, practical considerations often make aggressive scraping against a company’s wishes problematic.
Web scraping remains a powerful tool for data collection, but one that should be deployed with careful consideration of both legal and ethical boundaries.