The CFAA: Shield or Anti-Competitive Sword in the World of Data Scraping?

Fish & Richardson
Contact

Most people think of the Computer Fraud and Abuse Act (CFAA), 18 U.S.C. § 1030, as the federal criminal statute addressing computer hacking and other cybercrime. But as more and more businesses vest their enterprise value in data, and as data goes digital, the civil provisions of the CFAA are getting more use as a litigation weapon. One hot area in which civil litigants have used the CFAA is in so-called data scraping cases. Companies whose business models are built on aggregating, managing, analyzing, and/or displaying data have tried to wield the CFAA against upstarts and competitors who collect—or “scrape”—the data available on their websites.  These cases have had varying results.

On August 14, 2017, a decision in hiQ Labs, Inc. v. LinkedIn, Corp., 2017 U.S. Dist. LEXIS 129088, Case No. 17-cv-03301-EMC (N.D. Cal. Aug. 14, 2017) raised fresh questions about the applicability of CFAA to this practice. Before we dive in, a refresher may be in order.

The CFAA in a Nutshell

The CFAA, enacted in 1986 and amended several times since, creates civil and criminal liability for any person who “intentionally accesses a computer without authorization or exceeds authorized access, and thereby obtains . . . information from any protected computer.” 18 U.S.C. § 1030(a)(2)(C). Two things of note: First, a “protected computer” includes one that “is used in or affecting interstate or foreign commerce or communication,” which means pretty much any ordinary computer. Second, there are two possible ways to violate the CFAA: (1) access without authorization, or (2) exceeding authorized access to a protected computer. See Musacchio v. United States, 1316 S. Ct. 709, 713 (2016).

Both the “without authorization” and “exceeds authorized access” prongs have given the courts plenty to chew on, especially with the advent of the Internet. A standalone computer is easy. But who is an authorized user when it comes to accessing a website hosted on a “protected computer”? HBO’s blessing aside, is borrowing another’s credentials to access a website, with that user’s permission, accessing a computer “without authorization” or “exceeding authorized access”? What if one violates a website’s Terms of Service, even if one is an authorized user?

Some of these questions have resulted in a dizzying array of circuit splits, and the Supreme Court will consider two other writs of certiorari in cases from the Ninth Circuit—Power Ventures v. Facebook and Nosal v. United States—that tackle the meaning of an authorized user under the CFAA. The hiQ case we’ll take a look at shortly highlights the urgent need for clarity and perhaps even reform in this area.

Before We Go On…What Is Data Scraping?

Data scraping, also known as web scraping, screen scraping, or report mining, essentially is the harvesting or extracting of data from another website or program that is presented to human users. It is most often done by a script, or a bot, that is built to accomplish this efficiently and automatically. For example, a data management company may wish to scrape data from its competitor’s website on behalf of a customer who is looking to transition over to its services. Or a data analytics company may wish to scrape data from an aggregator such as Facebook, LinkedIn, or Twitter, and offer a secondary service to its customers (e.g., putting together a calendar of all your friends’ birthdays). Frequently, scraping is not desired by the website owner, and website terms of service often seek to prohibit it.  Every time a website is called up in a browser, a computer—the hosting server—is technically accessed, so the CFAA implications of data scraping are clear: is it permissible under the CFAA?

Data Scraping and the CFAA

Unsurprisingly, it depends. Courts that have faced this question have made decisions that are heavily fact-dependent, turning on things such as:

  • whether the information is behind a paywall or password protected;
  • whether there are governing Terms of Service that apply to the scraper;
  • whether the website owner has made its disapproval of the scraping conduct known;
  • whether the website owner had technical measures in place to block the scraping activity, which the scraper circumvented;
  • whether the scraper’s use of the data supplants the need for the scrapee’s website; and
  • who the scraped data belongs to.

An interesting pair of cases involving competing tour companies gives some insight into how one court, at least, wrestled with this issue. EF Cultural Travel sued a number of its competitors, accusing them of using a scraper software to systematically collect information regarding its tour offerings and pricing—all of which was publicly available, but coded and scattered across a number of pages—and thereby undercut its prices.

In EF Cultural Travel BV v. Explorica, Inc., 274 F.3d 577 (1st Cir. 2001), the First Circuit upheld a preliminary injunction against Explorica, holding that Explorica may very well have “exceeded authorized access” within the meaning of the CFAA through its use of scraping software. The Court relied heavily on the fact that the Explorica executive who commissioned the scraping software was a former EF employee, who disclosed EF’s proprietary information to the software engineer in violation of his broad confidentiality agreement with EF. Based on this information, the scraper was able to translate EF’s proprietary codes to correspond to EF’s offerings, giving the defendants the information they needed to compete on price.

Two years later, in EF Cultural Travel BV v. Zefer Corp., 318 F.3d 58 (1st Cir. 2003), the First Circuit considered the same issue against a different defendant, Zefer, who had signed no such confidentiality agreement. Here, the Court held that while the lack of authorization under the CFAA may be implicit rather than explicit (such as via password protection), the district court’s standard of gauging the “reasonable expectations” of the website owner was too nebulous and “litigation-spawning” to be the correct standard. “If EF wants to ban scrapers, let it say so on the webpage or a link clearly marked as containing restrictions.” The Court reasoned that basing CFAA liability on the unhappiness of a website owner at having competitors compete by looking at its website “would raise serious public policy concerns.”

Another court went further in the context of publicly-available information. In Cvent, Inc. v. Eventbrite, 739 F. Supp. 2d 927 (E.D. Va. 2010), the court held that no CFAA liability could lie whether the data that was scraped was publicly available, “without requiring any login, password, or other individualized grant of access.” Even though the plaintiff’s Terms of Use stated that access by competitors to its site or information as unauthorized, the court gave weight to the fact that the site took no affirmative steps to screen out competitors, concluding that its website was “not protected in any meaningful fashion by its Terms of Use or otherwise.”

Two other cases are worth briefly discussing to set the stage. In Craigslist Inc. v. 3Taps Inc., 964 F. Supp. 2d 1178 (N.D. Cal., 2013), 3Taps was accused of scraping ads that were posted on Craigslist and republishing them on its own site. After Craigslist sent 3Taps a cease and desist letter, revoked its authorization to access its website, and blocked 3Taps’s IP address, 3Taps found its way around these measures by using different IP addresses and proxy servers to hide its identity and continue scraping. The Court denied 3Taps’s motion to dismiss, holding that 3Taps’s continued access in the face of Craigslist’s revocation of authorization and IP blocking measures met the “without authorization” under the CFAA. In another case, the Ninth Circuit’s decision in Facebook v. Power Ventures, 844 F.3d 1058 (9th Cir. 2016) affirmed that Power violated the CFAA when it scraped Facebook’s proprietary material after Facebook users provided Power with their login credentials, giving heavy weight to the fact that Facebook had rescinded authorization for Power to access its website when it sent Power a cease and desist letter and blocked its IP address.

hiQ v. LinkedIn:  Data Scraper Turns the Tables

Which brings us to the recent decision in hiQ v. LinkedIn. hiQ Labs’ business relies on scraping and analyzing data about LinkedIn users that is presented publicly by LinkedIn, the leading professional networking website on the Internet. hiQ sells products designed to help employers assess their workers’ skills, as well as predict which of them is at the greatest risk of being recruited away. In May 2017, LinkedIn sent hiQ a cease and desist letter, noting that its User Agreement prohibited data collection from its website, and made clear that further access to LinkedIn’s site “of any kind” would be “without permission and without authorization from LinkedIn.” LinkedIn further implemented technical measures to block hiQ from accessing and scraping its data.

Blocked from accessing even the public data on LinkedIn’s site, hiQ brought a declaratory judgment action and sought a preliminary injunction from the Court to compel LinkedIn to allow hiQ to regain access. The key and threshold legal question was whether hiQ’s conduct—“accessing public LinkedIn profiles after LinkedIn has explicitly revoked permission to do so”—violated the CFAA. LinkedIn argued that the Ninth Circuit’s decision in Facebook v. Power Ventures, affirming a CFAA violation where the data scraper had continued to access a website despite a cease and desist letter and technical blocks, should defeat hiQ’s motion for a preliminary injunction.

The Court disagreed, finding “serious questions as to the applicability of the CFAA” to hiQ’s conduct. While acknowledging that the act of viewing a website was a literal “access” to the computer hosting the website, the Court was troubled by the notion of handing the decision of whether CFAA liability attached—including criminal liability—entirely to the website owner, who could unilaterally choose to revoke access of its otherwise-public site to any user, at any time. It distinguished Facebook by observing that the data that hiQ had sought to access was data that LinkedIn had itself chosen to make public, unlike the user profiles in Facebook that were password protected. And it went on to examine and apply the principles of trespass to the digital realm based on the work of Professor Orin Kerr, likening the Internet in general and social networking sites in particular to “speaking and listening in the modern public square.” In this paradigm, the dividing line for the Court in assessing authorization under the CFAA was the use of authentication systems, like passwords, which serve as analogues to gates and barriers in physical space.

Importantly, the Court specifically addressed a user’s deployment of data scraping programs or bots to access and scape a website. Having concluded that one’s access of publicly-available data on a website is likely not an access “without authorization” under the CFAA, the Court noted that the method by which this underlying authorized access occurs—whether or not they circumvent technological measures that are put in place to prevent the use of bots—would similarly not violate the CFAA.

After concluding that serious questions had been raised as to whether the CFAA would apply to hiQ’s scraping conduct, the Court went on to analyze the state law claims and also found that hiQ had raised serious questions as to whether LinkedIn’s blocking of hiQ’s access constituted unfair competition under California law. Again, the fact that the scraped data was publicly available factored heavily into the Court’s analysis. It ultimately granted hiQ’s preliminary injunction motion, enjoining LinkedIn from preventing hiQ’s access, copying, or use of public profiles on LinkedIn’s website, including through the use of legal or technical blocks. While citing 3Taps at one point in the decision, the Court in hiQ seemed to arrive at a different conclusion despite parallels in the two fact patterns: both involved cease and desist letters, the website owner’s revocation of the scraper’s authorization to access, and the owner’s use of technical measures to block the scraper from collecting data that was otherwise available to the public. LinkedIn has appealed the decision.

Conclusion:  Important Decisions Ahead?

The hiQ decision is yet another data point, and not an insignificant one, for companies who scrape data as part of their business model. The landscape is far from settled. The caselaw so far raises interesting questions about statutory interpretation, the ownership of data (especially in the social media context), and the applicability of ancient legal principles to the Internet age. Applying the decades-old CFAA to the Internet throws up myriad other implications, including First Amendment concerns, the meaning of statutory language in a dual-use (i.e., civil and criminal) statute, and the elusive balance between privacy and accessibility in a lightning-fast industry that is built on leveraging users’ data. These issues – some of which perhaps to be tackled by the Supreme Court in Nosal and Facebook – warrant more analysis as the law in this area develops. As always, stay tuned.

[View source.]

DISCLAIMER: Because of the generality of this update, the information provided herein may not be applicable in all situations and should not be acted upon without specific legal advice based on particular situations. Attorney Advertising.

© Fish & Richardson

Written by:

Fish & Richardson
Contact
more
less

PUBLISH YOUR CONTENT ON JD SUPRA NOW

  • Increased visibility
  • Actionable analytics
  • Ongoing guidance

Fish & Richardson on:

Reporters on Deadline

"My best business intelligence, in one easy email…"

Your first step to building a free, personalized, morning email brief covering pertinent authors and topics on JD Supra:
*By using the service, you signify your acceptance of JD Supra's Privacy Policy.
Custom Email Digest
- hide
- hide