Key Takeaways:
- In the closely watched Thomson Reuters v. ROSS Intelligence case, a federal judge has ruled that an AI developer’s use of copyrighted material was not fair use as a matter of law.
- While the AI in question had some differences from the generative AI models that have led to extensive litigation around copyright and fair use, and involved direct competitors building similar products, the decision offers some insight into how courts will evaluate AI training data going forward.
- The court concluded that depriving a copyright owner of the ability to license their work as AI training data undercuts the fair use defense, which may have significant implications in other cases poised for decisions this year.
In a ruling with potential implications for other pending generative artificial intelligence (“AI”) copyright cases, the United States District Court for the District of Delaware in Thomson Reuters Enterprise Centre GmbH & West Publishing Corp. v. ROSS Intelligence Inc. has granted summary judgment for Thomson Reuters on direct copyright infringement and related defenses, as well as fair use.
While the AI technology utilized by ROSS in this case was not generative AI (i.e., it does not create new content based upon user prompts but rather returns search results), the opinion provides a helpful early barometer for how future courts will evaluate copyright infringement claims, and the fair use defense, in the context of AI technologies.
Summary of the Case
The Thomson Reuters case was one of the first to allege AI copyright infringement, though its facts differ from the many copyright cases challenging generative AI that followed in its wake.
Thomson Reuters offers one of the leading legal search engines on the market, Westlaw, which contains not only copies of cases, statutes and other legal materials but also content created by Westlaw itself to assist in legal research. ROSS is a legal research startup that sought to offer a search engine for lawyers that relied on generative AI to produce results. Historically, many of the leading legal research tools (including Westlaw) used standard text-based search methods to identify relevant cases or statutes. ROSS sought to train an AI to provide better results—but needed training data for that AI.
ROSS first sought to license Westlaw materials directly, but Thomson Reuters refused to give its data to a competitor. ROSS then found another source of data—“Bulk Memos” created by a third-party company using Westlaw resources as a guide. As the court explained, “Ross built its competing product using Bulk Memos, which in turn were built from Westlaw headnotes.”
Thomson Reuters filed suit back in 2020, alleging that ROSS had used its copyrighted Westlaw headnotes and other materials as training data for ROSS’ competing legal AI product. Thomson Reuters claimed that thousands of ROSS’s legal memos are infringing copies of its Westlaw headnotes.
After discovery, Thomson Reuters sought summary judgment for direct copyright infringement on a subset of 2,830 headnotes. Both parties also moved for summary judgment on ROSS’ defense that it was making fair use of the Westlaw materials. In its first summary judgment ruling, issued on September 25, 2023, the court denied both parties’ motions for summary judgment on copyright infringement and fair use. The parties then filed renewed motions for summary judgment in October 2024.
Tuesday’s decision substantially reversed the court’s initial summary judgment ruling in September 2023. Interestingly, the change in outcome was not due to a change in law or fact—rather, as the case proceeded after the initial summary judgment ruling, the court (according to the new opinion) “studied the case materials more closely” and “invited the parties to renew their summary-judgment briefing” rather than proceed immediately to trial.
Reconsidering the issue of direct copyright infringement, the court held that that ROSS had directly infringed 2,243 of the Westlaw headnotes at issue. In its prior ruling, the court declined to find direct infringement, concluding instead that a jury would need to determine whether the Westlaw headnotes were sufficiently original to merit copyright protection in the first place. In its updated opinion, the court decided that the headnotes and Key Number System were in fact original, citing to the landmark Feist Publications case, oft cited for the proposition that copyright’s “originality threshold is ‘extremely low,’ requiring only ‘some minimal degree of creativity …. some creative spark.’” Feist Publications, Inc. v. Rural Tel. Serv. Co., 499 U.S. 340 (1991). Likening the process of crafting headnotes to that of a sculptor chiseling through marble, Judge Bibas wrote that “a sculptor creates a sculpture by choosing what to cut away and what to leave in place…. So too, even a headnote taken verbatim from an opinion is a carefully chosen fraction of the whole.” Appreciating the limits of this reasoning, however, the court declined to grant summary judgment as to headnotes that were verbatim copies of the underlying opinion.
Fair Use
The most noteworthy aspect of the court’s ruling pertained to fair use, which has been at the forefront of many copyright cases involving generative AI. Reversing its earlier position that issues of fact precluded summary judgment on fair use for either party, the court held that ROSS’s activities were not fair use as a matter of law.
When examining the first fair use factor, the purpose and character of the use, the court found the factor favored Thomson Reuters. The court relied on the Supreme Court’s recent landmark decision in Andy Warhol Foundation v. Goldsmith, focusing on the undisputedly commercial nature of ROSS’s use, as well as the fact that the use was not transformative. In the court’s eyes, the use did “not have a ‘further purpose or different character’ from Thomson Reuters’s. … Ross was using Thomson Reuters’s headnotes as AI data to create a legal research tool to compete with Westlaw.”
The court next found that the second fair use factor, the nature of the work, favored ROSS because the Westlaw headnotes were “not that creative” when compared to “that of a novelist or artist drafting a work from scratch.” However, the court added that “this factor ‘has rarely played a significant role in the determination of a fair use dispute.’” As the court pointed out, the more creative a work is, the more protection it receives.
The court then found that the third fair use factor, the amount and substantiality of the work taken, favored ROSS because ROSS did not make the copied material available to the public and instead only used it for internal training purposes. As the court pointed out, what matters is not the amount and substantiality used in making a copy but what is made accessible to the public that may serve as a competing substitute. Thus, because the Westlaw headnotes were never accessible by users of ROSS’s system, this factor weighed in favor of ROSS.
Finally, the court found that the last (and most important) factor, the effect on the value or potential market for the copyrighted work, favored Thomson Reuters because ROSS’s product was designed to compete with Westlaw as a market substitute (i.e., as a legal research platform) and could impact Thomson Reuters’ ability to market the Westlaw headnotes as AI training data to other parties.
Though two factors favored Thomson Reuters (the first and the fourth) and two favored ROSS (the second and third), the court ultimately found that the balance of the factors weighed in Thomson Reuters’ favor in this case because “factor two matters less than the others, and factor four matters more.”
Key Takeaways for Future Cases
While it may be tempting to read into the tea leaves of the Thomson Reuters decision, it is important to emphasize that the case does not involve generative AI and that a fair use analysis in a case involving generative AI is likely to play out very differently than it did here.
That said, the decision is an interesting first signal as to how courts are likely to analyze the facts in each case when thinking about AI in the fair use context.
- Defendants could have more success on the first fair use factor, purpose and character, in the context of a generative AI model. Generative AI defendants have argued that their models have very different characters and purposes than the data on which they’re trained. Generative AI developers are likely to emphasize the transformative nature of their technology and the necessity of the intermediate copying in order for the developers to innovate.
- The second fair use factor, originality, could weigh more in plaintiffs’ favor in generative AI cases where the works at issue are novels, paintings and other creative works that fall more squarely within the ambit of copyright than Thomson Reuters’ headnotes did.
- The third fair use factor, the amount taken, depends on the extent to which the material taken is shown or otherwise made available to the public, and this factor will likely come out differently in other AI cases depending on how the copyrighted works are shared or disseminated. Some generative AI applications have attempted to prevent the exact regurgitation of copyrighted training data, but in some cases plaintiffs have put forward evidence of verbatim or near-verbatim copies of their works in generative AI outputs.
- The fourth factor, focusing on the market effect, is likely to be a major battleground in pending generative AI cases and also may play out differently in the generative AI context depending on the extent to which the parties are direct competitors. Plaintiffs will likely stress generative AI’s ability to undercut their services in the market and, in some cases, wholly supplant them. Defendants, on the other hand, will argue that a generative AI chatbot is not a substitute for a novel, song or film. That said, the Thomson Reuters court’s recognition of the plaintiff’s ability to license their creative work as part of the relevant market is likely to be highly relevant for those plaintiffs whose intellectual property assets could serve as high-quality AI training data.
It is also worth noting that Judge Bibas’ order dispensed with not only ROSS’s fair use argument but also ROSS’s other copyright defenses—innocent infringement, copyright misuse, merger and the scènes à faire doctrine—with little effort. “None of ROSS’s possible defenses holds water,” the court wrote, “I reject them all.” While these defenses have received less attention in the generative AI context than fair use, the court’s succinct ruling on these defenses is notable in that it cuts off several avenues that defendants in this and other cases have attempted to use to escape liability.
Though this case does not involve generative AI, we expect to see plaintiffs in ongoing litigation involving generative AI heralding the ruling as a step in the right direction, citing it as persuasive authority in other cases and urging other courts to follow suit. If courts begin to reach the same conclusion on the viability of the fair use defense in generative AI copyright cases, defendants may be more inclined to pursue settlements going forward.
We will continue to follow how these legal and factual questions are resolved as litigation progresses. To stay up to date, subscribe to the Debevoise Data Blog here.
This publication is for general information purposes only. It is not intended to provide, nor is it to be used as, a substitute for legal advice. In some jurisdictions it may be considered attorney advertising.