Overview
Overview
On May 9, 2025, the US Copyright Office (“USCO”) released a highly anticipated pre-publication version of the third and likely final installment of its Report on Copyright and Artificial Intelligence – Part 3: Generative AI Training (the “Report”).[1]
The very next day President Trump dismissed the Register of Copyrights and Director of the USCO, Shira Perlmutter – two days after dismissing Carla Hayden, the Librarian of Congress.[2]Perlmutter has since sued President Trump and the acting Librarian of Congress seeking an injunction blocking her removal.[3]
It is unclear whether Trump’s dismissals (1) were related to the content of the Report (which was critical of some of the arguments advanced by those who favor free use of copyrighted material to train AI models), (2) prompted the unusual pre-publication of the Report or (3) will affect the issuance or content of the final Report.
Regardless, the Report addresses an unsettled question underlying dozens of pending copyright AI lawsuits: does the use of copyrighted works without permission to develop and deploy generative AI (“GenAI”) models qualify as “fair use?”[4] This is the billion dollar question facing GenAI companies whose business model is predicated on such use.
For private fund managers and others who are increasingly looking to harness the power of GenAI and navigate the attendant legal risk, the Report is particularly timely.
While the Report notes that any fair use analysis must be context-specific, it does offer some helpful general guidelines in assessing fair use for GenAI. According to the Report, transformative use and market effects will be the most significant fair use factors that judges will assess in ruling on GenAI companies’ use of copyrighted material. As detailed below, these factors tend to weigh for or against fair use depending on the circumstances – and one federal judge appears poised to rule broadly in favor of fair use for GenAI.
Ultimately, the Report does not recommend immediate government intervention on fair use or compulsory licensing issues related to GenAI. Instead, it advises allowing the nascent licensing market for GenAI training data to evolve organically. To address potential gaps in data offerings and market inefficiencies, the Report explores alternative mechanisms, including extended collective licensing schemes, which could provide broader and more efficient licensing solutions by aggregating rights on behalf of multiple copyright holders.[5]
Fair Use Factors
For private fund managers contemplating using GenAI in their investment process, the Report’s discussion of the fair use factors is worth examining.
In addressing the first fair use factor (the purpose and character of the use), the Report notes that determining whether an AI output is transformative is context-dependent. Some use cases are clearly permitted, others clearly are not.[6] Of note to private fund managers, using GenAI for noncommercial research or analysis where portions of the copyrighted works are not reproduced in the outputs is, according to the Report, likely to constitute fair use.[7] However, using unlawfully accessed material (via pirated works or by circumventing paywalls) to train a GenAI model that produces unrestricted competing content would not constitute fair use.[8]
Also of note for private fund managers, the Report highlights that when retrieval-augmented generation (“RAG”) searches[9] summarize the retrieved copyrighted works rather than providing hyperlinks to the original source, such outputs are less likely to be considered transformative and, therefore, may not qualify as fair use.[10]
In addressing the second factor (the nature of the copyrighted work), the Report notes that any analysis must be context-specific but a fair use finding is less likely if the material used to train the GenAI is “more expressive” or “previously unpublished.”[11]
In addressing the third factor (the amount and substantiality of the use), the Report concludes that under certain circumstances, use of an entire work may not in fact weigh against fair use.[12]
In addressing the fourth factor (market effects), which the Supreme Court has designated as “undoubtedly the single most important element of fair use,”[13] the Report identifies several potential harms, including lost sales, lost licensing opportunities, RAG-related substitution and market dilution. Here, the USCO wades into “uncharted territory,”[14] as no court has yet recognized that market dilution can be applied when AI-generated content competes with human-created works.[15] However, the Report argues that while many GenAI applications promise great public benefits, the sheer unprecedented volume of such applications could pose significant harm to the market for copyrighted works.[16] If courts apply this theory of market dilution, rightsholders may be able to block any use that might have a general effect on the market for copyrighted works, even if it doesn’t specifically impact the rightsholder. Further, the Report emphasizes that where licensing options already exist – or are reasonably likely to develop – the loss of licensing opportunities will disfavor fair use.[17]
Ultimately, the USCO concludes that fair use analysis of GenAI applications must remain on a case-by-case basis, but the first and fourth factors will carry “considerable weight.”[18]
Licensing
The Report outlines four general licensing options: voluntary direct licensing, voluntary collective licensing, extended collective licensing (“ECL”) and compulsory licensing.
- Voluntary direct licenses are negotiated on a case-by-case basis between individual rightsholders and AI developers.
- Voluntary collective licensing agreements typically involve collective management organizations (“CMO”s) that are authorized by multiple rightsholders to negotiate licenses and administer royalty collection and distribution on their behalf.[19]
- ECL builds on the voluntary collective agreement model to cover the works of all relevant rightsholders in a particular category – even those who haven’t actually joined the CMO – while providing an opt-out mechanism for non-participant rightsholders to negotiate separately.[20]
- Compulsory licensing is a statutory framework that permits use of copyrighted material without the rightsholder’s direct consent, subject to government oversight and often complex rate-setting procedures.[21]
Voluntary direct and collective licensing markets for GenAI have already emerged, with others in development.[22] Licensing at scale, however, raises several practical concerns including cost structure, impact on model quality and antitrust issues. Licensing large volumes of copyrighted works at market rates could be prohibitively expensive, particularly given the vast datasets required to train modern AI models. Moreover, if models can only be trained on licensed works, the resulting models may be “tainted by bias and inaccuracy.”[23] There are also antitrust concerns that big tech companies could crowd out smaller developers who might not be able to afford to negotiate broad data licenses.[24] The USCO argues these concerns shouldn’t factor into the fair use analysis, and defers to the Department of Justice for guidance (including a possible antitrust exemption) and the Federal Trade Commission for enforcement.[25]
The Report ultimately advocates for the growth of voluntary licensing regimes for copyrighted works, which can facilitate AI innovation while protecting rightsholders. To support this approach, the Report further argues that ECL could address market inefficiencies without the market risks from “premature” statutory approaches such as compulsory licensing, which may stifle innovation and distort market incentives.[26]
Litigation
The Report is already impacting pending copyright AI lawsuits. While the USCO defers to the courts to “weigh the statutory factors together” and calls it impossible to prejudge litigation outcomes,[27] federal courts can and have deferred to the legal interpretations of agencies such as the USCO, depending on their thoroughness, validity and persuasiveness.[28] As no definitive case law exists on the use of copyrighted material for training GenAI,[29] content owners have already jumped on the Report, citing it as supplemental authority in a detailed counter to the fair use defense in two pending cases.[30] Interestingly, in a May 22, 2025 hearing in a federal case against Anthropic PBC in California, Judge William Alsup said he was leaning “toward finding Anthropic PBC violated copyright law when it made initial copies of pirated books, but that its subsequent uses to train their GenAI models qualify as fair use.”[31] Alsup appeared sympathetic to Anthropic’s argument that its use is “transformative in the extreme” but also might make Anthropic pay for its initial use, noting: “I have a hard time seeing that you can commit what is ordinarily a crime, but get exonerated because you end up using it for a transformative use.”[32] Alsup could be the first judge in the nation to rule on fair use in the GenAI context. And if his reasoning on the fair use factors survives appeal and is adopted by other courts, it could augur well for developers of GenAI, even if the Report itself provides litigation ammunition for content owners.
Takeaways
Overall, the Report provides some instructive – if not legally binding – guidance for AI companies, copyright owners and private fund managers.
For AI companies and downstream users, the Report suggests that implementing effective guardrails to prevent infringing outputs will weigh in favor of fair use and recommends leveraging existing and emerging data licensing frameworks to train AI models. The Report also flags for AI companies that knowingly training AI models on pirated datasets would almost certainly exceed the boundaries of fair use.[33] In such cases, and possibly others, courts may be less inclined to accept arguments about transformative use or net societal benefits of GenAI – particularly when such use poses foreseeable market harm to content owners.[34]
For copyright owners, the Report encourages creators to pursue organized approaches to collective licensing via CMOs while recognizing market dilution as a potential harm from unrestrained AI training. The Report also notes that copyright owners ideally shouldn’t be required to opt out of the use of their material for training AI models.
For private fund managers, the Report offers some guidance that certain non-commercial research uses of GenAI may constitute fair use but that other uses (RAG searches) may not, and therefore carry greater risk. Given the volatility at the Copyright Office and the rapid technological and legal developments in the AI space, private fund managers who use GenAI should continue to pay close attention to this area.
Authored by and Steven Appel.
If you have any questions concerning this Alert, please contact your attorney or one of the authors.