Gathering enough data to create sufficiently useful training datasets for generative artificial intelligence requires scraping most public websites. The scraping is conducted using pieces of code (scraping bots) that make copies of website pages. Today, there are only a few ways for website owners to effectively block these bots from scraping content. One method, prohibiting scraping in the website terms of service, is loosely enforced because it is not always clear when the terms are enforceable. This Essay aims to clear up the confusion by describing what scraping is, how entities do it, what makes website terms of service enforceable, and what claims of damages website owners may make as a result of being scraped. The novel argument of the Essay is that when (1) a website’s terms of service or terms of use prohibit scraping or using website content to train AI and (2) a bot scrapes pages on the website including those terms, the bot’s deployer has actual notice of the terms and those terms are therefore legally enforceable, meaning the website can claim a breach of contract. This Essay also details the legal and substantive arguments favoring this position while cautioning that nonprofits with a primarily scientific research focus should be exempt from such strict enforcement.
Author
Assistant Professor of Instruction, Business, Government and Society Department, McCombs School of Business, University of Texas, Austin. A special thanks to Saskia Reford, UT-Austin juris doctor candidate, for her invaluable contributions.
Copyright 2025 by David Atkinson
Cite as: David Atkinson, Putting GenAI on Notice: GenAI Exceptionalism and Contract Law, 120 Nw. U. L. Rev. Online 27 (2025), https://scholarlycommons.law.northwestern.edu/cgi/viewcontent.cgi?article=1357&context=nulr_online.