“The kind of books I fear losing are the kinds that make us think and understand each other.” This is a statement from Mary Rasenberger, the CEO of the Authors Guild and the Authors Guild Foundation. With AI, or artificial intelligence, becoming more prevalent, problems arise even as others are solved. AI applications have entered a wide variety of industries such as healthcare, finance, retail, and manufacturing. Now it has extended its grasp on book publishing.
A part of this began with “scraping,” when tech companies use data from published books on the internet to train AI—without consent from or compensation for the authors. Published books from authors, including those of John Grisham, best known for his legal thrillers such as The Firm; Micheal Connelly, author of Lincoln Lawyer; and George R.R. Martin, who wrote the Game of Thrones series, were downloaded from a pirate ebook website and used to train machine learning models. In September 2023, a total of 17 authors partnered with the Author’s Guild, including those mentioned above, filed a lawsuit against OpenAI and Microsoft, alleging that copyright laws were violated. The authors’ copyrighted works were used to further develop and train AI technology on the patterns found in the works, thus teaching them how to make new content. All of this is built on unpermitted use and lack of acknowledgment. The tech companies compared the training of their AI with these books to a person reading to improve their writing, and simultaneously claimed that the ChatGPT was the equivalent of “a teacher who has learned from lots of prior study.”
However, this is only one aspect of AI’s expansion in the industry. Self-publishing has grown due to its easy access, with an estimated 2.5 million books self-published in the U.S. in 2023 versus the 500,000 to 1 million books from traditional publishing. Led primarily by Amazon’s Kindle Direct Publishing platform, the self-publishing industry where the author acts as the writer, editor, publisher, marketer, and publicist, has made it easier for authors to put out their work. But it’s also made it easier for scams and knock-offs to infiltrate the market. This issue has persisted since 2019, despite Amazon’s issued statement about its “proactive steps to drive counterfeits” within their markets “to zero.” As an attempt to safeguard the advance of AI-generated materials, Amazon set a limit of three published books per day per person. This implies that three books a day is the normal publishing rate, even though the typical timeframe takes years for most authors. Jane Friedman, an author and founder of the publishing industry newsletter The Hot Sheet, has written about her experiences of having AI-generated books published under her name with content that is extremely similar to her work. When she first reached out to Amazon about the issue, the company said that nothing could be done because she hadn’t trademarked her name. The titles were eventually taken down after her story spread through national publications.
As the presence of AI-generated material progresses, the idea that it can be used to improve writing has become more accepted. Sudowrite, created by James Yu and Amit Gupta who were writers alongside their experimentation with GPT-3, was inspired by the notion that AI could provide writers with feedback to help them improve. For paid users, Sudowrite helps with brainstorming, drafting, and revising written material—mainly novels. The most requested feature which the creators have put to trial, is a model precisely tuned to the unique voice of the user. With this, writers would be able to receive more exact feedback in alignment with their unique style. Yu believes that AI assistance for writers will eventually become the equivalent of Photoshop for visual artists. Yet this creates a complication to the legal aspect of the work, a legal aspect that hasn’t quite been defined: the copyright. While AI-generated work cannot be copyrighted, AI-assisted work can be copyrighted, within limits. The distinction between AI-generated and AI-assisted is subjective to each company, due to the lack of a clear ubiquitous definition. Though Amazon has begun to label AI-generated works, it doesn’t require disclosure of whether or not a book is AI-assisted, a term the company defines as using AI after the initial draft. Traditional publishers do not disclose the use of AI in their material, assisted or generated, making it impossible to differentiate as its grasp on the industry expands.
Furthermore, traditional publishers have shared the industry’s merging with AI through loosely defined ideation. Nihar Malavirya, CEO of Penguin Random House, hopes that by using the technology, publishing would be made easier without the need to hire more workers. HarperCollins announced its partnership with an AI Studio to create audiobook translations. Simon & Schuster is planning to use AI for marketing purposes and AI-generated audio narration while expressing open-mindedness for other uses. Despite this, recent steps have been taken to mitigate the copyright issues created alongside the presence of AI. In October 2024, Penguin Random House expressed an explicit refusal for its author’s works to be used for the training of AI, a step solidified with a new standard copyright page: “No part of this book may be used or reproduced in any manner to train artificial intelligence technologies or systems.” Soon after, in November, HarperCollins took part in an AI licensing deal, allowing tech companies to use materials to train AI models with the author’s agreement and the appropriate compensation.
Above all, the concerns that persist are those on the types of books that will make it into the world. Books that are meant to present creativity and define an industry. When should we enforce the divide between AI-generated and AI-assisted? Should AI assistance become a standard within the industry? Are we at stake of losing what we value within literature? How far is too far?