Database of 16,000 Artists Used to Train Midjourney AI, Including 6-Year-Old Child, Garners Criticism
For many, a new year includes resolutions to do better and build better habits. For Midjourney, the start of 2024 meant having to deal with a circulating list of artists whose work the company used to train its generative artificial intelligence program.
During the New Year’s weekend, artists linked to a Google Sheet on the social media platforms X (formerly known as Twitter) and Bluesky, alleging that it showed how Midjourney developed a database of time periods, styles, genres, movements, mediums, techniques, and thousands of artists to train its AI text-to-image generator. Jon Lam, a senior storyboard artist at Riot Games, also posted several screenshots of Midjourney software developers discussing the creation of a database of artists to train its AI image generator to emulate.
The 24-page list of artists’ names used by Midjourney as the training foundation for its AI image generator (Exhibit J) includes modern and contemporary blue-chip names,as well as commercially successfully illustrators for companies like Hasbro and Nintendo. Notable artists include Cy Twombly, Andy Warhol, Anish Kapoor, Yayoi Kusama, Gerhard Richter, Frida Kahlo, Andy Warhol, Ellsworth Kelly, Damien Hirst, Amedeo Modigliani, Pablo Picasso, Paul Signac, Norman Rockwell, Paul Cézanne, Banksy, Walt Disney, and Vincent van Gogh.
Midjourney’s dataset also includes artists who contributed art to the popular trading card game Magic the Gathering, including Hyan Tran, a six-year-old child and one-time art contributor who participated in a fundraiser for the Seattle Children’s Hospital in 2021.
Phil Foglio encouraged other artists to search the list to see if their names were included and to seek legal representation if they did not already have a lawyer.
Access to the Google file was soon restricted, but a version has been uploaded to the Internet Archive.
The list of 16,000 artists was included as part of a lawsuit amendment to a class-action complaint targeted at Stability AI, Midjourney, and DeviantArt and the submission of 455-pages of supplementary evidence filed on November 29 last year.
The amendment was filed after a judge in California federal court dismissed several claims brought forth by a group of artists against Midjourney and DeviantArt on October 30.
The class-action copyright lawsuit was first filed almost a year ago in the United States District Court of the Northern District of California.
Last September, the US Copyright Review Board decided that an image generated using Midjourney’s software could not be copyright due to how it was produced. Jason M. Allen’s image had garnered the $750 top prize in the digital category for art at the Colorado State Fair in 2022. The win went viral online, but prompted intense worry and anxiety among artists about the future of their careers.
Concern about artworks being scraped without permission and used to train AI image generators also prompted researchers from the University of Chicago to create a digital tool for artists to help “poison” massive image sets and destabilize text-to-image outputs.
At publication time, Midjourney did not respond to requests for comment from ARTnews.