Does a Chinese AI copyright case change artists' futures
In December 2023, another Chinese lower court followed suit in a copyright infringement decision that could serve as a potentially significant guideline in a slew of upcoming trials in other countries whose verdicts may rattle the photographic, graphic arts, marketing, and visual arts markets.
While similar cases in the United States have refused to grant copyright eligibility to users of AI image generators like Stable Diffusion, Midjourney, and Dall-E, the favorable decisions for AI “artists” in China and India are noteworthy and may very well have set much of the language and criteria for determining future cases on a one-to-one basis.
Even though the penalty in the Chinese Internet Court for Copyright Infringement was equivalent to less than USD 80, including court costs — a far cry from the penalties on the table in other copyright cases currently going through the legal system — Judge Zhu Ge ruled that the plaintiff AI user, Mr Li did indeed use sufficient intellectual investment and made enough aesthetic choices to grant copyright protection based on significant input into his image-generating prompt in the form of the words chosen, the order of the words, the selection and rejection of images during the multi-step development of the image, infusion of personal expression, and setting appropriate parameters to guide engine to his desired image, thereby meeting the requirements of intellectual achievement.
Here is an approximate translation of the prompt "created" by Mr. Li:
“ultra-photorealistic: 1.3, extremely high-quality high-detail RAW color photo, in locations, Japan idol, highly detailed symmetrical attractive face, angular symmetrical face, perfect skin, skin pores, dreamy black eyes, reddish-brown straight hair, uniform, long legs, thigh-high socks, soft focus, (film grain, vivid colors, Film emulation, Kodak gold portra 100, 35mm, canon50 f1,2), Lens Flare, Golden Hour, HD, Cinematic, Beautiful Dynamic Lighting”
Despite what the Chinese judge concluded, these words placed alone in a prompt will produce wildly different results for someone else, such is the random nature of the engine. There is plenty in this prompt that is repeatable, and none of it is reserved solely for the user, Mr Li. The “intellectual” input and “aesthetic” choices seem dubious at this stage. However, Mr Li also used further specifics to arrive at the final desired images.
Two add-on modules designed specifically for the generation of young Asian females were utilized:
These alone add a significant number of automatic variables and pre-sets to the images during generation. It’s also important to note that these models can be trained on a database made up entirely of dark-haired, fair-skinned, dreamy-looking pics of young Asian women similar to these images. My point is that it’s entirely possible that the prompt above plus the add-on modules alone could produce something relatively similar.
It is unknown whether Mr Li fed the prompt copies of actual images of similar-looking young women to assist the image generator. Mixing someone else’s actual photographs or photos of their artwork with a user prompt would be a much stricter test of copyright eligibility, but they’re not mentioned.
Mr Li continued his process even further. A lengthy negative prompt of around 120 words was also incorporated. Negative prompts force the image generator to exclude certain visual attributes, some of which it is prone to do with mistakes known as hallucinations if granted carte blanche without guidance.
The negative prompts translated from the Chinese original were the following:
3d, render, cg, painting, drawing, cartoon, anime, comic:1,2, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, signature, watermark, username, blurry, artist name, (long body), bad anatomy, liquid body, malformed, mutated, bad proportions, uncoordinated body, unnatural body, disfigured, ugly, gross proportions, mutation, disfigured, deformed, (mutation), (child:1,2), b&w, fat, extra nipples, minimalistic, nsfw, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, disfigured, kitsch, ugly, oversaturated, grain, low-res, Deformed, disfigured, poorly drawn face, mutation, mutated, extra limb, ugly, poorly drawn hands, missing limb, floating limbs, Disconnected limbs, malformed hands, blur, out of focus, long neck, long body, ugly, disgusting, poorly drawn, childish, mutilated, mangled, old, surreal, text, b&w, monochrome, conjoined twins, multiple heads, extra legs, extra arms, meme, elongated, twisted, fingers, strabismus, heterochromia, closed eyes, blurred, watermark, wedding, group, dark skin, dark-skinned female, tattoos, nude, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry.
- Set the Height as 768 (pixels I assume)
- Set the CFG Scale as 9
- Set the Seed as 2692150200 (Seeds are like pre-sets that can be recalled for duplication from previous generations)
- Set the weight for model “land-hanfugirl-v1-5.safetensors” in “Additional-Networks”
- Modify the Seed as 2692150199
- Add several keywords in the Prompt: “shy, elegant, cute, lust, cool pose, teen, viewing at the camera, masterpiece, best quality”
In the final bullet point, the author of the prompt used both lust and teen. Had he used either Midjourney or Dall-E for generation, it’s quite likely that from this point forward he would have violated the NSFW filters as well as child porn filters, but for some reason, the version of Stable Diffusion used did not flag them.
Before I continue, I must thank Frederick Xie for the following legal analysis of the court’s finding. His full explanation can be read at https://www.lexology.com/library/detail.aspx?g=d630296f-d3ca-4771-ad82-1af977f93f5e
According to Xie, the court was tasked with applying the four-step test in ruling the copyright eligibility of the work. The decisions are in brackets.
1. Whether the work is within the scope of literature, art and science (Yes)
2. Whether the work possesses originality (Yes)
3. Whether the work has a certain form of expression (Yes)
4. Whether the work is a result of intellectual achievement. (Yes)
In #1, I’d argue that photographs or AI-generated images clearly are NOT literature, which is obvious. But I’d also question whether this type of snapshot portrait can be considered art. Perhaps in the broadest legal terms, I suppose so since it is a visual image, but as a general rule, I don’t consider most photo snapshots to be art. Would yearbook photos be considered “art” for legal purposes? As for science, the achievement there in my view would be credited to the AI generative tool creators, not the person using it. That’s my opinion, but the judge decided otherwise.
After the first condition is accepted, the question centers therefore on the level of intellectual achievement and whether it can applied to the use of an AI image generator.
I’m copying Xie’s explanation of the intellectual achievement, originality, and expression below as he has summarized the decision expertly. I edited mildly for clarity.
“Intellectual Achievement."
The court determines that the image is not an existing image returned by a search engine, nor it is a combination of various elements preset by the software designer. In simple terms, the role or function of the model is similar to that of humans who have acquired certain abilities and skills through learning and accumulation. It can generate corresponding images based on human input in the form of text descriptions, thereby substituting the process of drawing lines and coloring by humans. These present human creativity and ideas in tangible ways.
The plaintiff entered prompt words in describing the subject, details of the figure, environment, poses, and style, then he adjusted the parameters and added several prompt words based on the photos initially generated, and finally he selected a photo that he was satisfied with. Viewing the process, the plaintiff made a certain amount of intellectual investment, such as designing the presentation of characters, selecting prompt words, arranging the order of prompt words, setting relevant parameters, selecting which image meets expectations, etc. The photo at issue reflected the plaintiff's intellectual investment, therefore it shall be considered a result of intellectual achievement.
"Originality"
The court determined that in general, the more diverse the requirements, the clearer and more specific the descriptions of image elements, layout, and composition become, and therefore the more personalized their expression can be reflected. The plaintiff designed visual elements — such as characters and their presentation — through prompt words and set parameters for screen layout and composition, reflecting the plaintiff's selection and arrangement. In the meantime, the plaintiff obtained the first image by inputting a word prompt and setting relevant parameters. He continued to add prompt words and modify parameters, continuously adjusting and correcting them and finally obtained the photo at issue. This adjustment and correction process also reflects the plaintiff's aesthetic choices and personal decisions. The court therefore ruled that the photo at issue possesses originality.
The court also made a policy concern analysis, as Xie continues in the words of the judge:
Generative artificial intelligence technology has brought about changes in people's creative methods, which is similar to the impact of many technological advancements in history. The process of technological development is the gradual outsourcing of human work to machines. Before the emergence of cameras, people needed to use advanced painting skills to reproduce images, and the emergence of cameras made it easier to record objective images. Now, the photography function of smartphones is becoming more and more powerful, and they are becoming simpler to use. As long as the photos taken with smartphones reflect the photographer's originality and intellectual investment, they will be considered works of photography and therefore protected by copyright law. Therefore, the more advanced the technology and more intelligent the tools are, the less investment by humans that is needed.
(JP note: In essence, the judge is saying that the camera has never taken the pictures, rather the photographer took the pictures and the act of using new technology replaces old methods with easier processes. It is still image capturing controlled by the user even though the user is not actively using a camera to take pictures. Stated like this it is more difficult to dispute, even though the major difference is that a camera can capture only images that it can actually see. All of this can be debated after the advent of in-camera filters that modify images at the point of capture in the camera or smartphone.
The question should revert to the copyrights of the works used to train the AI image generator to recognize the prompt words and build a database of images that resemble the desired effects. This would be a related but different lawsuit and was not dealt with at all in this case)
In essence, the judge granted copyright protection for an image that was created by using an image generator that was trained to understand the visual values of words in a prompt by the use of hundreds of thousands of images that received ZERO copyright protection. And people wonder why so many artists and photographers are upset?
Xie continued:
The court made an analogy between using AI to generate the photo and entrusting an artist to paint a drawing, considering them similar. A big difference is that an artist has his own free will in selection and judgment and therefore can be considered a creator of the work while AI does not. The court therefore determined that using AI to generate a photo shall still be considered as a human using tools for creation, meaning that the human is the entity making the intellectual investment instead of AI.
(JP’s notes: That point is significant. Other copyright cases have determined that if any protection existed it would be held by the image generator and its creators. In this case, the judge has said the opposite on the grounds that it is the user who has made the intellectual investment and made a series of artistic decisions using a tool to create the artist’s intent.)
Xie:
The court believed that encouraging more people to use the latest tools for creation would benefit the creation of work and the development of AI technology, and therefore, as long as the AI-generated photos possess original intellectual investment, they can be considered copyrightable work by the AI artist.
Two factors are important to realize from this court decision. It was handed down by a lower court, and because China does not use a common law system like much of the Western world, it can be disregarded by other Chinese courts, even ones at similar levels by different judges. There are also higher-level courts that have not taken a stance on this particular issue, so consider this ruling as “pending” and not a new precedent.
The second factor is that some have speculated that the judge’s decision may be tied in some way to China’s major push to become the world’s dominant leader in artificial intelligence systems. It’s possible that Chinese courts are unwilling to throw roadblocks of doubt in their way and will allow AI users to claim copyright protection as a way to pave the way for unhindered growth.