GEMA vs OpenAI: the ruling that may reshape how AI uses copyrighted works

24 nov

The relationship between artificial intelligence and copyright law has become a central topic of global debate. Yet until recently, European courts had never taken a clear position on one fundamental question: can AI platforms freely use copyrighted works to train their models?

With the recent decision of the Munich Regional Court in GEMA vs OpenAI, that question has finally been answered — and not in the way the big tech companies hoped.

The lawsuit began after GEMA, Germany’s main music rights management society, accused OpenAI of using thousands of song lyrics protected by copyright to train its models without securing any authorization. The issue does not concern the outputs generated by the AI, but rather the most fundamental stage of the process: the training phase. Like many other companies in the sector, OpenAI has long argued that this phase is purely technical and therefore falls outside the traditional scope of copyright law. For years this position benefited from a certain degree of tolerance, largely because the technology was new, complex, and difficult to regulate.

The German court has now overturned this paradigm in a decisive way. The ruling states that training a language model is, in fact, an act of reproduction of the work, and therefore falls squarely within the exclusive rights of the author. The fact that the operation is carried out by an algorithm, automatically and on a massive scale, does not change the legal nature of the activity. If a text is copied, stored, analyzed or processed within a dataset, such use requires authorization.

The court went further, clarifying that the European exceptions on text and data mining do not apply here. The EU DSM Directive allows certain data-analysis activities, particularly for scientific research and, in some cases, for commercial purposes — but only if the author has not expressly reserved their rights. Many of the lyrics represented by GEMA did contain such a reservation. According to the court, OpenAI should have noticed this and obtained the necessary licenses.

This decision is likely to mark a turning point. For years, AI models have been trained using enormous quantities of data taken from the internet, often without distinguishing between free content and protected content. This practice was largely justified by the technical complexity of machine learning systems and by the absence of clear regulation. Today, however, that gap is beginning to close, and the Munich ruling leaves little room for flexible interpretations: using copyrighted works requires a license, just as any other reproduction would.

The potential consequences for the industry are significant. Large companies may now be required to negotiate massive licensing agreements with national and international collecting societies, opening the door to an entirely new market of “AI royalties.” At the same time, far greater transparency will likely be required regarding the datasets used for training — an area that has traditionally remained opaque. In the future, platforms may be obliged to disclose which works were used, under which licenses, and with what limitations.

For authors and publishers, this is clearly an important victory. For the first time, a court has recognized that the value of creative works does not end with their public consumption but persists even in the context of AI training. This may create opportunities for new forms of remuneration and for stronger control over the use of creative content in the digital world.

The GEMA vs OpenAI case also arrives at a time when various European courts are examining similar issues and when the Court of Justice of the European Union is expected to address related questions soon. It is increasingly evident that we are entering a new phase of copyright law, one in which the very concept of “reproduction” will need to be reconsidered through the lens of machine learning technologies.

One thing, however, is already certain: the era in which AI systems could train freely on anything available online is over. The Munich court’s decision brings copyright law into the heart of AI development. From now on, those who wish to build increasingly powerful models will have to engage with authors, publishers and collecting societies. It is no longer just a technical matter — it is a legal, economic, and cultural issue that will define how human creativity interacts with artificial creativity.

gemavsaiGEMAAI ActAitraininggenrativeaiaitrainingcopyrightlitigationartificial intelligencecase law

Gianpaolo Todisco

GEMA vs OpenAI: the ruling that may reshape how AI uses copyrighted works

The CJEU’s Mio/Konektra judgment and the future of Applied Art Copyright

Runway soundtracks vs. music licensing and copyright