AI Act

GEMA vs OpenAI: the ruling that may reshape how AI uses copyrighted works

Gianpaolo Todisco - Partner

The relationship between artificial intelligence and copyright law has become a central topic of global debate. Yet until recently, European courts had never taken a clear position on one fundamental question: can AI platforms freely use copyrighted works to train their models?

With the recent decision of the Munich Regional Court in GEMA vs OpenAI, that question has finally been answered — and not in the way the big tech companies hoped.

The lawsuit began after GEMA, Germany’s main music rights management society, accused OpenAI of using thousands of song lyrics protected by copyright to train its models without securing any authorization. The issue does not concern the outputs generated by the AI, but rather the most fundamental stage of the process: the training phase. Like many other companies in the sector, OpenAI has long argued that this phase is purely technical and therefore falls outside the traditional scope of copyright law. For years this position benefited from a certain degree of tolerance, largely because the technology was new, complex, and difficult to regulate.

The German court has now overturned this paradigm in a decisive way. The ruling states that training a language model is, in fact, an act of reproduction of the work, and therefore falls squarely within the exclusive rights of the author. The fact that the operation is carried out by an algorithm, automatically and on a massive scale, does not change the legal nature of the activity. If a text is copied, stored, analyzed or processed within a dataset, such use requires authorization.

The court went further, clarifying that the European exceptions on text and data mining do not apply here. The EU DSM Directive allows certain data-analysis activities, particularly for scientific research and, in some cases, for commercial purposes — but only if the author has not expressly reserved their rights. Many of the lyrics represented by GEMA did contain such a reservation. According to the court, OpenAI should have noticed this and obtained the necessary licenses.

This decision is likely to mark a turning point. For years, AI models have been trained using enormous quantities of data taken from the internet, often without distinguishing between free content and protected content. This practice was largely justified by the technical complexity of machine learning systems and by the absence of clear regulation. Today, however, that gap is beginning to close, and the Munich ruling leaves little room for flexible interpretations: using copyrighted works requires a license, just as any other reproduction would.

The potential consequences for the industry are significant. Large companies may now be required to negotiate massive licensing agreements with national and international collecting societies, opening the door to an entirely new market of “AI royalties.” At the same time, far greater transparency will likely be required regarding the datasets used for training — an area that has traditionally remained opaque. In the future, platforms may be obliged to disclose which works were used, under which licenses, and with what limitations.

For authors and publishers, this is clearly an important victory. For the first time, a court has recognized that the value of creative works does not end with their public consumption but persists even in the context of AI training. This may create opportunities for new forms of remuneration and for stronger control over the use of creative content in the digital world.

The GEMA vs OpenAI case also arrives at a time when various European courts are examining similar issues and when the Court of Justice of the European Union is expected to address related questions soon. It is increasingly evident that we are entering a new phase of copyright law, one in which the very concept of “reproduction” will need to be reconsidered through the lens of machine learning technologies.

One thing, however, is already certain: the era in which AI systems could train freely on anything available online is over. The Munich court’s decision brings copyright law into the heart of AI development. From now on, those who wish to build increasingly powerful models will have to engage with authors, publishers and collecting societies. It is no longer just a technical matter — it is a legal, economic, and cultural issue that will define how human creativity interacts with artificial creativity.

AI Act: New scenarios in the regulation of artificial intelligence

The AI ACT, the European Regulation on Artificial Intelligence, was approved by the European Parliament on June 14, will be submitted for consideration by EU countries in the Council, with the aim of becoming law by the end of 2023.  The proposed AI Act takes a risk-based approach and provides for penalties of up to €30,000,000 or up to 6 percent of the previous year's total annual worldwide turnover in the event of infringement.

The proposed EU Regulation on Artificial Intelligence aims to create a reliable legal framework for AI, based on the EU’s fundamental values and rights, with the goal to ensure the safe use of AI, and prevent risks and negative consequences for people and society.

The proposal establishes harmonized rules for the development, marketing, and use of AI systems in the EU through a risk-based approach with different compliance obligations depending on the level of risk (low, medium, or high) that software and applications may pose to people's fundamental rights: The higher the risk, the greater the compliance requirements and responsibilities of developers.

In particular, the AI Act proposes a fundamental distinction between:

-          "Prohibited Artificial Intelligence Practices", that create an unacceptable risk, for example, for the violation of EU fundamental rights. This includes systems that:

o   Use subliminal techniques that act without a person's knowledge or that exploit physical or mental vulnerabilities and are such as to cause physical or psychological harm;

o   Used by public authorities, such as, social scoring, real-time remote biometric identification in public spaces, predictive policing based of indiscriminate collection, and facial recognition unless there is a specific need or judicial authorization.

-          "High-Risk AI Systems" that pose a high risk to the health, safety or fundamental rights of individuals, such as systems that enable biometric Identification and categorization of individuals, to determine access to educational and vocational training institutions, to score admission tests or conduct personnel selection activities, to be used for political elections, etc. The placing on the market and use of this type of systems, therefore, is not prohibited but requires compliance with specific requirements and the performance of prior conformity assessments.

In particular, these systems must comply with a number of specific rules, including:

-          Establishment and maintenance of a risk management system: it is mandatory to establish and maintain an active risk management system for artificial intelligence (AI) systems.

-          Quality criteria for data and models: AI systems must be developed according to specific qualitative criteria for the data used and the models implemented to ensure the reliability and accuracy of the results produced.

-          Documentation of development and operation: Adequate documentation of the development of a given AI system and its operation in required, including the systems’ compliance with applicable regulations.

-          Transparency to users: it is mandatory to provide users with clear and understandable information on how AI systems work, to make them aware about how data are used and how results are generated.

-          Human oversight: AI systems must be designed so that they can be supervised by human beings.

-          Accuracy, robustness and cybersecurity: it is imperative to ensure that AI systems are reliable, accurate and secure. This includes taking steps to prevent errors or malfunctions that could cause harm or undesirable outcomes.

In some cases, conformity assessment can be carried out independently by the manufacturer of AI systems, while in other cases it may be necessary to involve an external conformity assessment body.

-          "Limited Risk AI Systems" that do not pose significant risks and for which there are general requirements for information and transparency to the user. For example, systems that interact with humans (e.g., virtual assistant), that are used to detect emotions, or that generate or manipulate content (e.g., Chat GPT), must adequately disclose the use of automated systems, including for the purpose of enabling informed choices or opting out of certain solutions.

The Regulation is structured in a flexible way so that it can be applied or adapted to different cases that may arise as a result of technological developments. The Regulation also takes into account and ensures the application of complementary rules, such as those on data protection, consumer protection and the Internet of Things (IoT).

The Regulation provides for fines of up to 30 million euros or up to 6 percent of the total annual worldwide turnover of the preceding year in case of violation.

As mentioned above, the text approved by the European Parliament will be submitted to the Council for consideration, with the aim of being adopted by the end of 2023. If so, it will be the first legislation in the world to address in such a comprehensive and detailed manner the potential issues arising from placing AI systems on the market.

We will provide updates on future regulatory developments

For details and information, please contact David Ottolenghi of Clovers.