Legal Aspects

Article 4 of the EU Copyright Directive 2019/790 (DSM Directive) requires EU Member States to provide an "exception [...] of this Directive for reproductions and extractions of lawfully accessible works and other subject matter for the purposes of text and data mining", which is defined as "any automated analytical technique aimed at analysing text and data in digital form in order to generate information which includes but is not limited to patterns, trends and correlations".

The DSM Directive distinguishes between TDM for scientific purposes and for other purposes. Article 3 provides a provision for scientific purposes, while Article 4 regulates all other purposes. Although Article 4 does not explicitly mention any specific, non-scientific purposes of TDM, it is assumed that acts of training models of AI include acts of TDM as an essential element. The applicability of this type of use for AI training has been debated among legal scholars, but since the new European AI Act explicitly mentions the applicability of TDM for AI training, this discussion is considered to be resolved and clear by most.

Article 4 (3) further elaborates that "the exception or limitation provided for in paragraph shall apply on condition that the use of works and other subject matter referred to in that paragraph has not been expressly reserved by their rightholders in an appropriate manner, such as machine-readable means in the case of content made publicly available online."

The AI Act and the obligation to respect TDM opt-out declarations raise complex questions about their territorial applicability on cross-border uses. If, for instance, an AI provider in the US trains an AI model, using servers and content available in the US, it is questionable whether this provider must respect TDM opt-out declarations. The AI Act assumes that the provider must respect the declaration at least when marketing these models on the European Union Market, in order “to ensure a level playing field among providers of general-purpose AI models where no provider should be able to gain a competitive advantage in the Union market by applying lower copyright standards than those provided in the Union.” (Recital 106 of the European AI Act). Even if this assumption is only part of a (non-binding) recital, it expresses the general objective of the Act, confirmed by obligations affecting providers irrespective of their seat such as those contained in § 53 of the AI Act.

Traditionally, copyright laws are territorial in the sense that reproductions during AI model training are governed by the copyright laws of the country where the training occurs, this would primarily suggest the applicability of US law to the case. The EU Copyright Directive, including its opt-out provision for text and data mining (TDM), is European Union (EU) legislation. As such, it primarily applies to EU member states and the acts of using copyrighted works on EU territory.

However, the scope of the AI Act, referring to the copyright provisions of the EU Copyright Directive goes beyond this regulation. The AI Act imposes very specific compliance duties, namely, “through state-of the art technologies” to comply with opt-out declaration (Art. 53 (1) lit. c) AI Act), on any provider placing general-purpose AI models on the EU market, regardless of their location. This means even if the AI training occurs outside the EU, such model must comply with the AI Act when the model or an application based thereon is offered on the European Union market. The Act's goal is to ensure fair competition and protect individual rights within the territorial scope of EU law. Therefore, if an AI model is not developed according to the AI Act, it cannot be offered in the EU, regardless of whether the content used for training has been created by a rightsholder based in the EU or on US territory.

Alternatively, Peukert and other legal scholars suggest that the application of the AI Act could depend on the localization of another act of use, namely, on whether the AI provider has crawled data from websites hosted in the EU. This approach is consistent with the specific requirements of the AI Act, such as automatic crawling and the obligation to comply with machine-readable rights reservations. This solution emphasizes that if content is hosted in the EU, AI providers accessing this content by way of crawling during or as a part of AI training processes should comply with applicable rules in the EU, even if the subsequent training takes place elsewhere.

In summary, while traditional copyright principles may in effect limit territorial applicability, the AI Act seeks to ensure compliance with EU standards for AI models offered within the EU, potentially extending its reach beyond traditional territorial boundaries.

Applicability to Rightsholders based outside the European Union

Rightsholders from the US or from other countries outside the EU may benefit from the EU Copyright Directives if their content is used within the EU.

If the content of US rightsholder is affected by uses happening on EU territory and is lawfully accessible within the EU, AI companies are marketing their products in the EU and EU-based users are using it, these AI companies must comply with the EU Copyright Directive.

Therefore, US and other third country rightsholders may want to consider the instruments provided for their protection by the EU Copyright Directive if they want to control the use of their content within the EU. They can implement the opt-out provision to prevent their content from being used for TDM by AI providers marketing their products in the EU.

Implementation Considerations for Machine-Readable Opt-Out

US and other third country rightsholders can use machine-readable means, such as the International Standard Content Code (ISCC), to identify their works and bind opt-out reservations that express in an automated way that their works should not be used for TDM. This would help ensure that EU-based entities can respect their opt-out.

Practical Steps – US and other third country rightsholders should:

  1. Assess if their content is accessible within the EU.

  2. Implement a machine-readable opt-out if they wish to prevent TDM through content that is accessible and marketed within the EU.

  3. Monitor compliance and take necessary actions if their opt-out is not respected.

In summary, the EU Copyright Directive's opt-out provision can be applicable if an AI model or the AI application is marketed within the European Union. Implementing a machine-readable opt-out, using the ISCC as an identifier that can help to point to rights and opt-out reservations, can provide an effective and enforceable instrument to US and other third country rightsholders to manage their content's use.

Sources

Peukert, Copyright in the Artificial Intelligence Act – A Primer; GRUR Int. 2024, p. 497; Rendas/Hartmann; From Brussels to Brasília: How the EU AI Act Could Inspire Brazil’s Generative AI Copyright Policy; GRUR Int. 2024, p. 495.

HiQ Labs, Inc. v LinkedIn Corp. 31 F.4th 1180, 1187 n. 3 (9th Cir. 2022) as referred to by Peukert, GRUR Int. 2024, p. 497.

Last updated