This work addresses the significant challenge faced in the development of artificial intelligence: the use of copyrighted data to train AI systems. This issue is often described as a gap in the exceptions and limitations of copyright law, reflecting the lack of clear rules that foster legal uncertainty, thus affecting AI development processes. This situation arises not only from the lack of formalities and the simple criterion for protected originality, which only requires the work to have some minimal creativity, but also from a technological context that transforms common activities into acts of creation. Access to advanced digital tools allows almost anyone to create content protected by copyright, thus generating a proliferation of less complex and original creative works. In the course of this work, we classify machine learning applications according to the nature of the training data used, identifying the main categories. Currently, copyright law regulates the making of private copies or uses that compete in the market, but these constitute only a fraction of AI applications, leaving out many socially harmful uses of protected materials. The typical punitive measures of copyright law prove inadequate to deal with these uses, being more appropriate for situations that exceed the normative scope. We discuss various solutions to these challenges, highlighting the need to use works to train AI, while ensuring fair compensation for creators, thus promoting technological evolution and market competitiveness. We conclude that the Text and Data Mining exception present in the European Union Directive on Copyright in the Digital Single Market is a significant advance. This provision, although formulated as an exception, can be seen as a formality that obliges rights holders to act to exclude their materials from training datasets, directly addressing one of the fundamental causes of the AI dilemma. Although the work analyzes the issue from a legal perspective in Portugal, reflecting European Union regulation, it also aims to compare different legal systems, identifying the most favorable locations for investment in this technology due to greater ease and openness to its development. Given the current relevance of the topic, the analysis of concrete cases involving large technology companies and copyright holders whose works are used in software training is essential to address the central question of the work, concluding on the situations and locations where the use of works is lawful.
Date of Award | 23 Sept 2024 |
---|
Original language | Portuguese |
---|
Awarding Institution | - Universidade Católica Portuguesa
|
---|
Supervisor | Nuno Sousa e Silva (Supervisor) |
---|
- Artificial intelligence (AI)
- Copyright
- Text and data mining (TDM)
- Machine learning
- Deep learning
- Training models
- European legislation
- Copyright exception
- Fair use
- Intellectual property
- Legal regulation
- Generative AI
- Temporary copies
- Digital single market directive
- Databases
- Mestrado em Direito e Gestão
IA e obras: uma parceria ilícita?: a ilicitude da utilização de dados no treino de IA
Ferreira, M. A. (Student). 23 Sept 2024
Student thesis: Master's Thesis