Protection Your Digital Gold: Large Language Models from a legal perspective

Protection Your Digital Gold: Large Language Models from a legal perspective

As most executives, I started dabbling with large language models to see how I may use them in the support of our students as well as apply them to other areas. Having a strong technological background it did not take long to conclude that custom trained models (at least fine tuned models) will be the future where businesses create corporate LLMs/AI's that speak "their" language.

Some of these models might be used only for internal purposes but many might realize that the value of their AI model exceeds or matches the value of their core business. Imagine a medical company that operates global discussion groups about depression. An AI that has vast knowledge of these discussions and can be trained in helping people to find the right professional could far exceed the value of the companies medication.

However, as businesses increasingly leverage these models to automate, enhance, or create large amounts of data, understanding the associated legal implications is paramount.

Many providers of LLMs, such as OpenAI's GPT models, have terms and conditions stipulating that users cannot utilize their AI to aid in the creation of a competing language model. This restriction may seem innocuous on the surface but it could have significant implications if a company decides to create its own language model using data augmented, corrected, or otherwise created by an LLM.

Let's look at this hypothetical scenario: A company uses an LLM to manage its customer service chat platform, automate data entry, or create content for marketing materials. As a result, a significant part of the company's data is now generated or influenced by the LLM. 

 

A few years down the line, the company decides to create its own language model for internal use or for commercial purposes. The most valuable resource for training such a model would be the company's own data. However, any data that has been influenced by the LLM is potentially tainted under the terms of the LLM license agreement. Consequently, the company finds itself in a legal conundrum - using their own data to train their model could constitute a breach of the license agreement.

 

This scenario raises an essential question for businesses planning to deploy any kind of AI solutions: Should you avoid using AI unless you are certain you will never want to create you own models?

 

The answer isn't necessarily a categorical 'yes' or 'no'. It is essential to recognize that the utilization of AI offers many benefits, including increased efficiency, cost reduction, and improved customer service. Completely avoiding the use of AI for fear of potential future complications may limit a company's growth and competitiveness.

 

Instead, companies could consider the following steps to mitigate potential legal risks:

 

1. Understand the Terms and Conditions: Before deploying an LLM, understand the provider's terms and conditions. Consult with legal experts to comprehend the implications fully.

 

2. Separate AI-Generated and Human-Generated Data: If possible, maintain separate databases for human-generated and AI-generated or AI-influenced data. This segregation could help in the event of a future decision to train your own language model.

 

3. Negotiate License Terms: If planning to create your own language model, consider negotiating the license terms with the LLM provider. They may allow usage of their AI for this purpose under specific circumstances or for an additional fee.

 

4. Use Open Source or Less Restrictive AI: Consider using open source language models or those with less restrictive terms, allowing you to use the AI-augmented data to train your own models. Companies like my own Plumeria.ai offer consulting in the use of different models and might offer valuable guidance.

 

5. Data Anonymization and Aggregation: Techniques like data anonymization and aggregation could help to minimize the extent to which your data is considered 'tainted' by the LLM, although this is a legally complex area and should be approached with caution.

 

In conclusion, rather than shying away from using AI, businesses should approach this powerful technology with an informed understanding of the potential legal implications. With careful planning and the right strategies, companies can reap the benefits of AI while safeguarding their ability to mine their own "digital gold" in the future.

Nikolai Manek

Back to blog