A non-disclosure agreement (NDA) is a useful legal tool that allows someone to share otherwise confidential information with others, safe in the knowledge that their confidentiality is protected by the NDA. In the past, NDAs were drafted for the realities of the past. For example, it was assumed that information was delivered in a physical format and should be returned or destroyed upon completion of the agreement. NDAs written in the modern age had to adapt modern realities, for example, electronic information cannot be “returned” in the usual sense, so NDAs explicitly require recipients to delete any electronic copies of the information from their systems.
Technology continues to change. The latest change in how we process and generate information is the increased use of Artificial Intelligence and Machine Learning (AI/ML). To stay ahead of the curve, NDAs must explicitly address the challenges posed by AI/ML.
The use of AI/ML is challenging some of our old definitions. For example, NDAs often require recipients to destroy copies of the confidential information in their possession when the term of the NDA expires. This has traditionally presented a binary state regarding compliance. Even in the case of electronic data, it can be easily shown whether it has been retained or deleted. But what if the recipient of the confidential information uses it as training data for an AI/ML model? Some specialized AI/ML models may store portions of the information as presented, which would violate most current NDAs. But many AI/ML models do not retain verbatim copies. Instead, the models process the data and use it to tune their parameters. A copy of the confidential information has not been kept, and the recipient is arguably complying with the NDA. But should this be allowed?
Current AI Chatbots have proven remarkably capable of accurately inferring personal information about users from subtle clues. Could such models also infer the confidential information in future, despite not retaining copies in the usual sense? Even if they cannot, some AI/ML models trained on the confidential information would then be capable of producing results not possible in the absence of that information. The recipient retains the value of the information without having to retain the information itself.
Don’t put yourself in the position of having to argue after the fact that your NDA should be interpreted to cover these scenarios. Prevent this use of confidential information in the first place. Unless there is a good reason to allow it, an NDA should include language specifically forbidding the use of the disclosed confidential information to train AI/ML models.
A related area of concern is the use of public-facing AI/ML models. It's generally understood that feeding confidential information to a public-facing AI/ML tool would be considered disclosing that confidential information to a third party - and thus forbidden by most existing NDA language. However, there are larger concerns. With a traditional third-party disclosure, it may be possible to limit the scope or any damage or claw back the information. But with a public AI/ML system, anything typed into it may be used to train the system to respond to future queries by members of the public. If your confidential information is fed to public-facing AI/ML, there is potential for that confidential information to get out into the public. There is no way of knowing where the information may go or who may be able to use it – and no way of undoing that damage. The stakes are too high to rely on a recipient understanding that “third-party” includes AI/ML models. It is best practice for an NDA to explicitly forbid feeding confidential information into public or private AI/ML models.
Of course, the requirements for an NDA can vary with the type of information being shared and its intended purpose. If the use of AI/ML cannot be avoided, or if the underlying transaction for the NDA requires AI/ML, then the NDA should still be crafted to take that into account. In those cases, the NDA at least should indicate that processing of the information by AI/ML is allowed only if the recipient of the information ensures that the AI/ML model is not trained on the information and the AI/ML model is not accessible to anyone who otherwise is not allowed access to the information.