Mend.io this week added a MendAI tool to its application security portfolio that identifies code generated by an artificial intelligence (AI) model.
In addition, the Mend.io software composition analysis (SCA) tool has been expanded to surface detailed AI model versioning and update information for every AI model used, including outdated dependencies.
Mend.io has indexed more than 35,000 publicly available, pre-trained large language models (LLMs) to identify which AI models are being employed.
Jeffery Martin, vice president of product for Mend.io, said that capability makes it simpler for organizations to navigate licensing, compatibility and compliance issues within the context of a software bill of materials (SBOM). That’s a critical capability as organizations look to apply governance policies to code generated by AI platforms, he added.
It’s still early days as far as usage of AI to generate code is concerned. However, it’s clear data science teams relying on machine learning operations (MLOps) workflows to build models will need access to the same types of SCA and SBOM tools that many developers routinely use today. Less clear is to what degree DevSecOps teams will meld workflows to unify the management of application and AI model security, he added.
Unfortunately, data science teams typically don’t have a lot of cybersecurity expertise. That’s problematic because as AI applications are developed using tools and platforms that have known vulnerabilities, it becomes relatively trivial for cybercriminals to exploit them. As such, DevSecOps teams that have deployed AI applications will need tools that enable them to identify which potentially vulnerable code generated by an AI model is running where in an application environment.
Cybercriminals Commonly and Specifically Target AI Models
Unfortunately, cybercriminals are now aware of these vulnerabilities so it’s becoming increasingly common to see malware campaigns specifically targeting AI models. The purpose of those attacks can range from simply exfiltrating data, to poisoning the pool of data being used to train AI models that are rapidly becoming the most valuable software asset an organization can own. The challenge is that data science teams may need to replace an entire AI model to remediate those vulnerabilities because many AI models are not as easily patched as other software artifacts.
It’s not apparent yet who within organizations will ultimately assume responsibility for AI application security. The challenge is there is already a general shortage of cybersecurity expertise and the number of existing cybersecurity professionals that have AI expertise is extremely limited. Undoubtedly, it will eventually require organizations to meld MLOps and cybersecurity workflows to define a set of best MLSecOps practices.
In the meantime, DevOps teams should expect the amount of code running in their application environments that might have been generated by an AI model to increase substantially. With or without permission, many developers are reusing code generated by AI platforms, most of which have been trained using code of uneven quality collected from across the internet. As a result, the code generated by the AI model might be just as flawed as the code used to train the AI model.
Regardless of how that code was created, however, the one certain thing is that a DevSecOps team will be expected to address any subsequent application security issues that will almost inevitably arise.