How Generative AI Can Enhance Software Supply Chain Security
Generative AI is the technology of the moment, and is actually being hyped as providing transformational benefits for years to come, including when it comes to software supply chain security.
This subset of AI uses machine learning algorithms to generate new data and content. Given the increasing importance of the software supply chain, it is critical to use every measure possible to secure it. But this is no easy feat—96% of scanned codebases contain open source.
And as with any transformative technology, the adoption of generative AI also comes with a set of challenges, particularly in the area of security. Rezilion recently conducted research to test security concerns within generative AI, not only related to large language models (LLMs) but also the broader security considerations accompanying its adoption. These include examining whether the appropriate controls are being put in place and whether proper risk assessments are being conducted.
Attackers Are Leveraging Generative AI
Already, attackers are of course seizing on the opportunities generative AI presents. For example, a fake GPT project recently claimed to provide a better AI tool than ChatGPT. Upon installation, the project opens the Chrome browser with the real OpenAI website, leading the victim to believe that the real ChatGPT was installed.
Some of the security risks Rezilion has identified are not new, but they often don’t receive enough attention when LLMs are used. These include confidentiality, integrity, availability, and basic security best practices.
Mitigating the Risk of Generative AI
Just as security guardrails have been erected around common risks in the software development domain, AI models and their encompassing ecosystem must develop the same compensating controls and best practices, the Rezilion research advises.
For example, to prevent SQL injections, we know we must apply strategies such as input validation and sanitization; the same techniques must be adapted to address prompt injection risk.
Reducing data management risk starts with sourcing training data from reliable and verifiable sources. It is also critical to employ robust data sanitization and preprocessing techniques to eliminate vulnerabilities and biases to ensure the data is reliable and fair.
To address data leakage risks, we recommend strict output filtering and context-aware mechanisms to safeguard against the inadvertent disclosure of sensitive information by the LLM.
Generative AI’s Benefits in Software Supply Chain Security
With all that said, generative AI tools can pack a punch in software supply chain security to enhance vulnerability detection and management, as well as patching and gathering intelligence in real time. AI also has the ability to detect vulnerabilities in open source code more quickly and accurately.
For example, developers can use automated code analysis to not only analyze potential vulnerabilities but receive suggestions for improvements and fixes. Automation can also be used in other areas, such as speeding up the testing process to quickly flag issues in production.
Patching vulnerabilities in open source code is another area where AI can make strides. Using neural networks, it can automate the process of identifying and applying patches for natural language processing (NLP) pattern matching or the K-nearest neighbors (KNN) algorithm on code embeddings.
This can result in time and resource savings.
How to Proceed
Significantly, AI and generative AI can be a developer’s new best friend, educating them about security best practices in real-time to help them write more secure code and identify and fix vulnerabilities. This takes away the need to flag and fix vulnerabilities in an automatic pull/merge request (PR/MR).
Expect to see great potential from generative AI to protect the software supply chain. But a word of caution: Remember that ChatGPT and any language-based generative model have limitations, such as the inherent tendency to produce false or fabricated information; a concept typically referred to as “hallucination.”
Be sure you understand these models’ capabilities and limitations, and as with any new technology, implement fundamental security measures to mitigate the risk they present.
About the author: Esther Shein is a longtime freelance tech and business writer and editor whose work has appeared in several publications, including CIO.com, TechRepublic, VentureBeat, ZDNet, TechTarget, The Boston Globe and Inc. She has also written thought leadership whitepapers, ebooks, case studies and marketing materials.