The National Institute of Standards and Technology (NIST), an agency within the U.S. Commerce Department known for developing and testing technology for the government, private companies, and the public, has re-released a crucial tool designed to measure the impact of malicious attacks on AI systems. This tool specifically targets “poisoning” attacks that corrupt AI model training data, potentially degrading the performance of the affected AI systems.
Named Dioptra, after the classical astronomical and surveying instrument, this modular, open-source, web-based tool was first launched in 2022. Dioptra aims to assist companies training AI models, as well as individuals using these models, in assessing, analyzing, and tracking AI risks. NIST states that Dioptra can be utilized to benchmark and research models, providing a common platform for exposing models to simulated threats in a “red-teaming” environment.
“Testing the effects of adversarial attacks on machine learning models is one of the goals of Dioptra,” NIST wrote in a press release. “The open-source software, available for free download, could help the community, including government agencies and small to medium-sized businesses, conduct evaluations to assess AI developers’ claims about their systems’ performance.”
Dioptra’s re-release comes with additional documentation from NIST and its newly established AI Safety Institute. These documents outline strategies to mitigate some of the dangers posed by AI, such as its potential abuse in generating non-consensual pornography. This follows the launch of the U.K. AI Safety Institute’s Inspect toolset, which similarly aims to assess AI model capabilities and overall safety. The U.S. and U.K. continue to collaborate on advanced AI model testing, a partnership announced at the U.K.’s AI Safety Summit in Bletchley Park in November of last year.
Dioptra is also a product of President Joe Biden’s executive order on AI. This order mandates, among other things, that NIST aids in AI system testing. The executive order also establishes standards for AI safety and security, including requirements for companies developing models (e.g., Apple) to notify the federal government and share results of all safety tests before public deployment.
As we’ve previously discussed, AI benchmarks are notoriously challenging to establish, particularly because the most advanced AI models today are often opaque. Companies creating these models keep crucial infrastructure, training data, and other key details confidential. A recent report from the Ada Lovelace Institute, a U.K.-based nonprofit research institute focused on AI, found that current evaluations alone are insufficient to determine an AI model’s real-world safety. This is partly due to existing policies allowing AI vendors to selectively choose which evaluations to conduct.
While NIST does not claim that Dioptra can completely eliminate risks associated with AI models, the agency suggests that Dioptra can illuminate which types of attacks might impair an AI system’s performance and quantify this impact.
A notable limitation of Dioptra is its current compatibility. It only works out-of-the-box on models that can be downloaded and used locally, such as Meta’s expanding Llama family. Models that are accessible only through an API, like OpenAI’s GPT-4, are not supported — at least for now.
In conclusion, Dioptra represents a significant step forward in the ongoing effort to enhance AI model safety and reliability. By providing a tool for testing and understanding the vulnerabilities of AI systems, NIST is helping to pave the way for more secure and trustworthy AI technologies.