We have now reached the 180-day mark since the White House Executive Order (EO) on the Safe, Secure and Trustworthy Development of AI and we are seeing a flurry of mandated actions being completed. See here for a summary of recent actions. One of the mandated actions was for the National Institute of Standards and Technology (NIST) to update its January 2023 AI Risk Management Framework (AI RMF 1.0), which it has now done. To this end, NIST released four draft publications intended to help improve the safety, security and trustworthiness of artificial intelligence (AI) systems and launched a challenge series to support development of methods to distinguish between content produced by humans and content produced by AI.
The NIST guidance is a voluntary framework to assist companies with the responsible development of AI. It does not have the force of law, but it may not be without legal significance. For example, I recently covered proposed state legislation which seeks to impose obligations on developers and deployers of AI systems and provides an affirmative defense if they have implemented and maintained a program that complies with a nationally or internationally recognized risk management framework for artificial intelligence systems. NIST’s AI RMF is one the most frequently cited risk management frameworks in the US.
While much of the information in these documents is technical, in-house counsel should factor this guidance into their companies’ policies on development and deployment of AI. For more information on AI policies, see our prior publication on Why Companies Need AI Legal Training and Must Develop AI Policies. Additionally, companies should factor this guidance into their checklist for issues to address with conducting vendor diligence for AI tools the company considers acquiring.
These publications are initial drafts, which NIST has published to solicit public feedback before submitting final versions later this year. Unless extended, the deadline for comments is June 2, 2024.
The following are additional details on the four documents.
Of the four draft publications, two provide guidance to help manage the risks of generative AI and are designed to be companion resources to the (AI RMF) and Secure Software Development Framework (SSDF), respectively. The third offers approaches for promoting transparency in digital content, which AI can generate or alter. The fourth proposes a plan for developing global AI standards.
Mitigating the Risks of Generative AI – The first publication is the AI RMF Generative AI Profile (NIST AI 600-1). This publication identifies risks that are novel to or exacerbated by the use of GAI and provides a set of actions to help organizations govern, map, measure, and manage these risks.
The identified risks include:
- CBRN Information: Lowered barriers to entry or eased access to materially nefarious information related to chemical, biological, radiological, or nuclear (CBRN) weapons, or other dangerous biological materials
- Confabulation: The production of confidently stated but erroneous or false content (known colloquially as “hallucinations” or “fabrications”)
- Dangerous or Violent Recommendations: Eased production of and access to violent, inciting, radicalizing, or threatening content as well as recommendations to carry out self-harm or conduct criminal or otherwise illegal activities
- Data Privacy: Leakage and unauthorized disclosure or de-anonymization of biometric, health, location, personally identifiable, or other sensitive data
- Environmental: Impacts due to high resource utilization in training GAI models, and related outcomes that may result in damage to ecosystems
- Human-AI Configuration: Arrangement or interaction of humans and AI systems which can result in algorithmic aversion, automation bias or over-reliance, misalignment or mis-specification of goals and/or desired outcomes, deceptive or obfuscating behaviors by AI systems based on programming or anticipated human validation, anthropomorphization, or emotional entanglement between humans and GAI systems; or abuse, misuse, and unsafe repurposing by humans.
- Information Integrity: Lowered barrier to entry to generate and support the exchange and consumption of content which may not be vetted, may not distinguish fact from opinion or acknowledge uncertainties, or could be leveraged for large-scale dis-information and mis-information campaigns.
- Information Security: Lowered barriers for offensive cyber capabilities, including ease of security attacks, hacking, malware, phishing, and offensive cyber operations through accelerated automated discovery and exploitation of vulnerabilities; increased available attack surface for targeted cyber attacks, which may compromise the confidentiality and integrity of model weights, code, training data, and outputs.
- Intellectual Property: Eased production of alleged copyrighted, trademarked, or licensed content used without authorization and/or in an infringing manner; eased exposure to trade secrets; or plagiarism or replication with related economic or ethical impacts.
- Obscene, Degrading, and/or Abusive Content: Eased production of and access to obscene, degrading, and/or abusive imagery, including synthetic child sexual abuse material (CSAM), and non-consensual intimate images (NCII) of adults.
- Toxicity, Bias, and Homogenization: Difficulty controlling public exposure to toxic or hate speech, disparaging or stereotyping content; reduced performance for certain sub-groups or languages other than English due to non-representative inputs; undesired homogeneity in data inputs and outputs resulting in degraded quality of outputs.
- Value Chain and Component Integration: Non-transparent or untraceable integration of upstream third-party components, including data that has been improperly obtained or not cleaned due to increased automation from GAI; improper supplier vetting across the AI lifecycle; or other issues that diminish transparency or accountability for downstream users.
Reducing Threats to the Data Used to Train AI Systems – The second publication is the Secure Software Development Practices for Generative AI and Dual-Use Foundation Models (NIST Special Publication (SP) 800-218A). It is designed to augment the SSDF (SP 800-218) by adding practices, tasks, recommendations, considerations, notes, and informative references that are specific to AI model development throughout the software development life cycle. Among other things, it focuses on data sourcing for, designing, training, fine-tuning, and evaluating AI models, and incorporating and integrating AI models into other software. While the SSDF is broadly concerned with securing the software’s lines of code, this companion resource expands the SSDF to help address concerns around malicious training data adversely affecting generative AI systems.
The document is designed to address these issues from the perspectives of:
- AI model producers: Organizations that are developing their own generative AI and dual-use foundation models
- AI system producers: Organizations that are developing software that leverages a generative AI or dual-use foundation model
- AI system acquirers: Organizations that are acquiring a product or service that utilizes one or more AI systems.
Reducing Synthetic Content Risks – The third publication is Reducing Risks Posed by Synthetic Content (NIST AI 100-4). This publication provides technical approaches for promoting transparency in digital content based on use case and context. This publication identifies methods for detecting, authenticating and labeling synthetic content, including digital watermarking and metadata recording, where information indicating the origin or history of content such as an image or sound recording is embedded in the content to assist in verifying its authenticity. This publication supplements a separate report on provenance and detection of synthetic content that AI EO Section 4.5(a) tasks NIST with providing to the White House.
The report offers approaches to help manage and reduce risks related to synthetic content in four ways:
- Attesting that a particular system produced a piece of content
- Asserting ownership of content
- Providing tools to label and identify AI-generated content
- Mitigating the production and dissemination of AI generated child sexual abuse material and non-consensual intimate imagery of real individuals.
Global Engagement on AI Standards – the fourth publication is the Plan for Global Engagement on AI Standards (NIST AI 100-5). This publication is designed to promote the worldwide development and implementation of AI-related consensus standards, cooperation and coordination, and information sharing and addresses topics such as: (i) mechanisms for enhancing awareness of the origin of digital content, whether authentic or synthetic; and (ii) shared practices for testing, evaluation, verification and validation of AI systems.
The plan furthers the policies and principles in the White House Executive Order which instructs the Federal government to “promote responsible AI safety and security principles and actions with other nations, including our competitors, while leading key global conversations and collaborations to ensure that AI benefits the whole world, rather than exacerbating inequities, threatening human rights, and causing other harms.”
NIST GenAI – NIST also announced NIST GenAI, a new platform to evaluate and measure generative AI technologies developed by the research community from around the world to inform the work of the U.S. AI Safety Institute at NIST. Pursuant to the program, NIST will issue a series of challenge problems designed to evaluate and measure the capabilities and limitations of generative AI technologies. These evaluations will be used to identify strategies to promote information integrity and guide the safe and responsible use of digital content. More information about the challenge and how to register can be found on the NIST GenAI website.
The objectives of the NIST GenAI evaluation include:
- Evolving benchmark dataset creation,
- Facilitating the development of content authenticity detection technologies for different modalities (text, audio, image, video, code),
- Conducting a comparative analysis using relevant metrics, and
- Promoting the development of technologies for identifying the source of fake or misleading information.
The NIST GenAI Pilot study aims to measure and understand system behavior for discriminating between synthetic and human-generated content in the text-to-text (T2T) and text-to-image (T2I) modalities. This pilot addresses how human content differs from synthetic content, and how the evaluation findings can guide users in differentiating between the two. The generator task creates high-quality outputs while the discriminator task detects if a target output was generated by AI models or humans.
Conclusion
The pace of development of AI technologies and applications continues increase at a dizzying pace. Given the immense potential and concomitant risks, the US government and its agencies are working to develop guidance and regulations at a pace not previously seen with other groundbreaking technologies. It is imperative that companies keep up with these developments and take them into account when drafting or updating their corporate policies on development and deployment of AI.