Security & Trust Center
Your data security is our top priority
AI Security
Best practices for mitigating the risks associated with AI models
AI security refers to the practices, measures and strategies implemented to protect artificial intelligence systems, models and data from unauthorized access, manipulation or malicious activities. Organizations must implement robust security protocols, encryption methods, access controls and monitoring mechanisms to safeguard AI assets and mitigate potential risks associated with their use.
The Databricks Security team works with our customer base to deploy AI and machine learning on Databricks securely with the appropriate features that meet customers’ architecture requirements. We also work with dozens of experts internally at Databricks and in the larger ML and GenAI community to identify security risks to AI systems and define the controls necessary to mitigate those risks.
Understanding AI Systems
What components make up an AI system and how do they work together?
AI systems are comprised of data, code and models. A typical end-to-end AI system has 12 foundational architecture components, broadly categorized into four major stages:
- Data operations include ingesting and transforming data and ensuring data security and governance. Good ML models depend on reliable data pipelines and secure infrastructure.
- Model operations include building custom models, acquiring models from a model marketplace or using SaaS large language models (LLMs), such as OpenAI. Developing a model requires a series of experiments and a way to track and compare the conditions and results of those experiments.
- Model deployment and serving consists of securely building model images, isolating and securely serving models, automated scaling, rate limiting, and monitoring deployed models.
- Operations and platform include platform vulnerability management and patching, model isolation and controls to the system, and authorized access to models with security in the architecture. It also consists of operational tooling for CI/CD. It ensures the complete lifecycle meets the required standards by keeping the distinct execution environments — development, staging and production — secure for MLOps.
The below image highlights the 12 components and how they interact across an AI system.
Understanding AI Security Risks
What are the security threats that may arise when adopting AI?
In our analysis of AI systems, we identified 55 technical security risks across the 12 foundational architecture components. In the table below, we outline these basic components, which align with steps in any AI system, and highlight the types of security risks:
System stage | System components | Potential security risks |
---|---|---|
Data operations | 1. Raw data 2. Data preparation 3. Datasets 4. Catalog and governance | 19 specific risks:
|
Model operations | 5. ML algorithm 6. Evaluation 7. Model build 8. Model management | 14 specific risks:
|
Model deployment and serving | 9. Model Serving — inference requests 10. Model Serving — inference responses | 15 specific risks:
|
Operations and platform | 11. ML operations 12. ML platform | 7 specific risks:
|
What controls are available for mitigating AI security risks?
We build security into every layer of the Databricks Data Intelligence Platform, with over a decade of experience working with thousands of customers to securely deploy AI systems. Our team collected a list of 53 prescriptive controls for mitigating the above-identified AI security risks. The first set of these controls are typical cybersecurity best practices such as single sign-on, encryption techniques, library and source code controls, and network access controls with a defense-in-depth approach to managing risks. The second set of controls are data and AI governance specific, such as data classification, data lineage, data versioning, model tracking, data and model asset permissions, and model governance. The last set of controls are AI specific like model serving isolation, prompt tools, auditing and monitoring models, MLOps and LLMOps, centralized LLM management, fine-tuning, and pretraining your models.
If you are interested in getting an in-depth overview of the security risks associated with AI systems and what controls should be implemented for each risk, we invite you to download our Databricks AI Security Whitepaper.
Best Practices for Securing AI and ML Models
Data and security teams must actively collaborate to pursue their goal of improving AI systems’ security. Whether you are implementing traditional machine learning solutions or LLM-driven applications, Databricks recommends taking the following steps as outlined in the Databricks AI Security Whitepaper.
Identify the AI business use case
Always remember your business goals. Ensure there is a well-defined use case with your stakeholders, whether already implemented or in the planning phases. We recommend leveraging Databricks Solution Accelerators, which are purpose-built guides to speed up results across your most common and high-impact AI and ML use cases.
Determine the AI deployment model
Choose an appropriate model such as a traditional custom tabular model, SaaS LLM, retrieval augmented generation (RAG), fine-tuned model or external model. Each deployment model has a varying shared responsibility split across the 12 AI system components and among your organization, the Databricks Data Intelligence Platform and any partners involved.
Select the most pertinent risks
From our above-documented list of 55 security risks, pinpoint the most relevant to your organization based on the deployment model your organization is using.
Enumerate threats for each risk
Identify the specific threats linked to each risk and the targeted AI component for every threat.
Choose and implement controls
Select controls that align with your organization’s risk appetite. The responsibility for the implementation of these controls may be shared among your organization, your cloud provider, and your data and AI vendor(s).
FAQs
AI Security Resources
Databricks AI Security Framework (DASF)
The Databricks Security team developed the Databricks AI Security Framework (“DASF”) to raise awareness of unique and evolving vulnerabilities as the global community incorporates AI and ML into more systems. The DASF takes a holistic approach to mitigating AI security risks of AI systems instead of focusing only on the security of models or model endpoints.
AI Security Workshop
The Databricks Security team regularly hosts AI Security workshops at industry conferences or by request. These workshops are designed for security leaders to understand how AI systems work and their associated risk factors, and to facilitate a discussion-based approach to mitigating these risks.
AI Security Blogs
The Databricks Security Team regularly authors blogs regarding AI security with machine learning experts on the Databricks blog.
AI Security Events and Webinars
Our security leaders at Databricks are regularly invited to lead and participate in roundtables, workshops, virtual events, and speaking engagements with thought leaders, enterprises, public sector agencies, security vendors, or industry groups to share their expertise.