How Zero Knowledge Proofs Make AI Secure, Compliant, and Privacy Preserving
The full benefits of AI cannot be achieved until AI can be fully trusted–secure, compliant, and privacy preserving. Trusted AI requires integrating privacy-preserving verification software throughout the AI tech stack.
The best approach for making secure, compliant, and privacy preserving are zero knowledge proofs.
ZKPs are privacy-preserving protocols that enable rock-solid verification that a statement is true without revealing any information beyond the validity of the statement.
ZKPs can enable you to prove that AI-related computations are correct without revealing the actual data or the details of the computation. As a result, ZKPs enable AI properties to be verified and models to be trained without revealing confidential information.
ZKPs for AI are the major focus of our work at ICME. Here’s how ZKPs can be integrated across a typical generative AI tech stack using OpenAI as an example:
1. Hardware Infrastructure
GPUs: For the GPU-intensive tasks required in AI, ZKPs can add a layer of security when using GPUs from third-party providers. This prevents the providers from gaining insights into the data or algorithms used.
For example, when OpenAI rents cloud GPU services, ZKPs can validate the integrity of the training process conducted on these external GPUs.
In addition, with ZKPs OpenAI can transform its data into a form that can be processed by the cloud GPUs without revealing the actual information within the data.
CPU & Memory: In a shared computing environment, which is common in cloud computing and in other situations common for AI where you have multi-tenant systems (imagine an office space shared by different companies), CPUs and memory are used by multiple users and applications.
In a multi-tenant scenario, ZKPs can ensure that each tenant's operations are isolated and secure. They can demonstrate that a process is only accessing the memory allocated to it, not someone else's.
Storage: For storage, ZKPs can be used to demonstrate that encrypted data hasn't been tampered with or that deduplication processes haven't affected the integrity of the stored data. This can be done without revealing the actual contents of the data.
2. Software Frameworks
TensorFlow & PyTorch: These are the toolkits that developers use to build and train AI models. With ZKPs, developers can guarantee the process was done correctly without revealing their methods or the data they used.
When integrating ZKPs within these frameworks, ZKPs provide a verification mechanism for different companies to contribute to a model without sharing their sensitive data. For example, a consortium of hospitals can collaboratively train a model without sharing patient data.
GANs, VAEs, & Transformer Models: These are different methods of creating AI that can generate new, original content, from pictures to text.
ZKPs would allow creators to prove compliance with requirements that content is made without exposing underlying data. For example, a ZKP could verify that a GAN-generated medical image doesn't contain any identifiable patient information.
3. Data Management
Datasets: These are the collections of data AI learns from. ZKPs can ensure that the data is used correctly according to privacy rules and regulations such as GDPR or those contained in President Biden October 2023 Executive Order.
For example, if a privacy rule states that an algorithm should only access certain anonymized parts of a dataset, a ZKP can verify that the algorithm did not access any other data.
ZKPs can also be used to train AI systems in a privacy-preserving manner by allowing the model to learn from the data without ever seeing the data itself. This is done by converting the data into a format that can be used with ZKPs, and then using the proofs to demonstrate to the model that the data is valid and meets certain criteria.
For example, ZKPs can be used to train a credit scoring model without revealing the personal information of the individuals in the training dataset
ZKPs can facilitate the creation of audit trails that prove that data handling complies with privacy regulations without making the actual data available to the auditors.
Data Preprocessing & Augmentation: These are the tools that clean and expand data so AI can learn better. ZKPs can verify the data was properly prepared without revealing personal or sensitive information.
Specifically, ZKPs can prove that certain data cleansing or augmentation steps were performed correctly without exposing the raw data, like verifying the anonymization of a dataset without revealing the identities within it.
ZKPs enable the verification of how data is used and processed without exposing the underlying data.
4. Development & Deployment
Jupyter Notebooks, MLflow, and Tensor Board: These are like digital labs where AI is experimented with and developed. ZKPs can keep the experiments secure, letting others check the results without accessing the underlying data or techniques, similar to a sealed test-taking environment.
By using ZKPs, these tools can provide verifiable and tamper-proof metrics for model performance and experiment outcomes, which is essential when results need to be shared with external stakeholders without revealing proprietary information.
Kubernetes, Docker, & Cloud Services: Think of these as the methods to get the AI out to users, much like publishing a book. ZKPs can ensure the AI hasn't been tampered with from the time it was created to when the user interacts with it, like a safety seal on a food product.
ZKPs can verify the integrity and identity of containers and services in the deployment pipeline, ensuring that the model being served is the one that was intended, and no unauthorized changes have been made.
5. Monitoring, Maintenance, and Compliance
Logging, Monitoring, Model Versioning, and CI/CD Tools: These are the systems that keep the AI running smoothly, like a car's dashboard and maintenance logs. ZKPs can create secure logs that show the AI is functioning correctly without giving away proprietary information.
Specifically, ZKPs can create non-repudiable logs that verify an action took place at a certain time by a certain entity, without revealing the action's specifics, which is vital in regulated industries.
ZKPs can verify system status and model version integrity without revealing underlying configurations, crucial for secure, compliant, and auditable system maintenance.
Data Encryption and Access Control Systems: This is like the security system for your digital house. ZKPs add another layer, letting you prove who you are without giving away your password or key.
ZKPs enhance these elements by allowing systems to confirm that a user has the right to access data or perform an action without needing to see the user's credentials or the data itself, which is critical in maintaining stringent data privacy.
ZKPs can prove compliance and access control by privately validating claims about data, fostering compliance with security standards and regulations.
6. Support and Education
Online Communities, Vendor Support: ZKPs can facilitate secure sharing of troubleshooting data within communities or with vendors without revealing sensitive information, fostering a secure collaborative environment.
Online Courses, Conferences & Workshops: ZKPs can be used in training platforms to verify learner submissions without revealing the solutions, supporting secure, fair educational environments.
******
ZKPs have the ability to ensure that AI is secure, compliant, and privacy preserving at every step and in every layer. Get in touch to learn more about the ZKP tools for AI that ICME is building for this purpose.