zkGPU - On the State of the Art for Verifiable Compute of GPU Execution.
*This post is about verifying the computations run on GPU. Not using GPUs to accelerate ZKP!
GPUs are having a long — moment in the spotlight. Often celebrated for their prowess in rendering lifelike graphics, GPUs now sit at the heart of some of the most demanding computational tasks in modern technology. From powering cutting-edge AI models to accelerating cryptographic operations, their role extends far beyond gaming or visual effects. They drive innovations in fields like scientific simulations, cryptocurrency mining, and blockchain technologies like Ethereum, where they facilitate proof-of-work and other critical processes.
The reason for this ubiquity lies in their unparalleled, pun intended, ability to process massive volumes of data in parallel, making them indispensable for high-performance computing tasks. Yet, with this surge in GPU usage comes a crucial question: how do we verify that GPU computations are accurate?
For example, training a deep learning model on a GPU cluster involves intensive and opaque computations. How can we ensure the model was trained correctly without rerunning the entire process? Similarly, as on-device AI and small language models gain traction, verifying their execution remotely — while preserving privacy — presents a significant challenge.
Take Apple's approach, for instance. They have multiple systems in place to help preserve privacy with Apple Intelligence and Iphone features. Their setup allows for on-device processing for inference, with FHE backed (Fully homomorphic encryption) backed Private Information Retrieval (PRI) on cloud servers. The shortfall with this is we are not able to convince third parties about what we run on our local device, and we must trust that the FHE/PIR setup is correct. Lastly we must trust that server hardware is running correctly. These are some problems where a zkGPU can help with verifiability and privacy in a mainstream setting.
Another example is large-scale scientific simulations for climate modeling or drug discovery, where the accuracy of GPU computations directly impacts decision-making that can cost millions of dollars — or even lives. How can we guarantee the integrity of these results without reproducing the simulation?
In blockchain and decentralized systems, verifiable compute is a cornerstone of trust. GPUs are used to generate cryptographic proofs, validate transactions, and even execute smart contracts. Errors or malicious manipulation of these computations can compromise the entire system. Similarly, in financial applications — like high-frequency trading or risk modeling — faulty or corrupted GPU calculations could lead to disastrous outcomes if left unchecked.
In this blog post, we will explore the cutting edge of verifiable compute for GPU execution, diving into the intersection of hardware, cryptography, and zero-knowledge proofs (zk-proofs). We'll uncover how cutting edge zero knowledge proofs are enabling a future where we can trust GPU computations without having to repeat them, opening up new possibilities for scalable, secure, and transparent systems.
Verifiable Compute with TEE (Trusted Execution Environments)
Trusted Execution Environments (TEEs) are specialized secure areas within a processor that ensure code and data loaded within them are protected in terms of confidentiality and integrity. They are a hardware based solution for ensuring that computations were ran correctly on CPU. Recently the concept has been extending to the world of GPU with TEE-GPU variants from a few large hardware vendors. With NVIDIA being a good example.
NVIDIA has advanced the field of verifiable GPU computation through its Confidential Computing technology, notably integrated into the H100 Tensor Core GPUs. This innovation ensures the confidentiality and integrity of data and algorithms during processing, addressing critical security concerns in AI and high-performance computing.
The Gist On How It Works:
The H100 GPUs incorporate hardware-based Trusted Execution Environments (TEEs), creating isolated regions within the GPU where sensitive computations occur securely. This isolation is anchored by an on-die hardware root of trust, establishing a secure boot process and enabling hardware protections for code and data. The system supports cryptographic attestation, allowing verification that computations were executed as intended without tampering. The TEE-GPU also need to be connected to a TEE-CPU to make sure the full flow is attestable. This is what Intel and NVIDIA have done recently.
Read more as they have a lot of docs at NVIDIA Developer!
Cost Considerations:
The integration of Confidential Computing features adds to the cost of NVIDIA's H100 GPUs. As of early 2024, the H100 GPUs were priced between $25,000 and $30,000 each. Market demand and supply constraints have led to higher prices in secondary markets, with individual units selling for over $40,000. So these things are quite expensive and out of reach for many.
Verifiability:
The proofs from TEE are normally not easily verified in constrained settings. For example DCAP based attestations are normally too expensive to verify Ethereum, so some teams (Phala) have used zero-knowledge proofs in a mixed fashion to allow anyone to cheaply verify the attestation proofs from TEE systems.
People are also not keen on trusting large hardware providers with mission critical applications, where one back-door could cost millions.
Software Based Attestation of GPU.
There have been a few approaches to software based attestation of GPU such as SAGE (Software-based Attestation for GPU Execution). These methods are critical in situations where hardware-based (TEE-GPU) are unavailable or infeasible. But ironically early work still relies on TEE-CPU.
How Software-Based GPU Attestation Works
Software-based attestation mechanisms rely on creating a verification function (VF) that executes on the GPU.
This function:
- Computes a checksum of its own code and data, ensuring its integrity.
- Measures execution time to detect tampering or irregularities.
- Establishes a dynamic root of trust (RoT) between the verifier (e.g., running on the CPU) and the GPU.
The RoT is setup between a verifier (a TEE), which in the best case is a trade off some are willing to take. Others will not want to rely on the hardware based attestation this requires. Another limitation is that we would need to have access to the specifications of specific hardwares for creating appropriate VF for SAGE.
zkSAGE
For certain public programs we can achieve a purely software based approach to verifiable GPU by mixing protocols from SAGE with our memory efficient zero-knowledge virtual machine (NovaNet's zkEngine). With this you can ensure that the program running on the CPU (without TEE) was executed correctly, by using ZKP over its public program. After the proof is generated anyone can succinctly verify.
The GPU portion is attested to in the same way as before but for a limited set of programs where we can have accurate expectations of the GPU performance and that will not blow up the constraints in the zkVM.
*This is a good area to collab open-source. Reach out if interested!
The next generation: zkGPU
Most zero-knowledge virtual machines today are based on widely used ISA (Instruction Set Architectures) like RISC-V or WASM. This is due to the fact that the execution traces, often represented as opcodes, can be easily turned into zk-circuits and lookup arguments. In the land of GPU there is still no widely accepted intermediate representation. Each vendor — whether it's NVIDIA, AMD, or Intel — has its own proprietary instruction sets, memory hierarchies, and optimizations. While there are intermediate representations (IRs) like SPIR-V or PTX, translating these into a universal ZKP-friendly format remains a daunting task. The absence of a widely accepted, vendor-neutral IR for GPU execution complicates efforts to standardize a zkGPU IR framework.
GPUs achieve their computational prowess through massive parallelism, often running thousands of threads simultaneously. While this is great for performance, it poses significant challenges for ZKP systems, which must trace and verify each thread's execution. Capturing the intricate dependencies and synchronization patterns between threads in a succinct, zk-friendly format is non-trivial.
In spite of these challenges I think it is becoming more practical to develop system for a zero knowledge proof based zkGPU. To start, we would be getting the execution trace and memory trace from a GPU. We would then be running those through our VM (whatever ISA that ends up being). This could either be done on the CPU or in-line on the GPU. The latter has much less tooling and libraries, but could piggy back of GPU acceleration libraries for ZKP.
We would likely need incremental verifiable computation (via folding), and memory consistency checks that work with the parallel multi-threaded nature of GPU. Multiple IVC instances may be running at once and finalized with other instances, like in the recent Nebula paper. These could later be folded into one instance with multi-folding techniques. In this way repetitive computations could be aggregated into one final proof for final verification.
JOLT like lookups would be essential as well! We would need to make the many floating point operations work in ZK-land. Lookup tables are ideal for this as they avoid the constraint blowup of bit decompositions.
The journey to a fully realized zkGPU is as ambitious as it is transformative. By bridging the gap between GPU execution and zero-knowledge proofs, zkGPU could revolutionize fields ranging from AI and scientific research to decentralized systems and financial modeling.
While challenges abound — proprietary architectures, massive parallelism, and high verification costs — ongoing innovations in IR standardization, zkVM design, and hybrid verification models offer promising avenues.
The development of zkGPU represents not just a technical milestone but a paradigm shift toward transparent, scalable, and verifiable computation. For developers, researchers, and visionaries, this frontier invites collaboration. If you're passionate about open-source innovation in zkGPU, let’s connect and shape the future of verifiable compute!