Making smart contracts smarter: A deep dive into the ZKML track

Proof of machine learning (ML) model inference using zkSNARKs is expected to become one of the most significant advances in smart contracts in the last decade. This development opens up an exciting design space, allowing applications and infrastructures to develop into more complex and intelligent systems.

By adding ML capabilities, smart contracts can become more autonomous and dynamic, allowing them to make decisions based on real-time on-chain data rather than static rules. Smart contracts will be flexible and adaptable to a variety of scenarios, including those that may not have been anticipated when the contract was initially created. In short, ML capabilities will expand the automation, accuracy, efficiency, and flexibility of any smart contract that we put on the chain.

ML is widely used in applications outside of web3 but has almost no application in smart contracts. This is mainly due to the high computational cost of running these models on-chain. For example, FastBERT is a computationally optimized language model that uses around 1800 MFLOPS (million floating-point operations), which cannot be directly run on the EVM.

The application of on-chain ML models is primarily focused on the inference stage: applying the model to real-world data to make predictions. To have ML-scale smart contracts, the contract must be able to ingest such predictions, but as we mentioned earlier, running the model directly on the EVM is not feasible. zkSNARKs provides a solution: anyone can run a model off-chain and generate a concise and verifiable proof that the expected model did indeed produce a specific result. This proof can be published on-chain and ingested by smart contracts to enhance their intelligence.

In this article, we will:

Explore potential applications and use cases for on-chain ML
Examine the emerging projects and infrastructure building around zkML core
Discuss some of the challenges with existing implementations and what the future of zkML might look like

Introduction to Machine Learning (ML)

Machine learning (ML) is a subfield of artificial intelligence (AI) that focuses on developing algorithms and statistical models that enable computers to learn from data and make predictions or decisions. ML models typically have three main components:

Training data: a set of input data used to train a machine learning algorithm for prediction or classification of new data. Training data can take various forms such as images, text, audio, numeric data, or combinations thereof.
Model architecture: the overall structure or design of a machine learning model. It defines the hierarchical structure, activation functions, and types and number of connections between nodes or neurons. The choice of architecture depends on the specific problem and data being used.
Model parameters: the values or weights learned by the model during the training process and used for prediction. These values are iteratively adjusted through optimization algorithms to minimize the error between predicted and actual results.

Model generation and deployment are divided into two stages:

Training stage: in the training stage, the model is exposed to a labeled dataset and its parameters are adjusted to minimize the error between predicted and actual results. The training process typically involves multiple iterations or epochs, and the model’s accuracy is evaluated on a separate validation set.
Inference stage: the inference stage is the phase where a trained machine learning model is used to predict new unseen data. The model receives input data and applies learned parameters to generate output such as classification or regression predictions.

Currently, zkML primarily focuses on the inference stage of machine learning models rather than the training stage, mainly due to the computational complexity of training circuits. However, zkML’s focus on inference is not a limitation: we anticipate some very interesting use cases and applications.

Verified Inference Scenarios

There are four possible scenarios for verified inference:

Private input, public model. A model consumer (MC) may want to keep their inputs confidential from the model provider (MP). For example, MC may want to prove credit score model results to a lender without disclosing their personal financial information. This can be accomplished using pre-commitment schemes and running the model locally.
Public input, private model. A common issue with ML-as-a-Service is that MP may want to hide their parameters or weights to protect their IP, while MC wants to verify the generated inference indeed comes from the specified model in the adversarial setting. Think of it this way: MP has an incentive to run a lighter-weight model to save cost when providing inferences to MC. Using on-chain model weight commitments, MC can audit private models at any time.
Private input, private model. This arises when the data used for inference is highly sensitive or confidential and the model itself is hidden to protect IP. An example of this might include using private patient information to audit healthcare models. Combinatorial techniques in zk or variants of multi-party computation (MPC) or FHE can be used to serve this scenario.
Public input, public model. zkML addresses a different use case when all aspects of the model can be made public: compressing and verifying off-chain computation to an on-chain environment. For larger models, a succinct zk proof of verified inference is more cost effective than re-running the model oneself.

Verified ML inference opens up new design space for smart contracts. Some crypto-native applications include:

1. DeFi

Verifiable off-chain ML oracles. Continuing to use generative AI could push the industry towards signature schemes of its content (e.g., news publications signing articles or images). Signed data is then prepared for zero-knowledge proofs, making the data composable and trustworthy. ML models can then process this signed data off-chain for prediction and classification (e.g., categorizing election outcomes or weather events). These off-chain ML oracles can solve real-world prediction markets, insurance protocol contracts, etc. without trust by verifying inference and publishing a proof on-chain.

DeFi apps based on ML parameters. Many aspects of DeFi can be more automated. For example, lending protocols can use ML models to update parameters in real-time. Currently, lending protocols rely mainly on off-chain models run by organizations to determine collateral factors, loan-to-value ratios, liquidation thresholds, etc., but a better option may be community-trained open-source models that anyone can run and verify.

Automated trading strategies. A common way to show the return characteristics of a financial model strategy is for the MP to provide various backtesting data to investors. However, there is no way to verify that a strategist is following the model when executing trades – investors must trust that the strategist is indeed following the model. zkML provides a solution where the MP can provide a proof of financial model inference when deployed to a specific position. This may be particularly useful for DeFi-managed insurance vaults.

2. Security

Fraud monitoring for smart contracts. Instead of letting slow human governance or centralized participants control the ability to pause a contract, ML models can be used to detect potential malicious behavior and pause the contract.

3. Traditional ML

Decentralized, trustless Kaggle implementations. A protocol or market can be created that allows MC or other interested parties to verify the accuracy of a model without the MP disclosing the model’s weights. This is useful for selling models, competitions around model accuracy, etc.

Decentralized prompt markets for generative AI. Prompt creation for generative AI has become a sophisticated craft, with the best output generated prompts often having multiple modifiers. External parties may be willing to buy these complex prompts from creators. zkML can have two uses here: 1) verifying the output of prompts to reassure potential buyers that the prompt indeed creates the desired image;

2) Allow prompt owners to maintain ownership of the prompt after purchase, while remaining opaque to the buyer, yet still generating a verified image for them.

4. Identity

Replace private keys with privacy-preserving biometric authentication. Private key management remains one of the biggest barriers to web3 user experience. Abstracting private keys with facial recognition or other unique factors is one possible solution for zkML.

Fair airdrops and contributor rewards. Detailed user personas can be created with ML models to determine airdrop allocations or contributor rewards based on multiple factors. This could be particularly useful when used in conjunction with identity solutions. In this case, one possibility is to have users run an open source model that assesses their participation in the application and higher-level participation, such as posts to a governance forum, to infer their allocation. Then provide this proof to the contract to receive the corresponding token allocation.

5. Web3 Social

Filtering for web3 social media. The decentralized nature of web3 social apps will lead to an increase in spam and malicious content. Ideally, social media platforms can use an open-source ML model of community consensus and publish proof of model inference when selecting filtered posts. Case study: zkML analysis of Twitter algorithm.

Advertising/recommendations. As a social media user, I may be willing to see personalized ads but want to keep my preferences and interests private from advertisers. I could choose to run a model about my interests locally and input it into the media app to provide content for me. In this case, advertisers may be willing to pay for the end-user to achieve this, however, these models may be far less complex than targeted ad models currently in production.

6. Creator Economy/Game

Game economy rebalancing. ML models can be used to dynamically adjust token issuance, supply, burn, voting thresholds, etc. One possible model is an incentive contract that rebalances the in-game economy if a certain rebalancing threshold is met and proof of inference is verified.

New types of on-chain games. Cooperative human vs. AI games and other innovative on-chain games can be created where untrusted AI models act as a non-playable character. Each action taken by the NPC is published with a proof that anyone can verify to determine that the correct model is running. In Modulus Labs’ Leela vs. the World, verifiers want to ensure the stated 1900 ELO AI is choosing chess moves, not Magnus Carlson. Another example is AI Arena, a Super Smash Brothers-like AI fighting game. In high-stakes competitive environments, players want to ensure the model they trained isn’t tampered with or cheating.

Emerging Projects and Infrastructure

The zkML ecosystem can be broadly divided into four main categories:

Model-to-Proof compilers: Infrastructure that compiles existing formats (such as PyTorch, ONNX, etc.) models into verifiable computation circuits.

General-purpose Proof Systems: Build proof systems for verifying arbitrary computation traces.

zkML-specific proof systems: Build proof systems specifically for verifying ML model computation traces.

Applications: Projects dedicated to unique zkML use cases.

01, Model Proof Compiler

(Model-to-Proof Compilers)

In the zkML ecosystem, most of the focus is on creating model-to-proof compilers. Typically, these compilers will take high-level ML models using PyTorch, TensorFlow, etc. and transform them into zk circuits.

EZKL is a library and command-line tool for performing inference of deep learning models in zk-SNARKS. Using EZKL, you can define a computation graph in PyTorch or TensorFlow and export it as an ONNX file with some example inputs in a JSON file, then point EZKL at these files to generate a zkSNARK circuit. With the latest round of performance improvements, EZKL can now prove a MNIST-sized model in about 6 seconds and 1.1GB of RAM. So far, EZKL has seen some significant early adoption, being used as infrastructure for various hackathon projects.

Cathie So’s circomlib-ml library contains various ML circuit templates for Circom. The circuits include some of the most common ML functions. Keras2circom, developed by Cathie, is a Python tool that converts Keras models to Circom circuits using the underlying circomlib-ml library.

LinearA has developed two frameworks for zkML: Tachikoma and Uchikoma. Tachikoma is used to convert neural networks to a form that uses only integers and generates computation traces. Uchikoma is a tool that converts TVM’s intermediate representation to a programming language that does not support floating-point operations. LinearA plans to support Circom with domain arithmetic and Solidity with signed and unsigned integer arithmetic.

Daniel Kang’s zkml is a framework for proving ML model execution based on his work in the paper “Scaling up Trustless DNN Inference with Zero-Knowledge Proofs”. At the time of writing this article, it can prove an MNIST circuit in about 5GB of memory and around 16 seconds of runtime.

In the broader model-to-proof compiler space, there are Nil Foundation and Risc Zero. Nil Foundation’s zkLLVM is an LLVM-based circuit compiler that can verify computing models written in popular programming languages such as C++, Rust, and JavaScript/TypeScript. It is a general-purpose infrastructure compared to other models-to-proof compilers mentioned here, but it is still applicable to complex computations such as zkML. When used in conjunction with their proof marketplace, this can be especially powerful.

Risc Zero builds a generic zkVM for the open-source RISC-V instruction set, hence supporting existing mature languages such as C++ and Rust, as well as the LLVM toolchain. This allows for seamless integration between host and client zkVM code, similar to Nvidia’s CUDA C++ toolchain, but using ZKP engines instead of GPUs. Similar to Nil, using Risc Zero can verify the computation trajectory of ML models.

02, Generalized Proof Systems

Improvements to proof systems have been the main driver for the success of zkML, especially the introduction of custom gates and lookup tables. This is mainly because ML has non-linear dependencies. In short, non-linearity is introduced through activation functions (such as ReLU, sigmoid, and tanh) that are applied to the output of linear transformations in neural networks. Due to the limitations of mathematical operation gates, implementing these non-linearities in zk circuits is challenging. Bitwise decomposition and lookup tables can help address this by precomputing the possible results of the non-linearities into a lookup table, interestingly resulting in more efficient computation in zk.

For this reason, Plonkish proof systems tend to be the most popular backend for zkML. Halo2 and Plonky2 and their arithmetic representations can handle neural network non-linearities well by looking up parameters. Additionally, the former has a vibrant developer tool ecosystem and flexibility, making it a practical backend for many projects including EZKL.

Other proof systems also have their advantages. R1CS-based proof systems include Groth16, known for its small proof size, and Gemini, known for its handling of large circuits and linear-time verifier. STARK-based systems, such as the Winterfell prover/verifier library, are particularly useful for handling complex, circuit-unfriendly operations in advanced machine learning models. This is especially true when using Giza’s tools to take a trace of a Cairo program as input and use Winterfell to generate STARK proofs to verify the correctness of the output.

01, zkML-Specific Proof Systems

Some progress has been made in designing efficient proof systems that can handle complex, circuit-unfriendly operations in advanced machine learning models. zkCNN based on the GKR proof system and systems such as Zator based on combinatorial techniques often outperform general-purpose proof systems, as shown in Modulus Labs’ benchmark report.

zkCNN is a method of proving the correctness of convolutional neural networks using zero-knowledge proofs. It uses sumcheck protocols to prove fast Fourier transforms and convolutions and has linear proof time, faster than asymptotic calculation results. Several improved and generalized interactive proofs have been introduced, including validation of convolutional layers, ReLU activation functions, and max pooling. According to Modulus Labs’ benchmark report, the interesting thing about zkCNN is that it is superior to other general-purpose proof systems in terms of proof generation speed and RAM consumption.

Zator is a project aimed at exploring the use of recursive SNARKs to verify deep neural networks. The current limitation of verifying deeper models is to fit the entire computation path into a single circuit. Zator proposes to use recursive SNARKs to verify layer by layer, gradually verifying N-step repeated calculations. They use Nova to reduce N compute instances to one that can be verified in a single step. With this approach, Zator was able to SNARK a network with 512 layers, as deep as most current production AI models. The proof generation and verification time of Zator is still too long for mainstream use cases, but their combination technique is still very interesting.

Application Areas

Given that zkML is in its early stages, its focus is primarily on the infrastructure listed above. However, there are also some projects currently dedicated to application development.

Modulus Labs is one of the most diverse projects in the zkML field, conducting both sample applications and related research. On the application side, Modulus Labs demonstrates zkML use cases through RockyBot (a chain transaction robot) and Leela vs. the World (a board game where humans play against a verified on-chain Leela chess engine). The team also conducted research and wrote “The Cost of Intelligence,” benchmarking the speed and efficiency of various proof systems under different model sizes.

Worldcoin is attempting to use zkML to create a privacy-preserving human identity proof protocol. Worldcoin uses custom hardware to process high-resolution iris scans and inserts them into Semaphore implementation. Useful operations such as membership proofs and voting can then be performed using the system. They currently use trusted runtime environments and secure enclaves to verify camera signatures of iris scans, but their ultimate goal is to use zero-knowledge proofs to verify the correct inference of neural networks to provide encryption-level security guarantees.

Giza is a protocol that deploys AI models on the chain using a completely trustless approach. It uses a technology stack that includes the ONNX format for representing machine learning models, the Giza Transpiler for converting these models to the Cairo program format, the ONNX Cairo Runtime for executing the models in a verifiable and deterministic manner, and the Giza Model smart contract for deploying and executing the models on the chain. While Giza could also be classified as a model-to-proof compiler, their positioning as an ML model marketplace is one of the most interesting applications currently.

Gensyn is a distributed hardware supply network for training ML models. Specifically, they are developing a gradient-based probabilistic auditing system and using model checkpoints to enable a distributed GPU network to provide training services for full-scale models. While their zkML application here is very specific to their use case – they aim to ensure the integrity of model updates when a node downloads and trains a part of the model – it demonstrates the powerful combination of zk and ML.

ZKaptcha focuses on bot problems in web3 and provides captcha services for smart contracts. Their current implementation generates proof of human work by completing the captcha, which is verified by their on-chain validation program and accessed by the smart contract through a few lines of code. Today, they rely mainly on zk, but they plan to implement zkML in the future, similar to existing web2 captcha services, by analyzing mouse movement and other behaviors to determine whether the user is human.

Given that the zkML market is still very early, many applications have already been tested at the hackathon level. Projects include AI Coliseum, an on-chain AI competition that uses ZK proofs to verify machine learning outputs, Hunter z Hunter, a photo treasure hunting game that uses the EZKL library to verify image classification model outputs with halo2 circuits, and zk Section 9, which converts AI image generation models into circuits for casting and verifying AI art.

Challenges for zkML

Despite rapid progress in improving and optimizing, the zkML field still faces some core challenges. These challenges involve both technical and practical aspects, including:

Quantization with minimal precision loss
Circuit size, especially when networks consist of multiple layers
Efficient proof of matrix multiplication
Adversarial attacks

Quantization is the process of representing floating-point numbers as fixed-point numbers, and most machine learning models use floating-point numbers to represent model parameters and activation functions. In processing zk circuits, fixed-point numbers need to be used. The impact of quantization on the accuracy of machine learning models depends on the precision level used. Generally, using lower precision (i.e., fewer bits) may lead to reduced accuracy, as this can introduce rounding and approximation errors. However, there are several techniques available to minimize the impact of quantization on accuracy, such as fine-tuning the model after quantization and using quantization-aware training. In addition, an alternative neural network architecture developed for edge devices, such as weightless neural networks, was shown in the Zero Gravity hackathon project at zkSummit 9 to be able to avoid quantization issues in circuits.

In addition to quantization, hardware is another key challenge. Once a machine learning model is correctly represented through a circuit, proof of its inference will become cheap and fast due to the succinctness of zk. The challenge here is not with the verifier, but with the prover, as RAM consumption and proof generation time will increase rapidly with model size. Certain proof systems (such as GKR-based systems using the sumcheck protocol and layered arithmetic circuits) or combination techniques (such as combining Plonky2 with Groth16, where Plonky2 is efficient in proof time but suboptimal in proof size for large models, while Groth16 does not cause proof size growth in the complexity of complex models) are better suited to handle these issues, but managing trade-offs is a core challenge in the zkML project.

In terms of adversarial attacks, there is still work to be done. First, if an untrusted protocol or DAO chooses to implement a model, there is still a risk of adversarial attacks during the training phase (e.g., training a model to exhibit specific behavior upon seeing specific inputs, which could be used to manipulate subsequent inferences). Federated learning techniques and zkML in the training phase may be one way to minimize this attack surface.

Another core challenge is the risk of model theft attacks when models are protected for privacy. While it is possible to obfuscate the weights of a model, theoretically, it is still possible to reverse-engineer the weights given enough input-output pairs. This is primarily a risk for small-scale models, but the risk still exists.

Scalability of Smart Contracts

Although there are some challenges in optimizing these models to run under the constraints of zk, improvement work is progressing exponentially, and some expect that we will soon catch up with the broader machine learning field with further hardware acceleration. To highlight the speed of these improvements, zkML demonstrated at 0xBlockingRC in 2021 how to execute a small-scale MNIST image classification model in verifiable circuits, and less than a year later, Daniel Kang did the same work for an ImageNet-scale model in a paper. In April 2022, the accuracy of this ImageNet-scale model increased from 79% to 92%, and large models like GPT-2 are expected to become possible in the near future, though the current proof time is long.

We believe zkML is a rich and evolving ecosystem aimed at expanding the capabilities of blockchain and smart contracts to make them more flexible, adaptive, and intelligent.

Although zkML is still in its early development stage, it has already begun to show promising results. As technology matures and develops, we can expect to see more innovative zkML use cases on-chain.

Like what you're reading? Subscribe to our top stories.

We will continue to update Gambling Chain; if you have any questions or suggestions, please contact us!

AIBlockchainDAODecentralizationDepthGovernanceSmart contract

Gambling Chain