Zero-Knowledge Machine Learning (ZKML): Combining the two hottest trends in technology, what kind of potential will it unleash?

Author | Callum@Web3CN.Pro

ZK has been hot since 2022, and its technology has made great progress. At the same time, with the popularity of Machine Learning (ML), many companies have begun to build, train, and deploy machine learning models. However, a major problem facing machine learning is how to ensure trustworthiness and dependence on opaque data. This is where ZKML comes in: it allows people using machine learning to fully understand the model without revealing information about the model itself.

1. What is ZKML

To understand what ZKML is, we can break it down into two parts. ZK (Zero-Knowledge Proof) is a cryptographic protocol. A prover can prove that a given statement is true without revealing any other information to a verifier, meaning that the result can be obtained without a process.

ZK has two main features: first, it proves what it wants to prove without revealing too much information to the verifier; second, it is difficult to generate a proof, but it is easy to verify a proof.

Based on these two features, ZK has developed several use cases: Layer 2 scaling, privacy public chains, decentralized storage, identity authentication, and machine learning, among others. This article will focus on ZKML (Zero-Knowledge Machine Learning).

What is ML (Machine Learning)? Machine learning is a branch of artificial intelligence that involves the development and application of algorithms that enable computers to learn and adapt to data, optimizing their performance through an iterative process without programming. It uses algorithms and models to identify data and obtain model parameters, ultimately making predictions/decisions.

Currently, machine learning has been successfully applied in various fields. As these models become more sophisticated, machine learning needs to perform more and more tasks. To ensure high-accuracy models, ZK technology is needed: using public models to verify private data or using public data to verify private models.

Currently, ZKML that we are discussing is the zero-knowledge proof of the inference step of creating ML models, not the ML model training.

2. Why do we need ZKML

With the advancement of artificial intelligence technology, it has become increasingly difficult to distinguish between artificial intelligence and human intelligence and human generation. Zero-knowledge proof has the ability to solve this problem, allowing us to determine whether specific content was generated by applying a specific model to a given input without revealing any other information about the model or input.

Traditional machine learning platforms often require developers to submit their own model architecture for performance validation on the host. This can lead to several problems:

  • Intellectual property loss: Publicizing the entire model architecture can expose valuable trade secrets or innovative technologies that developers wish to keep confidential.
  • Lack of transparency: The evaluation process may be opaque, and participants may be unable to verify how their models rank against other models.
  • Data privacy issues: Shared models trained on sensitive data may inadvertently leak information about the underlying data, violating privacy norms and regulations.

These challenges have spurred a need for solutions that can protect the privacy of machine learning models and their training data.

ZK has proposed a promising approach to addressing the challenges faced by traditional ML platforms. By harnessing the power of ZK, ZKML provides a privacy-preserving solution with the following advantages:

  • Model privacy: Developers can participate in validation without disclosing the entire model architecture, thereby protecting their intellectual property.
  • Transparent validation: ZK can verify model performance without leaking internal model details, thus promoting transparency and trustless evaluation processes.
  • Data privacy: ZK can be used to verify private data using public models or private models using public data, ensuring sensitive information is not leaked.

Integrating ZK into the ML process provides a secure and privacy-preserving platform that overcomes the limitations of traditional ML. This not only promotes the adoption of machine learning in privacy industries but also attracts experienced Web2 developers to explore possibilities within the Web3 ecosystem.

3. ZKML Applications and Opportunities

As cryptography, zero-knowledge proof technology, and hardware facilities continue to mature, more and more projects are beginning to explore the use of ZKML. The ecosystem of ZKML can be roughly divided into the following four categories:

  • Model verification compiler: Infrastructure that compiles models from existing formats (e.g., Pytorch, ONNX) into verifiable computation circuits.
  • Generalized proof systems: Proof systems built to validate arbitrary computation trajectories.
  • ZKML-specific proof systems: Proof systems built specifically to validate computation trajectories of ML models.
  • Applications: Projects that handle ZKML use cases.

Based on the ecological categories of these applications in ZKML, we can classify some of the current ZKML projects:

Image source: @bastian_wetzel

ZKML is still a nascent technology, and its market is still early, with many applications just being experimented with at hackathons, but ZKML still opens up a new design space for smart contracts:

DeFi

DeFi applications parameterized with ML can be more automated. For example, a lending protocol can use an ML model to update parameters in real-time. Currently, lending protocols mainly trust off-chain models run by organizations to determine collateral, LTV, liquidation thresholds, etc., but a better alternative may be community-trained open-source models that anyone can run and verify. Using verifiable off-chain ML oracles, ML models can perform prediction and classification on signed data off-chain. These off-chain ML oracles can verify inference and publish proofs on-chain, thus trustlessly solving real-world prediction markets, lending protocols, etc.

Web3 Social

Filtering Web3 social media. The decentralized nature of Web3 social applications will lead to more spam and malicious content. Ideally, social media platforms can use open-source ML models agreed upon by the community and publish proofs of model inference when selecting filtered posts. As a social media user, you may be willing to view personalized ads but want to keep your preferences and interests private from advertisers. Therefore, users can choose to run a model locally based on their preferences, which can input to the media application to provide content.

GameFi

ZKML can be applied to new chain-based games that create collaborative human and AI games and other innovative chain-based games, where AI models can act as NPCs, and each action taken by NPCs is published on-chain with a proof that anyone can verify to determine the correct model being run. At the same time, ML models can be used to dynamically adjust token issuance, supply, burn, voting thresholds, etc., and an incentive contract model can be designed that rebalances the in-game economy if a certain rebalancing threshold is reached and a verification proof is verified.

Identity Verification

Replace private key with privacy-preserving biometric authentication. Private key management remains one of the biggest pain points in Web3. Extracting private keys via facial recognition or other unique factors may be a possible solution for ZKML.

4. Challenges of ZKML

Although ZKML is constantly improving and optimizing, the field is still in its early stages of development and there are still some technical and practical challenges:

  • Quantization with the least amount of precision loss
  • Circuit size, especially when a network consists of multiple layers
  • Effective proof of matrix multiplication
  • Adversarial attacks

These challenges will affect the accuracy of machine learning models, their cost and proof speed, and the risk of model stealing attacks.

Improvements to these issues are currently underway. @0xBlockingRC demonstrated in the 2021 ZK-MNIST demo how to perform small-scale MNIST image classification models in verifiable circuits; Daniel Kang has done the same for ImageNet-scale models, and the accuracy of ImageNet-scale models has now been improved to 92% and is expected to be further hardware accelerated for more extensive ML spaces.

ZKML is still in its early stages of development, but it has already shown many achievements, and more on-chain innovative applications of ZKML can be expected. As ZKML continues to evolve, we can anticipate that privacy-protected machine learning will become the norm in the future.

Like what you're reading? Subscribe to our top stories.

We will continue to update Gambling Chain; if you have any questions or suggestions, please contact us!

Follow us on Twitter, Facebook, YouTube, and TikTok.

Share:

Was this article helpful?

93 out of 132 found this helpful

Gambling Chain Logo
Industry
Digital Asset Investment
Location
Real world, Metaverse and Network.
Goals
Build Daos that bring Decentralized finance to more and more persons Who love Web3.
Type
Website and other Media Daos

Products used

GC Wallet

Send targeted currencies to the right people at the right time.