Recursive zkSNARKs: Exploring New Territory

By Nalin. This was joint work with Jonathan Wang, Yi Sun, Vincent Huang -- the circom-pairing team.

In this blog post, we present circuits for recursive groth16 SNARK verification in Circom, building off of previous work by the circom-pairing team. More interestingly, we introduce properties uniquely enabled by recursive zkSNARKs, and share some of our own projects that utilise these properties.

If you'd prefer a video summarisation of this blog post, check out my talk from the Applied ZK Workshop at the Science of Blockchain Conference!

Recursive zkSNARKs

Recursive zkSNARKs unlock a powerful new dimension for zkSNARKs as a cryptographic tool. But first off, what does recursion even mean in the context of a zkSNARK?

Recursing inside of a SNARK means verifying a SNARK proof inside of another SNARK proof--generating SNARK proofs of statements like "I know a zkSNARK proof which a SNARK verification algorithm returns 'true' on." Recursion allows a zkSNARK prover to squeeze more knowledge into their proofs while maintaining the succinctness properties of a SNARK: verification of a recursive zkSNARK is not significantly slower than ordinary verification.

Why would you want to make recursive SNARK proofs? Abstractly, recursive SNARKs unlock two novel application-level properties: compression and composability.

Compression

Recursive zkSNARKs allow provers to put “more knowledge” into proofs, while ensuring that these proofs can still be verified in constant or polylogarithmic time by a verifier. Therefore, using recursive ZK SNARKs as a “rollup” for information is a natural and very mechanical use case. Recursive SNARKs enable compression of information, which is a way to abuse the properties of succinctness of ZK proofs.

These rollup compressions all tend to share a common pattern. Suppose you have a sequence of items you want to prove, and suppose you've already generated a SNARK proof of the first N items in the list. To extend this proof to N + 1 items, you can simply generate another proof that proves two things: the validity of your previous proof of the first N items, and the validity of the N+1th item. The alternative, non-recursive approach would have been to verify all N+1 items from scratch.

In the naive model, proving cost (proving time, constraint count, trusted setup size etc.) scales linearly with the number of items being proven. On the other hand, the recursive approach bounds the marginal proving cost to the cost of one recursive proof verification plus the cost of checking one item inside the SNARK. This difference is also particularly notable for cases where these items are from streaming data. This general pattern is typically referred to as Incrementally Verifiable Computation.

Of course, rollups for monetary transactions (and EVMs) are hot in blockchain land these days, but there are other applications where compression is useful. Here are some of them:

Isokratia: A rollup of off-chain votes into a single SNARK proof verifiable on-chain, abusing succinctness to enable ~free, trust-minimized governance. Stay tuned for the proper release of Isokratia, coming this week! 👀
Blockchain light clients: Abuse compression to prove the validity of entire blockchains in one ZK proof and minimize the trust assumptions on full nodes. This is how Celo’s Plumo light client works.
Trustless on-chain indexers and aggregators: If you could put a SQL VM inside a ZK SNARK, then much like a light client SNARK, you could enable trustless querying and aggregation of on-chain data and build a more decentralised and lower-trust version of The Graph.¹
VDF provers: VDFs (Verifiable Delay Functions) are functions that are slow to compute but need to be easy to verify. You can use recursive SNARKs to compress validity proofs for a VDF function, running rounds of the VDF functions inside the SNARK recursion.

Notably, the compression property also allows us to circumvent practical circuit size limits. Most proving system require memory on the order of the size of the circuit, so the computation we can do in ZK circuits often gets bounded by the size of hardware RAM we can acquire². By using recursive zkSNARKs for compression, we can instead create a smaller circuit which, with the overhead of recursion, can allow us to roll up more computation than the largest (non-recursive) circuits independently can.

Outside of just rollup-like applications, there’s another neat application of compression: more expressive circuits! To elaborate, for one, you can do true conditional statements using recursive checks for a fixed cost: you can use a recursive SNARK to write a constant cost selector! Without recursion, the only way to write a dynamic selector (an expression that allows you to select the value of the Nth element of a list, where N as well as the list are dynamic inputs) is to write a loop through the list and compare every individual element, incurring O(length of list) constraints. With recursion, however, you can prove the execution of such a selector inside a SNARK, so you can roll it up inside a different circuit for the cost of SNARK verification -- which is constant!³ This is quite interesting, since with a good enough DSL compressions could make SNARK circuit writing feel much closer to writing Turing complete machines with more freedom over looping, conditionals etc. rather than the finite state machines they feel like right now. In fact, this is also how the recently announced Lurk Language uses recursive ZK SNARKs.

Composability

Succinctness properties of vanilla zkSNARKs scale up into the compression property for recursive zkSNARKs, so it is worth asking the same question for the zero-knowledge properties of vanilla zkSNARKs: How does ZK "scale up" for recursive zkSNARKs? We've been calling this scaled up property composability:

Typically, we think of zkSNARKs in the context of "A prover shows knowledge of a fact to a verifer, without revealing the fact". There is a somewhat implicit assumption that the prover themselves are well aware of the fact itself.

Recursive zkSNARKs, however, change this. A prover can now show knowledge of a fact to a verifier, without knowing the fact themselves.

What does this mean? Using Recursive zkSNARKs, a chain of proofs could be created, where at each step the proof is passed along to a new participant, who adds on some claims of knowledge of their own. At the end, you would be able to create a combined proof of knowledge such that no individual participant in this chain has the entirety of the knowledge that antecedes this proof. Perhaps, this might also be reminiscent of how Trusted Setup ceremonies work.

I first thought of this property in the context of the game of Telephone/Chinese whispers. No one learns the series of changes that led to the current word, but they are assured of its correctness by a zkSNARK.

Where can we use this property? A natural fit seems to be partial information games: We could use Recursive zkSNARKs to instrument party games like Mafia and Telephone between adversarial players.

But this also goes beyond games. Recently, we released ETHdos, which demonstrates a clever use case for this new property: hiding social graphs while making trustless claims about proximity!

In ETHdos, I can prove I am degree 4 from Vitalik by showing that I am friends with someone who is degree 3 (and they repeat this themselves recursively). I can show there exists a valid path of 4 people between me and Vitalik, without knowing anything about the first 3 nodes in the path.

This is something new entirely. Not only are ZK proofs hiding information from external verifiers, but also from the chain of provers themselves. This suggests that applications of Recursive zkSNARKs must be richer than that of vanilla zkSNARKs.

Another interesting idea of an application composability somewhat uniquely enables is trust-less, private liquidity channels (or even proper state channels by throwing in a zkVM), revealing none of the intermediate transactions to anyone except each other. Additionally, validity proofs would allow for minimal wait periods.⁴

Now that we've described some examples of composability, you'll notice a couple common threads all these constructions seem to share. Let's solidify these:

Firstly, it seems there's always multiple parties involved in these constructions: Almost certainly, it somehow doesn't make sense to use this property of recursive zkSNARKs for an individual prover. There must necessarily be some collaborative component to the proof generation process -- otherwise there is no secondary prover to hide the first proof's inputs from.

Secondly, all of these constructions use knowledge of signatures as the transitive hidden witness⁵ -- this isn't a coincidence, signatures are the easiest way to instantiate existential unforgeability: Only the owner of the keypairs can create these signatures, so they're great collaborative surface to build on top of. So, perhaps, one way to find interesting applications of this property is to ask: what are different things people can tuck in these collaborative cryptography boxes?

There are some close parallels between this property and how academia generally models Proof Carrying Data in how they extend Incrementally Verifiable Computation to graph structures, but largely, this seems to be one of the most overlooked properties of recursive ZK SNARKs by practitioners. We're hoping these notes will help change this and bring about cooler applications for recursive zkSNARKs.

Beyond these application-level properties unlocked by recursion, another very useful unlock on the technical side worth mentioning is the ability to mix-and-match different proof systems and arithmetisations (and with that, utilise the best features of each one). For instance, STARKs, with their AIR arithmetisation, are particularly well suited for state machine execution proofs, but are unfortunately expensive to verify. With recursion, you could generate a STARK proof for a state machine's execution, but then verify the STARK proof in a PLONK SNARK, obtaining a cheap to verify SNARK proof as output. Notably, this is the architecture that Polgon Hermez's zkEVM (using an EVM as the state machine) uses.

Now that we've established why recursive zkSNARKs are interesting, let's zoom in to how we implemented recursive zkSNARKs in groth16, and how you can use our work to build your own circuits:

Groth16 Recursion

While there has been a ton of research on building ZK SNARKs efficient at recursion on the proving systems layer (eg Halo 2, Nova), most of this work has not yet permeated into software libraries developers can play with. However, groth16, one of the very first general purpose zkSNARK schemes (albeit one that's very vanilla and has been superseded by other constructions on almost all dimensions) has a thriving developer tooling ecosystem around it in the form of Circom, snarkjs, rapidsnark, and other related libraries.

So, while groth16 is not particularly well suited for recursion from a theoretical perspective, enabling recursion for the groth16 paradigm is a surefire way to enable explorations with recursive SNARKs on the application layer today!

Typically, there are two methods for implementing fully-recursive SNARKs: the more popular one is using cycles of pairing-friendly elliptic curves, where efficient recursion is enabled by finding two pairing friendly curves such that the order of one equals the field size of the other, and vice versa. This cycle allows the recursive zkSNARK protocol to alternate field operations between these two, only requiring a small amount of wrong-field math/conversions. A second way is to brute-force your way through and simply implement elliptic curve operations for a single pairing friendly curve in your proving system itself (by doing a bunch of ugly, "wrong-field" bigint math).

We take the latter approach. With circom-pairing team’s work, we already had all the necesary primitives (pairing checks and EC operations) required to build out a groth16 proof verifier in Circom. So, earlier this summer, Jonathan Wang and I ported the pairing circuits to the BN254 curve and then put together a groth16 verifier in Circom - open sourced here!

The Optimisoor of a groth16 verification inside SNARKs

Using a number of neat optimisation tricks (lightly detailed in the screenshot above for those more familiar with the verification scheme), we were able to get the groth16 verification circuit down to ~20M constraints! Unfortunately, 20M constraints is still quite a lot and far from the reach of client-side proving on laptops/desktops⁶. Ultimately, we are able to generate recursive SNARK proofs in about 250 seconds on a beefy server (32-core 3.1GHz, 256G RAM). This can be further optimised to about 180 seconds with some devops tricks detailed in ETHdos's blog post. While this is certainly limiting in its current form, we think unlocking the possibility of an entirely new class of applications is nonetheless very exciting!

While as an application developer you can ignore most of the details of this verification, it is important to be aware of two facts about this verification:

While we usually treat SNARK verification cost as constant time, in reality the verification cost for groth16 is O(number of public inputs). Usually, this is close enough to constant, but the effect is more apparent in our resource-constrained verification inside a SNARK setup. You pay proving cost for each additional public input when verifying a proof in another SNARK.
groth16 has a two phase Trusted Setup, with the second phase of the setup being circuit-specific. This means that when you verify a proof inside a SNARK, it would require its own trusted setup independent of the outer layer SNARK.

Semi-Public Inputs

For the first limitation, we introduce a trick to compress many public inputs into one: All the inputs that are meant to be public are instead passed as private inputs into the circuit, and the only public input exposed is a hash of these private inputs. In addition to its usual checks, the circuit additionally verifies that the hash of the private (or more aptly, semi-public inputs) matches the passed public input hash. Under the Fiat-Shamir model, this is secure as you would essentially need to find another "useful" preimage for the publically passed hash. This trick has use cases beyond just recursion. In other verification environments, you can use this trick to reduce verification compute cost (for instance, saving gas costs for an on-chain groth16 proof verifier).

Self-Recursive Circuits

Regarding the second limitation, you'll notice that perhaps the applications most suitable for recursive groth16 SNARKs are ones that recurse into themselves, i.e. proofs verified in the circuit are proofs of the same circuit itself. This would mean we only require one trusted setup. We'll share our mental model for thinking about making applications with self-recursive SNARKs in this context:

How to think about self-recursive SNARKs

At each step, you have a circuit that proves the validity of computation $A_i, A_{i+1} \dots$ (with perhaps $i$ as a public input to the SNARK), and within the $i$ th such proof, you verify another proof showing validity of computation $A_{i+1}, A_{i+2} \dots$ alongside step $A_i$ 's validity. At each recursion, your SNARK circuit would remain the same. In the case of Isokratia, for example, each of the $A_i$ s are ECDSA signature verifications.

In the next post, we'll release and detail the construction of Isokratia and expand on how to use these two tricks and mental models in practice. In the meantime, check out our previous blog post on ETHdos.

Thanks to Jonathan Wang, Yi Sun, Vivek Bhupatiraju, Sampriti Panda, Adhyyan Sekhsaria, Ying Tong, Barry Whitehat, Lakshman Sankar, Amir Bolous and others in the 0xPARC community for helping consolidate these ideas and reviewing drafts of this post.

Typically, blockchain light clients have the ability to provide proofs of values of individual state values on-chain. ZK-SNARKs allow compressing many such proofs, along with aggregation functions like SQL, to generate succinct proofs of historical data and trends of state values on-chain. ↩
Or in the case of groth16 and other trusted setup based SNARKs, the maximum size of the circuit gets bounded first by a different practical limit, the size of the largest trusted setups we have available (that are currently on the order of $2^{28}$ constraints). ↩
In case of Circom based recursive SNARKs, this constant cost is very high making it useless in practice, but the general strategy is interesting to keep in mind. ↩
Here’s a full description of the liquidity channel protocol using this: two parties escrow into a contract and do a bunch of transactions off chain between each other, revealing none of the intermediate transactions to anyone except each other. At all times, they maintain a rollup zk proof that has 3 public inputs: both the parties balances and the number of transactions between them (this will be useful later for indexing). Each time I want to make a proof, I generate a signature attesting to a transaction at a particular state, and roll it up into a new zk proof and give it to my counterparty. My counterparty does the exact same thing back. At no point are signatures revealed to each other; only the balance sheet is revealed. At any point, if one of them decides to settle, they can on chain and show the rolled-up zk proof to the contract, revealing only final balances. To complete the construction, you would need a escalation game style mechanism in case any single party tries to go to chain and censor some suffix of transactions. In such a case, the counterparty would use the "index" public parameter to be able to show a “larger” (by number of transactions) zk proof and guarantee correct settlement. Two open questions on this protocol: Can you layer this protocol on top of an anonymity pool to hide transactors? Can you build Lightning Network style channels with multi-hop payments using this? ↩
This can be slightly generalised to be problems in NP, which is really what Digital Signature Algorithms (and ECC) unlock for us on-demand. Notably, factorisation is another such problem in NP which might be interesting to think more deeply about. ↩
Based on 0xPARC's previous work pushing in-browser proving to its limits with heyanon and stealthdrop, 20M constraints is over by a factor of ~10x or so for what's possible in-browser today. ↩

Recursive zkSNARKs: Exploring New Territory