Penumbra is a fully private proof-of-stake network and decentralized exchange for the Cosmos ecosystem.

Penumbra integrates privacy with proof-of-stake through a novel private delegation mechanism that provides staking derivatives, tax-efficient staking, and on-chain governance with private voting. Penumbra connects to the Cosmos ecosystem via IBC, acting as an ecosystem-wide shielded pool and allowing private transactions in any IBC-compatible asset. Users can also swap these assets using ZSwap, a private decentralized exchange supporting sealed-bid batch auctions and Uniswap-v3-style concentrated liquidity. Sealed-bid batch auctions prevent frontrunning, provide better execution, and reveal only the net flow across a pair of assets in each block, and liquidity positions are created anonymously, allowing traders to approximate their desired trading function without revealing their individual beliefs about prices.

This website renders the work-in-progress protocol specification for Penumbra.

Press s or use the magnifying glass icon for full-text search.

If you’re interested in technical discussion about the project, why not

Private Transactions

Penumbra records all value in a single multi-asset shielded pool based on the Zcash Sapling design, but allows private transactions in any kind of IBC asset. Inbound IBC transfers shield value as it moves into the zone, while outbound IBC transfers unshield value.

Unlike Zcash, Penumbra has no notion of transparent transactions or a transparent value pool; instead, inbound IBC transfers are analogous to t2z Zcash transactions, outbound IBC transfers are analogous to z2t Zcash transactions, and the entire Cosmos ecosystem functions analogously to Zcash’s transparent pool.

Unlike the Cosmos Hub or many other chains built on the Cosmos SDK, Penumbra has no notion of accounts. Only validators have any kind of long-term identity, and this identity is only used (in the context of transactions) for spending the validator’s commission.

Private Staking

In a proof-of-stake system like the Cosmos Hub, stakeholders delegate staking tokens by bonding them to validators. Validators participate in Tendermint consensus with voting power determined by their delegation size, and delegators receive staking rewards in exchange for taking on the risk of being penalized for validator misbehavior (slashing).

Integrating privacy and proof of stake poses significant challenges. If delegations are public, holders of the staking token must choose between privacy on the one hand and staking rewards and participation in consensus on the other hand. Because the majority of the stake will be bonded to validators, privacy becomes an uncommon, opt-in case. But if delegations are private, issuing staking rewards becomes very difficult, because the chain no longer knows the amount and duration of each address’ delegations.

Penumbra sidesteps this problem using a new mechanism that eliminates staking rewards entirely, treating unbonded and bonded stake as separate assets, with an epoch-varying exchange rate that prices in what would be a staking reward in other systems. This mechanism ensures that all delegations to a particular validator are fungible, and can be represented by a single delegation token representing a share of that validator’s delegation pool, in effect a first-class staking derivative. Although delegation fungibility is key to enabling privacy, as a side effect, delegators do not realize any income while their stake is bonded, only a capital gain (or loss) on unbonding.

The total amount of stake bonded to each validator is part of the public chain state and determines consensus weight, but the delegation tokens themselves are each just another token to be recorded in a multi-asset shielded pool. This provides accountability for validators and privacy and flexibility for delegators, who can trade and transact with their delegation tokens just like they can with any other token.

It also provides an alternate perspective on the debate between fixed-supply and inflation-based rewards. Choosing the unbonded token as the numéraire, delegators are rewarded by inflation for taking on the risk of validator misbehavior, and the token supply grows over time. Choosing (a basket of) delegation tokens as the numéraire, non-delegators are punished by depreciation for not taking on any risk of misbehavior, and the token supply is fixed.

Private Governance

Like the Cosmos Hub, Penumbra supports on-chain governance with delegated voting. Unlike the Cosmos Hub, Penumbra’s governance mechanism supports secret ballots. Penumbra users can anonymously propose votes by escrowing a deposit of bonded stake. Stakeholders vote by proving ownership of their bonded stake prior to the beginning of the voting period and encrypting their votes to a threshold key controlled by the validator set. Validators sum encrypted votes and decrypt only the per-epoch totals.

Private DEX

Penumbra provides private, sealed-bid batch swaps using ZSwap. ZSwap allows users to privately swap between any pair of assets. Individual swaps do not reveal trade amounts. Instead, all swaps in each block are executed in a single batch. Only the total amount in each batch is revealed, and only after the batch has been finalized. This prevents front-running and provides better execution, but also provides long-term privacy for individual swaps. Users can also provide liquidity by anonymously creating concentrated liquidity positions. These positions reveal the amount of liquidity and the bounds in which it is concentrated, but are not otherwise linked to any identity, so that (with some care) users can privately approximate arbitrary trading functions without revealing their specific views about prices.

Concepts and Mechanisms

This section provides an overview of the concepts involved in Penumbra’s mechanism design.

Validators

Validators in Penumbra undergo various transitions depending on chain activity.

                                 ┌────────────────────────────────────────────────────────────────────────────┐
                                 │                                                                            │
                                 │            ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─                                           │
                                 │              Genesis Validator  │                                          │
                                 │            │                             ┏━━━━━━━━━━━━━━━━━━━━━━━┓         │
                                 │             ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘        ┃      Tombstoned       ┃         │
                                 │                       │          ┌──────▶┃     (Misbehavior)     ┃         │
                                 │                       │          │       ┗━━━━━━━━━━━━━━━━━━━━━━━┛         │
                                 │                       │          │                                         │
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─            │                       ▼          │                                         ▼
      Validator      │      ┏━━━━━━━━━┓              ╔══════╗       │                                   ┏━━━━━━━━━━━┓
│     Definition     ──────▶┃Inactive ┃─────────────▶║Active║───────┼────────────────────────────────┬─▶┃ Disabled  ┃
   (in transaction)  │      ┗━━━━━━━━━┛              ╚══════╝       │                                │  ┗━━━━━━━━━━━┛
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─            ▲                                  │       ┏━━━━━━━━━━━━━━━━━━━━━┓  │        │
                                 │                                  └──────▶┃ Jailed (Inactivity) ┃──┘        │
                                 │                                          ┗━━━━━━━━━━━━━━━━━━━━━┛           │
                                 │                                                     │                      │
                                 └─────────────────────────────────────────────────────┴──────────────────────┘

Single lines represent unbonded stake, and double lines represent bonded stake.

Validators become known to the chain either at genesis, or by means of a transaction with a ValidatorDefinition action in them. Validators transition through five states:

  • Inactive: a validator whose delegation pool is too small to participate in consensus set
  • Active: a validator whose delegation pool is large enough to participate in consensus and must meet uptime requirements
  • Jailed: a validator that has been slashed and removed from consensus for downtime, that may return later
  • Tombstoned: a validator that has been permanently slashed and removed from consensus for byzantine misbehavior and may not return
  • Disabled: a validator that has been manually disabled by the operator, that may return to Inactive later

Validators specified in the genesis config begin in the active state, with whatever stake was allocated to their delegation pool at genesis. Otherwise, new validators begin in the inactive state, with no stake in their delegation pool. At this point, the validator is known to the chain, and stake can be contributed to its delegation pool. Stake contributed to an inactive validator’s delegation pool does not earn rewards (the validator’s rates are held constant), but it is also not bonded, so undelegations are effective immediately, with no unbonding period and no output quarantine.

The chain chooses a validator limit N as a consensus parameter. When a validator’s delegation pool (a) has a nonzero balance and (b) its (voting-power-adjusted) size is in the top N validators, it moves into the active state during the next epoch transition. Active validators participate in consensus, and are communicated to Tendermint. Stake contributed to an active validator’s delegation pool earns rewards (the validator’s rates are updated at each epoch to track the rewards accruing to the pool). That stake is bonded, so undelegations have an unbonding period and an output quarantine. An active validator can exit the consensus set in four ways.

First, the validator could be jailed and slashed for inactivity. This can happen in any block, triggering an unscheduled epoch transition. Jailed validators are immediately removed from the consensus set. The validator’s rates are updated to price in the slashing penalty, and are then held constant. Validators jailed for inactivity are not permanently prohibited from participation in consensus, and their operators can re-activate them by re-uploading the validator definition. Stake cannot be delegated to a slashed validator. Stake already contributed to a slashed validator’s delegation pool will enter an unbonding period to hold the validator accountable for any byzantine behavior during the unbonding period. Re-delegations may occur after the validator enters the “Inactive” state.

Second, the validator could be tombstoned and slashed for byzantine misbehavior. This can happen in any block, triggering an unscheduled epoch transition. Tombstoned validators are immediately removed from the consensus set. Any pending undelegations from a slashed validator are cancelled: the quarantined output notes are deleted, and the quarantined nullifiers are removed from the nullifier set. The validator’s rates are updated to price in the slashing penalty, and are then held constant. Tombstoned validators are permanently prohibited from participation in consensus (though their operators can create new identity keys, if they’d like to). Stake cannot be delegated to a tombstoned validator. Stake already contributed to a tombstoned validator’s delegation pool is not bonded (the validator has already been slashed and tombstoned), so undelegations are effective immediately, with no unbonding period and no quarantine.

Third, the validator could be manually disabled by the operator. The validator is then in the disabled state. It does not participate in consensus, and the stake in its delegation pool does not earn rewards (the validator’s rates are held constant). The stake in its delegation pool will enter an unbonding period at the time the validator becomes disabled. The only valid state a disabled validator may enter into is “inactive”, if the operator re-activates it by updating the validator definition.

Fourth, the validator could be displaced from the validator set by another validator with more stake in its delegation pool. The validator is then in the inactive state. It does not participate in consensus, and the stake in its delegation pool does not earn rewards (the validator’s rates are held constant). The stake in its delegation pool will enter an unbonding period at the time the validator becomes inactive. Inactive validators have three possible state transitions:

  1. they can become active again, if new delegations boost its weight back into the top N;
  2. they can be tombstoned, if evidence of misbehavior arises during the unbonding period;
  3. they can be disabled, if the operator chooses.

If (2) occurs, the same state transitions as in regular tombstoning occur: all pending undelegations are cancelled, etc. If (3) occurs, the unbonding period continues and the validator enters the disabled state. If (1) occurs, the validator stops unbonding, and all delegations become bonded again.

Batching Flows

Penumbra’s ledger records value as it moves between different economic roles – for instance, movement between unbonded stake and delegation tokens, movement between different assets as they are traded, etc. This creates a tension between the need to reveal the total amount of value in each role as part of the public chain state, and the desire to shield value amounts in individual transactions.

To address this tension, Penumbra provides a mechanism to aggregate value flows across a batch of transactions, revealing only the total amount and not the value contributed by each individual transaction. This mechanism is built using an integer-valued homomorphic encryption scheme that supports threshold decryption, so that the network’s validators can jointly control a decryption key.

Transactions that contribute value to a batch contain an encryption of the amount. To flush the batch, the validators sum the ciphertexts from all relevant transactions to compute an encryption of the batch total, then jointly decrypt it and commit it to the chain.

This mechanism doesn’t require any coordination between the users whose transactions are batched, but it does require that the validators create and publish a threshold decryption key. To allow batching across block boundaries, Penumbra organizes blocks into epochs, and applies changes to the validator set only at epoch boundaries. Decryption keys live for the duration of the epoch, allowing value flows to be batched over any time interval from 1 block up to the length of an epoch. We propose epoch boundaries on the order of 1-3 days.

At the beginning of each epoch, the validator set performs distributed key generation to produce a decryption key jointly controlled by the validators (on an approximately stake-weighted basis) and includes the encryption key in the first block of the epoch.

Because this key is only available after the first block of each epoch, some transactions cannot occur in the first block itself. Assuming a block interval similar to the Cosmos Hub, this implies an ~8-second processing delay once per day, a reasonable tradeoff against the complexity of phased setup procedures.

Addresses and Keys

Value transferred on Penumbra is sent to shielded payment addresses; these addresses are derived from spending keys through a sequence of intermediate keys that represent different levels of attenuated capability:

flowchart BT
    subgraph Address
    end;
    subgraph DTK[Detection Key]
    end;
    subgraph IVK[Incoming Viewing Key]
    end;
    subgraph OVK[Outgoing Viewing Key]
    end;
    subgraph FVK[Full Viewing Key]
    end;
    subgraph SK[Spending Key]
    end;
    subgraph SeedPhrase[Seed Phrase]
    end;

    index(address index);
    div{ };

    SeedPhrase --> SK;
    SK --> FVK;
    FVK --> OVK;
    FVK --> IVK;

    index --> div;
    IVK ----> div;
    div --> Address & DTK;
    DTK --> Address;

From bottom to top:

  • the seed phrase is the root key material. Multiple accounts - each with separate spend authority - can be derived from this root seed phrase.
  • the spending key is the capability representing spending authority for a given account;
  • the full viewing key represents the capability to view all transactions related to the account;
  • the outgoing viewing key represents the capability to view only outgoing transactions, and is used to recover information about previously sent transactions;
  • the incoming viewing key represents the capability to view only incoming transactions, and is used to scan the block chain for incoming transactions.

Penumbra allows the same account to present multiple, publicly unlinkable addresses, keyed by an 16-byte address index. Each choice of address index gives a distinct shielded payment address. Because these addresses share a common incoming viewing key, the cost of scanning the blockchain does not increase with the number of addresses in use.

Finally, Penumbra also allows outsourcing probabilistic transaction detection to third parties using fuzzy message detection. Each address has a detection key; a third party can use this key to detect transactions that might be relevant to that key. Like a Bloom filter, this detection has false positives but no false negatives, so detection will find all relevant transactions, as well as some amount of unrelated cover traffic. Unlike incoming viewing keys, detection keys are not shared between diversified addresses, allowing fine-grained control of delegation.

This diagram shows only the user-visible parts of the key hierarchy. Internally, each of these keys has different components, described in detail in the Addresses and Keys chapter.

Assets and Amounts

Penumbra’s shielded pool can record arbitrary assets. To be precise, we define:

  • an amount to be an untyped quantity of some asset;
  • an asset type to be an ADR001-style denomination trace uniquely identifying a cross-chain asset and its provenance, such as:
    • denom (native chain A asset)
    • transfer/channelToA/denom (chain B representation of chain A asset)
    • transfer/channelToB/transfer/channelToA/denom (chain C representation of chain B representation of chain A asset)
  • an asset ID to be a fixed-size hash of an asset type;
  • a value to be a typed quantity, i.e., an amount and an asset id.

Penumbra deviates slightly from ADR001 in the definition of the asset ID. While ADR001 defines the IBC asset ID as the SHA-256 hash of the denomination trace, Penumbra hashes to a field element, so that asset IDs can be more easily used inside of a zk-SNARK circuit.

Notes, Nullifiers, and Trees

Transparent blockchains operate as follows: all participants maintain a copy of the (consensus-determined) application state. Transactions modify the application state directly, and participants check that the state changes are allowed by the application rules before coming to consensus on them.

On a shielded blockchain like Penumbra, however, the state is fragmented across all users of the application, as each user has a view only of their “local” portion of the application state recording their funds. Transactions update a user’s state fragments privately, and use a zero-knowledge proof to prove to all other participants that the update was allowed by the application rules.

Penumbra’s transaction model is derived from the Zcash shielded transaction design, with modifications to support multiple asset types, and several extensions to support additional functionality. The Zcash model is in turn derived from Bitcoin’s unspent transaction output (UTXO) model, in which value is recorded in transaction outputs that record the conditions under which they can be spent.

In Penumbra, value is recorded in notes, which function similarly to UTXOs. Each note specifies (either directly or indirectly) a type of value, an amount of value of that type, a spending key that authorizes spending the note’s value, and a unique nullifier derived from the note’s contents.

However, unlike UTXOs, notes are not recorded as part of the public chain state. Instead, the chain contains a state commitment tree, an incremental Merkle tree containing (public) commitments to (private) notes. Creating a note involves creating a new note commitment, and proving that it commits to a valid note. Spending a note involves proving that the spent note was previously included in the state commitment tree, using the spending key to demonstrate spend authorization, and revealing the nullifier, which prevents the same note from being spent twice.

Transactions

Transactions describe an atomic collection of changes to the ledger state. Each transaction consists of a sequence of descriptions for various actions1. Each description adds or subtracts (typed) value from the transaction’s value balance, which must net to zero.

Penumbra adapts Sapling’s Spend, which spends a note and adds to the transaction’s value balance, and Output, which creates a new note and subtracts from the transaction’s value balance. Penumbra also adds many new descriptions to support additional functionality:

Transfers

  • Spend descriptions spend an existing note, adding its value to the transaction’s value balance;

  • Output descriptions create a new note, subtracting its value from the transaction’s value balance;

  • Transfer descriptions transfer value out of Penumbra by IBC, consuming value from the transaction’s value balance, and producing an ICS20 FungibleTokenPacketData for the counterparty chain;

Staking

Governance

  • CreateProposal descriptions are used to propose measures for on-chain governance and supply a deposit, consuming bonded stake from the transaction’s value balance and producing a new note that holds the deposit in escrow;

  • WithdrawProposal descriptions redeem an escrowed proposal deposit, returning it to the transaction’s value balance and immediately withdrawing the proposal.

  • Vote descriptions perform private voting for on-chain governance and declare a vote. This description leaves the value balance unchanged.

Trading

  • Swap descriptions perform the first phase of a swap, consuming tokens of one type from a transaction’s value balance, burning them, and producing a swap commitment for use in the second stage;

  • SwapClaim descriptions perform the second phase of a swap, allowing a user who burned tokens of one type to mint tokens of the other type at the chain-specified clearing price, and adding the new tokens to a transaction’s value balance;

Market-making

  • OpenPosition descriptions open concentrated liquidity positions, consuming value of the traded types from the transaction’s value balance and adding the specified position to the AMM state;

  • ClosePosition descriptions close concentrated liquidity positions, removing the specified position to the AMM state and adding the value of the position, plus any accumulated fees or liquidity rewards, to the transaction’s value balance.

Each transaction also contains a fee specification, which is always transparently encoded. The value balance of all of a transactions actions, together with the fees, must net to zero.

1

Note that like Zcash Orchard, we use the term “action” to refer to one of a number of possible state updates; unlike Orchard, we do not attempt to conceal which types of state updates are performed, so our Action is an enum.

Governance

Penumbra supports on-chain governance with delegated voting. Validators’ votes are public and act as default votes for their entire delegation pool, while delegators’ votes are private, and override the default vote provided by their validator.

Votes are similar to Cosmos Hub: Yes, No, and Abstain have the same meanings. However, Penumbra does not have NoWithVeto; instead, a sufficiently high threshold of “no” votes (determined by a chain parameter) will slash a proposal, which causes its deposit to be burned. This functions as a spam deterrent.

Proposals

Penumbra users can propose votes by escrowing a minimum amount of PEN. They do this by creating a transaction with a ProposalSubmit description, which consumes some amount of PEN from the transaction’s balance, and creates a proposal_N_voting NFT, which can be redeemed for the proposal deposit in the case that the voting concludes without slashing the proposal (both failed and passed proposals can reclaim their deposit).

Proposals can either be normal or emergency proposals. In either case, the voting period begins immediately, in the next block after the proposal has been committed to the chain. Normal proposals have a fixed-length voting period, while emergency proposals are accepted as soon as a 2/3 majority of the stake is reached.

Because validators provide default votes for their delegation pool, an emergency proposal can in principle be accepted immediately, without any input from delegators. This allows time-critical resolution of emergencies (e.g., deploying an 0day hotfix); the 2/3 majority of the stake required is already sufficient to arbitrarily rewrite the chain state.

Proposals can also be withdrawn by their proposer prior to the end of the voting period. This is done by creating a transaction with a ProposalWithdraw description, and allows the community to iterate on proposals as the (social) governance process occurs. For instance, a chain upgrade proposal can be withdrawn and re-proposed with a different source hash if a bug is discovered while upgrade voting is underway. Withdrawn proposals cannot be accepted, even if the vote would have passed, but they can be slashed.1

Voting

Stakeholder votes are of the form , representing the weights for yes, no, and abstain, respectively. Most stakeholders would presumably set all but one weight to . Stakeholders vote by proving ownership of some amount of bonded stake (their voting power) prior to the beginning of the voting period.

To do this, they create a transaction with a Vote description. This description identifies the validator and the proposal, proves spend authority over a note recording dPEN(v), and reveals the note’s nullifier. Finally, it proves vote consistency , produces a new note with dPEN(v), and includes , an encryption of the vote weights to the validators’ decryption key.

The proof statements in a Vote description establishing spend authority over the note are almost identical to those in a Spend description. However, there are two key differences. First, rather than checking the note’s nullifier against the global nullifier set and marking it as spent, the nullifier is checked against a snapshot of the nullifier set at the time that voting began (establishing that it was unspent then), as well as against a per-proposal nullifier set (establishing that it has not already been used for voting). In other words, instead of marking that the note has been spent in general, we only mark it as having been spent in the context of voting on a specific proposal. Second, the ZK proof additionally proves that the note was created before the proposal started, and the stateful checks ensure that it was not spent after the proposal started.

This change allows multiple proposals to be voted on concurrently, at the cost of linkability. While the same note can be used to vote on multiple proposals, those votes, as well as the subsequent spend of the note, will have the same nullifier and thus be linkable to each other. However, the Vote descriptions are shielded, so an observer only learns that two opaque votes were related to each other.

We suggest that wallets roll over the note the first time it is used for voting by creating a transaction with Vote, Spend, and Output descriptions. This mitigates linkability between Vote and Spend descriptions, and means that votes on any proposals created after the first vote are unlinkable from prior votes.

Counting Votes

At the end of each epoch, validators collect the encrypted votes from each delegation pool, aggregate the encrypted votes into encrypted tallies and decrypt the tallies. These intermediate tallies are revealed, because it is not possible to batch value flows over time intervals longer than one epoch. In practice, this provides a similar dynamic as existing (transparent) on-chain governance schemes, where tallies are public while voting is ongoing.

At the end of the voting period, the per-epoch tallies are summed. For each validator , the votes for each option are summed to determine the portion of the delegation pool that voted; the validator’s vote acts as the default vote for the rest of the delegation pool. Finally, these per-validator subtotals are multiplied by the voting power adjustment function to obtain the final vote totals.

If the vote was not slashed, the proposal_N_voting NFT created during proposal submission can be redeemed for the original proposal deposit (if not slashed) and a commemorative proposal_N_passed, proposal_N_failed, or proposal_N_slashed NFT.

1

If withdrawing a proposal halted on-chain voting immediately, the escrow mechanism would not be effective at deterring spam, since the proposer could yank their proposal at the last minute prior to losing their deposit. However, at the UX level, withdrawn proposals can be presented as though voting were closed, since validators’ default votes are probably sufficient for spam deterrence.

Primitives

Penumbra uses the following cryptographic primitives, described in the following sections:

  • The Proof System section describes the choice of proving curve (BLS12-377) and proof system (Groth16, and potentially PLONK in the future);

  • The decaf377 section describes decaf377, a parameterization of the Decaf construction defined over the BLS12-377 scalar field, providing a prime-order group that can be used inside or outside of a circuit;

  • The Poseidon for BLS12-377 section describes parameter selection for an instantiation of Poseidon, a SNARK-friendly sponge construction, over the BLS12-377 scalar field;

  • The Fuzzy Message Detection section describes a construction that allows users to outsource a probabilistic detection capability, allowing a third party to scan and filter the chain on their behalf, without revealing precisely which transactions are theirs.

  • The Homomorphic Threshold Decryption section describes the construction used to batch flows of value across transactions.

  • The Randomizable Signatures section describes decaf377-rdsa, a variant of the Zcash RedDSA construction instantiated over decaf377, used for binding and spend authorization signatures.

  • The Key Agreement section describes decaf377-ka, an instantiation of Diffie-Hellman key agreement over decaf377.

Proving Considerations

Penumbra needs SNARK proofs. Because the choice of proving system and proving curve can’t really be cleanly separated from the rest of the system choices (e.g., the native field of the proving system informs what embedded curve is available, and how circuit programming is done), large parts of the rest of the system design block on making a choice of proving system.

Goals

  1. Near-term implementation availability. We’d like to ship a useful product first, and iterate and expand it later.

  2. High performance for fixed functionality. Penumbra intends to support fixed functionality initially; programmability is a good future goal but isn’t a near-term objective. The fixed functionality should have as high performance as possible.

  3. Longer-term flexibility. The choice should ideally not preclude many future choices for later functionality. More precisely, it should not impose high switching costs on future choices.

  4. Recursion capability. Penumbra doesn’t currently make use of recursion, but there are a lot of interesting applications it could be used for.

Setup ceremonies are beneficial to avoid for operational reasons, but not for security reasons. A decentralized setup procedure is sufficient for security.

Options

Proof systems:

  • Groth16:
    • Pros: high performance, very small proofs, mature system
    • Cons: requires a setup for each proof statement
  • PLONK:
    • Pros: universal setup, still fairly compact proofs, seems to be a point of convergence with useful extensions (plookup, SHPLONK, etc)
    • Cons: bigger proofs, worse constants than Groth16
  • Halo 2
    • Pros: no setup, arbitrary depth recursion
    • Cons: bigger proof sizes, primary implementation for the Pallas/Vesta curves which don’t support pairings

Curve choices:

  • BLS12-381:

    • Pros: very mature, used by Sapling already
    • Cons: no easy recursion
  • BLS12-377:

    • Pros: constructed as part of Zexe to support depth 1 recursion using a bigger parent curve, deployed in Celo, to be deployed in Zexe
    • Cons: ?
  • Pallas/Vesta:

    • Pros: none other than support for Halo 2’s arbitrary recursion
    • Cons: no pairings mean they cannot be used for any pairing-based SNARK

Considerations

Although the choice of proof system (Groth16, Plonk, Halo, Pickles, …) is not completely separable from the choice of proving curve (e.g., pairing-based SNARKs require pairing-friendly curves), to the extent that it is, the choice of the proof system is relatively less important than the choice of proving curve, because it is easier to encapsulate.

The choice of proving curve determines the scalar field of the arithmetic circuit, which determines which curves are efficient to implement in the circuit, which determines which cryptographic constructions can be performed in the circuit, which determines what kind of key material the system uses, which propagates all the way upwards to user-visible details like the address format. While swapping out a proof system using the same proving curve can be encapsulated within an update to a client library, swapping out the proving curve is extremely disruptive and essentially requires all users to generate new addresses and migrate funds.

This means that, in terms of proof system flexibility, the Pallas/Vesta curves are relatively disadvantaged compared to pairing-friendly curves like BLS12-381 or BLS12-377, because they cannot be used with any pairing-based SNARK, or any other pairing-based construction. Realistically, choosing them is committing to using Halo 2.

Choosing BLS12-377 instead of BLS12-381 opens the possibility to do depth-1 recursion later, without meaningfully restricting the near-term proving choices. For this reason, BLS12-377 seems like the best choice of proving curve.

Penumbra’s approach is to first create a useful set of fixed functionality, and generalize to custom, programmable functionality only later. Compared to Sapling, there is more functionality (not just Spend and Output but Delegate, Undelegate, Vote, …), meaning that there are more proof statements. Using Groth16 means that each of these statements needs to have its own proving and verification key, generated through a decentralized setup.

So the advantage of a universal setup (as in PLONK) over per-statement setup (as in Groth16) would be:

  1. The setup can be used for additional fixed functionality later;
  2. Client software does not need to maintain distinct proving/verification keys for each statement.

(2) is a definite downside, but the impact is a little unclear. As a point of reference, the Sapling spend and output parameters are 48MB and 3.5MB respectively. The size of the spend circuit could be improved using a snark-friendly hash function.

With regard to (1), if functionality were being developed in many independent pieces, doing many setups would impose a large operational cost. But doing a decentralized setup for a dozen proof statements simultaneously does not seem substantially worse than doing a decentralized setup for a single proof statement. So the operational concern is related to the frequency of groups of new statements, not the number of statements in a group. Adding a later group of functionality is easy if the first group used a universal setup. But if it didn’t, the choice of per-statement setup initially doesn’t prevent the use of a universal setup later, as long as the new proof system can be implemented using the same curve.

Because Penumbra plans to have an initial set of fixed functionality, and performance is a concern, Groth16 seems like a good choice, and leaves the door open for a future universal SNARK. Using BLS12-377 opens the door to future recursion, albeit only of depth 1.

The decaf377 group

Penumbra, like many other zero-knowledge protocols, requires a cryptographic group that can be used inside of an arithmetic circuit. This is accomplished by defining an “embedded” elliptic curve whose base field is the scalar field of the proving curve used by the proof system.

The Zexe paper, which defined BLS12-377, also defined (called in Figure 16 of the paper) a cofactor-4 Edwards curve defined over the BLS12-377 scalar field for exactly this purpose. However, non-prime-order groups are a leaky abstraction, forcing all downstream constructions to pay attention to correct handling of the cofactor. Although it is usually possible to do so safely, it requires additional care, and the optimal technique for handling the cofactor is different inside and outside of a circuit.

Instead, applying the Decaf construction to this curve gives decaf377, a clean abstraction that provides a prime-order group complete with hash-to-group functionality and whose encoding and decoding functions integrate validation. Although it imposes a modest additional cost in the circuit context, as discussed in Costs and Alternatives, the construction works the same way inside and outside of a circuit and imposes no costs for lightweight, software-only applications, making it a good choice for general-purpose applications.

Curve Parameters

The cofactor-4 Edwards curve defined over the BLS12-377 scalar field has the following parameters:

  • Base field: Integers mod prime
  • Elliptic curve equation: with and
  • Curve order: where

We use a conventional generator basepoint selected to have a convenient hex encoding:

0x0800000000000000000000000000000000000000000000000000000000000000

In affine coordinates this generator point has coordinates:

Implementation

An implementation of decaf377 can be found here.

Costs and Alternatives

Arithmetic circuits have a different cost model than software. In the software cost model, software executes machine instructions, but in the circuit cost model, relations are certified by constraints. Unfortunately, while Decaf is a clearly superior choice in the software context, in the circuit context it imposes some additional costs, which must be weighed against its benefits.

At a high level, Decaf implements a prime-order group using a non-prime-order curve by constructing a group quotient. Internally, group elements are represented by curve points, with a custom equality check so that equivalent representatives are considered equal, an encoding function that encodes equivalent representatives as identical bitstrings, and a decoding function that only accepts canonical encodings of valid representatives.

To do this, the construction defines a canonical encoding on a Jacobi quartic curve mod its 2-torsion (a subgroup of size 4) by making two independent sign choices. Then, it uses an isogeny to transport this encoding from the Jacobi quartic to a target curve that will be used to actually implement the group operations. This target curve can be an Edwards curve or a Montgomery curve. The isogenies are only used for deriving the construction. In implementations, all of these steps are collapsed into a single set of formulas that perform encoding and decoding on the target curve.

In other words, one way to think about the Decaf construction is as some machinery that transforms two sign choices into selection of a canonical representative. Ristretto adds extra machinery to handle cofactor 8 by making an additional sign choice.

In the software cost model, where software executes machine instructions, this construction is essentially free, because the cost of both the Decaf and conventional Edwards encodings are dominated by the cost of computing an inverse or an inverse square root, and the cost of the sign checks is insignificant.

However, in the circuit cost model, where relations are certified by various constraints, this is no longer the case. On the one hand, certifying a square root or an inverse just requires checking that or that , which is much cheaper than actually computing or . On the other hand, performing a sign check involves bit-constraining a field element, requiring hundreds of constraints.

Sign checks

The definition of which finite field elements are considered nonnegative is essentially arbitrary. The Decaf paper suggests three possibilities:

  • using the least significant bit, defining to be nonnegative if the least absolute residue for is even;

  • using the most significant bit, defining to be nonnegative if the least absolute residue for is in the range ;

  • for fields where , using the Legendre symbol, which distinguishes between square and nonsquare elements.

Using the Legendre symbol is very appealing in the circuit context, since it has an algebraic definition and, at least in the case of square elements, very efficient certification. For instance, if square elements are chosen to be nonnegative, then certifying that is nonnegative requires only one constraint, . However, the reason for the restriction to fields is that and should have different signs, which can only be the case if is nonsquare. Unfortunately, many SNARK-friendly curves, including BLS12-377, are specifically chosen so that for as large a power as possible (e.g., in the case of BLS12-377).

This leaves us with either the LSB or MSB choices. The least significant bit is potentially simpler for implementations, since it is actually the low bit of the encoding of , while the most significant bit isn’t, because it measures from , not a bit position , so it seems to require a comparison or range check to evaluate. However, these choices are basically equivalent, in the following sense:

Lemma.1

The most significant bit of is if and only if the least significant bit of is .

Proof.

The MSB of is if and only if , but this means that , which is even, is the least absolute residue, so the LSB of is also . On the other hand, the MSB of is if and only if , i.e., if , i.e., if . This means that the least absolute residue of is ; since is even and is odd, this is odd and has LSB .

This means that transforming an LSB check to an MSB check or vice versa requires multiplication by or , which costs at most one constraint.

Checking the MSB requires checking whether a value is in the range . Using Daira Hopwood’s optimized range constraints, the range check costs 2. However, the input to the range check is a bit-constrained unpacking of a field element, not a field element itself. This unpacking costs .

Checking the LSB is no less expensive, because although the check only examines one bit, the circuit must certify that the bit-encoding is canonical. This requires checking that the value is in the range , which also costs , and as before, the unpacking costs .

In other words, checking the sign of a field element costs , or in the case where the field element is already bit-encoded for other reasons. These checks are the dominant cost for encoding and decoding, which both require two sign checks. Decoding from bits costs c. , decoding from a field element costs c. , and encoding costs c. regardless of whether the output is encoded as bits or as a field element.

For decaf377, we choose the LSB test for sign checks.

Alternative approaches to handling cofactors

Decaf constructs a prime-order group whose encoding and decoding methods perform validation. A more conventional alternative approach is to use the underlying elliptic curve directly, restrict to its prime-order subgroup, and do subgroup validation separately from encoding and decoding. If this validation is done correctly, it provides a prime-order group. However, because validation is an additional step, rather than an integrated part of the encoding and decoding methods, this approach is necessarily more brittle, because each implementation must be sure to do both steps.

In the software cost model, there is no reason to use subgroup validation, because it is both more expensive and more brittle than Decaf or Ristretto. However, in the circuit cost model, there are cheaper alternatives, previously analyzed by Daira Hopwood in the context of Ristretto for JubJub (1, 2).

Multiplication by the group order.

The first validation method is to do a scalar multiplication and check that . Because the prime order is fixed, this scalar multiplication can be performed more efficiently using a hardcoded sequence of additions and doublings.

Cofactor preimage.

The second validation method provides a preimage in affine coordinates . Because the image of is the prime-order subgroup, checking that satisfies the curve equation and that checks that is in the prime-order subgroup.

In the software context, computing and computing cost about the same, although both are an order of magnitude more expensive than decoding. But in the circuit context, the prover can compute outside of the circuit and use only a few constraints to check the curve equation and two doublings. These costs round to zero compared to sign checks, so the validation is almost free.

The standard “compressed Edwards y” format encodes a point by the -coordinate and a sign bit indicating whether is nonnegative. In software, the cost of encoding and decoding are about the same, and dominated by taking an inverse square root. In circuits, the costs of encoding and decoding are also about the same, but they are instead dominated by a sign check that matches the sign of the recovered -coordinate with the supplied sign bit. This costs c. as above.

Comparison and discussion

This table considers only approximate costs.

Operationdecaf377Compressed Ed + Preimage
Decode (from bits)400C400C
Decode (from )750C325C
Encode (to bits)750C750C
Encode (to )750C325C

When decoding from or encoding to field elements, the marginal cost of Decaf compared to the compressed Edwards + cofactor preimage is an extra bit-unpacking and range check. While this effectively doubles the number of constraints, the marginal cost of c. is still small relative to other operations like a scalar multiplication, which at 6 constraints per bit is approximately .

When decoding from or encoding to bits, the marginal cost of Decaf disappears. When the input is already bit-constrained, Decaf’s first sign check can reuse the bit constraints, saving c. , but the compressed Edwards encoding must range-check the bits (which Decaf already does), costing c. extra. Similarly, in encoding, Decaf’s second sign check produces bit-constrained variables for free, while the compressed Edwards encoding spends c. bit-constraining and range-checking them.

However, in the software context, the prime-order validation check costs approximately 10x more than the cost of either encoding. Many applications require use of the embedded group both inside and outside of the circuit, and uses outside of the circuit may have additional resource constraints (for instance, a hardware token creating a signature authorizing delegated proving, etc.).

Performing validation as an additional, optional step also poses additional risks. While a specification may require it to be performed, implementations that skip the check will appear to work fine, and the history of invalid-point attacks (where implementations should, but don’t, check that point coordinates satisfy the curve equation) suggests that structuring validation as an integral part of encoding and decoding is a safer design. This may not be a concern for a specific application with a single, high-quality implementation that doesn’t make mistakes, but it’s less desirable for a general-purpose construction.

In summary, Decaf provides a construction that works the same way inside and outside of a circuit and integrates validation with the encoding, imposing only a modest cost for use in circuits and no costs for lightweight, software-only applications, making it a good choice for general-purpose constructions.

1

I have no idea whether this is common knowledge; I learned of this fact from its use in Mike Hamburg’s Ed448-Goldilocks implementation.

2

The value 73 is computed as:

from itertools import groupby

def cost(k):
  return min(k-1, 2)

def constraints(bound):
  costs = [cost(len(list(g))+1) for (c, g) in groupby(bound.bits()[:-1]) if c == 1]
  return sum(costs)

constraints(ZZ((q-1)/2))

as here.

Inverse Square Roots

As in the internet-draft, the decaf377 functions are defined in terms of the following function, which computes the square root of a ratio of field elements, with the special behavior that if the input is nonsquare, it returns the square root of a related field element, to allow reuse of the computation in the hash-to-group setting.

Define as a non-square in the field and sqrt_ratio_zeta(N,D) as:

  • (True, ) if and are nonzero, and is square;
  • (True, ) if is zero;
  • (False, ) if is zero and is non-zero;
  • (False, ) if and are nonzero, and is nonsquare.

Since is nonsquare, if is nonsquare, is square. Note that unlike the similar function in the ristretto255/decaf448 internet-draft, this function does not make any claims about the sign of its output.

To compute sqrt_ratio_zeta we use a table-based method adapted from Sarkar 2020 and zcash-pasta, which is described in the remainder of this section.

Constants

We set (the 2-adicity of the field) and odd such that . For the BLS12-377 scalar field, and .

We define where is a non-square as described above. We fix as 2841681278031794617739547238867782961338435681360110683443920362658525667816.

We then define a and such that . We also define a parameter where . For decaf377 we choose:

Precomputation

Lookup tables are needed which can be precomputed as they depend only on the 2-adicity and the choice of above.

lookup table:

We compute for and , indexed on and :

This table lets us look up powers of .

lookup table:

We compute for , indexed on :

We use this table in the procedure that follows to find (they are the values) in order to compute .

Procedure

In the following procedure, let . We use the following relations from Sarkar 2020:

  • Equation 1: and for and
  • Lemma 3: for
  • Equation 2:

In these expressions, and are field elements. are unsigned -bit integers. At each , the algorithm first computes , then and (from the previous step’s and ), then finally and , in each case such that they satisfy the above expressions. Note that in the algorithm .

Step 1: Compute

We compute . This corresponds to line 2 of the findSqRoot Algorithm 1 in Sarkar 2020.

Substituting :

Applying Fermat’s Little Theorem (i.e. ):

Substituting and rearranging:

We compute using a quantity we define as:

We also define:

And substitute and into which gives us:

We now can use in the computation for and :

Step 2: Compute

Compute using and as calculated in the prior step. This corresponds to line 4 of the findSqRoot Algorithm 1 in Sarkar 2020.

Step 3: Compute

We next compute for . This corresponds to line 5 of the findSqRoot Algorithm 1 in Sarkar 2020. This gives us the following components:

Step 4: Compute and

Next, we loop over . This corresponds to lines 6-9 of the findSqRoot Algorithm 1 in Sarkar 2020.

For

Using Lemma 3:

Substituting the definition of from equation 1:

Rearranging and substituting in (initial condition):

Substituting in equation 2:

This is almost in a form where we can look up in our s lookup table to get and thus . If we define we get:

Which we can use with our s lookup table to get . Expressing in terms of , we get .

For

First we compute using equation 1:

Next, similar to the first iteration, we use lemma 3 and substitute in and to yield:

In this expression we can compute the quantities on the left hand side, and the right hand side is in the form we expect for the s lookup table, yielding us . Note that here too we define such that the s lookup table can be used. Expressing in terms of , we get .

For

The remaining iterations proceed similarly, yielding the following expressions:

Note for and the remaining iterations we do not require a trick (i.e. where ) to get in a form where it can be used with the s lookup table. In the following expressions for , is always even, and so the high bit of each value is unchanged when adding .

At the end of this step, we have found and for .

Step 5: Return result

Next, we can use equation 1 to compute using and from the previous step:

This matches the expression from Lemma 4 in Sarkar 2020.

Next, to compute , we lookup entries in the g lookup table. To do so, we can decompose into:

then is computed as:

Multiplying in from step 1, we compute:

This corresponds to line 10 of the findSqRoot Algorithm 1 in Sarkar 2020.

In the non-square case, will be odd, and will be odd. We will have computed and multiply by a correction to get our desired output.

We can use the result of this computation to determine whether or not the square exists, recalling from Step 1 that :

  • If is square, then , and
  • If is non-square, then and .

Decoding

Decoding to a point works as follows where and are the curve parameters as described here.

  1. Decode s_bytes to a field element . We interpret these bytes as unsigned little-endian bytes. We check if the length has 32 bytes, where the top 3 bits of the last byte are 0. The 32 bytes are verified to be canonical, and rejected if not (if the input is already a field element in the circuit case, skip this step).

  2. Check that is nonnegative, or reject (sign check 1).

  3. .

  4. .

  5. (was_square, v) = sqrt_ratio_zeta(1, u_2 * u_1^2), rejecting if was_square is false.

  6. if is negative (sign check 2)1.

  7. .

The resulting coordinates are the affine Edwards coordinates of an internal representative of the group element.

1

Note this differs from the Decaf paper in Appendix A.2, but implementations of decaf377 should follow the convention described here.

Encoding

Given a representative in extended coordinates , encoding works as follows where and are the curve parameters as described here.

  1. .

  2. (_ignored, v) = sqrt_ratio_zeta(1, u_1 * (a - d) * X^2).

  3. (sign check 1).

  4. .

  5. .

  6. Set s_bytes to be the canonical unsigned little-endian encoding of , which is an integer mod . s_bytes has extra 0x00 bytes appended to reach a length of 32 bytes.

Group Hash

Elligator can be applied to map a field element to a curve point. The map can be applied once to derive a curve point suitable for use with computational Diffie-Hellman (CDH) challenges, and twice to derive a curve point indistinguishable from random.

In the following section, and are the curve parameters as described here. is a constant and sqrt_ratio_zeta(v_1,v_2) is a function, both defined in the Inverse Square Roots section.

The Elligator map is applied as follows to a field element :

  1. .

  2. .

  3. .

  4. sqrt_ratio_zeta where is a boolean indicating whether or not a square root exists for the provided input.

  5. If a square root for does not exist, then and . Else, and is unchanged.

  6. .

  7. .

  8. If ( and is true) or ( and is false) then .

The Jacobi quartic representation of the resulting point is given by . The resulting point can be converted from its Jacobi quartic representation to extended projective coordinates via:

For single-width hash-to-group (encode_to_curve), we apply the above map once. For double-width (hash_to_curve) we apply the map to two field elements and add the resulting curve points.

Test Vectors

Small generator multiples

This table has hex-encodings of small multiples of the generator :

ElementHex encoding
0000000000000000000000000000000000000000000000000000000000000000
0800000000000000000000000000000000000000000000000000000000000000
b2ecf9b9082d6306538be73b0d6ee741141f3222152da78685d6596efc8c1506
2ebd42dd3a2307083c834e79fb9e787e352dd33e0d719f86ae4adb02fe382409
6acd327d70f9588fac373d165f4d9d5300510274dffdfdf2bf0955acd78da50d
460f913e516441c286d95dd30b0a2d2bf14264f325528b06455d7cb93ba13a0b
ec8798bcbb3bf29329549d769f89cf7993e15e2c68ec7aa2a956edf5ec62ae07
48b01e513dd37d94c3b48940dc133b92ccba7f546e99d3fc2e602d284f609f00
a4e85dddd19c80ecf5ef10b9d27b6626ac1a4f90bd10d263c717ecce4da6570a
1a8fea8cbfbc91236d8c7924e3e7e617f9dd544b710ee83827737fe8dc63ae00
0a0f86eaac0c1af30eb138467c49381edb2808904c81a4b81d2b02a2d7816006
588125a8f4e2bab8d16affc4ca60c5f64b50d38d2bb053148021631f72e99b06
f43f4cefbe7326eaab1584722b1b4860de554b23a14490a03f3fd63a089add0b
76c739a33ffd15cf6554a8e705dc573f26490b64de0c5bd4e4ac75ed5af8e60b
200136952d18d3f6c70347032ba3fef4f60c240d706be2950b4f42f1a7087705
bcb0f922df1c7aa9579394020187a2e19e2d8073452c6ab9b0c4b052aa50f505

Hash-to-group

This table has input field elements along with the affine coordinates of the output point after applying the elligator map once:

Input field element output point
2873166235834220037104482467644394559952202754715866736878534498814378075613(1267955849280145133999011095767946180059440909377398529682813961428156596086, 5356565093348124788258444273601808083900527100008973995409157974880178412098)
7664634080946480262422274939177258683377350652451958930279692300451854076695(1502379126429822955521756759528876454108853047288874182661923263559139887582, 7074060208122316523843780248565740332109149189893811936352820920606931717751)
707087697291448463178823336344479808196630248514167087002061771344499604401(2943006201157313879823661217587757631000260143892726691725524748591717287835, 4988568968545687084099497807398918406354768651099165603393269329811556860241)
4040687156656275865790182426684295234932961916167736272791705576788972921292(2893226299356126359042735859950249532894422276065676168505232431940642875576, 5540423804567408742733533031617546054084724133604190833318816134173899774745)
6012393175004325154204026250961812614679561282637871416475605431319079196219(2950911977149336430054248283274523588551527495862004038190631992225597951816, 4487595759841081228081250163499667279979722963517149877172642608282938805393)
7255180635786717958849398836099816771666363291918359850790043721721417277258(3318574188155535806336376903248065799756521242795466350457330678746659358665, 7706453242502782485686954136003233626318476373744684895503194201695334921001)
6609366864829739556945402594963920739176902000316365292959221199804402230199(3753408652523927772367064460787503971543824818235418436841486337042861871179, 2820605049615187268236268737743168629279853653807906481532750947771625104256)
6875465950337820928985371259904709015074922314668494500948688901607284806973(7803875556376973796629423752730968724982795310878526731231718944925551226171, 7033839813997913565841973681083930410776455889380940679209912201081069572111)

Randomizable Signatures

Penumbra’s signatures are provided by decaf377-rdsa, a variant of the Zcash RedDSA construction instantiated using the decaf377 group. These are Schnorr signatures, with two additional properties relevant to Penumbra:

  1. They support randomization of signing and verification keys. Spending a note requires use of the signing key that controls its spend authorization, but if the same spend verification key were included in multiple transactions, they would be linkable. Instead, both the signing and verification keys are kept secret1, and each spend description includes a randomization of the verification key, together with a proof that the randomized verification key was derived from the correct spend verification key.

  2. They support addition and subtraction of signing and verification keys. This property is used for binding signatures, which bind zero-knowledge proofs to the transaction they were intended for and enforce conservation of value.

decaf377-rdsa

Let be the decaf377 group of prime order . Keys and signatures are parameterized by a domain . Each domain has an associated generator . Currently, there are two defined domains: and . The hash function is instantiated by using blake2b with the personalization string decaf377-rdsa---, treating the 64-byte output as the little-endian encoding of an integer, and reducing that integer modulo .

A signing key is a scalar . The corresponding verification key is the group element .

Sign

On input message m with signing key , verification key , and domain :

  1. Generate 80 random bytes.
  2. Compute the nonce as .
  3. Commit to the nonce as .
  4. Compute the challenge as .
  5. Compute the response as .
  6. Output the signature R_bytes || s_bytes.
Verify

On input message m, verification key A_bytes, and signature sig_bytes:

  1. Parse from A_bytes, or fail if A_bytes is not a valid (hence canonical) decaf377 encoding.
  2. Parse sig_bytes as R_bytes || s_bytes and the components as and , or fail if they are not valid (hence canonical) encodings.
  3. Recompute the challenge as .
  4. Check the verification equation , rejecting the signature if it is not satisfied.

SpendAuth signatures

The first signature domain used in Penumbra is for spend authorization signatures. The basepoint is the conventional decaf377 basepoint 0x0800000....

Spend authorization signatures support randomization:

Randomize.SigningKey

Given a randomizer , the randomized signing key is .

Randomize.VerificationKey

Given a randomizer , the randomized verification key is .

Randomizing a signing key and then deriving the verification key associated to the randomized signing key gives the same result as randomizing the original verification key (with the same randomizer).

Binding signatures

The second signature domain used in Penumbra is for binding signatures. The basepoint is the result of converting blake2b(b"decaf377-rdsa-binding") to an element and applying decaf377’s CDH encode-to-curve method.

Since the verification key corresponding to the signing key is , adding and subtracting signing and verification keys commutes with derivation of the verification key, as desired.

1

This situation is a good example of why it’s better to avoid the terms “public key” and “private key”, and prefer more precise terminology that names keys according to the cryptographic capability they represent, rather than an attribute of how they’re commonly used. In this example, the verification key should not be public, since it could link different transactions.

Simple example: Binding signature

Let’s say we have two actions in a transaction: one spend (indicated with subscript ) and one output (indicated with subscript ).

The balance commitments for those actions are:

where and are generators, are the blinding factors, and are the values.

When the signer is computing the binding signature, they have the blinding factors for all commitments.

They derive the signing key by adding up the blinding factors based on that action’s contribution to the balance:

The signer compute the binding signature using this key .

When the verifier is checking the signature, they add up the balance commitments to derive the verification key based on their contribution to the balance:

If the transaction is valid, then the first term on the LHS () is zero since for Penumbra all transactions should have zero value balance.

This leaves the verifier with the verification key:

If the value balance is not zero, the verifier will not be able to compute the verification key with the data in the transaction.

Key Agreement

Poseidon for BLS12-377

The Poseidon hash function is a cryptographic hash function that operates natively over prime fields. This allows the hash function to be used efficiently in the context of a SNARK. In the sections that follow we describe our instantiation of Poseidon over BLS12-377.

Overview of the Poseidon Permutation

This section describes the Poseidon permutation. It consists of rounds, where each round has the following steps:

  • AddRoundConstants: where constants (denoted by arc in the code) are added to the internal state,
  • SubWords: where the S-box is applied to the internal state,
  • MixLayer: where a matrix is multiplied with the internal state.

The total number of rounds we denote by . There are two types of round in the Poseidon construction, partial and full. We denote the number of partial and full rounds by and respectively.

In a full round in the SubWords layer the S-box is applied to each element of the internal state, as shown in the diagram below:

  ┌───────────────────────────────────────────────────────────┐
  │                                                           │
  │                     AddRoundConstants                     │
  │                                                           │
  └────────────┬──────────┬──────────┬──────────┬─────────────┘
               │          │          │          │              
             ┌─▼─┐      ┌─▼─┐      ┌─▼─┐      ┌─▼─┐            
             │ S │      │ S │      │ S │      │ S │            
             └─┬─┘      └─┬─┘      └─┬─┘      └─┬─┘            
               │          │          │          │              
  ┌────────────▼──────────▼──────────▼──────────▼─────────────┐
  │                                                           │
  │                         MixLayer                          │
  │                                                           │
  └────────────┬──────────┬──────────┬──────────┬─────────────┘
               │          │          │          │              
               ▼          ▼          ▼          ▼              

In a partial round, in the SubWords layer we apply the S-box only to one element of the internal state, as shown in the diagram below:

               │          │          │          │              
               │          │          │          │              
  ┌────────────▼──────────▼──────────▼──────────▼─────────────┐
  │                                                           │
  │                     AddRoundConstants                     │
  │                                                           │
  └────────────┬──────────────────────────────────────────────┘
               │                                               
             ┌─▼─┐                                             
             │ S │                                             
             └─┬─┘                                             
               │                                               
  ┌────────────▼──────────────────────────────────────────────┐
  │                                                           │
  │                         MixLayer                          │
  │                                                           │
  └────────────┬──────────┬──────────┬──────────┬─────────────┘
               │          │          │          │              
               ▼          ▼          ▼          ▼              

We apply half the full rounds () first, then we apply the partial rounds, then the rest of the full rounds. This is called the HADES design strategy in the literature.

Poseidon Parameter Generation

The problem of Poseidon parameter generation is to pick secure choices for the parameters used in the permutation given the field, desired security level in bits, as well as the width of the hash function one wants to instantiate (i.e. 1:1 hash, 2:1 hash, etc.).

Poseidon parameters consist of:

  • Choice of S-box: choosing the exponent for the S-box layer where ,
  • Round numbers: the numbers of partial and full rounds,
  • Round constants: the constants to be added in the AddRoundConstants step,
  • MDS Matrix: generating a Maximum Distance Separable (MDS) matrix to use in the linear layer, where we multiply this matrix by the internal state.

Appendix B of the Poseidon paper provides sample implementations of both the Poseidon permutation as well as parameter generation. There is a Python script called calc_round_numbers.py which provides the round numbers given the security level , the width of the hash function , as well as the choice of used in the S-box step. There is also a Sage script, which generates the round numbers, constants, and MDS matrix, given the security level , the width of the hash function , as well as the choice of used in the S-box step.

Since the publication of the Poseidon paper, others have edited these scripts, resulting in a number of implementations being in use derived from these initial scripts. We elected to implement Poseidon parameter generation in Rust from the paper, checking each step, and additionally automating the S-box parameter selection step such that one can provide only the modulus of a prime field and the best will be selected.

Below we describe where we deviate from the parameter selection procedure described in the text of the Poseidon paper.

Choice of S-Box

The Poseidon paper focuses on the cases where , as well as BLS12-381 and BN254. For a choice of positive , it must satisfy , where is the prime modulus.

For our use of Poseidon on BLS12-377, we wanted to generate a procedure for selecting the optimal for a general curve, which we describe below.

We prefer small, positive for software (non-circuit) performance, only using when we are unable to find an appropriate positive . For positive , the number of constraints in an arithmetic circuit per S-Box is equal to its depth in a tree of shortest addition chains. For a given number of constraints (i.e. at a given depth in the tree), we pick the largest at that level that meets the GCD requirement. Since larger provides more security, choosing the largest at a given depth reduces the round requirement.

The procedure in detail:

Shortest addition chains

We proceed down the tree from depth 2 to depth 5 (where depth 0 is the root of 1):

  1. At a given depth, proceed from largest number to smaller numbers.
  2. For a given element, check if is satisfied. If yes, we choose it, else continue.

If we get through these checks to depth of 5 without finding a positive exponent for , then we pick , which is well-studied in the original Poseidon paper.

For decaf377, following this procedure we end up with .

Round Numbers

We implement the round numbers as described in the original paper. These are the number of rounds necessary to resist known attacks in the literature, plus a security margin of +2 full rounds, and +7.5% partial rounds.

We test our round number calculations with tests from Appendices G and H from the paper which contain concrete instantiations of Poseidon for and their round numbers.

Round Constants

We do not use the Grain LFSR for generating pseudorandom numbers as described in Appendix F of the original paper. Instead, we use a Merlin transcript to enable parameter generation to be fully deterministic and easily reproducible.

We first append the message "poseidon-paramgen" to the transcript using label dom-sep.

We then bind this transcript to the input parameter choices:

  • the width of the hash function using label t,
  • the security level using label M, and
  • the modulus of the prime field using label p.

We then also bind the transcript to the specific instance, as done with the Grain LFSR in Appendix F, so we bind to:

  • the number of full rounds using label r_F,
  • the number of partial rounds using label r_P, and
  • the choice of S-box exponent using label alpha.

We generate random field elements from hashes of this complete transcript of all the input parameters and the derived parameters , , and .

Each round constant is generated by obtaining challenge bytes from the Merlin transcript, derived using label round-constant. We obtain bytes where is the field size in bits. These bytes are then interpreted as an unsigned little-endian integer reduced modulo the field.

MDS Matrix

We generate MDS matrices using the Cauchy method. However instead of randomly sampling the field as described in the text, we deterministically generate vectors and as:

Each element of the matrix is then constructed as:

where .

This deterministic matrix generation method has been verified to be safe over the base field of decaf377, using algorithms 1-3 described in Grassi, Rechberger and Schofnegger 2020 over the t range 1-100.

The parameters that are used for poseidon377 are located in the params.rs module of the source code. The parameters were generated on an Apple M1.

Test Vectors

The following are test vectors for the poseidon377 Poseidon instantiation.

Each section is for a given rate of a fixed-width hash function, where capacity is 1. Inputs and output are elements. The domain separator used in each case are the bytes "Penumbra_TestVec" decoded to an element, where we interpret these bytes in little-endian order.

Rate 1

Input element:

  • 7553885614632219548127688026174585776320152166623257619763178041781456016062

Output element:

  • 2337838243217876174544784248400816541933405738836087430664765452605435675740

Rate 2

Input elements:

  • 7553885614632219548127688026174585776320152166623257619763178041781456016062
  • 2337838243217876174544784248400816541933405738836087430664765452605435675740

Output element:

  • 4318449279293553393006719276941638490334729643330833590842693275258805886300

Rate 3

Input elements:

  • 7553885614632219548127688026174585776320152166623257619763178041781456016062
  • 2337838243217876174544784248400816541933405738836087430664765452605435675740
  • 4318449279293553393006719276941638490334729643330833590842693275258805886300

Output element:

  • 2884734248868891876687246055367204388444877057000108043377667455104051576315

Rate 4

Input elements:

  • 7553885614632219548127688026174585776320152166623257619763178041781456016062
  • 2337838243217876174544784248400816541933405738836087430664765452605435675740
  • 4318449279293553393006719276941638490334729643330833590842693275258805886300
  • 2884734248868891876687246055367204388444877057000108043377667455104051576315

Output element:

  • 5235431038142849831913898188189800916077016298531443239266169457588889298166

Rate 5

Input elements:

  • 7553885614632219548127688026174585776320152166623257619763178041781456016062
  • 2337838243217876174544784248400816541933405738836087430664765452605435675740
  • 4318449279293553393006719276941638490334729643330833590842693275258805886300
  • 2884734248868891876687246055367204388444877057000108043377667455104051576315
  • 5235431038142849831913898188189800916077016298531443239266169457588889298166

Output element:

  • 66948599770858083122195578203282720327054804952637730715402418442993895152

Rate 6

Input elements:

  • 7553885614632219548127688026174585776320152166623257619763178041781456016062
  • 2337838243217876174544784248400816541933405738836087430664765452605435675740
  • 4318449279293553393006719276941638490334729643330833590842693275258805886300
  • 2884734248868891876687246055367204388444877057000108043377667455104051576315
  • 5235431038142849831913898188189800916077016298531443239266169457588889298166
  • 66948599770858083122195578203282720327054804952637730715402418442993895152

Output element:

  • 6797655301930638258044003960605211404784492298673033525596396177265014216269

Fuzzy Message Detection

By design, privacy-preserving blockchains like Penumbra don’t reveal metadata about the sender or receiver of a transaction. However, this means that users must scan the entire chain to determine which transactions relate to their addresses. This imposes large bandwidth and latency costs on users who do not maintain online replicas of the chain state, as they must “catch up” each time they come online by scanning all transactions that have occurred since their last activity.

Alternatively, users could delegate scanning to a third party, who would monitor updates to the chain state on their behalf and forward only a subset of those updates to the user. This is possible using viewing keys, as in Zcash, but viewing keys represent the capability to view all activity related to a particular address, so they can only be delegated to trusted third parties.

Instead, it would be useful to be able to delegate only a probabilistic detection capability. Analogous to a Bloom filter, this would allow a detector to identify all transactions related to a particular address (no false negatives), while also identifying unrelated transactions with some false positive probability. Unlike viewing capability, detection capability would not include the ability to view the details of a transaction, only a probabilistic association with a particular address. This is the problem of fuzzy message detection (FMD), analyzed by Beck, Len, Miers, and Green in their paper Fuzzy Message Detection, which proposes a cryptographic definition of fuzzy message detection and three potential constructions.

This section explores how Penumbra could make use of fuzzy message detection:

  • In Sender and Receiver FMD, we propose a generalization of the original definition where the false positive probability is set by the sender instead of the receiver, and discusses why this is useful.

  • In Constructing S-FMD, we realize the new definition using a variant of one of the original FMD constructions, and extend it in two ways:

    1. to support arbitrarily precise detection with compact, constant-size keys;
    2. to support diversified detection, allowing multiple, publicly unlinkable addresses to be scanned by a single detection key.

Unfortunately, these extensions are not mutually compatible, so we only use the first one, and record the second for posterity.

Acknowledgements

Thanks to George Tankersley and Sarah Jamie Lewis for discussions on this topic (and each independently suggesting the modifications to realize S-FMD), to Gabrielle Beck for discussions about the paper and ideas about statistical attacks, and to Guillermo Angeris for pointers on analyzing information disclosure.

Sender and Receiver FMD

The goal of detection capability is to be able to filter the global stream of state updates into a local stream of state updates that includes all updates related to a particular address, without identifying precisely which updates those are. The rate of updates on this filtered stream should be bounded below, to ensure that there is a large enough anonymity set, and bounded above, so that users processing the stream have a constant and manageable amount of work to process it and catch up with the current chain state. This means that the detection precision must be adaptive to the global message rates: if the false positive rate is too low, the filtered stream will have too few messages, and if it is too high, it will have too many messages.

However, this isn’t possible using the paper’s original definition of fuzzy message detection, because the false positive probability is chosen by the receiver, who has no way to know in advance what probability will produce a correctly sized stream of messages. One way to address this problem is to rename the original FMD definition as Receiver FMD (R-FMD), and tweak it to obtain Sender FMD (S-FMD), in which the sender chooses the detection probability.

Receiver FMD

The paper’s original definition of fuzzy message detection is as a tuple of algorithms (KeyGen, CreateClue, Extract, Examine).1 The receiver uses KeyGen to generate a root key and a clue key. A sender uses the receiver’s clue key as input to CreateClue to produce a clue. The Extract algorithm takes the root key and a false positive rate (chosen from some set of supported rates), and produces a detection key. The Examine algorithm uses a detection key to examine a clue and produce a detection result.

This scheme should satisfy certain properties, formalizations of which can be found in the paper:

Correctness.

Valid matches must always be detected by Examine; i.e., there are no false negatives.

Fuzziness.

Invalid matches should produce false positives with probability approximately , as long as the clues and detection keys were honestly generated.

Detection Ambiguity.

An adversarial detector must be unable to distinguish between a true positive and a false positive, as long as the clues and detection keys were honestly generated.

In this original definition, the receiver has control over how much detection precision they delegate to a third party, because they choose the false positive probability when they extract a detection key from their root key. This fits with the idea of attenuating credentials, and intuitively, it seems correct that the receiver should control how much information they reveal to their detector. But the information revealed to their detector is determined by both the false positive probability and the amount of other messages that can function as cover traffic. Without knowing the extent of other activity on the system, the receiver has no way to make a principled choice of the detection precision to delegate.

Sender FMD

To address this problem, we generalize the original definition (now Receiver FMD) to Sender FMD, in which the false positive probability is chosen by the sender.

S-FMD consists of a tuple of algorithms (KeyGen, CreateClue, Examine). Like R-FMD, CreateClue creates a clue and Examine takes a detection key and a clue and produces a detection result. As discussed in the next section, S-FMD can be realized using tweaks to either of the R-FMD constructions in the original paper.

Unlike R-FMD, the false positive rate is set by the sender, so CreateClue takes both the false positive rate and the receiver’s clue key. Because the false positive rate is set by the sender, there is no separation of capability between the root key and a detection key, so KeyGen outputs a clue key and a detection key, and Extract disappears.

In R-FMD, flag ciphertexts are universal with respect to the false positive rate, which is applied to the detection key; in S-FMD, the false positive rate is applied to the flag ciphertext and the detection key is universal.

Unlike R-FMD, S-FMD allows detection precision to be adaptive, by having senders use a (consensus-determined) false positive parameter. This parameter should vary as the global message rates vary, so that filtered message streams have a bounded rate, and it should be the same for all users, so that messages cannot be distinguished by their false positive rate.

1

We change terminology from the FMD paper; the paper calls detection and clue keys the secret and public keys respectively, but we avoid this in favor of capability-based terminology that names keys according to the precise capability they allow. The “clue” terminology is adopted from the Oblivious Message Retrieval paper of Zeyu Liu and Eran Tromer; we CreateClue and Examine clues rather than Flag and Test flag ciphertexts.

Constructing S-FMD

The original FMD paper provides three constructions of R-FMD. The first two realize functionality for restricted false-positive probabilities of the form ; the third supports arbitrary fractional probabilities using a much more complex and expensive construction.

The first two schemes, R-FMD1 and R-FMD2, are constructed similarly to a Bloom filter: the CreateClue procedure encrypts a number of 1 bits, and the Examine procedure uses information in a detection key to check whether some subset of them are 1, returning true if so and false otherwise. The false positive probability is controlled by extracting only a subset of the information in the root key into the detection key, so that it can only check a subset of the bits encoded in the clue. We focus on R-FMD2, which provides more compact ciphertexts and chosen-ciphertext security.

In this section, we:

  • recall the construction of R-FMD2, changing notation from the paper;
  • adapt R-FMD2 to S-FMD2, which provides sender FMD instead of receiver FMD;
  • extend S-FMD2 to use deterministic derivation, allowing 32-byte flag and detection keys to support arbitrary-precision false positive probabilities;
  • extend S-FMD2 to support diversified detection, allowing multiple, unlinkable flag keys to be detected by a single detection key (though, as this extension conflicts with the preceding one, we do not use it);
  • summarize the resulting construction and describe how it can be integrated with a Sapling- or Orchard-style key hierarchy.

The R-FMD2 construction

First, we recall the paper’s construction of R-FMD2, changing from multiplicative to additive notation and adjusting variable names to be consistent with future extensions.

The construction supports restricted false positive values for , a global parameter determining the minimum false positive rate. is a group of prime order and and are hash functions.

R-FMD2/KeyGen

Choose a generator . For , choose and compute . Return the root key and the clue key .

R-FMD2/Extract

On input and root key , parse and return as the detection key.

R-FMD2/Flag

On input , first parse , then proceed as follows:

  1. Choose and compute .
  2. Choose and compute .
  3. For each , compute
    1. a key bit ;
    2. a ciphertext bit .
  4. Compute .
  5. Compute .

Return the clue .

R-FMD2/Examine

On input , , first parse , , then proceed as follows:

  1. Compute .
  2. Recompute as .
  3. For each , compute
    1. a key bit ;
    2. a plaintext bit .

If all plaintext bits , return (match); otherwise, return .

To see that the Test procedure detects true positives, first note that since and , so the detector will recompute the used by the sender; then, since , the detector’s is the same as the sender’s .

On the other hand, if the clue was created with a different clue key, the resulting will be random bits, giving the desired false positive probability.

One way to understand the components of the construction is as follows: the message consists of bits of a Bloom filter, each encrypted using hashed ElGamal with a common nonce and nonce commitment . The points are included in the ElGamal hash, ensuring that any change to either will result in a random bit on decryption. The pair act as a public key and signature for a one-time signature on the ciphertext as follows: the sender constructs the point as the output of a chameleon hash with basis on input , then uses the hash trapdoor (i.e., their knowledge of the discrete log relation ) to compute a collision involving , a hash of the rest of the ciphertext. The collision acts as a one-time signature with public key .

The fact that the generator was selected at random in each key generation is not used by the construction and doesn’t seem to play a role in the security analysis; in fact, the security proof in Appendix F has the security game operate with a common generator. In what follows, we instead assume that is a global parameter.

From R-FMD2 to S-FMD2

Changing this construction to the S-FMD model is fairly straightforward: rather than having the sender encrypt bits of the Bloom filter and only allow the detector to decrypt of them, we have the sender only encrypt bits of the Bloom filter and allow the detector to decrypt all potential bits. As noted in the previous section, this means that in the S-FMD context, there is no separation of capability between the root key and the detection key. For this reason, we skip the Extract algorithm and (conceptually) merge the root key into the detection key.

The S-FMD2 construction then works as follows:

S-FMD2/KeyGen

For , choose and compute . Return the detection key and the clue key .

S-FMD2/CreateClue

On input clue key , first parse , then proceed as follows:

  1. Choose and compute .
  2. Choose and compute .
  3. For each , compute
    1. a key bit ;
    2. a ciphertext bit .
  4. Compute .
  5. Compute .

Return the clue . (We include explicitly rather than have it specified implicitly by to reduce the risk of implementation confusion).

S-FMD2/Examine

On input detection key and clue , first parse and , then proceed as follows:

  1. Compute .
  2. Recompute as .
  3. For each , compute
    1. a key bit ;
    2. a plaintext bit .

If all plaintext bits , return (match); otherwise, return .

The value of is included in the ciphertext hash to ensure that the encoding of the ciphertext bits is non-malleable. Otherwise, the construction is identical.

Compact clue and detection keys

One obstacle to FMD integration is the size of the clue keys. Clue keys are required to create clues, so senders who wish to create clues need to obtain the receiver’s clue key. Ideally, the clue keys would be bundled with other address data, so that clues can be included with all messages, rather than being an opt-in mechanism with a much more limited anonymity set.

However, the size of clue keys makes this difficult. A clue key supporting false positive probabilities down to requires group elements; for instance, supporting probabilities down to with a 256-bit group requires 448-byte flag keys. It would be much more convenient to have smaller keys.

One way to do this is to use a deterministic key derivation mechanism similar to BIP32 to derive a sequence of child keypairs from a single parent keypair, and use that sequence as the components of a flag / detection keypair. To do this, we define a hash function . Then given a parent keypair , child keys can be derived as with . Unlike BIP32, there is only one sequence of keypairs, so there is no need for a chain code to namespace different levels of derivation, although of course the hash function should be domain-separated.

Applying this to S-FMD2, the scalar becomes the detection key, and the point becomes the clue key. The former detection key is renamed the expanded detection key , and the former clue key is renamed the expanded clue key ; these are derived from the detection and clue keys respectively.

In addition, it is no longer necessary to select the minimum false positive probability as a global parameter prior to key generation, as the compact keys can be extended to arbitrary length (and thus arbitrary false positive probabilities).

Deterministic derivation is only possible in the S-FMD setting, not the R-FMD setting, because S-FMD does not require any separation of capability between the component keys used for each bit of the Bloom filter. Public deterministic derivation does not allow separation of capability: given and any child secret , it’s possible to recover the parent secret and therefore the entire subtree of secret keys. Thus, it cannot be used for R-FMD, where the detection key is created by disclosing only some of the bits of the root key.

Diversified detection: from S-FMD2 to S-FMD2-d

The Sapling design provides viewing keys that support diversified addresses: a collection of publicly unlinkable addresses whose activity can be scanned by a common viewing key. For integration with Sapling, as well as other applications, it would be useful to support diversified detection: a collection of publicly unlinkable diversified clue keys that share a common detection key. This collection is indexed by data called a diversifier; each choice of diversifier gives a different diversified clue key corresponding to the same detection key.

In the Sapling design, diversified addresses are implemented by selecting the basepoint of the key agreement protocol as the output of a group-valued hash function whose input is the diversifier. Because the key agreement mechanism only makes use of the secret key (the incoming viewing key) and the counterparty’s ephemeral public key, decryption is independent of the basepoint, and hence the choice of diversifier, allowing a common viewing key to decrypt messages sent to any member of its family of diversified transmission (public) keys.

A similar approach can be applied to S-FMD2, but some modifications to the tag mechanism are required to avoid use of the basepoint in the Test procedure. Unfortunately, this extension is not compatible with compact clue and detection keys, so although it is presented here for posterity, it is not used in Penumbra.

Let be the set of diversifiers, and let be a group-valued hash function. At a high level, our goal is to replace use of a common basepoint with a diversified basepoint for some .

Our goal is to have diversified clue keys, so we first replace by in KeyGen, so that and the components of the clue key are parameterized by the diversifier . The sender will compute the key bits as and the receiver computes them as If instead of , then and the middle component will match.

But then the sender needs to construct as in order to have a known discrete log relation between and . This is a problem, because is recomputed by the receiver, but if the receiver must recompute it as , detection is bound to the choice of diversifier, and we need detection to be independent of the choice of diversifier.

The root of this problem is that the chameleon hash uses basis , so the receiver’s computation involves the basepoint . To avoid it, we can use a basis , where . is used in place of , but is computed as , i.e., as the chameleon hash . The sender’s collision then becomes , allowing the detector to recompute as The point must be included in the clue, increasing its size, but detection no longer involves use of a basepoint (diversified or otherwise).

Concretely, this mechanism works as follows, splitting out generation of clue keys (diversifier-dependent) from generation of detection keys:

S-FMD2-d/KeyGen

For , choose , and return the detection key .

S-FMD2-d/Diversify

On input detection key and diversifier , first parse . Then compute the diversified base , and use it to compute . Return the diversified clue key .

S-FMD2-d/CreateClue

On input diversified clue key and diversifier , first parse , then proceed as follows:

  1. Compute the diversified base .
  2. Choose and compute .
  3. Choose and compute .
  4. Choose and compute .
  5. For each , compute
    1. a key bit ;
    2. a ciphertext bit .
  6. Compute .
  7. Compute .

Return the clue .

S-FMD2-d/Examine

On input detection key and clue , first parse and , then proceed as follows:

  1. Compute .
  2. Recompute as .
  3. For each , compute
    1. a key bit ;
    2. a plaintext bit .

If all plaintext bits , return (match); otherwise, return .

Unfortunately, this extension does not seem to be possible to combine with the compact clue and detection keys extension described above. The detection key components must be derived from the parent secret and (the hash of) public data, but diversified detection requires that different public data (clue keys) have the same detection key components.

Of the two extensions, compact clue keys are much more important, because they allow clue keys to be included in an address, and this allows message detection to be a default behavior of the system, rather than an opt-in behavior of a select few users.

S-FMD Threat Model

Open vs Closed World Setting

The previous literature on FMD (e.g. Beck et al. 2021, Seres et al. 2021) focuses on a model we will call the closed world setting where:

  • A single untrusted server performs FMD on behalf of users. This server has all detection keys.
  • All users in the messaging system use FMD.

This is appropriate for a messaging application using a single centralized server where FMD is a requirement at the protocol-level.

However, Penumbra operates in what we will call the open world setting in which:

  • Multiple untrusted servers perform FMD on behalf of users, i.e. there is no single centralized detection server with all detection keys.
  • Not all users use FMD: in Penumbra FMD is opt-in. A fraction of Penumbra users will download all messages, and never provide detection keys to a third party.

A further difference in Penumbra is that the total number of distinct users is unknown. Each user can create multiple addresses, and they choose whether or not to derive a detection key for a given address.

Assumptions

We assume the FMD construction is secure and the properties of correctness, fuzziness (false positives are detected with rate ), and detection ambiguity (the server cannot distinguish between false and true positives) hold as described in the previous section.

All parties malicious or otherwise can access the public chain state, e.g. the current and historical global network activity rates and the current and historical network false positive rates.

Each detection server also observes which messages are detected, but only for the detection keys they have been given. No detection server has detection keys for all users, since only a subset of users opt-in to FMD.

A malicious detection server can passively compare the detected sets between users. They can also perform active attacks, sending additional messages to artificially increase global network volume, or to the user to increase the number of true positives. We consider these attacks in the next section. In the open world setting, multiple malicious detection servers may be attempting to boost traffic globally, or may target the same user. Malicious detection servers may collude, sharing the sets of detection keys they have in order to jointly maximize the information they learn about network activity.

We assume no detection servers have access to sender metadata, as would be the case if participants routed their traffic through a network privacy layer such as Tor.

A passive eavesdropper can observe the network traffic between recipient and detection server, and attempt to infer the number of messages they have downloaded. We assume the connection between the detection server and recipient is secured using HTTPS.

S-FMD in Penumbra

In the secions that follow we describe how S-FMD clue keys, detection keys, and clues are integrated into the Penumbra system.

Clue Keys

Each Penumbra diversified address includes as part of the encoded address an S-FMD clue key. This key can be used to derive a clue for a given output.

See the Addresses section for more details.

Detection Keys

Each Penumbra address has an associated S-FMD detection key. This key is what the user can optionally disclose to a third-party service for detection. Recall the detection key is used to examine each clue and return if there is a detection or not.

Clues

Each Penumbra transaction can have multiple outputs, to the same or different recipients. If a transaction contains outputs for the same recipient, then the FMD false positive rate will be if the detector uses the detection key that does not correspond to that recipient. To ensure that transactions are detected with false positive rate , we attach clues to transactions such that there is a single clue per recipient clue key per transaction.

In order not to leak the number of distinct recipients to a passive observer through the number of clues in a transaction, we add dummy clues to the transaction until there are an equal number of clues and outputs. A consensus rule verifies that all transactions have an equal number of clues and outputs.

A consensus rule verifies that clues in transactions have been generated using the appropriate precision, within a grace period of 10 blocks1. Transactions with clues generated using the incorrect precision are rejected by the network.

1

This imposes an bound on the latency of signing workflows.

Parameter Considerations

  • How should the false positive rate be determined? In some epoch, let be the false positive rate, be the total number of messages, be the number of true positives for some detection key, and be the number of detections for that detection key. Then and ideally should be chosen so that:

    1. is bounded above;
    2. When is within the range of “normal use”, is close enough to that it’s difficult for a detector to distinguish (what does this mean exactly?);
  • The notion of detection ambiguity only requires that true and false positives be ambiguous in isolation. In practice, however, a detector has additional context: the total number of messages, the number of detected messages, and the false positive probability. What’s the right notion in this context?

  • What happens when an adversary manipulates (diluting the global message stream) or (by sending extra messages to a target address)? There is some analogy here to flashlight attacks, although with the critical difference that flashlight attacks on decoy systems degrade privacy of the transactions themselves, whereas here the scope is limited to transaction detection.

Flow Encryption

NOTE: Flow Encryption will not ship in Penumbra V1, because ABCI 2.0 was delayed. Instead, flow contributions will be in the clear until a future network upgrade.

These notes are a work-in-progress and do not reflect the current protocol.

In Penumbra, we need some scheme to achieve privacy in a multi-party context, where knowledge from multiple participants is required. Traditional zero-knowledge proofs are not sufficient for this, since there is a global state that no individual participant can prove knowledge of if the global state is private. This is described in the Batching Flows section at length. To implement flow encryption, we need a few constructions:

Ideal Functionality

Flow encryption is ultimately “just” a form of additively homomorphic threshold encryption. However, usefully integrating that primitive into a ledger and byzantine consensus system requires a number of additional properties, and it’s useful to have a single name that refers to that complete bundle of properties.

In this section, we sketch what those properties are, and the algorithms that a flow encryption construction should provide, separately from our instantiation of flow encryption, eddy, described in the next section.

Participants

There are three participants in flow encryption:

  • the ledger, which is assumed to be a BFT broadcast mechanism that accepts messages based on some application-specific rules;
  • users, who encrypt their individual contributions to some flow of value and submit them to the ledger;
  • decryptors, who aggregate encryptions posted to the ledger and decrypt the batched flow.

In this description, we refer to decryptors, rather than validators, in order to more precisely name their role in flow encryption. In Penumbra, the decryptors are also the validators who perform BFT consensus to agree on the ledger state. However, this is not a requirement of the construction; one could also imagine a set of decryptors who jointly decrypted data posted to a common, external ledger such as Juno or Ethereum.

We include the ledger as a participant to highlight that a BFT broadcast mechanism is assumed to be available, and to explicitly define when communication between other participants happens via the ledger, and is thus already BFT, and when communication happens out-of-band, and must be made BFT.

Importantly, users are not required to have any online interactions or perform any protocols with any other participant. They must be able to encrypt and submit their contribution noninteractively. This is essential for the intended role of flow encryption in allowing private interaction with public shared state: the decryption protocol acts as a form of specialized, lightweight MPC, and the encryption algorithm acts as a way for users to delegate participation in that MPC to a set of decryptors, without imposing wider coordination requirements on the users of the system.

Algorithms and Protocols

FlowEnc/DistKeyGen

This is a multi-round protocol performed by the decryptors. On input a set of decryptors and a decryption threshold , this protocol outputs a common threshold encryption key , a private key share for each participant , and a public set of commitments to decryption shares . The exact details of the DKG, its number of rounds, etc., are left to the specific instantiation.

FlowEnc/Encrypt

This is an algorithm run by users. On input an encryption key and the opening to a Pedersen commitment , this algorithm outputs a ciphertext and a proof which establishes that is well-formed and is consistent, in the sense that it encrypts the same value committed to by .

We assume that all ciphertexts are submitted to the ledger, which verifies along with any other application-specific validity rules. These rules need to check, among other things, that is the correct value to encrypt, and the Pedersen commitment provides a flexible way to do so. The consistency property established by allows application-specific proof statements about (a commitment to) to extend to the ciphertext .

FlowEnc/Aggregate

This algorithm describes how to add ciphertexts to output the encryption of their sum.

We also assume that, because ciphertexts are posted to the ledger, all decryptors have a consistent view of available ciphertexts, and of the application-specific rules concerning which contributions should be batched together, over what time period, etc., so that they also have a consistent view of the aggregated ciphertext to decrypt.

In the case of same-block decryption, this assumption requires some care to integrate with the process for coming to consensus on the block containing transactions to batch and decrypt, but this is out-of-scope for flow encryption itself. See Flow Encryption and Consensus for details on this aspect in Penumbra specifically.

FlowEnc/PreDecrypt

On input ciphertext and key share , this algorithm outputs a decryption share and a decryption share integrity proof .

FlowEnc/Decrypt

On input a ciphertext and any set of decryption shares and proofs with , output if at least of decryption shares were valid.

Properties

We require the following properties of these algorithms:

Additive Homomorphism

Ciphertexts must be additively homomorphic:

Value Integrity

The flow encryption must ensure conservation of value, from the individual users’ contributions all the way to the decrypted batch total. Establishing value integrity proceeds in a number of steps:

  1. Application-specific logic proves that each user’s contribution value conserves value according to the application rules, by proving statements about the commitment to .
  2. The encryption proof extends integrity from to .
  3. Integrity extends to the aggregation automatically, since the aggregation can be publicly recomputed by anyone with access to the ledger.
  4. The decryption share integrity proofs extend integrity from to . This requires that, if is the result of decryption using valid decryption shares, than .
  5. Publication of (any) decryption transcript allows any participant to check the end-to-end property that for (application-)valid .
Decryption Robustness

The decryption process must succeed after receiving any valid decryption shares, so that any decryptor who can receive messages from honest decryptors must be able to compute the correct plaintext.

Unlike the DKG, where we do not impose a constraint on the number of rounds, we require that decryption succeed with only one round of communication. This allows us to integrate the decryption process with the consensus process, so that in the case where decryptors are validators, they can jointly learn the batched flow at the same time as they finalize a block and commit its state changes. This property is important to avoid requiring a pipelined execution model. For more details, see Flow Encryption and Consensus.

Note that we do not require that any specific subset of decryption shares is used to get the (unique) decryption result in FlowEnc/Decrypt. This permits a streaming implementation where all decryptors participate, but decryption completes for each participant as soon as they receive valid shares, ensuring that decryption is not bottlenecked on the slowest participant.

DKG Verifiability

The DKG must be verifiable: participants (decryptors) must be able to verify that counterparty participants (other decryptors) are contributing to the DKG honestly, without the use of a trusted dealer. This can be achieved using something similar to Feldman’s Verifiable Secret Sharing protocol, where each participant shares a commitment to their share which is visible to all other participants. In addition, our DKG must be able to tolerate rogue-key attacks: that is, it must tolerate the instance where a validator maliciously chooses their share based on the value of the other validator’s shares in order to cancel out other validator’s keyshares and gain unilateral control over the resulting DKG key. One way this can be prevented is by each validator producing a proof of knowledge of their secret share.

DKG Robustness

The DKG must have robustness. The DKG should be able to tolerate a byzantine threshold of decryptors intentionally refusing to participate in the DKG round, or intentionally contributing malformed shares during DKG execution, without requiring a full restart of the DKG protocol. This is due to DoS concerns: with a naive, non-robust implementation, a single malicious decryptor could potentially indefinitely delay the beginning of an epoch by refusing to participate in DKG or by contributing invalid shares.

The eddy construction

Distributed Key Generation

A prerequisite for flow encryption in Penumbra is some distributed key generation algorithm. Our threshold encryption scheme uses ElGamal and operates over the decaf377 group, thus we must choose a DKG which operates over decaf377 outputs a decaf377 public key.

Desired Properties

Minimal Round Complexity

The DKG must be run by all validators at the start of every epoch. Ideally, the DKG should be able to run in over a single block, such that there is no delay between the start of the epoch and when users can delegate in the staking system or trade on ZSwap. As such, for optimal UX and simplicity of implementation, the optimal DKG for our scheme should have minimal round complexity. Each round of communication can occur in the ABCI++ vote extension phase. Since Tendermint votes already impose an O(n^2) communication complexity for each block, minimal communication complexity is not important for our DKG.

Verifiability and Security Under Rogue-Key Attacks

In addition to minimal round complexity, our DKG must be verifiable: participants (validators) must be able to verify that counterparty participants (other validators) are contributing to the DKG honestly, without the use of a trusted dealer. This can be achieved using something similar to Feldman’s Verifiable Secret Sharing protocol, where each participant shares a commitment to their share which is visible to all other participants. In addition, our DKG must be able to tolerate rogue-key attacks: that is, it must tolerate the instance where a validator maliciously chooses their share based on the value of the other validator’s shares in order to cancel out other validator’s keyshares and gain unilateral control over the resulting DKG key. One way this can be prevented is by each validator producing a proof of knowledge of their secret share.

Robustness

An additional property that our DKG should have is that of robustness. The DKG should be able to tolerate a byzantine threshold of validators intentionally refusing to participate in the DKG round, or intentionally contributing malformed shares during DKG execution, without requiring a full restart of the DKG protocol. This is due to DoS concerns: with a naive, non-robust implementation, a single malicious validator could potentially indefinitely delay the beginning of an epoch by refusing to participate in DKG or by contributing invalid shares.

A Survey of Existing DKGs

Gurkan, Jovanovic, Maller, Meiklejohn, Stern, and Tomescu present a survey of DKG instantiations with their communication complexity, public verifiablity, round complexity, prover, and verifier costs:

Note that PV, public verifiability, is slightly different than our requirement of verifiability: public verifiability means that any non-participating observer must be able to verify the correct execution of the DKG.

Of the schemes listed, Fouque-Stern has the lowest round complexity, however the scheme uses Pallier which is not compatible with our encryption scheme. The scheme described in that paper uses pairings, and thus is also not compatible with our scheme. In addition, O(n*log*n) communication complexity is not important for our scheme, since our vote extensions already require n^2 communication.

In addition to these schemes, ETHDKG is in use by Anoma and Osmosis in Ferveo. However, this scheme is also not applicable to our threshold encryption scheme as it is based on pairings rather than a discrete log DKG.

FROST

Komlo and Goldberg introduced FROST: Flexible Round-Optimized Schnorr Threshold Signatures. FROST contains a DKG which is a slightly modified version of Pedersen’s DKG, modified to protect against rogue-key attacks. FROST DKG operates on any prime order group where Decisional Diffie-Hellman is difficult, and thus is compatible with decaf377.

Verifiability

FROST DKG fulfills the requirement of verifiability and security under rogue-key attacks. FROST DKG achieves this by slightly modifying Pedersen’s DKG to include a proof of knowledge of each participant’s secret in the first round of communication.

Robustness

FROST DKG trades off simplicity and efficiency in favor of robustness. A single participant can cause the DKG to abort by contributing an invalid share or by refusing to produce a key share. Gennaro, Jarecki, Krawczyk, and Tal Rabin presented an alternative instantiation of Pedersen’s DKG, which can tolerate up to n-t invalid or missing shares. This is accomplished by adding an additional complaint round, where participants can publish evidence of a participant’s invalid contribution and disqualify them from the DKG. A similar strategy could be added to FROST DKG to adapt it to be robust.

Round Complexity

By including an additional round for relaying complaints from each participant to each counterparty, the round complexity of our DKG rises to 3 rounds.

ABCI++ Implementation

A prerequisite for implementing any DKG scheme in Penumbra is ABCI++, the extension to Tendermint which adds additional hooks to the Tendermint state machine interface, most importantly vote extensions.

Implementing FROST DKG with ABCI++ can be done naively by placing each round of communication in the Vote Extensions phase of ABCI++. This gives an upper bound on the delay between the start of a new epoch and the completion of the DKG as 3 blocks: 1 block per vote extension communication round.

This could potentially be pipelined by including each participant’s commitments for the next epoch (the commitments from FROST) in the block data at one of the last blocks of the previous epoch. Round 2 of FROST could occur during the vote extensions phase of the epoch transition block.

NOTE: FROST requires private communication channels for round 2, thus we must assign validators encryption keys, perform key agreement, and encrypt round 2 communications before placing them in the vote extension. This can be accomplished using ECDH with decaf377, and an AEAD cipher (though authentication is not strictly required, since vote extensions are signed).

TODO:

[] concrete robust FROST implementation

Homomorphic Threshold Encryption

The core of the flow encryption system requires a partially homomorphic encryption scheme, which allows for users to publish transactions that contain encrypted values. These encrypted values are then aggregated, using the homomorphism, by validators. The aggregate value (the “batched flow”) is then decrypted using the threshold decryption scheme described here.

Desired Properties

For our threshold encryption scheme, we require three important properties:

  • Homomorphism: we must be able to operate over ciphertexts, by combining balance commitments from many participants into a batched value.
  • Verifiability: we must be able to verify that a given value was encrypted correctly to a given ciphertext
  • Robustness: up to validators must be permitted to either fail to provide a decryption share or provide in invalid decryption share.

Concrete Instantiation: Homomorphic ElGamal

Setup

Compute a lookup table for every by setting where is the basepoint of decaf377. Store for later use in value decryption.

Value Encryption

                                  ┌───────────────────┐                           
                                  │DKG Public Key (D) │                           
                                  │                   │                           
                                  └───────────────────┘                           
                                            │                                     
                                  ┌─────────▼─────────┐    ┌──────────────┐       
                                  │                   │    │     v_e      │       
                                  │                   │    │  (encrypted  │       
                                  │                   │    │    value)    │       
                  ┌──────────┐    │ElGamal Encryption │    │ ┌──────────┐ │       
               ┌─▶│v_0 (u16) │────▶   D: DKG Pubkey   ├────┼▶│   c_0    │ │       
               │  └──────────┘    │     M = v_i*G     │    │ └──────────┘ │       
               │  ┌──────────┐    │     e <- F_q      │    │ ┌──────────┐ │       
┌────────────┐ ├─▶│v_1 (u16) │────▶ c_i = (e*G, M+eD) ├────┼▶│   c_1    │ │       
│            │ │  └──────────┘    │                   │    │ └──────────┘ │       
│v [0, 2^64) │─┤  ┌──────────┐    │                   │    │ ┌──────────┐ │       
│            │ ├─▶│v_2 (u16) │────▶ Correctness Proof ├────┼▶│   c_2    │ │       
└────────────┘ │  └──────────┘    │         σ         │    │ └──────────┘ │       
               │  ┌──────────┐    │                   │    │ ┌──────────┐ │       
               └─▶│v_3 (u16) │────▶                   ├────┼▶│   c_3    │ │       
                  └──────────┘    │                   │    │ └──────────┘ │       
                                  │                   │    │              │       
                                  │                   │    │              │       
                                  └───────────────────┘    └──────────────┘       
                                            │                                     
                                            │           ┌────────────────────────┐
                                            └──────────▶│proofs σ_ci = (r, s, t) │
                                                        └────────────────────────┘

A value is given by an unsigned 64-bit integer . Split into four 16-bit limbs

with .

Then, perform ElGamal encryption to form the ciphertext by taking (for each )

Where is the basepoint generator for decaf377, is the large prime-order scalar field for decaf377, and is the public key output from DKG.

Next, compute a proof of correctness of the ElGamal encryption by executing the following sigma protocol:

The proof is then . The encryption of value is given as .

Upon receiving an encrypted value with proofs , a validator or validating full node should verify each proof by checking

Considering the value invalid if the proof fails to verify.

This protocol proves, in NIZK, the relation

Showing that the ciphertext is an actual encryption of for the DKG pubkey , and using the hash transcript to bind the proof of knowledge of to each .

Each ciphertext is two group elements, accompanied by a proof which is three scalars. decaf377 group elements and scalars are encoded as 32-byte values, thus every encrypted value combined with its proof is = 640 bytes.

Value Aggregation

                          n (batch size)                         
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐
│┌──────────────┐   ┌──────────────┐            ┌──────────────┐│
 │              │   │              │            │              │ 
 │   v_{e0}     │   │    v_{e1}    │            │    v_{ek}    │ 
 │              │   │              │            │              │ 
 │ ┌──────────┐ │   │ ┌──────────┐ │            │ ┌──────────┐ │ 
 │ │   c_0    │ │   │ │   c_0    │ │            │ │   c_0    │ │ 
 │ └──────────┘ │   │ └──────────┘ │            │ └──────────┘ │ 
 │ ┌──────────┐ │   │ ┌──────────┐ │            │ ┌──────────┐ │ 
 │ │   c_1    │ │   │ │   c_1    │ │            │ │   c_1    │ │ 
 │ └──────────┘ │   │ └──────────┘ │            │ └──────────┘ │ 
 │ ┌──────────┐ │   │ ┌──────────┐ │    ...     │ ┌──────────┐ │ 
 │ │   c_2    │ │   │ │   c_2    │ │            │ │   c_2    │ │ 
 │ └──────────┘ │   │ └──────────┘ │            │ └──────────┘ │ 
 │ ┌──────────┐ │   │ ┌──────────┐ │            │ ┌──────────┐ │ 
 │ │   c_3    │ │   │ │   c_3    │ │            │ │   c_3    │ │ 
 │ └──────────┘ │   │ └──────────┘ │            │ └──────────┘ │ 
 │              │   │              │            │              │ 
 │              │   │              │            │              │ 
 └──────────────┘   └──────────────┘            └──────────────┘ 
         │                  │             │             │        
         │                  │             │             │        
         └──────────────────┴─────┬───────┴─────────────┘        
                                  │             ┌──────────────┐ 
                                  ▼             │              │ 
                                ┌───┐           │   v_{agg}    │ 
                                │ + │───────────▶              │ 
                                └───┘           │ ┌──────────┐ │ 
                                                │ │   c_0    │ │ 
                                                │ └──────────┘ │ 
                                                │ ┌──────────┐ │ 
                                                │ │   c_1    │ │ 
                                                │ └──────────┘ │ 
                                                │ ┌──────────┐ │ 
                                                │ │   c_2    │ │ 
                                                │ └──────────┘ │ 
                                                │ ┌──────────┐ │ 
                                                │ │   c_3    │ │ 
                                                │ └──────────┘ │ 
                                                │              │ 
                                                │              │ 
                                                └──────────────┘  

To batch flows, we must use the homomorphic property of ElGamal ciphertexts. Aggregation should be done component-wise, that is, on each limb of the ciphertext (). To aggregate a given , simply add each limb: