Penumbra is a fully private proofofstake network and decentralized exchange for the Cosmos ecosystem.
Penumbra integrates privacy with proofofstake through a novel private delegation mechanism that provides staking derivatives, taxefficient staking, and onchain governance with private voting. Penumbra connects to the Cosmos ecosystem via IBC, acting as an ecosystemwide shielded pool and allowing private transactions in any IBCcompatible asset. Users can also swap these assets using ZSwap, a private decentralized exchange supporting sealedbid batch auctions and Uniswapv3style concentrated liquidity. Sealedbid batch auctions prevent frontrunning, provide better execution, and reveal only the net flow across a pair of assets in each block, and liquidity positions are created anonymously, allowing traders to approximate their desired trading function without revealing their individual beliefs about prices.
This website renders the workinprogress protocol specification for Penumbra.
Press s
or use the magnifying glass icon for fulltext search.
If you’re interested in technical discussion about the project, why not
 join the discord,
 check out the repo and issue tracker,
 view the roadmap goals,
 or follow the project on Twitter for updates.
Private Transactions
Penumbra records all value in a single multiasset shielded pool based on the Zcash Sapling design, but allows private transactions in any kind of IBC asset. Inbound IBC transfers shield value as it moves into the zone, while outbound IBC transfers unshield value.
Unlike Zcash, Penumbra has no notion of transparent transactions or a
transparent value pool; instead, inbound IBC transfers are analogous to t2z
Zcash transactions, outbound IBC transfers are analogous to z2t
Zcash
transactions, and the entire Cosmos ecosystem functions analogously to
Zcash’s transparent pool.
Unlike the Cosmos Hub or many other chains built on the Cosmos SDK, Penumbra has no notion of accounts. Only validators have any kind of longterm identity, and this identity is only used (in the context of transactions) for spending the validator’s commission.
Private Staking
In a proofofstake system like the Cosmos Hub, stakeholders delegate staking tokens by bonding them to validators. Validators participate in Tendermint consensus with voting power determined by their delegation size, and delegators receive staking rewards in exchange for taking on the risk of being penalized for validator misbehavior (slashing).
Integrating privacy and proof of stake poses significant challenges. If delegations are public, holders of the staking token must choose between privacy on the one hand and staking rewards and participation in consensus on the other hand. Because the majority of the stake will be bonded to validators, privacy becomes an uncommon, optin case. But if delegations are private, issuing staking rewards becomes very difficult, because the chain no longer knows the amount and duration of each address’ delegations.
Penumbra sidesteps this problem using a new mechanism that eliminates staking rewards entirely, treating unbonded and bonded stake as separate assets, with an epochvarying exchange rate that prices in what would be a staking reward in other systems. This mechanism ensures that all delegations to a particular validator are fungible, and can be represented by a single delegation token representing a share of that validator’s delegation pool, in effect a firstclass staking derivative. Although delegation fungibility is key to enabling privacy, as a side effect, delegators do not realize any income while their stake is bonded, only a capital gain (or loss) on unbonding.
The total amount of stake bonded to each validator is part of the public chain state and determines consensus weight, but the delegation tokens themselves are each just another token to be recorded in a multiasset shielded pool. This provides accountability for validators and privacy and flexibility for delegators, who can trade and transact with their delegation tokens just like they can with any other token.
It also provides an alternate perspective on the debate between fixedsupply and inflationbased rewards. Choosing the unbonded token as the numéraire, delegators are rewarded by inflation for taking on the risk of validator misbehavior, and the token supply grows over time. Choosing (a basket of) delegation tokens as the numéraire, nondelegators are punished by depreciation for not taking on any risk of misbehavior, and the token supply is fixed.
Private Governance
Like the Cosmos Hub, Penumbra supports onchain governance with delegated voting. Unlike the Cosmos Hub, Penumbra’s governance mechanism supports secret ballots. Penumbra users can anonymously propose votes by escrowing a deposit of bonded stake. Stakeholders vote by proving ownership of their bonded stake prior to the beginning of the voting period and encrypting their votes to a threshold key controlled by the validator set. Validators sum encrypted votes and decrypt only the perepoch totals.
Private DEX
Penumbra provides private, sealedbid batch swaps using ZSwap. ZSwap allows users to privately swap between any pair of assets. Individual swaps do not reveal trade amounts. Instead, all swaps in each block are executed in a single batch. Only the total amount in each batch is revealed, and only after the batch has been finalized. This prevents frontrunning and provides better execution, but also provides longterm privacy for individual swaps. Users can also provide liquidity by anonymously creating concentrated liquidity positions. These positions reveal the amount of liquidity and the bounds in which it is concentrated, but are not otherwise linked to any identity, so that (with some care) users can privately approximate arbitrary trading functions without revealing their specific views about prices.
Concepts and Mechanisms
This section provides an overview of the concepts involved in Penumbra’s mechanism design.
Validators
Validators in Penumbra undergo various transitions depending on chain activity.
┌ ─ ─ ─ ─ ┐
Genesis
│Validator│
─ ─ ─ ─ ─
│
│
┌ ─ ─ ─ ─ ─ ─ ─ ─ ▼
Validator │ ┏━━━━━━━━┓ ╔══════╗ ┏━━━━━━━┓
│ Definition ─────▶┃Inactive┃─────▶║Active║─────▶┃Slashed┃
(in transaction)│ ┗━━━━━━━━┛ ╚══════╝ ┗━━━━━━━┛
└ ─ ─ ─ ─ ─ ─ ─ ─ ▲ ▲ ▲
│ │ │
│ ▼ │
│ ╔═════════╗ │
└─────────║Unbonding║────────┘
╚═════════╝
Single lines represent unbonded stake, and double lines represent bonded stake.
Validators become known to the chain either at genesis, or by means of a transaction with a ValidatorDefinition
action in them. Validators transition through four states:
 Inactive, where the validator is not part of the consensus set, the stake in the validator’s delegation pool is not bonded;
 Active, where the validator is part of the consensus set, and the stake in the validator’s delegation pool is bonded;
 Unbonding, where the validator is not part of the consensus set, but the stake in the validator’s delegation pool is still bonded;
 Slashed, where the validator is not part of the consensus set, and the stake in the validator’s delegation pool is not bonded.
Validators specified in the genesis config begin in the active state, with whatever stake was allocated to their delegation pool at genesis. Otherwise, validators begin in the inactive state, with no stake in their delegation pool. At this point, the validator is known to the chain, and stake can be contributed to its delegation pool. Stake contributed to an inactive validator’s delegation pool does not earn rewards (the validator’s rates are held constant), but it is also not bonded, so undelegations are effective immediately, with no unbonding period and no output quarantine.
The chain chooses a validator limit N as a consensus parameter. When a validator’s delegation pool (a) has a nonzero balance and (b) its (votingpoweradjusted) size is in the top N validators, it moves into the active state during the next epoch transition. Active validators participate in consensus, and are communicated to Tendermint. Stake contributed to an active validator’s delegation pool earns rewards (the validator’s rates are updated at each epoch to track the rewards accruing to the pool). That stake is bonded, so undelegations have an unbonding period and an output quarantine. An active validator can exit the consensus set in two ways.
First, the validator could be slashed. This can happen in any block, not just at an epoch transition. Slashed validators are immediately removed from the consensus set. Any pending undelegations from a slashed validator are cancelled: the quarantined output notes are deleted, and the quarantined nullifiers are removed from the nullifier set. The validator’s rates are updated to price in the slashing penalty, and are then held constant. Slashed validators are jailed, and permanently prohibited from participation in consensus (though their operators can create new identity keys, if they’d like to). Stake cannot be delegated to a slashed validator. Stake already contributed to a slashed validator’s delegation pool is not bonded (the validator has already been slashed and jailed), so undelegations are effective immediately, with no unbonding period and no quarantine.
Second, the validator could be displaced from the validator set by another validator with more stake in its delegation pool. The validator is then in the unbonding state. It does not participate in consensus, and the stake in its delegation pool does not earn rewards (the validator’s rates are held constant). However, the stake in its delegation pool is still bonded. Undelegations from an unbonding validator are quarantined with an unbonding period that starts when the undelegation was performed, not when the validator began unbonding. Unbonding validators have three possible state transitions:
 they can become active again, if new delegations boost its weight back into the top N;
 they can be slashed, if evidence of misbehavior arises during the unbonding period;
 they can become inactive, if neither (1) nor (2) occurs before the unbonding period passes.
If (2) occurs, the same state transitions as in regular slashing occur: all pending undelegations are cancelled, etc. If (3) occurs, all pending undelegations are immediately removed from quarantine, shortcircuiting the unbonding period that began when the undelegation was performed. If (1) occurs, the validator stops unbonding, but this has no effect on pending undelegations, since they were quarantined with an unbonding period that started when the undelegation was performed (i.e., as if they were undelegations from an active validator).
Batching Flows
Penumbra’s ledger records value as it moves between different economic roles – for instance, movement between unbonded stake and delegation tokens, movement between different assets as they are traded, etc. This creates a tension between the need to reveal the total amount of value in each role as part of the public chain state, and the desire to shield value amounts in individual transactions.
To address this tension, Penumbra provides a mechanism to aggregate value flows across a batch of transactions, revealing the only the total amount and not the value contributed by each individual transaction. This mechanism is built using an integervalued homomorphic encryption scheme that supports threshold decryption, so that the network’s validators can jointly control a decryption key.
Transactions that contribute value to a batch contain an encryption of the amount. To flush the batch, the validators sum the ciphertexts from all relevant transactions to compute an encryption of the batch total, then jointly decrypt it and commit it to the chain.
This mechanism doesn’t require any coordination between the users whose transactions are batched, but it does require that the validators create and publish a threshold decryption key. To allow batching across block boundaries, Penumbra organizes blocks into epochs, and applies changes to the validator set only at epoch boundaries. Decryption keys live for the duration of the epoch, allowing value flows to be batched over any time interval from 1 block up to the length of an epoch. We propose epoch boundaries on the order of 13 days.
At the beginning of each epoch, the validator set performs distributed key generation for to produce a decryption key jointly controlled by the validators (on an approximately stakeweighted basis) and includes the encryption key in the first block of the epoch.
Because this key is only available after the first block of each epoch, some transactions cannot occur in the first block itself. Assuming a block interval similar to the Cosmos Hub, this implies an ~8second processing delay once per day, a reasonable tradeoff against the complexity of phased setup procedures.
Addresses and Keys
Value transferred on Penumbra is sent to shielded payment addresses; these addresses are derived from spending keys through a sequence of intermediate keys that represent different levels of attenuated capability:
flowchart BT subgraph Address end; subgraph DTK[Detection Key] end; subgraph IVK[Incoming Viewing Key] end; subgraph OVK[Outgoing Viewing Key] end; subgraph FVK[Full Viewing Key] end; subgraph SK[Spending Key] end; subgraph SeedPhrase[Seed Phrase] end; index(address index); div{ }; SeedPhrase > SK; SK > FVK; FVK > OVK; FVK > IVK; index > div; IVK > div; div > Address & DTK; DTK > Address;
From bottom to top:
 the seed phrase is the root key material. Multiple accounts  each with separate spend authority  can be derived from this root seed phrase.
 the spending key is the capability representing spending authority for a given account;
 the full viewing key represents the capability to view all transactions related to the account;
 the outgoing viewing key represents the capability to view only outgoing transactions, and is used to recover information about previously sent transactions;
 the incoming viewing key represents the capability to view only incoming transactions, and is used to scan the block chain for incoming transactions.
Penumbra allows the same account to present multiple, publicly unlinkable addresses, keyed by an 16byte address index. Each choice of address index gives a distinct shielded payment address. Because these addresses share a common incoming viewing key, the cost of scanning the blockchain does not increase with the number of addresses in use.
Finally, Penumbra also allows outsourcing probabilistic transaction detection to third parties using fuzzy message detection. Each address has a detection key; a third party can use this key to detect transactions that might be relevant to that key. Like a Bloom filter, this detection has false positives but no false negatives, so detection will find all relevant transactions, as well as some amount of unrelated cover traffic. Unlike incoming viewing keys, detection keys are not shared between diversified addresses, allowing finegrained control of delegation.
This diagram shows only the uservisible parts of the key hierarchy. Internally, each of these keys has different components, described in detail in the Addresses and Keys section of the Cryptographic Protocol chapter.
Assets and Amounts
Penumbra’s shielded pool can record arbitrary assets. To be precise, we define:
 an amount to be an untyped quantity of some asset;
 an asset type to be an ADR001style denomination trace uniquely identifying a crosschain asset and its provenance, such as:
denom
(native chain A asset)transfer/channelToA/denom
(chain B representation of chain A asset)transfer/channelToB/transfer/channelToA/denom
(chain C representation of chain B representation of chain A asset)
 an asset ID to be a fixedsize hash of an asset type;
 a value to be a typed quantity, i.e., an amount and an asset id.
Penumbra deviates slightly from ADR001 in the definition of the asset ID. While ADR001 defines the IBC asset ID as the SHA256 hash of the denomination trace, Penumbra hashes to a field element, so that asset IDs can be more easily used inside of a zkSNARK circuit.
Notes, Nullifiers, and Trees
Transparent blockchains operate as follows: all participants maintain a copy of the (consensusdetermined) application state. Transactions modify the application state directly, and participants check that the state changes are allowed by the application rules before coming to consensus on them.
On a shielded blockchain like Penumbra, however, the state is fragmented across all users of the application, as each user has a view only of their “local” portion of the application state recording their funds. Transactions update a user’s state fragments privately, and use a zeroknowledge proof to prove to all other participants that the update was allowed by the application rules.
Penumbra’s transaction model is derived from the Zcash shielded transaction design, with modifications to support multiple asset types, and several extensions to support additional functionality. The Zcash model is in turn derived from Bitcoin’s unspent transaction output (UTXO) model, in which value is recorded in transaction outputs that record the conditions under which they can be spent.
In Penumbra, value is recorded in notes, which function similarly to UTXOs. Each note specifies (either directly or indirectly) a type of value, an amount of value of that type, a spending key that authorizes spending the note’s value, and a unique nullifier derived from the note’s contents.
However, unlike UTXOs, notes are not recorded as part of the public chain state. Instead, the chain contains a state commitment tree, an incremental Merkle tree containing (public) commitments to (private) notes. Creating a note involves creating a new note commitment, and proving that it commits to a valid note. Spending a note involves proving that the spent note was previously included in the state commitment tree, using the spending key to demonstrate spend authorization, and revealing the nullifier, which prevents the same note from being spent twice.
Transactions
Transactions describe an atomic collection of changes to the ledger state. Each transaction consist of a sequence of descriptions for various actions^{1}. Each description adds or subtracts (typed) value from the transaction’s value balance, which must net to zero. Penumbra adapts the Spend and Output actions from Sapling, and adds many new descriptions to support additional functionality:
Penumbra adapts Sapling’s Spend, which spends a note and adds to the transaction’s value balance, and Output, which creates a new note and subtracts from the transaction’s value balance, and adds many new descriptions to support additional functionality:
Transfers

Spend descriptions spend an existing note, adding its value to the transaction’s value balance;

Output descriptions create a new note, subtracting its value from the transaction’s value balance;

Transfer descriptions transfer value out of Penumbra by IBC, consuming value from the transaction’s value balance, and producing an ICS20
FungibleTokenPacketData
for the counterparty chain;
Staking

Delegate descriptions deposit unbonded stake into a validator’s delegation pool, consuming unbonded stake from the transaction’s value balance and producing new notes recording delegation tokens representing the appropriate share of the validator’s delegation pool;

Undelegate descriptions withdraw from a validator’s delegation pool, consuming delegation tokens from the transaction’s value balance and producing new notes recording the appropriate amount of unbonded stake;

Commission descriptions are used by validators to sweep commission on staking rewards into shielded notes, adding unbonded stake to the transaction’s value balance;
Governance

CreateProposal descriptions are used to propose measures for onchain governance and supply a deposit, consuming bonded stake from the transaction’s value balance and producing a new note that holds the deposit in escrow;

WithdrawProposal descriptions redeem an escrowed proposal deposit, returning it to the transaction’s value balance and immediately withdrawing the proposal.

Vote descriptions perform private voting for onchain governance and declare a vote. This description leaves the value balance unchanged.
Trading

Swap descriptions perform the first phase of ZSwap, consuming tokens of one type from a transaction’s value balance, burning them, and producing a swap commitment for use in the second stage;

Sweep descriptions perform the second phase of ZSwap, allowing a user who burned tokens of one type to mint tokens of the other type at the chainspecified clearing price, and adding the new tokens to a transaction’s value balance;
Marketmaking

OpenPosition descriptions open concentrated liquidity positions, consuming value of the traded types from the transaction’s value balance and adding the specified position to the AMM state;

ClosePosition descriptions close concentrated liquidity positions, removing the specified position to the AMM state and adding the value of the position, plus any accumulated fees or liquidity rewards, to the transaction’s value balance.
Each transaction also contains a fee specification, which is always transparently encoded. The value balance of all of a transactions actions, together with the fees, must net to zero.
Note that like Zcash Orchard, we use the term “action” to refer to one of a number of possible state updates; unlike Orchard, we do not attempt to conceal which types of state updates are performed, so our Action is an enum.
Governance
Penumbra supports onchain governance with delegated voting. Validators’ votes are public and act as default votes for their entire delegation pool, while delegators’ votes are private, and override the default vote provided by their validator.
Votes are the same as on the Cosmos Hub: Yes
, No
, NoWithVeto
, and
Abstain
. NoWithVeto
is the same as No
but also votes that the proposer
should lose their deposit. The intended cultural norm is that No
should be
used to indicate disagreement with goodfaith proposals and NoWithVeto
should be used to deter spam proposals.
Proposals
Penumbra users can propose votes by escrowing a minimum amount of PEN
. They
do this by creating a transaction with a CreateProposal
description, which
consumes some amount of PEN
from the transaction’s balance, and creates a new
escrow note with the same amount. The note is escrowed in the sense that it is
recorded seperately and is not included in the state commitment tree until voting
completes.
Proposals can either be normal or emergency proposals. In either case, the voting period begins immediately, in the next block after the proposal has been committed to the chain. Normal proposals have a fixedlength voting period, while emergency proposals are accepted as soon as a 2/3 majority of the stake is reached.
Because validators provide default votes for their delegation pool, an emergency proposal can in principle be accepted immediately, without any input from delegators. This allows timecritical resolution of emergencies (e.g., deploying an 0day hotfix); the 2/3 majority of the stake required is already sufficient to arbitrarily rewrite the chain state.
Proposals can also be withdrawn by their proposer prior to the end of the voting
period. This is done by creating a transaction with a WithdrawProposal
description, and allows the community to iterate on proposals as the (social)
governance process occurs. For instance, a chain upgrade proposal can be
withdrawn and reproposed with a different source hash if a bug is discovered
while upgrade voting is underway. Withdrawn proposals cannot be accepted, even
if the vote would have passed, but they can be vetoed.^{1}
Voting
Stakeholder votes are of the form $(x_{y},x_{n},x_{a},x_{v})$, representing the weights for yes, no, abstain, and veto respectively. Most stakeholders would presumably set all but one weight to $0$. Stakeholders vote by proving ownership of some amount of bonded stake (their voting power) prior to the beginning of the voting period.
To do this, they create a transaction with a Vote
description. This
description identifies the validator $v$ and the proposal, proves spend
authority over a note recording $y$ dPEN(v)
, and reveals the note’s nullifier.
Finally, it proves vote consistency $y=x_{y}+x_{n}+x_{a}+x_{v}$, produces a new
note with $y$ dPEN(v)
, and includes $Enc_{D}(x_{i})$, an encryption
of the vote weights to the validators’ decryption key.
The proof statements in a Vote
description establishing spend authority over
the note are almost identical to those in a Spend
description. However, there
are two key differences. First, rather than proving that the note was included
in a recent state commitment tree state, it always uses the root of the note
commitment tree at the time that voting began, establishing that the note was
not created after voting began. Second, rather than checking the note’s
nullifier against the global nullifier set and marking it as spent, the
nullifier is checked against a snapshot of the nullifier set at the time that
voting began (establishing that it was unspent then), as well as against a
perproposal nullifier set (establishing that it has not already been used for
voting). In other words, instead of marking that the note has been spent in
general, we only mark it as having been spent in the context of voting on a
specific proposal.
This change allows multiple proposals to be voted on concurrently, at the cost
of linkability. While the same note can be used to vote on multiple proposals,
those votes, as well as the subsequent spend of the note, will have the same
nullifier and thus be linkable to each other. However, the Vote
descriptions
are shielded, so an observer only learns that two opaque votes were related to
each other.
We suggest that wallets roll over the note the first time it is used for voting
by creating a transaction with Vote
, Spend
, and Output
descriptions. This
mitigates linkability between Vote
and Spend
descriptions, and means that
votes on any proposals created after the first vote are unlinkable from prior
votes.
Counting Votes
At the end of each epoch, validators collect the encrypted votes from each delegation pool, aggregate the encrypted votes into encrypted tallies and decrypt the tallies. These intermediate tallies are revealed, because it is not possible to batch value flows over time intervals longer than one epoch. In practice, this provides a similar dynamic as existing (transparent) onchain governance schemes, where tallies are public while voting is ongoing.
At the end of the voting period, the perepoch tallies are summed. For each validator $v$, the votes for each option are summed to determine the portion of the delegation pool that voted; the validator’s vote acts as the default vote for the rest of the delegation pool. Finally, these pervalidator subtotals are multiplied by the voting power adjustment function $θ_{v}(e)$ to obtain the final vote totals.
If the vote was not vetoed, the escrowed note from the Proposal
description
is included in the state commitment tree, so that it can be spent by the
proposer. Otherwise, it is not, and the funds are burned.
If withdrawing a proposal halted onchain voting immediately, the escrow mechanism would not be effective at deterring spam, since the proposer could yank their proposal at the last minute prior to losing their deposit. However, at the UX level, withdrawn proposals can be presented as though voting were closed, since validators’ default votes are probably sufficient for spam deterrence.
Staking and Delegation
As described in the overview, integrating privacy with proofofstake poses significant challenges. Penumbra sidesteps these challenges using a novel mechanism that eliminates staking rewards entirely, treating unbonded and bonded stake as separate assets, with an epochvarying exchange rate that prices in what would be a staking reward in other systems. This mechanism ensures that all delegations to a particular validator are fungible, and can be represented by a single delegation token representing a share of that validator’s delegation pool, in effect a firstclass staking derivative.
This section describes the staking and delegation mechanism in detail:
 Staking Tokens describes the different staking tokens;
 Validator Rewards and Fees describes mechanics around validator commissions and transaction fees;
 Voting Power describes how validators’ voting power is calculated;
 Delegation describes how users bond stake to validators;
 Undelegation describes how users unbond stake from validators;
 Example Staking Dynamics contains a worked example illustrating the mechanism design.
 Arithmetic contains the specification for computing staking rates with fixedpoint arithmetic
Staking Tokens
Penumbra’s staking token, denoted PEN
, represents units of unbonded stake.
Rather than treat bonded and unbonded stake as being two states of the same
token, Penumbra records stake bonded with a particular validator with a
delegation token, denoted dPEN
. Conversion between PEN
and dPEN
occurs with an epochvarying exchange rate that prices in what would
be a staking reward in other systems. This ensures that all delegations to a
particular validator are fungible, and can be represented by a single token
representing an ownership share of that validator’s delegation pool.
Stake bonded with different validators is not fungible, as different
validators may have different commission rates and different risk of
misbehavior. Hence dPEN
is a shorthand for a class of assets (one per
validator), rather than a single asset. dPEN
bonded to a specific
validator $v$ can be denoted dPEN(v)
when it is necessary to be precise.
Each flavor of dPEN
is its own firstclass token, and like any other token
can be transferred between addresses, traded, sent over IBC, etc. Penumbra
itself does not attempt to pool risk across validators, but nothing prevents
third parties from building stake pools composed of these assets.
The base reward rate for bonded stake is a parameter $r_{e}$ indexed by epoch. This parameter can be thought of as a “Layer 1 Base Operating Rate”, or “L1BOR”, in that it serves as a reference rate for the entire chain. Its value is set on a perepoch basis by a formula involving the ratio of bonded and unbonded stake, increasing when there is relatively less bonded stake and decreasing when there is relatively more. This formula should be decided and adjusted by governance.
Each validator declares a set of funding streams, which comprise both the destinations of their commission and the total commission rate $c_{v,e}∈[0,1]$. $cv,e$ is subtracted from the base reward rate to get a validatorspecific reward rate $r_{v,e}=(1−c_{v,e})r_{e}.$.
The base exchange rate between PEN
and dPEN
is given by the function
$ψ(e)=0≤i<e∏ (1+r_{i}),$ which measures the cumulative
depreciation of stake PEN
relative to the delegation token dPEN
from
genesis up to epoch $e$. However, because dPEN
is not a single asset but a
family of pervalidator assets, this is only a base rate.
The actual exchange rate between stake PEN
and validator $v$’s delegation
token dPEN(v)
accounts for commissions by substituting the validatorspecific
rate $r_{v,e}$ in place of the base rate $r$ to get $ψ_{v}(e)=0≤i<e∏ (1+r_{v,i}).$
Delegating $x$ PEN
to validator $v$ at epoch $e_{1}$ results in $x/ψ_{v}(e_{1})$ dPEN
. Undelegating $y$ dPEN(v)
from validator $v$ at
epoch $e_{2}$ results in $yψ_{v}(e_{2})$ PEN
. Thus, delegating at epoch
$e_{1}$ and undelegating at epoch $e_{2}$ results in a return of $ψ_{v}(e_{2})/ψ_{v}(e_{1})=e_{1}≤e<e_{2}∏ (1+r_{v,e}),$ i.e., the staking
reward compounded only over the period during which the stake was bonded.
Discounting newly bonded stake by the cumulative depreciation of unbonded stake since genesis means that all bonded stake can be treated as if it had been bonded since genesis, which allows newly unbonded stake to always be inflated by the cumulative appreciation since genesis. This mechanism avoids the need to track the age of any particular delegation to compute its rewards, and makes all shares of each validator’s delegation pool fungible.
Validator Rewards and Fees
Validators declare a set of funding streams that comprise the destination of all of their staking rewards. Each funding stream contains a rate $ri∈[0,1]$ and a destination address $ai$. The validator’s total commission rate is defined as $c_{v,e}=∑ri$, the sum of the rate of each funding stream. $c_{v,e}$ cannot exceed 1.
The spread between the base reward rate $r_{e}$ and the reward rate for their delegators is determined by the validator’s total commission $r_{v,e}=(1−c_{v,e})r_{e}$, or equivalently $r_{e}=r_{v,e}+c_{v,e}r_{e}$.
Validator rewards are distributed in the first block of each epoch. In epoch
$e$, a validator $v$ whose delegation pool has size $y_{v}$ dPEN
receives a
commission of size $y_{v}c_{v,e}r_{e}ψ(e−1)$ PEN
, issued to the
validator’s address.
To see why this is the reward amount, suppose validator $v$ has a delegation
pool of size $y_{v}$ dPEN
. In epoch $e−1$, the value of the pool is $y_{v}ψ_{v}(e−1)$ PEN
. In epoch $e$, the base reward rate $r_{e}$ causes the value
of the pool to increase to
$(1+r_{e})y_{v}ψ_{v}(e−1).$
Splitting $r_{e}$ as $r_{e}=r_{v,e}+c_{v,e}r_{e}$, this becomes
$y_{v}(1+r_{v,e})ψ_{v}(e−1)+c_{v,e}r_{e}y_{v}ψ_{v}(e−1).$
The value in the first term, $y_{v}(1+r_{v,e})ψ_{v}(e−1)$,
corresponds to the $r_{v,e}$ portion, and accrues to the delegators. Since $(1+r_{v,e})ψ_{v}(e−1)=ψ_{v}(e)$, this is exactly $y_{v}ψ_{v}(e)$, the new PEN
denominated value of the delegation pool.
The value in the second term, $c_{v,e}r_{e}y_{v}ψ_{v}(e−1)$, corresponds to the $c_{v,e}r_{e}$ portion, and accrues to the validator as commission. Validators can selfdelegate the resulting PEN
or use it to fund their operating expenses.
Transaction fees are denominated in PEN
and are burned, so that the value of the fees accrues equally to all stake holders.
TODO

allow transaction fees in
dPEN
with appropriate price adjustment, but only in transactions (e.g., undelegations, voting) that otherwise reveal the flavor ofdPEN
, so that there is no additional distinguisher?
Voting Power
The size of each validator’s delegation pool (i.e., the amount of dPEN
of that
validator’s flavor) is public information, and determines the validator’s voting
power. However, these sizes cannot be used directly, because they are based on
validatorspecific conversion rates $ψ_{v}$ and are therefore incommensurate.
Voting power is calculated using the adjustment function $θ_{v}(e)=ψ_{v}(e)/ψ(e)$, so that a validator $v$ whose delegation pool has $y_{v}$
dPEN
in epoch $e$ has voting power $y_{v}θ_{v}(e)$.
The validatorspecific reward rate $r_{v,e}=r_{e}−c_{v,i}r_{e}$ adjusts the base reward rate to account for the validator’s commission. Since $ψ_{v}(e)=0≤i<e∏ (1+r_{v,i}),$ and $ψ(e)=0≤i<e∏ (1+r_{i}),$ the adjustment function $θ_{v}(e)=ψ(e)ψ_{v}(e) =0≤i<e∏ 1+r_{i}1+r_{v,i} $ accounts for the compounded effect of the validator’s commission on the size of the delegation pool.
Delegation
The delegation process bonds stake to a validator, converting stake PEN
to
delegation tokens dPEN
. Delegations may be performed in any block, but only
take effect in the next epoch.
Delegations are accomplished by creating a transaction with a
Delegate
description. This specifies a validator $v$,
consumes $x$ PEN
from the transaction’s balance, produces a new shielded note
with $y=x/ψ_{v}(e)$ dPEN
associated to that validator, and includes an
encryption $Enc_{D}(y)$ of the delegation amount to the validators’
shared decryption key $D$. Here $e$ is the index of the next epoch, when the
delegation will take effect.
In the last block of epoch $e−1$, the validators sum the encrypted bonding amounts $Enc_{D}(y_{v})$ from all delegate descriptions for validator $v$ in the entire epoch to obtain an encryption of the total delegation amount $Enc_{D}(∑_{i}y_{v})$ and decrypt to obtain the total delegation amount $y_{v}=∑_{i}y_{v}$ without revealing any individual transaction’s delegation amount $y_{v}$. These total amounts are used to update the size of each validator’s delegation pool for the next epoch.
Revealing only the total inflows to the delegation pool in each epoch helps avoid linkability. For instance, if the size of each individual transaction’s delegation were revealed, a delegation of size $a$ followed by an undelegation of size $b$ could be correlated if an observer notices that there are some epochs $e_{1},e_{2}$ so that $aψ_{v}(e_{1})ψ_{v}(e_{2}) =b.$
This risk is still present when only the total amount – the minimum disclosure required for consensus – is revealed, because there may be few (or no) other delegations to the same validator in the same epoch. Some care should be taken in client implementations and user behavior to mitigate the effects of this information disclosure, e.g., by splitting delegations into multiple transactions in different epochs involving randomized subportions of the stake. However, the best mitigation would simply be to have many users.
Undelegation
The undelegation process unbonds stake from a validator, converting delegation
tokens dPEN
to stake PEN
. Undelegations may be performed in any block, but
only settle after the undelegation has exited the unbonding queue.
The unbonding queue is a FIFO queue allowing only a limited amount of stake to be unbonded in each epoch, according to an unbonding rate selected by governance. Undelegations are inserted into the unbonding queue in FIFO order. Unlike delegations, where only the total amount of newly bonded stake is revealed, undelegations reveal the precise amount of newly unbonded stake, allowing the unbonding queue to function.
Undelegations are accomplished by creating a transaction with a
Undelegate
description. This description has different behaviour
depending on whether or not the validator was slashed.
In the unslashed case, the undelegate description spends a note with value
$y$ dPEN
, reveals $y$, and produces $yψ_{v}(e)$ PEN
for the transaction’s
balance, where $e$ is the index of the current epoch. However, the nullifiers
revealed by undelegate descriptions are not immediately included in the
nullifier set, and new notes created by a transaction containing an undelegate
description are not immediately included in the state commitment tree. Instead,
the transaction is placed into the unbonding queue to be applied later. In the
first block of each epoch, transactions are applied if the corresponding
validator remains unslashed, until the unbonding limit is reached.
If a validator is slashed, any undelegate transactions currently in the unbonding queue are discarded. Because the nullifiers for the notes those transactions spent were not included in the nullifier set, the notes remain spendable, allowing a user to create a new undelegation description.
Undelegations from a slashed validator are settled immediately. The
undelegate description spends a note with value $y$ dPEN
and produces
$syψ_{v}(e_{s})$ PEN
, where $1−s$ is the slashing penalty and
$e_{s}$ is the epoch at which the validator was slashed. The remaining value,
$(1−s)yψ_{v}(e_{s})$, is burned.
Because pending undelegations from a slashed validator are discarded without applying their nullifiers, those notes can be spent again in a postslashing undelegation description. This causes linkability between the discarded undelegations and the postslashing undelegations, but this is not a concern because slashing is a rare and unplanned event which already imposes worse losses on delegators.
Example Staking Dynamics
To illustrate the dynamics of this system, consider a toy scenario with three delegators, Alice, Bob, and Charlie, and two validators, Victoria and William. Tendermint consensus requires at least four validators and no one party controlling more than $1/3$ of the stake, but this example uses only a few parties just to illustrate the dynamics.
For simplicity, the the base reward rates and commission rates
are fixed over all epochs at $r=0.0006$ and $c_{v}=0$, $c_{w}=0.1$.
The PEN
and dPEN
holdings of participant $a,b,c,…$ are
denoted by $x_{a}$, $y_{a}$, etc., respectively.
Alice starts with $y_{a}=10000$ dPEN
of Victoria’s delegation pool, Bob starts
with $y_{b}=10000$ dPEN
of William’s delegation pool, and Charlie starts with
$x_{c}=20000$ unbonded PEN
.

At genesis, Alice, Bob, and Charlie respectively have fractions $25%$, $25%$, and $50$ of the total stake, and fractions $50%$, $50%$, $0%$ of the total voting power.

At epoch $e=1$, Alice, Bob, and Charlie’s holdings remain unchanged, but their unrealized notional values have changed.
 Victoria charges zero commission, so $ψ_{v}(1)=ψ(1)=1.0006$. Alice’s $y_{a}=10000$
dPEN(v)
is now worth $10006$PEN
.  William charges $10%$ commission, so $ψ_{w}(1)=1.00054$. Bob’s $y_{b}=10000$
dPEN(w)
is now worth $10005.4$, and William receives $0.6$PEN
.  William can use the commission to cover expenses, or selfdelegate. In this example, we assume that validators selfdelegate their entire commission, to illustrate the staking dynamics.
 William selfdelegates $0.6$
PEN
, to get $0.6/ψ_{w}(2)=0.6/1.00054_{2}=0.59935…$dPEN
in the next epoch, epoch $2$.
 Victoria charges zero commission, so $ψ_{v}(1)=ψ(1)=1.0006$. Alice’s $y_{a}=10000$

At epoch $e=90$:
 Alice’s $y_{a}=10000$
dPEN(v)
is now worth $10554.67$PEN
.  Bob’s $y_{b}=10000$
dPEN(w)
is now worth $10497.86$PEN
.  William’s selfdelegation of accumulated commission has resulted in $y_{w}=53.483$
dPEN(w)
.  Victoria’s delegation pool remains at size $10000$
dPEN(v)
. William’s delegation pool has increased to $10053.483$dPEN(w)
. However, their respective adjustment factors are now $θ_{v}(90)=1$ and $θ_{w}(90)=0.99462$, so the voting powers of their delegation pools are respectively $10000$ and $9999.37$. The slight loss of voting power for William’s delegation pool occurs because William selfdelegates rewards with a one epoch delay, thus missing one epoch of compounding.
 Charlie’s unbonded $x_{c}=20000$
PEN
remains unchanged, but its value relative to Alice and Bob’s bonded stake has declined.  William’s commission transfers stake from Bob, whose voting power has slightly declined relative to Alice’s.
 The distribution of stake between Alice, Bob, Charlie, and William is now $25.67%$, $25.54%$, $48.65%$, $0.14%$ respectively. The distribution of voting power is $50%$, $49.74%$, $0%$, $0.27%$ respectively.
 Charlie decides to bond his stake, split evenly between Victoria and William, to get $10000/ψ_{v}(91)=9485.85$
dPEN(v)
and $10000/ψ_{w}(91)=9536$dPEN(w)
.
 Alice’s $y_{a}=10000$

At epoch $e=91$:
 Charlie now has $9468.80$
dPEN(v)
and $9520.60$dPEN(w)
, worth $20000$PEN
.  For the same amount of unbonded stake, Charlie gets more
dPEN(w)
thandPEN(v)
, because the exchange rate $ψ_{w}$ prices in the cumulative effect of commission since genesis, but Charlie isn’t charged for commission during the time he didn’t delegate to William.  William’s commission for this epoch is now $1.233$
PEN
, up from $0.633$PEN
in the previous epoch.  The distribution of stake between Alice, Bob, Charlie, and William is now $25.68%$, $25.54%$, $48.64%$, $0.14%$ respectively. Because all stake is now bonded, except William’s commission for this epoch, which is insignificant, the distribution of voting power is identical to the distribution of stake.
 Charlie now has $9468.80$

At epoch $e=180$:
 Alice’s $y_{a}=10000$
dPEN(v)
is now worth $11140.12$PEN
.  Bob’s $y_{b}=10000$
dPEN(w)
is now worth $11020.52$PEN
.  Charlies’s $y_{c,v}=9468.80$
dPEN(v)
is now worth $10548.37$PEN
, and his $y_{c,w}=9520.60$dPEN(w)
is now worth $10492.20$PEN
.  William’s selfdelegation of accumulated commission has resulted in $y_{w}=158.77$
dPEN(w)
, worth $176.30$PEN
.  The distribution of stake and voting power between Alice, Bob, Charlie, and William is now $25.68%$, $25.41%$, $48.51%$, $0.40%$ respectively.
 Alice’s $y_{a}=10000$
This scenario was generated with a model in this Google Sheet.
FixedPoint Arithmetic for Rate Computation
To compute base reward rate, base exchange rate, validatorspecific exchange rates, and total validator voting power, we need to carefully perform arithmetic to avoid issues with precision and rounding. We use explicitly specified fixedprecision arithmetic for this, with a precision of 8 digits. This allows outputs to fit in a u64, with all products fitting in the output and plenty of headroom for additions.
All integer values should be interpreted as unsigned 64bit integers, with the
exception of the validator’s commission rate, which is a u16
specified in
terms of basis points (one onehundredth of a percent, or in other words, an
implicit denominator of $10_{4}$). All integer values, with the exception of the
validator’s commission rate, have an implicit denominator of $10_{8}$.
Throughout this spec representations are referred to as $x$, where $x=x⋅10_{8}$, and the value represented by representations is $x=x⋅10_{−8}$.
As an example, let’s take a starting value $x$ represented in our scheme (so, $x⋅10_{8}$) and compute its product with $y$, also represented by our fixed point scheme, so $y⋅10_{8}$. The product $xy$ is computed as
$⌊10_{8}(x10_{8})(y10_{8}) ⌋=⌊10_{8}xy ⌋$
Since both $x$ and $y$ are both representations and both have a factor of
$10_{8}$, computing their product includes an extra factor of 10^8 which we
divide out. All representations are u64
in our scheme, and any fixedpoint
number or product of fixed point numbers with 8 digits fits well within 64
bits.
Summary of Notation
 $r_{e}$: the fixedpoint representation of the base rate at epoch $e$.
 $psi(e)$: the fixedpoint representation of the base exchange rate at epoch $e$.
 $s_{i,e}$: the funding rate of the validator’s funding stream at index $i$ and epoch $e$, in basis points.
 $c_{v,e}$: the sum of the validator’s commission rates, in basis points.
 $r_{v,e}$: the fixedpoint representation of the validatorspecific reward rate for validator $v$ at epoch $e$.
 $psi_{v}(e)$: the fixedpoint representation of the validatorspecific exchange rate for validator $v$ at epoch $e$.
 $y_{v}$: the sum of the tokens in the validator’s delegation pool.
 $iota_{v}(e)$: the validator’s voting power for validator $v$ at epoch $e$.
Base Reward Rate
The base reward is an input to the protocol, and the exact details of how this base rate $r_{e}$ is determined is not yet decided. For now, we can assume it is derived from the block header.
Base Exchange Rate
The base exchange rate, $ψ(e)$, can be safely computed as follows:
$psi(e)=⌊10_{8}psi(e−1)⋅(10_{8}+r_{e}) ⌋$
Commission Rate from Funding Streams
To compute the validator’s commission rate from its set of funding streams, compute $c_{v,e}=i∑ s_{i,e}$ where $s_{i,e}$ is the rate of validator $v$’s $i$th funding stream at epoch $e$.
Validator Reward Rate
To compute the validator’s base reward rate, we compute the following:
$r_{v,e}=⌊10_{8}(10_{8}−(c_{v,e}⋅10_{4}))r_{e} ⌋$
Validator Exchange Rate
To compute the validator’s exchange rate, we use the same formula as for the base exchange rate, but substitute the validatorspecific reward rate in place of the base reward rate:
$psi_{v}(e)=⌊10_{8}psi_{v}(e−1)⋅(10_{8}+r_{v,e}) ⌋$
Validator Voting Power
Finally, to compute the validator’s voting power, take:
$iota_{v}(e)=⌊y_{v}⋅psi(e)psi_{v}(e) ⌋$
IBC Protocol Implementation
Penumbra supports the IBC protocol for interoperating with other counterparty blockchains. Unlike most blockchains that currently deploy IBC, Penumbra is not based on the Cosmos SDK. IBC as a protocol supports replication of data between two communicating blockchains. It provides basic building blocks for building higherlevel cross chain applications, as well as a protocol specification for the most commonly used IBC applications, the ICS20 transfer protocol.
Penumbra implements the core IBC protocol building blocks: ICS23 compatible state inclusion proofs, connections as well as channels and packets.
IBC Actions
In order to support the IBC protocol, Penumbra adds a single additional Action
IBCAction
. an IBCAction can contain any of the IBC datagrams:
ICS003 Connections
ConnOpenInit
ConnOpenTry
ConnOpenAck
ConnOpenConfirm
ICS004 Channels and Packets
ChanOpenInit
ChanOpenTry
ChanOpenAck
ChanOpenConfirm
ChanCloseInit
ChanCloseConfirm
RecvPacket
Timeout
Acknowledgement
These datagrams are implemented as protocol buffers, with the enclosing
IBCAction
type using profobuf’s OneOf
directive to encapsulate all possible
IBC datagram types.
Handling Bridged Assets
Penumbra’s native state model uses notes, which contain an amount of a
particular asset. Amounts in Penumbra are 128bit unsigned integers, in order
to support assets which have potentially large base denoms (such as Ethereum).
When receiving an IBC transfer, if the amount being transferred is greater than
u128
, we return an error.
Transfers into Penumbra
IBC transfer mechanics are specified in ICS20. The
FungibleTokenPacketData
packet describes the transfer:
FungibleTokenPacketData {
denomination: string,
amount: uint256,
sender: string,
receiver: string,
}
The sender
and receiver
fields are used to specify the sending account on
the source chain and the receiving account on the destination chain. However,
for inbound transfers, the destination chain is Penumbra, which has no
accounts. Instead, token transfers into Penumbra create an
OutputDescription
describing a new shielded note with the given amount and
denomination, and insert an encoding of the description itself into the
receiver
field.
ZSwap
Penumbra provides private, sealedbid batch swaps using ZSwap. ZSwap allows users to privately swap between any pair of assets. Individual swaps do not reveal trade amounts. Instead, all swaps in each block are executed in a single batch. Only the total amount in each batch is revealed, and only after the batch has been finalized. This prevents frontrunning and provides better execution, but also provides longterm privacy for individual swaps. Users can also provide liquidity by anonymously creating concentrated liquidity positions. These positions reveal the amount of liquidity and the bounds in which it is concentrated, but are not otherwise linked to any identity, so that (with some care) users can privately approximate arbitrary trading functions without revealing their specific views about prices.
Frequent batch swaps
Budish, Cramton, and Shim (2015) analyze trading in traditional financial markets using the predominant continuoustime limit order book market design, and find that highfrequency trading arises as a response to mechanical arbitrage opportunities created by flawed market design:
These findings suggest that while there is an arms race in speed, the arms race does not actually affect the size of the arbitrage prize; rather, it just continually raises the bar for how fast one has to be to capture a piece of the prize… Overall, our analysis suggests that the mechanical arbitrage opportunities and resulting arms race should be thought of as a constant of the market design, rather than as an inefficiency that is competed away over time.
— Eric Budish, Peter Cramton, John Shim, The HighFrequency Trading Arms Race: Frequent Batch Auctions as a Market Design Response
Because these mechanical arbitrage opportunities arise from the market design even in the presence of symmetrically observed public information, they do not improve prices or produce value, but create arbitrage rents that increase the cost of liquidity provision^{1}. Instead, they suggest changing from a continuoustime model to a discretetime model and performing frequent batch auctions, executing all orders that arrive in the same discrete time step in a single batch with a uniform price.
In the blockchain context, while projects like Uniswap have demonstrated the power and success of CFMMs for decentralized liquidity provision, they have also highlighted the mechanical arbitrage opportunities created by the mismatch between continuoustime market designs and the state update model of the underlying blockchain, which operates in discrete batches (blocks). Each prospective state update is broadcast to all participants to be queued in the mempool, but only committed as part of the next block, and while it is queued for inclusion in the consensus state, other participants can bid to manipulate its ordering relative to other state updates (for instance, frontrunning a trade).
This manipulation is enabled by two features:

Although trades are always committed in a batch (a block), they are performed individually, making them dependent on miners’ choices of the ordering of trades within a block;

Because trades disclose the trade amount in advance of execution, all other participants have the information necessary to manipulate them.
ZSwap addresses the first problem by executing all swaps in each block in a single batch, first aggregating the amounts in each swap and then executing it against the CFMM as a single trade.
ZSwap addresses the second problem by having users encrypt their swap amounts using a flow encryption key controlled by the validators, who aggregate the encrypted amounts and decrypt only the batch trade. This prevents frontrunning prior to block inclusion, and provides privacy for individual trades (up to the size of the batch) afterwards.
Users do not experience additional trading latency from the batch swap design, because the batch swaps occur in every block, which is already the minimum latency for finalized state updates. A longer batch latency could help privacy for markettakers by increasing the number of swaps in each batch, but would impair other trading and impose a worse user experience.
Private, sealedbid batch swaps
A key challenge in the design of any private swap mechanism is that zeroknowledge proofs only allow privacy for userspecific state, not for global state, because they don’t let you prove statements about things that you don’t know. While users can prove that their userspecific state was updated correctly without revealing it, they cannot do so for other users’ state.
Instead of solving this problem, ZSwap sidesteps the need for users to do so. At a high level, swaps work as follows: users privately burn funds of one kind in a coordinated way that allows the chain to compute a perblock clearing price, and mint or burn liquidity pool reserves. Later, users privately mint funds of the other kind, proving that they previously burned funds and that the minted amount is consistent with the computed price and the burned amount. No interaction or transfer of funds between users or the liquidity pool reserves is required. Users do not transact with each other. Instead, the chain permits them to transmute one asset type to another, provably updating their private state without interacting with any other users’ private state.
This mechanism is described in more detail in the SealedBid Batch Swaps section.
Concentrated Liquidity
ZSwap executes trades against concentrated liquidity positions.
Uniswap v3’s insight was that that existing constantproduct market makers like Uniswap v2 allocate liquidity inefficiently, spreading it uniformly over the entire range of possible prices for a trading pair. Instead, allowing liquidity providers (LPs) to restrict their liquidity to a price range of their choosing provides a mechanism for market allocation of liquidity, concentrating it into the range of prices that the assets in the pair actually trade.
Liquidity providers create positions that tie a quantity of liquidity to a specific price range. Within that price range, the position acts as a constantproduct market maker with larger “virtual” reserves. At each price, the pool aggregates liquidity from all positions that contain that price, and tracks which positions remain (or become) active as the price moves. By creating multiple positions, LPs can approximate arbitrary trading functions.
However, the concentrated liquidity mechanism in Uniswap v3 has a number of limitations:

Approximating an arbitrary trading function using a set of concentrated liquidity positions is cumbersome and difficult, because each position is a scaled and translated copy of the (nonlinear) constantproduct trading function;

Liquidity providers cannot compete on trading fees, because positions must be created with one of a limited number of fee tiers, and users cannot natively route trades across different fee tiers;

All liquidity positions are publicly linked to the account that created them, so that any successful LP strategy can be immediately cloned by other parties.
Zswap solves all of these problems:

Using linear (constantprice) concentrated liquidity positions makes approximating an arbitrary trading function much easier, and makes the DEX implementation much more efficient. Users can recover a constantproduct (or any other) trading function by creating multiple positions, pushing the choice of curve out of consensus and into client software.

Each position includes its own fee setting. This fragments liquidity into many (potentially tens of thousands) of distinct AMMs, each with a specific fee, but this is not a problem, because Zswap can optimally route batched swaps across all of them.

While positions themselves are public, they are opened and closed anonymously. This means that while the aggregate of all LPs’ trading functions is public, the trading function of each individual LP is not, so LPs can approximate their desired trading functions without revealing their specific views about prices, as long as they take care to avoid linking their positions (e.g., by timing or amount).
Handling of concentrated liquidity is described in more detail in the Concentrated Liquidity section.
Liquidity Compensation
Systems that support both staking and liquidity provision need a mechanism to manage the competitive equilibrium between these two uses of the staking token. If staking rewards are too high relative to LP returns, LPs will choose to stake instead, causing liquidity to dry up. On the other hand, if LP returns are too high relative to staking rewards, stakers will choose to provide liquidity instead, weakening the chain’s economic security.
Penumbra’s staking design, which provides native delegation tokens for each validator, poses additional challenges. LPs are disincentivized to provide liquidity in the staking token, if they could otherwise provide liquidity for a delegation token and get staking rewards as well as LP returns. This is undesirable for a number of reasons: it fragments liquidity across many different delegation tokens, it undermines the security model to some extent (by making it easier to unload delegation tokens before misbehavior is detected), and it drives staking centralization (since larger validators would presumably have deeper liquidity in their delegation token, making it a more attractive asset).
To address these challenges, and allow the protocol to stabilize competition between staking and liquidity provision, ZSwap uses a mechanism called liquidity compensation, so named because it compensates LPs for the opportunity cost of not staking. Instead of issuing staking rewards only to delegators, the total issuance in each epoch is split between delegators and liquidity providers. To determine the split, the chain sets a target ratio of bonded stake to stake used for liquidity provision, and compares the actual ratio to the target ratio, similarly to the way that other systems adjust staking rewards based on the proportion of tokens staked.
At the end of each epoch, the share of issuance used for liquidity compensation is allocated to each eligible liquidity position active during that epoch, pro rata to liquidity provided. Eligible liquidity positions are any positions in trading pairs where one asset is the staking token, and the other asset is not a delegation token.
This mechanism has several nice properties:

Although the amount of liquidity compensation is determined on the basis of value locked, it’s allocated on the basis of liquidity provided. This incentivizes LPs to efficiently deploy their capital, since LPs who can create finer liquidity positions will receive disproportionate liquidity compensation rewards.

Because markets in delegation tokens are not eligible for liquidity compensation, marketmakers providing liquidity between bonded and unbonded forms of the staking token must take on the opportunity cost of not staking the unbonded side. Therefore, fees in those markets must be sufficient to cover that opportunity cost, so that there’s no free, instant withdrawal from a delegation pool.

The mechanism is credibly neutral, applying to any trading pair involving the staking token, rather than artificially subsidizing certain tokens via governance proposals.
This mechanism is described in more detail in the Liquidity Compensation section.
on HFT, N.B. their footnote 5:
A point of clarification: our claim is not that markets are less liquid today than before the rise of electronic trading and HFT; the empirical record is clear that trading costs are lower today than in the preHFT era, though most of the benefits appear to have been realized in the late 1990s and early 2000s… Rather, our claim is that markets are less liquid today than they would be under an alternative market design that eliminated sniping.
SealedBid Batch Swaps
ZSwap’s sealedbid batch swaps conceptually decompose into two parts: the DEX and AMM mechanism itself, and the batching procedure, which allows multiple users’ swaps to be batched into a single trade executed against the trading function. This section focuses on the batching procedure, leaving the mechanics of the trading function for later.
A key challenge in the design of any private swap mechanism is that zeroknowledge proofs only allow privacy for userspecific state, not for global state, because they don’t let you prove statements about things that you don’t know. While users can prove that their userspecific state was updated correctly without revealing it, they cannot do so for other users’ state.
Instead of solving this problem, ZSwap sidesteps the need for users to do so. Rather than have users transact with each other, the chain permits them to transmute one asset type to another, provably updating their private state without interacting with any other users’ private state. To do this, they privately burn their input assets and encrypt the amounts to the validators. The validators aggregate the encrypted amounts and decrypt the batch total, then compute the effective (inclusive of fees) clearing prices and commit them to the chain state. In any later block, users can privately mint output funds of the new type, proving consistency with the inputs they burned.
Swap
actions
First, users create transactions with Swap
actions that privately burn their
input assets and encrypt the amounts to the validators. This action identifies
the trading pair $(t_{1},t_{2})$ by asset id, consumes $Δ=(Δ_{1},Δ_{2})$ of types
$(t_{1},t_{2})$ from the transaction balance, and contains an encryption of the
trade inputs
$Enc_{D}(Δ)=(Enc_{D}(Δ_{1}),Enc_{D}(Δ_{2})).$
Rational traders will choose either
$Δ_{1}=0$ or $Δ_{2}=0$, i.e., trade one asset type for the other type,
but the description provides two inputs so that different swap directions cannot
be distinguished.
The Swap
action also consumes $f$ fee tokens from the transaction’s value balance, which are saved for use as a prepaid transaction fee when claiming the swap output.
To record the user’s contribution for later, the action mints a swap NFT.
Penumbra assets are recorded
as a pair of an amount (u64
) and an asset id ($F_{q}$). Usually, the
asset id is the hash of a denomination string. For a swap NFT, however, the
asset id is computed as
$a_{NFT}=π(t_{1},t_{2},f,Δ_{1},Δ_{2},B_{d},pk_{d}),$
where:
 $π$ is a Poseidon hash function;
 $(Δ_{1},Δ_{2})$ are the input amounts of types $t_{1}$ and $t_{2}$ respectively;
 $f$ is a prepaid fee amount that will be used for the swap claim;
 $B_{d}$ and $pk_{d}$ are the diversified basepoint and diversified transmission key of one of the user’s addresses, used to preauthorize the swap claim.
The swap NFT is recorded like any other asset in the shielded pool. The Swap
action includes a NotePayload
with an encryption of a new note with the swap
NFT, rather than using a separate Output
action, in order to combine proof
statements and skip a fixedsize memo field.
Batching and Execution
In this description, which focuses on the state model, we treat the execution itself as a black box and focus only on the public data used for swap outputs.
Validators sum the encrypted amounts $Enc_{D}(Δ_{(i)})$ of all swaps in the batch to obtain an encryption of the combined inputs $Enc_{D}(∑_{i}Δ_{(i)})$, then decrypt to obtain the batch input $Δ=∑_{i}Δ_{(i)}$ without revealing any individual transaction’s input $Δ_{(i)}$. Then they execute $Δ$ against the trading pool, updating the pool state and obtaining the effective (inclusive of fees) clearing prices $p_{t_{1},t_{2}}$ ($t_{1}$ in terms of $t_{2}$) and $p_{t_{2},t_{1}}$ ($t_{2}$ in terms of $t_{1}$). Alternatively, the swap could fail, for instance, because there is insufficient liquidity, so the public state recording the swap results also includes a success bit $b_{t_{1},t_{2}}$ that is $1$ on success and $0$ on failure.
Each user’s output amounts can be computed as $Λ_{1}=b_{t_{1},t_{2}}(p_{t_{1},t_{2}}Δ_{2})+(1−b_{t_{1},t_{2}})Δ_{1}Λ_{2}=b_{t_{1},t_{2}}(p_{t_{2},t_{1}}Δ_{1})+(1−b_{t_{1},t_{2}})Δ_{2}$ which simplifies to $Λ_{1}=p_{t_{1},t_{2}}Δ_{2}Λ_{2}=p_{t_{2},t_{1}}Δ_{1}$ when the batch succeeds and $b_{t_{1},t_{2}}=1$, or to $Λ_{1}=Δ_{1}Λ_{2}=Δ_{2}$ when the batch fails and $b_{t_{1},t_{2}}=0$.
Claiming Swap Outputs
In a future block, users who created transactions with Swap
actions obtain
assets of the new types by creating a transaction with SwapClaim
actions.
This action privately mints new tokens of the output type, and proves
consistency with the user’s input contribution (via the swap NFT) and with the
effective clearing prices (which are part of the public chain state). The
SwapClaim
action is carefully designed so that it is selfauthenticating,
and does not require any spend authorization. Any entity in posession of the
full viewing key can create and submit a swap claim transaction, without further
authorization by the user. This means that wallet software can automatically
claim swap outputs as soon as it sees confirmation that the swap occurred.
Like a Spend
action, the SwapClaim
action spends a shielded note, revealing its nullifier and witnessing an authentication path from it to a recent anchor. However, it differs in several important respects:

Rather than unlocking value from an arbitrary note, it proves that the spent note records $1$ unit of a swap NFT whose asset ID is $a_{NFT}=π(t_{1},t_{2},f,Δ_{1},Δ_{2},B_{d},pk_{d}),$ so that the input state is available to other proof statements;

Rather than witnessing the full authentication path from the note commitment up to a recent anchor, it reveals the block height and only witnesses the authentication path up to the blocklevel root, proving that the note was included in a particular block^{1}, and allowing reference to the effective clearing prices $p_{t_{1},t_{2}}$ and $p_{t_{2},t_{1}}$;

Rather than contributing to the transaction’s value balance, it constructs two output notes itself, one for each of $Λ_{1}=b_{t_{1},t_{2}}(p_{t_{1},t_{2}}Δ_{2})+(1−b_{t_{1},t_{2}})Δ_{1}Λ_{2}=b_{t_{1},t_{2}}(p_{t_{2},t_{1}}Δ_{1})+(1−b_{t_{1},t_{2}})Δ_{2}$ and proves that the notes are sent to the address committed to by the $B_{d}$ and $pk_{d}$ in the swap NFT;
The SwapClaim
does not include a spend authorization signature, because it is only capable of consuming a swap NFT, not arbitrary notes, and only capable of sending the trade outputs to the address specified during construction of the original Swap
action, which is signed.
Finally, the SwapClaim
releases $f$ units of the fee token to the
transaction’s value balance, allowing it to pay fees without an additional
Spend
action. The transaction claiming the swap outputs can therefore consist
of a single SwapClaim
action, and that action can be prepared using only a
full viewing key. This design means that wallet software can automatically
submit the swap claim without any explicit user intervention, even if the
user’s custody setup (e.g., a hardware wallet) would otherwise require it.
Although sweep descriptions do not reveal the amounts, or which swap’s outputs they claim, they do reveal the block and trading pair, so their anonymity set is considerably smaller than an ordinary shielded value transfer. For this reason, client software should create and submit a transaction with a sweep description immediately after observing that its transaction with a swap description was included in a block, rather than waiting for some future use of the new assets. This ensures that future shielded transactions involving the new assets are not trivially linkable to the swap.
Privacy for MarketTakers
This design reveals only the net flow across a trading pair in each batch, not the amounts of any individual swap. However, this provides no protection if the batch contains a single swap, and limited protection when there are only a few other swaps. This is likely to be an especially severe problem until the protocol has a significant base of active users, so it is worth examining the impact of amount disclosure and potential mitigations.

TODO: on the client side, allow a “time preference” slider (immediate vs long duration), which spreads execution of randomized subamounts across multiple blocks at randomized intervals within some time horizon

TODO: extract below into separate section about privacy on penumbra
Assuming that all amounts are disclosed, an attacker could attempt to deanonymize parts of the transaction graph by tracing amounts, using strategies similar to those in An Empirical Analysis of Anonymity in Zcash. That research attempted to deanonymize transactions by analysing movement between Zcash’s transparent and shielded pools, with some notable successes (e.g., identifying transactions associated to the sale of stolen NSA documents). Unlike Zcash, where optin privacy means the bulk of the transaction graph is exposed, Penumbra does not have a transparent pool, and the bulk of the transaction graph is hidden, but there are several potential places to try to correlate amounts:
 IBC transfers into Penumbra are analogous to
t2z
transactions and disclose types and amounts and accounts on the source chain;  IBC transfers out of Penumbra are analogous to
z2t
transactions and disclose types and amounts and accounts on the destination chain;  Each unbonding discloses the precise amount of newly unbonded stake and the validator;
 Each epoch discloses the net amount of newly bonded stake for each validator;
 Liquidity pool deposits disclose the precise type and amount of newly deposited reserves;
 Liquidity pool deposits disclose the precise type and amount of newly withdrawn reserves;
The existence of the swap mechanism potentially makes correlation by amount more difficult, by expanding the search space from all amounts of one type to all combinations of all amounts of any tradeable type and all historic clearing prices. However, assuming rational trades may cut this search space considerably.^{2}
Thanks to Guillermo Angeris for this observation.
Technically, this is not quite true: by itself, all that revealing the blocklevel root on the authentication path proves is that the note was included in a block with that root, not the block with that root, since the block root binds all of the new note commitments produced in that block but does not explicitly bind the block height. To fix this, we can have the chain insert a dummy note whose note commitment is bound to the block height (e.g., by computing the note’s blinding factor as a hash of the height). This prevents a possible attack where an attacker who could control the exact set of note commitments included in two different blocks at heights $h_{1}$ and $h_{2}$, both with swaps, could submit the exact same input amounts in $h_{1}$ and $h_{2}$ (without change), and then claim both outputs at whichever executed with a higher price.
Concentrated Liquidity
Liquidity Mining
LPNFTs
Primitives
Penumbra uses the following cryptographic primitives, described in the following sections:

The Proof System section describes the choice of proving curve (BLS12377) and proof system (Groth16, and potentially PLONK in the future);

The
decaf377
section describesdecaf377
, a parameterization of the Decaf construction defined over the BLS12377 scalar field, providing a primeorder group that can be used inside or outside of a circuit; 
The Poseidon for BLS12377 section describes parameter selection for an instantiation of Poseidon, a SNARKfriendly sponge construction, over the BLS12377 scalar field;

The Fuzzy Message Detection section describes a construction that allows users to outsource a probabalistic detection capability, allowing a third party to scan and filter the chain on their behalf, without revealing precisely which transactions are theirs.

The Homomorphic Threshold Decryption section describes the construction used to batch flows of value across transactions.

The Randomizable Signatures section describes
decaf377rdsa
, a variant of the Zcash RedDSA construction instantiated overdecaf377
, used for binding and spend authorization signatures. 
The Key Agreement section describes
decaf377ka
, an instantiation of DiffieHellman key agreement overdecaf377
.
Proving Considerations
Penumbra needs SNARK proofs. Because the choice of proving system and proving curve can’t really be cleanly separated from the rest of the system choices (e.g., the native field of the proving system informs what embedded curve is available, and how circuit programming is done), large parts of the rest of the system design block on making a choice of proving system.
Goals

Nearterm implementation availability. We’d like to ship a useful product first, and iterate and expand it later.

High performance for fixed functionality. Penumbra intends to support fixed functionality initially; programmability is a good future goal but isn’t a nearterm objective. The fixed functionality should have as high performance as possible.

Longerterm flexibility. The choice should ideally not preclude many future choices for later functionality. More precisely, it should not impose high switching costs on future choices.

Recursion capability. Penumbra doesn’t currently make use of recursion, but there are a lot of interesting applications it could be used for.
Setup ceremonies are beneficial to avoid for operational reasons, but not for security reasons. A decentralized setup procedure is sufficient for security.
Options
Proof systems:
 Groth16:
 Pros: high performance, very small proofs, mature system
 Cons: requires a setup for each proof statement
 PLONK:
 Pros: universal setup, still fairly compact proofs, seems to be a point of convergence with useful extensions (plookup, SHPLONK, etc)
 Cons: bigger proofs, worse constants than Groth16
 Halo 2
 Pros: no setup, arbitrary depth recursion
 Cons: bigger proof sizes, primary implementation for the Pallas/Vesta curves which don’t support pairings
Curve choices:

BLS12381:
 Pros: very mature, used by Sapling already
 Cons: no easy recursion

BLS12377:
 Pros: constructed as part of Zexe to support depth 1 recursion using a bigger parent curve, deployed in Celo, to be deployed in Zexe
 Cons: ?

Pallas/Vesta:
 Pros: none other than support for Halo 2’s arbitrary recursion
 Cons: no pairings mean they cannot be used for any pairingbased SNARK
Considerations
Although the choice of proof system (Groth16, Plonk, Halo, Pickles, …) is not completely separable from the choice of proving curve (e.g., pairingbased SNARKs require pairingfriendly curves), to the extent that it is, the choice of the proof system is relatively less important than the choice of proving curve, because it is easier to encapsulate.
The choice of proving curve determines the scalar field of the arithmetic circuit, which determines which curves are efficient to implement in the circuit, which determines which cryptographic constructions can be performed in the circuit, which determines what kind of key material the system uses, which propagates all the way upwards to uservisible details like the address format. While swapping out a proof system using the same proving curve can be encapsulated within an update to a client library, swapping out the proving curve is extremely disruptive and essentially requires all users to generate new addresses and migrate funds.
This means that, in terms of proof system flexibility, the Pallas/Vesta curves are relatively disadvantaged compared to pairingfriendly curves like BLS12381 or BLS12377, because they cannot be used with any pairingbased SNARK, or any other pairingbased construction. Realistically, choosing them is committing to using Halo 2.
Choosing BLS12377 instead of BLS12381 opens the possibility to do depth1 recursion later, without meaningfully restricting the nearterm proving choices. For this reason, BLS12377 seems like the best choice of proving curve.
Penumbra’s approach is to first create a useful set of fixed functionality, and
generalize to custom, programmable functionality only later. Compared to
Sapling, there is more functionality (not just Spend
and Output
but
Delegate
, Undelegate
, Vote
, …), meaning that there are more proof
statements. Using Groth16 means that each of these statements needs to have its
own proving and verification key, generated through a decentralized setup.
So the advantage of a universal setup (as in PLONK) over perstatement setup (as in Groth16) would be:
 The setup can be used for additional fixed functionality later;
 Client software does not need to maintain distinct proving/verification keys for each statement.
(2) is a definite downside, but the impact is a little unclear. As a point of reference, the Sapling spend and output parameters are 48MB and 3.5MB respectively. The size of the spend circuit could be improved using a snarkfriendly hash function.
With regard to (1), if functionality were being developed in many independent pieces, doing many setups would impose a large operational cost. But doing a decentralized setup for a dozen proof statements simultaneously does not seem substantially worse than doing a decentralized setup for a single proof statement. So the operational concern is related to the frequency of groups of new statements, not the number of statements in a group. Adding a later group of functionality is easy if the first group used a universal setup. But if it didn’t, the choice of perstatement setup initially doesn’t prevent the use of a universal setup later, as long as the new proof system can be implemented using the same curve.
Because Penumbra plans to have an initial set of fixed functionality, and performance is a concern, Groth16 seems like a good choice, and leaves the door open for a future universal SNARK. Using BLS12377 opens the door to future recursion, albeit only of depth 1.
The decaf377
group
Penumbra, like many other zeroknowledge protocols, requires a cryptographic group that can be used inside of an arithmetic circuit. This is accomplished by defining an “embedded” elliptic curve whose base field is the scalar field of the proving curve used by the proof system.
The Zexe paper, which defined BLS12377, also defined (called $E_{Ed/BLS}$ in Figure 16 of the paper) a cofactor4 Edwards curve defined over the BLS12377 scalar field for exactly this purpose. However, nonprimeorder groups are a leaky abstraction, forcing all downstream constructions to pay attention to correct handling of the cofactor. Although it is usually possible to do so safely, it requires additional care, and the optimal technique for handling the cofactor is different inside and outside of a circuit.
Instead, applying the Decaf construction to this curve gives decaf377
, a
clean abstraction that provides a primeorder group complete with hashtogroup
functionality and whose encoding and decoding functions integrate validation.
Although it imposes a modest additional cost in the circuit context, as
discussed in Costs and Alternatives, the
construction works the same way inside and outside of a circuit and imposes no
costs for lightweight, softwareonly applications, making it a good choice for
generalpurpose applications.
Curve Parameters
The cofactor4 Edwards curve defined over the BLS12377 scalar field has the following parameters:
 Base field: Integers mod prime $q=8444461749428370424248824938781546531375899335154063827935233455917409239041$
 Elliptic curve equation: $ax_{2}+y_{2}=1+dx_{2}y_{2}$ with $a=−1$ and $d=3021$
 Curve order: $4r$ where $r=2111115437357092606062206234695386632838870926408408195193685246394721360383$
We use a conventional generator basepoint selected to have a convenient hex encoding:
0x0800000000000000000000000000000000000000000000000000000000000000
In affine coordinates this generator point has coordinates:
 $x=4959445789346820725352484487855828915252512307947624787834978378872129235627$
 $y=6060471950081851567114691557659790004756535011754163002297540472747064943288$
Implementation
An implementation of decaf377
can be found here.
Costs and Alternatives
Arithmetic circuits have a different cost model than software. In the software cost model, software executes machine instructions, but in the circuit cost model, relations are certified by constraints. Unfortunately, while Decaf is a clearly superior choice in the software context, in the circuit context it imposes some additional costs, which must be weighed against its benefits.
At a high level, Decaf implements a primeorder group using a nonprimeorder curve by constructing a group quotient. Internally, group elements are represented by curve points, with a custom equality check so that equivalent representatives are considered equal, an encoding function that encodes equivalent representatives as identical bitstrings, and a decoding function that only accepts canonical encodings of valid representatives.
To do this, the construction defines a canonical encoding on a Jacobi quartic curve mod its 2torsion (a subgroup of size 4) by making two independent sign choices. Then, it uses an isogeny to transport this encoding from the Jacobi quartic to a target curve that will be used to actually implement the group operations. This target curve can be an Edwards curve or a Montgomery curve. The isogenies are only used for deriving the construction. In implementations, all of these steps are collapsed into a single set of formulas that perform encoding and decoding on the target curve.
In other words, one way to think about the Decaf construction is as some machinery that transforms two sign choices into selection of a canonical representative. Ristretto adds extra machinery to handle cofactor 8 by making an additional sign choice.
In the software cost model, where software executes machine instructions, this construction is essentially free, because the cost of both the Decaf and conventional Edwards encodings are dominated by the cost of computing an inverse or an inverse square root, and the cost of the sign checks is insignificant.
However, in the circuit cost model, where relations are certified by various constraints, this is no longer the case. On the one hand, certifying a square root or an inverse just requires checking that $y_{2}=x$ or that $xy=1$, which is much cheaper than actually computing $y=x $ or $y=1/x$. On the other hand, performing a sign check involves bitconstraining a field element, requiring hundreds of constraints.
Sign checks
The definition of which finite field elements are considered nonnegative is essentially arbitrary. The Decaf paper suggests three possibilities:

using the least significant bit, defining $x$ to be nonnegative if the least absolute residue for $x$ is even;

using the most significant bit, defining $x$ to be nonnegative if the least absolute residue for $x$ is in the range $0≤x≤(q−1)/2$;

for fields where $q≡3(mod4)$, using the Legendre symbol, which distinguishes between square and nonsquare elements.
Using the Legendre symbol is very appealing in the circuit context, since it has an algebraic definition and, at least in the case of square elements, very efficient certification. For instance, if square elements are chosen to be nonnegative, then certifying that $x$ is nonnegative requires only one constraint, $x=y_{2}$. However, the reason for the restriction to $3(mod4)$ fields is that $1$ and $−1$ should have different signs, which can only be the case if $−1$ is nonsquare. Unfortunately, many SNARKfriendly curves, including BLS12377, are specifically chosen so that $q≡1(mod2_{α})$ for as large a power $α$ as possible (e.g., $α=47$ in the case of BLS12377).
This leaves us with either the LSB or MSB choices. The least significant bit is potentially simpler for implementations, since it is actually the low bit of the encoding of $x$, while the most significant bit isn’t, because it measures from $(q−1)/2$, not a bit position $2_{k}$, so it seems to require a comparison or range check to evaluate. However, these choices are basically equivalent, in the following sense:
Lemma.^{}1
The most significant bit of $x$ is $0$ if and only if the least significant bit of $2x$ is $0$.
Proof.
The MSB of $x$ is $0$ if and only if $2x≤q−1$, but this means that $2x$, which is even, is the least absolute residue, so the LSB of $2x$ is also $0$. On the other hand, the MSB of $x$ is $1$ if and only if $x>(q−1)/2$, i.e., if $x≥(q−1)/2+1$, i.e., if $2x≥q−1+2=q+1$. This means that the least absolute residue of $2x$ is $2x−q$; since $2x$ is even and $q$ is odd, this is odd and has LSB $1$. $□$
This means that transforming an LSB check to an MSB check or vice versa requires multiplication by $2$ or $1/2$, which costs at most one constraint.
Checking the MSB requires checking whether a value is in the range $[0,(q−1)/2]$. Using Daira Hopwood’s optimized range constraints, the range check costs $73C$^{2}. However, the input to the range check is a bitconstrained unpacking of a field element, not a field element itself. This unpacking costs $253C$.
Checking the LSB is no less expensive, because although the check only examines one bit, the circuit must certify that the bitencoding is canonical. This requires checking that the value is in the range $[0,q−1]$, which also costs $73C$, and as before, the unpacking costs $253C$.
In other words, checking the sign of a field element costs $253C+73C=326C$, or $76C$ in the case where the field element is already bitencoded for other reasons. These checks are the dominant cost for encoding and decoding, which both require two sign checks. Decoding from bits costs c. $73C+326C≅400C$, decoding from a field element costs c. $326C+326C≅750C$, and encoding costs c. $750C$ regardless of whether the output is encoded as bits or as a field element.
For decaf377
, we choose the LSB test for sign checks.
Alternative approaches to handling cofactors
Decaf constructs a primeorder group whose encoding and decoding methods perform validation. A more conventional alternative approach is to use the underlying elliptic curve directly, restrict to its primeorder subgroup, and do subgroup validation separately from encoding and decoding. If this validation is done correctly, it provides a primeorder group. However, because validation is an additional step, rather than an integrated part of the encoding and decoding methods, this approach is necessarily more brittle, because each implementation must be sure to do both steps.
In the software cost model, there is no reason to use subgroup validation, because it is both more expensive and more brittle than Decaf or Ristretto. However, in the circuit cost model, there are cheaper alternatives, previously analyzed by Daira Hopwood in the context of Ristretto for JubJub (1, 2).
Multiplication by the group order.
The first validation method is to do a scalar multiplication and check that $[q]P=1$. Because the prime order is fixed, this scalar multiplication can be performed more efficiently using a hardcoded sequence of additions and doublings.
Cofactor preimage.
The second validation method provides a preimage $Q=[1/4]P$ in affine coordinates $(x,y)$. Because the image of $[4]:E→E$ is the primeorder subgroup, checking that $Q$ satisfies the curve equation and that $P=[4]Q$ checks that $P$ is in the primeorder subgroup.
In the software context, computing $[1/4]P$ and computing $[q]P$ cost about the same, although both are an order of magnitude more expensive than decoding. But in the circuit context, the prover can compute $Q=[1/4]P$ outside of the circuit and use only a few constraints to check the curve equation and two doublings. These costs round to zero compared to sign checks, so the validation is almost free.
The standard “compressed Edwards y” format encodes a point $(x,y)$ by the $y$coordinate and a sign bit indicating whether $x$ is nonnegative. In software, the cost of encoding and decoding are about the same, and dominated by taking an inverse square root. In circuits, the costs of encoding and decoding are also about the same, but they are instead dominated by a sign check that matches the sign of the recovered $x$coordinate with the supplied sign bit. This costs c. $325C$ as above.
Comparison and discussion
This table considers only approximate costs.
Operation  decaf377  Compressed Ed + Preimage 

Decode (from bits)  400C  400C 
Decode (from $F_{q}$)  750C  325C 
Encode (to bits)  750C  750C 
Encode (to $F_{q}$)  750C  325C 
When decoding from or encoding to field elements, the marginal cost of Decaf compared to the compressed Edwards + cofactor preimage is an extra bitunpacking and range check. While this effectively doubles the number of constraints, the marginal cost of c. $325C$ is still small relative to other operations like a scalar multiplication, which at 6 constraints per bit is approximately $1500C$.
When decoding from or encoding to bits, the marginal cost of Decaf disappears. When the input is already bitconstrained, Decaf’s first sign check can reuse the bit constraints, saving c. $250C$, but the compressed Edwards encoding must rangecheck the bits (which Decaf already does), costing c. $75C$ extra. Similarly, in encoding, Decaf’s second sign check produces bitconstrained variables for free, while the compressed Edwards encoding spends c. $250C+75C$ bitconstraining and rangechecking them.
However, in the software context, the primeorder validation check costs approximately 10x more than the cost of either encoding. Many applications require use of the embedded group both inside and outside of the circuit, and uses outside of the circuit may have additional resource constraints (for instance, a hardware token creating a signature authorizing delegated proving, etc.).
Performing validation as an additional, optional step also poses additional risks. While a specification may require it to be performed, implementations that skip the check will appear to work fine, and the history of invalidpoint attacks (where implementations should, but don’t, check that point coordinates satisfy the curve equation) suggests that structuring validation as an integral part of encoding and decoding is a safer design. This may not be a concern for a specific application with a single, highquality implementation that doesn’t make mistakes, but it’s less desirable for a generalpurpose construction.
In summary, Decaf provides a construction that works the same way inside and outside of a circuit and integrates validation with the encoding, imposing only a modest cost for use in circuits and no costs for lightweight, softwareonly applications, making it a good choice for generalpurpose constructions.
I have no idea whether this is common knowledge; I learned of this fact from its use in Mike Hamburg’s Ed448Goldilocks implementation.
The value 73 is computed as:
from itertools import groupby
def cost(k):
return min(k1, 2)
def constraints(bound):
costs = [cost(len(list(g))+1) for (c, g) in groupby(bound.bits()[:1]) if c == 1]
return sum(costs)
constraints(ZZ((q1)/2))
as here.
Inverse Square Roots
As in the internetdraft, the decaf377
functions are defined in terms of the
following function, which computes the square root of a ratio of field elements,
with the special behavior that if the input is nonsquare, it returns the square
root of a related field element, to allow reuse of the computation in the
hashtogroup setting.
Define $ζ$ as a nonsquare in the field and sqrt_ratio_zeta(N,D)
as:
 (True, $DN $) if $N$ and $D$ are nonzero, and $DN $ is square;
 (True, $0$) if $N$ is zero;
 (False, $0$) if $D$ is zero and $N$ is nonzero;
 (False, $ζDN $) if $N$ and $D$ are nonzero, and $DN $ is nonsquare.
Since $ζ$ is nonsquare, if $DN $ is nonsquare, $ζDN $ is
square. Note that unlike the similar function in the
ristretto255
/decaf448
internetdraft, this function does not make any
claims about the sign of its output.
To compute sqrt_ratio_zeta
we use a tablebased method adapted from Sarkar 2020 and zcashpasta, which is described in the remainder of this section.
Constants
We set $n>1$ (the 2adicity of the field) and $m$ odd such that $p−1=2_{n}m$. For the BLS12377 scalar field, $n=47$ and $m=60001509534603559531609739528203892656505753216962260608619555$.
We define $g=ζ_{m}$ where $ζ$ is a nonsquare as described above. We fix $ζ$ as 2841681278031794617739547238867782961338435681360110683443920362658525667816.
We then define a $k≥1$ and $ℓ_{0},…,ℓ_{k−1}>0$ such that $ℓ_{0}+…+ℓ_{k−1}=n−1$. We also define a parameter $w$ where $1≤w≤max(ℓ_{0},ℓ_{1},…,ℓ_{k−1})$. For decaf377 we choose:
 $k=6$
 $ℓ_{0,1}=7$
 $ℓ_{2,3,4,5}=8$
 $w=8$
Precomputation
Lookup tables are needed which can be precomputed as they depend only on the 2adicity $n$ and the choice of $ℓ_{i}$ above.
$g$ lookup table: $g_{ν2_{iw}}$
We compute $g_{ν2_{iw}}$ for $ν∈1,…,2_{w}−1$ and $i∈0,…,k−1$, indexed on $x$ and $ν$:
$ gg_{2_{8}}⋮g_{2_{40}} g_{2}g_{2_{9}}⋮g_{2_{41}} ……⋱… g_{2_{8}−1}(g_{2_{8}})_{2_{8}−1}⋮(g_{2_{40}})_{2_{8}−1} $
This table lets us look up powers of $g$.
$s$ lookup table: $g_{−ν(2_{n−w})}$
We compute $g_{−ν(2_{n−w})}$ for $ν∈0,…,2_{w}−1$, indexed on $g_{−ν(2_{n−w})}$:
$(g_{0} g_{−2_{39}} (g_{−2_{39}})_{2} … (g_{−2_{39}})_{2_{8}−1} )$
We use this table in the procedure that follows to find $q_{i}$ (they are the $ν$ values) in order to compute $s_{i}$.
Procedure
In the following procedure, let $u=N/D$. We use the following relations from Sarkar 2020:
 Equation 1: $α_{i}=x_{i}g_{t_{i}}$ and $t_{i}=(t_{i−1}+s_{i−1})/2_{ℓ_{i}}$ for $i∈0,…,k−1$ and $t_{k}=t_{k−1}+s_{k−1}$
 Lemma 3: $α_{i}g_{s_{i}}=1$ for $i∈0,…,k−1$
 Equation 2: $s_{i}=q_{i}2_{n−l_{i}}$
In these expressions, $x_{i}$ and $α_{i}$ are field elements. $q_{i}$ are unsigned $ℓ_{i}$bit integers. At each $i$, the algorithm first computes $x_{i}$, then $t_{i}$ and $α_{i}$ (from the previous step’s $t_{i−1}$ and $s_{i−1}$), then finally $s_{i}$ and $q_{i}$, in each case such that they satisfy the above expressions. Note that in the algorithm $t_{0}=0$.
Step 1: Compute $v=u_{2m−1}$
We compute $v=u_{2m−1}$. This corresponds to line 2 of the findSqRoot
Algorithm 1 in Sarkar 2020.
Substituting $u=N/D$:
$v=(DN )_{2m−1}=N_{2m−1}D_{−2m−1}$
Applying Fermat’s Little Theorem (i.e. $D_{p−1}=1modp$):
$v=N_{2m−1}D_{p−1−2m−1}$
Substituting $p−1=2_{n}m$ and rearranging:
$v=N_{2m−1}D_{2_{n}m−2m−1}$
$v=N_{2m−1}D_{21(2_{n+1}m−m−1)}$
$v=N_{2m−1}D_{21(2_{n+1}m−m−1−2_{n+1}+2_{n+1})}$
$v=N_{2m−1}D_{21(2_{n+1}−1)(m−1)+2_{n}}$
$v=N_{2m−1}D_{21(2_{n+1}−1)(m−1)}D_{2_{n}}$
We compute $v$ using a quantity $w$ we define as:
$w=(ND_{2_{n+1}−1})_{(m−1)/2}D_{2_{n}−1}$
We also define:
$s=D_{2_{n}−1}$
$t=D_{2_{n+1}−1}=Ds_{2}$
And substitute $s$ and $t$ into $w$ which gives us:
$w=s(Nt)_{2m−1}$
We now can use $w$ in the computation for $v$ and $uv$:
$v=wD$
$uv=wN$
Step 2: Compute $x$
Compute $x=(uv)v$ using $v$ and $uv$ as calculated in the prior step. This corresponds to line 4 of the findSqRoot
Algorithm 1 in Sarkar 2020.
Step 3: Compute $x_{i}$
We next compute $x_{i}=x_{2_{n−1−(ℓ+…+ℓ)}}$ for $i=0,..,k−1$. This corresponds to line 5 of the findSqRoot
Algorithm 1 in Sarkar 2020. This gives us the following components:
$x_{0}=x_{2_{n−1−ℓ}}=x_{2_{39}}=x_{1}$
$x_{1}=x_{2_{n−1−(ℓ+ℓ)}}=x_{2_{32}}=x_{2}$
$x_{2}=x_{2_{24}}=x_{3}$
$x_{3}=x_{2_{16}}=x_{4}$
$x_{4}=x_{2_{8}}=x_{5}$
$x_{5}=x_{2_{0}}=x$
Step 4: Compute $s_{i}$ and $t_{i}$
Next, we loop over $k$. This corresponds to lines 69 of the findSqRoot
Algorithm 1 in Sarkar 2020.
For $i=0$
Using Lemma 3:
$α_{0}g_{s_{0}}=1$
Substituting the definition of $α_{0}$ from equation 1:
$x_{0}g_{t_{0}}g_{s_{0}}=1$
Rearranging and substituting in $t_{0}=0$ (initial condition):
$x_{0}=g_{−s_{0}}$
Substituting in equation 2:
$x_{0}=g_{−q_{0}2_{n−ℓ}}=g_{−q_{0}2_{40}}$
This is almost in a form where we can look up $q_{0}$ in our s lookup table to get $q_{0}$ and thus $s_{0}$. If we define $q_{0}=2q_{0}$ we get:
$x_{0}=g_{−q_{0}2_{39}}$
Which we can use with our s lookup table to get $q_{0}$. Expressing $s_{0}$ in terms of $q_{0}$, we get $s_{0}=q_{0}2_{39}$.
For $i=1$
First we compute $t_{1}$ using equation 1:
$t_{1}=(t_{0}+s_{0})/2_{ℓ_{1}}=(t_{0}+s_{0})/2_{7}=q_{0}2_{32}$
Next, similar to the first iteration, we use lemma 3 and substitute in $t_{1}$ and $s_{1}$ to yield:
$α_{1}g_{s_{1}}=1$
$x_{1}g_{q_{0}2_{32}}=g_{−q_{1}2_{39}}$
In this expression we can compute the quantities on the left hand side, and the right hand side is in the form we expect for the s lookup table, yielding us $q_{1}$. Note that here too we define $q_{1}=2q_{1}$ such that the s lookup table can be used. Expressing $s_{1}$ in terms of $q_{1}$, we get $s_{1}=q_{1}2_{39}$.
For $i=2,…,5$
The remaining iterations proceed similarly, yielding the following expressions:
$t_{2}=q_{0}2_{24}+q_{1}2_{31}$
$s_{2}=q_{2}2_{39}$
Note for $q_{2}$ and the remaining iterations we do not require a trick (i.e. where $q_{i}=2q_{i}$) to get $s_{2}$ in a form where it can be used with the s lookup table. In the following expressions for $t_{i}$, $q_{1}$ is always even, and so the high bit of each value is unchanged when adding $q_{1}$.
$t_{3}=q_{0}2_{16}+q_{1}2_{23}+q_{2}2_{31}$
$s_{3}=q_{3}2_{39}$
$t_{4}=q_{0}2_{8}+q_{1}2_{15}+q_{2}2_{23}+q_{3}2_{31}$
$s_{4}=q_{4}2_{39}$
$t_{5}=q_{0}+q_{1}2_{7}+q_{2}2_{15}+q_{3}2_{23}+q_{4}2_{31}$
$s_{5}=q_{5}2_{39}$
At the end of this step, we have found $s_{i}$ and $t_{i}$ for $i∈0,…,k−1$.
Step 5: Return result $y$
Next, we can use equation 1 to compute $t=t_{5}+s_{5}$ using $t_{5}$ and $s_{5}$ from the previous step:
$t=q_{0}+q_{1}2_{7}+q_{2}2_{15}+q_{3}2_{23}+q_{4}2_{31}+q_{5}2_{39}$
This matches the expression from Lemma 4 in Sarkar 2020.
Next, to compute $g_{t/2}$, we lookup entries in the g lookup table. To do so, we can decompose $t/2$ into:
$t/2=q_{0}2_{0}+q_{1}2_{8}+q_{2}2_{16}+q_{3}2_{24}+q_{4}2_{32}+q_{5}2_{40}$
then $g_{t/2}$ is computed as:
$g_{t/2}=g_{q_{0}2_{0}}g_{q_{1}2_{8}}g_{q_{2}2_{16}}g_{q_{3}2_{24}}g_{q_{4}2_{32}}g_{q_{5}2_{40}}$
Multiplying in $uv$ from step 1, we compute:
$y=uvg_{t/2}$
This corresponds to line 10 of the findSqRoot
Algorithm 1 in Sarkar 2020.
In the nonsquare case, $q_{0}$ will be odd, and $t$ will be odd. We will have computed $y=(N/D)g $ and multiply $y$ by a correction $ζ/g $ to get our desired output.
We can use the result of this computation $y$ to determine whether or not the square exists, recalling from Step 1 that $u=N/D$:
 If $u$ is square, then $y=N/D $, and $y_{2}D−N=0$
 If $u$ is nonsquare, then $y=ζN/D $ and $y_{2}D−ζN=0$.
Decoding
Decoding to a point works as follows where $a$ and $d$ are the curve parameters as described here.

Decode
s_bytes
to a field element $s$. We interpret these bytes as unsigned littleendian bytes. We check if the length has 32 bytes, where the top 3 bits of the last byte are 0. The 32 bytes are verified to be canonical, and rejected if not (if the input is already a field element in the circuit case, skip this step). 
Check that $s$ is nonnegative, or reject (sign check 1).

$u_{1}←1+as_{2}$.

$u_{2}←u_{1}−4ds_{2}$.

(was_square, v) = sqrt_ratio_zeta(1, u_2 * u_1^2)
, rejecting ifwas_square
is false. 
$v←−v$ if $2su_{1}v$ is negative (sign check 2)^{1}.

$(x,y)←(2sv_{2}u_{1}u_{2},(1−as_{2})vu_{1})$.
The resulting coordinates are the affine Edwards coordinates of an internal representative of the group element.
Note this differs from the Decaf paper in Appendix A.2, but
implementations of decaf377
should follow the convention described here.
Encoding
Given a representative in extended coordinates $(X,Y,Z,T)$, encoding works as follows where $a$ and $d$ are the curve parameters as described here.

$u_{1}←(X+T)(X−T)$.

(_ignored, v) = sqrt_ratio_zeta(1, u_1 * (a  d) * X^2)
. 
$u_{2}←∣vu_{1}∣$ (sign check 1).

$u_{3}←u_{2}Z−T$.

$s←∣(a−d)vu_{3}X∣$.

Set
s_bytes
to be the canonical unsigned littleendian encoding of $s$, which is an integer mod $q$.s_bytes
has extra0x00
bytes appended to reach a length of 32 bytes.
Group Hash
Elligator can be applied to map a field element to a curve point. The map can be applied once to derive a curve point suitable for use with computational DiffieHellman (CDH) challenges, and twice to derive a curve point indistinguishable from random.
In the following section, $a$ and $d$ are the curve parameters as described here. $ζ$ is a constant and sqrt_ratio_zeta(v_1,v_2)
is a function, both defined in the Inverse Square Roots section.
The Elligator map is applied as follows to a field element $r_{0}$:

$r←ζr_{0}$.

$u_{1}←(dr−d+a)(dr−ar−d)$.

$n_{1}←(r+1)(a−2d)$.

$m,x=$
sqrt_ratio_zeta
$(1,u_{1}n_{1})$ where $m$ is a boolean indicating whether or not a square root exists for the provided input. 
If a square root for $u_{1}n_{1}$ does not exist, then $q=−1$ and $x=r_{0}ζx$. Else, $q=1$ and $x$ is unchanged.

$s←xn_{1}$.

$t←−qxs(r−1)(a−2d)_{2}−1$.

If ($s<0$ and $m$ is true) or ($s>0$ and $m$ is false) then $s=−s$.
The Jacobi quartic representation of the resulting point is given by $(s,t)$. The resulting point can be converted from its Jacobi quartic representation to extended projective coordinates via:
$E←2s$
$F←1+as_{2}$
$G←1−as_{2}$
$H←t$
$X←EH$
$Y←FG$
$Z←FH$
$T←EG$
For singlewidth hashtogroup (encode_to_curve
), we apply the above map once. For doublewidth (hash_to_curve
) we apply the map to two field elements and add the resulting curve points.
Test Vectors
Small generator multiples
This table has hexencodings of small multiples of the generator $B$:
Element  Hex encoding 

$0$  0000000000000000000000000000000000000000000000000000000000000000 
$B$  0800000000000000000000000000000000000000000000000000000000000000 
$[2]B$  b2ecf9b9082d6306538be73b0d6ee741141f3222152da78685d6596efc8c1506 
$[3]B$  2ebd42dd3a2307083c834e79fb9e787e352dd33e0d719f86ae4adb02fe382409 
$[4]B$  6acd327d70f9588fac373d165f4d9d5300510274dffdfdf2bf0955acd78da50d 
$[5]B$  460f913e516441c286d95dd30b0a2d2bf14264f325528b06455d7cb93ba13a0b 
$[6]B$  ec8798bcbb3bf29329549d769f89cf7993e15e2c68ec7aa2a956edf5ec62ae07 
$[7]B$  48b01e513dd37d94c3b48940dc133b92ccba7f546e99d3fc2e602d284f609f00 
$[8]B$  a4e85dddd19c80ecf5ef10b9d27b6626ac1a4f90bd10d263c717ecce4da6570a 
$[9]B$  1a8fea8cbfbc91236d8c7924e3e7e617f9dd544b710ee83827737fe8dc63ae00 
$[10]B$  0a0f86eaac0c1af30eb138467c49381edb2808904c81a4b81d2b02a2d7816006 
$[11]B$  588125a8f4e2bab8d16affc4ca60c5f64b50d38d2bb053148021631f72e99b06 
$[12]B$  f43f4cefbe7326eaab1584722b1b4860de554b23a14490a03f3fd63a089add0b 
$[13]B$  76c739a33ffd15cf6554a8e705dc573f26490b64de0c5bd4e4ac75ed5af8e60b 
$[14]B$  200136952d18d3f6c70347032ba3fef4f60c240d706be2950b4f42f1a7087705 
$[15]B$  bcb0f922df1c7aa9579394020187a2e19e2d8073452c6ab9b0c4b052aa50f505 
Hashtogroup
This table has input field elements along with the $(x,y)$ affine coordinates of the output point after applying the elligator map once:
Input field element  $(x,y)$ output point 

2873166235834220037104482467644394559952202754715866736878534498814378075613  (1267955849280145133999011095767946180059440909377398529682813961428156596086, 5356565093348124788258444273601808083900527100008973995409157974880178412098) 
7664634080946480262422274939177258683377350652451958930279692300451854076695  (1502379126429822955521756759528876454108853047288874182661923263559139887582, 7074060208122316523843780248565740332109149189893811936352820920606931717751) 
707087697291448463178823336344479808196630248514167087002061771344499604401  (2943006201157313879823661217587757631000260143892726691725524748591717287835, 4988568968545687084099497807398918406354768651099165603393269329811556860241) 
4040687156656275865790182426684295234932961916167736272791705576788972921292  (2893226299356126359042735859950249532894422276065676168505232431940642875576, 5540423804567408742733533031617546054084724133604190833318816134173899774745) 
6012393175004325154204026250961812614679561282637871416475605431319079196219  (2950911977149336430054248283274523588551527495862004038190631992225597951816, 4487595759841081228081250163499667279979722963517149877172642608282938805393) 
7255180635786717958849398836099816771666363291918359850790043721721417277258  (3318574188155535806336376903248065799756521242795466350457330678746659358665, 7706453242502782485686954136003233626318476373744684895503194201695334921001) 
6609366864829739556945402594963920739176902000316365292959221199804402230199  (3753408652523927772367064460787503971543824818235418436841486337042861871179, 2820605049615187268236268737743168629279853653807906481532750947771625104256) 
6875465950337820928985371259904709015074922314668494500948688901607284806973  (7803875556376973796629423752730968724982795310878526731231718944925551226171, 7033839813997913565841973681083930410776455889380940679209912201081069572111) 
Poseidon for BLS12377
The Poseidon hash function is a cryptographic hash function that operates natively over prime fields. This allows the hash function to be used efficiently in the context of a SNARK. In the sections that follow we describe our instantiation of Poseidon over BLS12377.
Overview of the Poseidon Permutation
This section describes the Poseidon permutation. It consists of rounds, where each round has the following steps:
AddRoundConstants
: where constants (denoted byarc
in the code) are added to the internal state,SubWords
: where the Sbox $S(x)=x_{α}$ is applied to the internal state,MixLayer
: where a matrix is multiplied with the internal state.
The total number of rounds we denote by $R$. There are two types of round in the Poseidon construction, partial and full. We denote the number of partial and full rounds by $R_{P}$ and $R_{F}$ respectively.
In a full round in the SubWords
layer the Sbox is applied to each element of the
internal state, as shown in the diagram below:
┌───────────────────────────────────────────────────────────┐
│ │
│ AddRoundConstants │
│ │
└────────────┬──────────┬──────────┬──────────┬─────────────┘
│ │ │ │
┌─▼─┐ ┌─▼─┐ ┌─▼─┐ ┌─▼─┐
│ S │ │ S │ │ S │ │ S │
└─┬─┘ └─┬─┘ └─┬─┘ └─┬─┘
│ │ │ │
┌────────────▼──────────▼──────────▼──────────▼─────────────┐
│ │
│ MixLayer │
│ │
└────────────┬──────────┬──────────┬──────────┬─────────────┘
│ │ │ │
▼ ▼ ▼ ▼
In a partial round, in the SubWords
layer we apply the Sbox only to one element
of the internal state, as shown in the diagram below:
│ │ │ │
│ │ │ │
┌────────────▼──────────▼──────────▼──────────▼─────────────┐
│ │
│ AddRoundConstants │
│ │
└────────────┬──────────────────────────────────────────────┘
│
┌─▼─┐
│ S │
└─┬─┘
│
┌────────────▼──────────────────────────────────────────────┐
│ │
│ MixLayer │
│ │
└────────────┬──────────┬──────────┬──────────┬─────────────┘
│ │ │ │
▼ ▼ ▼ ▼
We apply half the full rounds ($R_{f}=R_{F}/2$) first, then we apply the $R_{P}$ partial rounds, then the rest of the $R_{f}$ full rounds. This is called the HADES design strategy in the literature.
Poseidon Parameter Generation
The problem of Poseidon parameter generation is to pick secure choices for the parameters used in the permutation given the field, desired security level $M$ in bits, as well as the width $t$ of the hash function one wants to instantiate (i.e. 1:1 hash, 2:1 hash, etc.).
Poseidon parameters consist of:
 Choice of Sbox: choosing the exponent $α$ for the Sbox layer where $S(x)=x_{α}$,
 Round numbers: the numbers of partial and full rounds,
 Round constants: the constants to be added in the
AddRoundConstants
step,  MDS Matrix: generating a Maximum Distance Separable (MDS) matrix to use in the linear layer, where we multiply this matrix by the internal state.
Appendix B of the Poseidon paper provides sample implementations of both the Poseidon
permutation as well as parameter generation. There is a Python script called
calc_round_numbers.py
which provides the round numbers given the security level
$M$, the width of the hash function $t$, as well as the choice of $α$ used
in the Sbox step. There is also a Sage script, which generates the round numbers,
constants, and MDS matrix, given the security level $M$, the width of the hash
function $t$, as well as the choice of $α$ used in the Sbox step.
Since the publication of the Poseidon paper, others have edited these scripts, resulting in a number of implementations being in use derived from these initial scripts. We elected to implement Poseidon parameter generation in Rust from the paper, checking each step, and additionally automating the Sbox parameter selection step such that one can provide only the modulus of a prime field and the best $α$ will be selected.
Below we describe where we deviate from the parameter selection procedure described in the text of the Poseidon paper.
Choice of SBox
The Poseidon paper focuses on the cases where $α=[−1,3,5]$, as well as BLS12381 and BN254. For a choice of positive $α$, it must satisfy $gcd(α,p−1)=1$, where $p$ is the prime modulus.
For our use of Poseidon on BLS12377, we wanted to generate a procedure for selecting the optimal $α$ for a general curve, which we describe below.
We prefer small, positive $α$ for software (noncircuit) performance, only using $α=−1$ when we are unable to find an appropriate positive $α$. For positive $α$, the number of constraints in an arithmetic circuit per SBox is equal to its depth in a tree of shortest addition chains. For a given number of constraints (i.e. at a given depth in the tree), we pick the largest $α$ at that level that meets the GCD requirement. Since larger $α$ provides more security, choosing the largest at a given depth reduces the round requirement.
The procedure in detail:
We proceed down the tree from depth 2 to depth 5 (where depth 0 is the root of 1):
 At a given depth, proceed from largest number to smaller numbers.
 For a given element, check if $gcd(α,p−1)=1$ is satisfied. If yes, we choose it, else continue.
If we get through these checks to depth of 5 without finding a positive exponent for $α$, then we pick $α=−1$, which is wellstudied in the original Poseidon paper.
For decaf377
, following this procedure we end up with $α=17$.
Round Numbers
We implement the round numbers as described in the original paper. These are the number of rounds necessary to resist known attacks in the literature, plus a security margin of +2 full rounds, and +7.5% partial rounds.
We test our round number calculations with tests from Appendices G and H from the paper which contain concrete instantiations of Poseidon for $α=[−1,3,5]$ and their round numbers.
Round Constants
We do not use the Grain LFSR for generating pseudorandom numbers as described in Appendix F of the original paper. Instead, we use a Merlin transcript to enable parameter generation to be fully deterministic and easily reproducible.
We first append the message "poseidonparamgen"
to the transcript
using label domsep
.
We then bind this transcript to the input parameter choices:
 the width of the hash function $t$ using label
t
,  the security level $M$ using label
M
, and  the modulus of the prime field $p$ using label
p
.
We then also bind the transcript to the specific instance, as done with the Grain LFSR in Appendix F, so we bind to:
 the number of full rounds $R_{F}$ using label
r_F
,  the number of partial rounds $R_{P}$ using label
r_P
, and  the choice of Sbox exponent $α$ using label
alpha
.
We generate random field elements from hashes of this complete transcript of all the input parameters and the derived parameters $α$, $R_{F}$, and $R_{P}$.
Each round constant is generated by obtaining challenge bytes from the Merlin transcript, derived
using label roundconstant
. We obtain $(n+135)/8$ bytes where $n$ is the
field size in bits. These bytes are then interpreted as an unsigned littleendian
integer reduced modulo the field.
MDS Matrix
We generate MDS matrices using the Cauchy method. However instead of randomly sampling the field as described in the text, we deterministically generate vectors $x$ and $y$ as:
$x=[0..t)$ $y=[t..2t)$
Each element $M_{ij}$ of the matrix is then constructed as:
$M_{ij}=x_{i}+y_{j}1 $
where $i,j∈[0..t)$.
This deterministic matrix generation method has been verified to be safe over the
base field of decaf377
, using algorithms 13 described in Grassi, Rechberger
and Schofnegger 2020 over the t
range 1100.
The parameters that are used for poseidon377
are located in the params.rs
module of the source code. The parameters were generated on an Apple M1.
Test Vectors
The following are test vectors for the poseidon377
Poseidon instantiation.
Each section is for a given rate of a fixedwidth hash function, where
capacity is 1. Inputs and output are $F_{q}$ elements. The domain separator
used in each case are the bytes "Penumbra_TestVec"
decoded to an $F_{q}$ element,
where we interpret these bytes in littleendian order.
Rate 1
Input element:
7553885614632219548127688026174585776320152166623257619763178041781456016062
Output element:
2337838243217876174544784248400816541933405738836087430664765452605435675740
Rate 2
Input elements:
7553885614632219548127688026174585776320152166623257619763178041781456016062
2337838243217876174544784248400816541933405738836087430664765452605435675740
Output element:
4318449279293553393006719276941638490334729643330833590842693275258805886300
Rate 3
Input elements:
7553885614632219548127688026174585776320152166623257619763178041781456016062
2337838243217876174544784248400816541933405738836087430664765452605435675740
4318449279293553393006719276941638490334729643330833590842693275258805886300
Output element:
2884734248868891876687246055367204388444877057000108043377667455104051576315
Rate 4
Input elements:
7553885614632219548127688026174585776320152166623257619763178041781456016062
2337838243217876174544784248400816541933405738836087430664765452605435675740
4318449279293553393006719276941638490334729643330833590842693275258805886300
2884734248868891876687246055367204388444877057000108043377667455104051576315
Output element:
5235431038142849831913898188189800916077016298531443239266169457588889298166
Rate 5
Input elements:
7553885614632219548127688026174585776320152166623257619763178041781456016062
2337838243217876174544784248400816541933405738836087430664765452605435675740
4318449279293553393006719276941638490334729643330833590842693275258805886300
2884734248868891876687246055367204388444877057000108043377667455104051576315
5235431038142849831913898188189800916077016298531443239266169457588889298166
Output element:
66948599770858083122195578203282720327054804952637730715402418442993895152
Rate 6
Input elements:
7553885614632219548127688026174585776320152166623257619763178041781456016062
2337838243217876174544784248400816541933405738836087430664765452605435675740
4318449279293553393006719276941638490334729643330833590842693275258805886300
2884734248868891876687246055367204388444877057000108043377667455104051576315
5235431038142849831913898188189800916077016298531443239266169457588889298166
66948599770858083122195578203282720327054804952637730715402418442993895152
Output element:
6797655301930638258044003960605211404784492298673033525596396177265014216269
Fuzzy Message Detection
By design, privacypreserving blockchains like Penumbra don’t reveal metadata about the sender or receiver of a transaction. However, this means that users must scan the entire chain to determine which transactions relate to their addresses. This imposes large bandwidth and latency costs on users who do not maintain online replicas of the chain state, as they must “catch up” each time they come online by scanning all transactions that have occurred since their last activity.
Alternatively, users could delegate scanning to a third party, who would monitor updates to the chain state on their behalf and forward only a subset of those updates to the user. This is possible using viewing keys, as in Zcash, but viewing keys represent the capability to view all activity related to a particular address, so they can only be delegated to trusted third parties.
Instead, it would be useful to be able to delegate only a probabilistic detection capability. Analogous to a Bloom filter, this would allow a detector to identify all transactions related to a particular address (no false negatives), while also identifying unrelated transactions with some false positive probability. Unlike viewing capability, detection capability would not include the ability to view the details of a transaction, only a probabilistic association with a particular address. This is the problem of fuzzy message detection (FMD), analyzed by Beck, Len, Miers, and Green in their paper Fuzzy Message Detection, which proposes a cryptographic definition of fuzzy message detection and three potential constructions.
This section explores how Penumbra could make use of fuzzy message detection:

In Sender and Receiver FMD, we propose a generalization of the original definition where the false positive probability is set by the sender instead of the receiver, and discusses why this is useful.

In Constructing SFMD, we realize the new definition using a variant of one of the original FMD constructions, and extend it in two ways:
 to support arbitrarily precise detection with compact, constantsize keys;
 to support diversified detection, allowing multiple, publicly unlinkable addresses to be scanned by a single detection key.
Unfortunately, these extensions are not mutually compatible, so we only use the first one, and record the second for posterity.
 In SFMD Threat Model, we describe the threat model for SFMD on Penumbra.
 In SFMD in Penumbra, we describe how SFMD is integrated into Penumbra.
 In Parameter Considerations, we discuss how the false positive rates should be chosen.
Acknowledgements
Thanks to George Tankersley and Sarah Jamie Lewis for discussions on this topic (and each independently suggesting the modifications to realize SFMD), to Gabrielle Beck for discussions about the paper and ideas about statistical attacks, and to Guillermo Angeris for pointers on analyzing information disclosure.
Sender and Receiver FMD
The goal of detection capability is to be able to filter the global stream of state updates into a local stream of state updates that includes all updates related to a particular address, without identifying precisely which updates those are. The rate of updates on this filtered stream should be bounded below, to ensure that there is a large enough anonymity set, and bounded above, so that users processing the stream have a constant and manageable amount of work to process it and catch up with the current chain state. This means that the detection precision must be adaptive to the global message rates: if the false positive rate is too low, the filtered stream will have too few messages, and if it is too high, it will have too many messages.
However, this isn’t possible using the paper’s original definition of fuzzy message detection, because the false positive probability is chosen by the receiver, who has no way to know in advance what probability will produce a correctly sized stream of messages. One way to address this problem is to rename the original FMD definition as Receiver FMD (RFMD), and tweak it to obtain Sender FMD (SFMD), in which the sender chooses the detection probability.
Receiver FMD
The paper’s original definition of fuzzy message detection is as a tuple of
algorithms (KeyGen, CreateClue, Extract, Examine)
.^{1} The receiver uses
KeyGen
to generate a root key and a clue key. A sender uses the
receiver’s clue key as input to CreateClue
to produce a clue. The Extract
algorithm takes the root key and a false positive rate $p$ (chosen from some set
of supported rates), and produces a detection key. The Examine
algorithm
uses a detection key to examine a clue and produce a detection result.
This scheme should satisfy certain properties, formalizations of which can be found in the paper:
Correctness.
Valid matches must always be detected by Examine
; i.e., there are no false negatives.
Fuzziness.
Invalid matches should produce false positives with probability approximately $p$, as long as the clues and detection keys were honestly generated.
Detection Ambiguity.
An adversarial detector must be unable to distinguish between a true positive and a false positive, as long as the clues and detection keys were honestly generated.
In this original definition, the receiver has control over how much detection precision they delegate to a third party, because they choose the false positive probability when they extract a detection key from their root key. This fits with the idea of attenuating credentials, and intuitively, it seems correct that the receiver should control how much information they reveal to their detector. But the information revealed to their detector is determined by both the false positive probability and the amount of other messages that can function as cover traffic. Without knowing the extent of other activity on the system, the receiver has no way to make a principled choice of the detection precision to delegate.
Sender FMD
To address this problem, we generalize the original definition (now Receiver FMD) to Sender FMD, in which the false positive probability is chosen by the sender.
SFMD consists of a tuple of algorithms (KeyGen, CreateClue, Examine)
. Like
RFMD, CreateClue
creates a clue and Examine
takes a detection key and a
clue and produces a detection result. As discussed in the next
section, SFMD can be realized using tweaks to either of the
RFMD constructions in the original paper.
Unlike RFMD, the false positive rate is set by the sender, so CreateClue
takes both the false positive rate $p$ and the receiver’s clue key. Because the
false postive rate is set by the sender, there is no separation of capability
between the root key and a detection key, so KeyGen
outputs a clue key and a
detection key, and Extract
disappears.
In RFMD, flag ciphertexts are universal with respect to the false positive rate, which is applied to the detection key; in SFMD, the false positive rate is applied to the flag ciphertext and the detection key is universal.
Unlike RFMD, SFMD allows detection precision to be adaptive, by having senders use a (consensusdetermined) false positive parameter. This parameter should vary as the global message rates vary, so that filtered message streams have a bounded rate, and it should be the same for all users, so that messages cannot be distinguished by their false positive rate.
We change terminology from the FMD paper; the paper calls detection and
clue keys the secret and public keys respectively, but we avoid this in favor of
capabilitybased terminology that names keys according to the precise capability
they allow. The “clue” terminology is adopted from the Oblivious Message
Retrieval paper of Zeyu Liu and Eran Tromer; we CreateClue
and Examine
clues
rather than Flag
and Test
flag ciphertexts.
Constructing SFMD
The original FMD paper provides three constructions of RFMD. The first two realize functionality for restricted falsepositive probabilities of the form $p=2_{−n}$; the third supports arbitrary fractional probabilities using a much more complex and expensive construction.
The first two schemes, RFMD1 and RFMD2, are constructed similarly to a Bloom
filter: the CreateClue
procedure encrypts a number of 1
bits, and the Examine
procedure uses information in a detection key to check whether some subset of
them are 1
, returning true
if so and false
otherwise. The false positive
probability is controlled by extracting only a subset of the information in the
root key into the detection key, so that it can only check a subset of the bits
encoded in the clue. We focus on RFMD2, which provides more compact
ciphertexts and chosenciphertext security.
In this section, we:
 recall the construction of RFMD2, changing notation from the paper;
 adapt RFMD2 to SFMD2, which provides sender FMD instead of receiver FMD;
 extend SFMD2 to use deterministic derivation, allowing 32byte flag and detection keys to support arbitraryprecision false positive probabilities;
 extend SFMD2 to support diversified detection, allowing multiple, unlinkable flag keys to be detected by a single detection key (though, as this extension conflicts with the preceding one, we do not use it);
 summarize the resulting construction and describe how it can be integrated with a Sapling or Orchardstyle key hierarchy.
The RFMD2 construction
First, we recall the paper’s construction of RFMD2, changing from multiplicative to additive notation and adjusting variable names to be consistent with future extensions.
The construction supports restricted false positive values $p=2_{−n}$ for $n≤γ$, a global parameter determining the minimum false positive rate. $G$ is a group of prime order $q$ and $H_{1}:G_{3}→{0,1}$ and $H_{2}:G×{0,1}_{γ}→Z_{q}$ are hash functions.
RFMD2/KeyGen
Choose a generator $B$ G$. For $i=1,…,γ$, choose $x_{i}$ Z_{q}$ and compute $X_{i}=[x_{i}]B$. Return the root key $rk←(x_{1},…,x_{γ})$ and the clue key $ck←(B,X_{1},…,X_{γ})$.
RFMD2/Extract
On input $p=2_{−n}$ and root key $rk$, parse $(x_{1},…,x_{γ})←rk$ and return $(x_{1},…,x_{n})$ as the detection key.
RFMD2/Flag
On input $ck$, first parse $(B,X_{1},…,X_{n})←ck$, then proceed as follows:
 Choose $r$ Z_{q}$ and compute $P←[r]B$.
 Choose $z$ Z_{q}$ and compute $Q←[z]B$.
 For each $i=1,…,γ$, compute
 a key bit $k_{i}←H_{1}(P∣∣[r]X_{i}∣∣Q)$;
 a ciphertext bit $c_{i}←k_{i}⊕1$.
 Compute $m←H_{2}(P∣∣c_{1}∣∣⋯∣∣c_{γ})$.
 Compute $y←(z−m)r_{−1}$.
Return the clue $c←(P,y,c_{1},…,c_{γ})$.
RFMD2/Examine
On input $dk$, $c$, first parse $(x_{1},…,x_{n})←dk$, $(P,y,c_{1},…,c_{γ})←c$, then proceed as follows:
 Compute $m←H_{2}(P∣∣c_{1}∣∣⋯∣∣c_{γ})$.
 Recompute $Q$ as $Q←[y]P+[m]B$.
 For each $i=1,…,γ$, compute
 a key bit $k_{i}←H_{1}(P∣∣[x_{i}]P∣∣Q)$;
 a plaintext bit $b_{i}←c_{i}⊕k_{i}$.
If all plaintext bits $b_{i}=1$, return $1$ (match); otherwise, return $0$.
To see that the Test
procedure detects true positives, first note that
since $P=[r]B$ and $y=(z−m)r_{−1}$,
$Q =[y]P+[m]B=[(z−m)r_{−1}][r]B+[m]B=[z]B, $
so the detector will recompute the $Q$ used by the sender; then, since $[r]X_{i}=[r][x_{i}]B=[x_{i}]P$, the detector’s $k_{i}←H_{1}(P∣∣[x_{i}]P∣∣Q)$ is the
same as the sender’s $k_{i}←H_{1}(P∣∣[r]X_{i}∣∣Q)$.
On the other hand, if the clue was created with a different clue key, the resulting $k_{i}$ will be random bits, giving the desired $2_{−n}$ false positive probability.
One way to understand the components of the construction is as follows: the message consists of $γ$ bits of a Bloom filter, each encrypted using hashed ElGamal with a common nonce $r$ and nonce commitment $P=[r]B$. The points $P,Q$ are included in the ElGamal hash, ensuring that any change to either will result in a random bit on decryption. The pair $(Q,y)$ act as a public key and signature for a onetime signature on the ciphertext as follows: the sender constructs the point $Q$ as the output of a chameleon hash with basis $(P,B)$ on input $(0,z)$, then uses the hash trapdoor (i.e., their knowledge of the discrete log relation $P=[r]B$) to compute a collision $(y,m)$ involving $m$, a hash of the rest of the ciphertext. The collision $y$ acts as a onetime signature with public key $Q$.
The fact that the generator $B$ was selected at random in each key generation is not used by the construction and doesn’t seem to play a role in the security analysis; in fact, the security proof in Appendix F has the security game operate with a common generator. In what follows, we instead assume that $B$ is a global parameter.
From RFMD2 to SFMD2
Changing this construction to the SFMD model is fairly straightforward: rather
than having the sender encrypt $γ$ bits of the Bloom filter and only allow
the detector to decrypt $n≤γ$ of them, we have the sender only encrypt
$n≤γ$ bits of the Bloom filter and allow the detector to decrypt all
potential bits. As noted in the previous
section, this means that in the SFMD context,
there is no separation of capability between the root key and the detection key.
For this reason, we skip the Extract
algorithm and (conceptually) merge the
root key into the detection key.
The SFMD2 construction then works as follows:
SFMD2/KeyGen
For $i=1,…,γ$, choose $x_{i}$ Z_{q}$ and compute $X_{i}=[x_{i}]B$. Return the detection key $dk←(x_{1},…,x_{γ})$ and the clue key $ck←(X_{1},…,X_{γ})$.
SFMD2/CreateClue
On input clue key $ck$, first parse $(X_{1},…,X_{n})←ck$, then proceed as follows:
 Choose $r$ Z_{q}$ and compute $P←[r]B$.
 Choose $z$ Z_{q}$ and compute $Q←[z]B$.
 For each $i=1,…,n$, compute
 a key bit $k_{i}←H_{1}(P∣∣[r]X_{i}∣∣Q)$;
 a ciphertext bit $c_{i}←k_{i}⊕1$.
 Compute $m←H_{2}(P∣∣n∣∣c_{1}∣∣⋯∣∣c_{n})$.
 Compute $y←(z−m)r_{−1}$.
Return the clue $c←(P,y,n,c_{1},…,c_{n})$. (We include $n$ explicitly rather than have it specified implicitly by $c_{n}$ to reduce the risk of implementation confusion).
SFMD2/Examine
On input detection key $dk$ and clue $c$, first parse $(x_{1},…,x_{γ})←dk$ and $(P,y,n,c_{1},…,c_{n})←c$, then proceed as follows:
 Compute $m←H_{2}(P∣∣n∣∣c_{1}∣∣⋯∣∣c_{n})$.
 Recompute $Q$ as $Q←[y]P+[m]B$.
 For each $i=1,…,n$, compute
 a key bit $k_{i}←H_{1}(P∣∣[x_{i}]P∣∣Q)$;
 a plaintext bit $b_{i}←c_{i}⊕k_{i}$.
If all plaintext bits $b_{i}=1$, return $1$ (match); otherwise, return $0$.
The value of $n$ is included in the ciphertext hash to ensure that the encoding of the ciphertext bits is nonmalleable. Otherwise, the construction is identical.
Compact clue and detection keys
One obstacle to FMD integration is the size of the clue keys. Clue keys are required to create clues, so senders who wish to create clues need to obtain the receiver’s clue key. Ideally, the clue keys would be bundled with other address data, so that clues can be included with all messages, rather than being an optin mechanism with a much more limited anonymity set.
However, the size of clue keys makes this difficult. A clue key supporting false positive probabilities down to $2_{−γ}$ requires $γ$ group elements; for instance, supporting probabilities down to $2_{−14}=1/16384$ with a 256bit group requires 448byte flag keys. It would be much more convenient to have smaller keys.
One way to do this is to use a deterministic key derivation mechanism similar to BIP32 to derive a sequence of child keypairs from a single parent keypair, and use that sequence as the components of a flag / detection keypair. To do this, we define a hash function $H_{4}:G×Z→Z_{q}$. Then given a parent keypair $(x,X=[x]B)$, child keys can be derived as $x_{i}X_{i} ←x+H_{4}(X∣∣i)←X+[H_{4}(X∣∣i)]B $ with $X_{i}=[x$