By Cassandra Heart, Arash Afshar
Digital signatures are a foundational concept in blockchain and cryptocurrencies. Modern blockchains use digital signatures to secure billions of dollars of value. Digital signatures use what is known as a keypair, a pair of random looking values, where one key is a “private key” and the other a “public key”. Through digital signatures, any person with the “private key” can “sign” a transaction and spend the digital currencies. Therefore, it is crucial to safeguard the “private key”. Some tech-savvy users of blockchains opt to safeguard this key themselves, and accept the risk of theft or loss of the key (and therefore the loss of their funds). In contrast, other blockchain users trust online wallets or exchanges with the safeguarding of their keys. Of course, this decision comes with its own set of risks based on the competency of the third party.
In both these options, the user is putting all their trust in a single entity, which may not be desirable. Enter the Threshold Digital Signature: a solution which requires a “threshold” of at least two cooperating participants to produce a signature, and which removes the problem of trusting a single entity. In this article we:
- Provide an intuitive description of Threshold Signatures and their applications,
- Dig a bit deeper and look into various Threshold Signature schemes, and
- Compare Threshold Signatures with other techniques, such as Mulitsig wallets.
The intuition for threshold signatures
As a developer in the space of threshold cryptography, it’s really exciting to see these innovations becoming a topic in the mainstream, but readers unfamiliar with cryptography or the math behind it quickly hit roadblocks upon encountering phrases like “Paillier cryptosystem”, “homomorphic encryption” or “Galois field”. It gets even more complicated when you discuss all the moving pieces behind it to coordinate the communication, and as a consequence, very few organizations have been willing to investigate its potential. But it doesn’t have to be scary; at the end, the math comes down to not much more than multiplication and addition. So let’s ELI5: What the heck is a threshold signature?
In metaphorical terms, signatures are akin to flying a kite on an invisible string. The kite itself is the public key — everyone can see it in the sky. The kite flier moves the kite around by manipulating the invisible string — the private key. The path it takes in the sky as it flies is the signature. Everyone saw the kite fly through the sky in that path, and only through the use of that invisible string was that flight path possible. This feels really simplified compared to the underlying math, but ultimately this metaphor is useful for demonstrating the coordination and work required to make threshold signing possible.
Enter threshold cryptography. The premise of threshold is literally in its name: some numerical value must be met for an operation to succeed. Oftentimes these processes are defined using the phrase “t of n”, where n is the number of total possible participants, and t is the threshold number that must be met. A common threshold cryptographic scheme that has been used for quite some time is Shamir’s secret sharing scheme. For those unfamiliar, the process involved uses a mathematical technique called Lagrange interpolation to recombine split values into a secret value. In the metaphorical world, it is taking that invisible string, and separating it into individual threads that many people can hold onto, and in order to fly the kite, the threshold number of people must come together and combine their threads into the string again.
This process works well, and services all over the world use it to secure secret data. The downside is that everyone who is involved must do this process in a secure location when breaking apart and recombining the secret. In cryptocurrencies, this also means that once the private key is recombined and used for signing, it should be considered exposed and all funds held by the key should be moved, so if any participant who helped in recombining the key walks away with it, they can’t do anything meaningful. This is expensive, and not to mention, requires a lot of coordination of people. What if we can take the powerful math behind cryptography and improve upon this so that nobody has to ever meet in a secure location at all?
The great news is that we can! There are mountains of literature that have risen overnight with new approaches to existing cryptosystems, improvements on previous ones, and completely groundbreaking cryptographic protocols. Navigating this field requires significant time and expertise, but here at Coinbase, we have found and implemented strategies that enable us to leverage these approaches, and support the novel approaches as they are discovered and peer reviewed. There’s a lot involved in this process, so let’s bring it back to the metaphor.
The setup process for getting our avid kite fliers ready is ultimately the unique twist that enables this entire process to work: each participant follows the same rule: they bring their own invisible thread, and their own piece of kite. Each flier agrees with the others in advance how they are going to fly, and they all proceed to run with their piece of kite at the agreed speed, angle, and time. If anyone strays from the agreed flight plan, the whole tangled mess of kites comes crashing to the ground, but if everyone proceeds as agreed, the kite takes off into one combined piece through the sky, able to perform the flight as planned. When the flight concludes, the parts disassemble mid air, and everyone goes home with their kite and thread. At no point does any one person hold the whole kite or string, and each party sees the flight plan ahead of time to know that nobody is going to try some wild antics that will let them run away with the kite.
Deeper dive into threshold signatures
Now that we have an intuitive understanding of threshold signatures, let’s dive deeper into the concepts and terminologies. The threshold signature schemes are part of the secure multi-party computation (MPC) field of cryptography. The main goal of MPC is to enable computation on private data without revealing them to anyone but the owner of the private data. For example, in the kite metaphor, the invisible pieces of the thread are the secret shares of the private key and threshold signature uses these secret shares to reconstruct the private key and sign the transaction without revealing the composite private key, nor the secret shares.
A very important ingredient of threshold signing is a mathematical construct called Elliptic Curve Cryptography. The TL;DR version is that given `y = x · G`, where `y` and `G` are publicly known values, it is very hard or even impossible to find `x` in a reasonable time frame. There are many “curves” that offer this property:
- Secp256k1, which is used in Bitcoin, Ethereum and many others
- Edwards25519, which is used in Cardano, Monero and many others
- BLS12–381, which is used in Ethereum 2.0 and some other chains
Given an appropriate elliptic curve, the next step towards a threshold signature is to first choose a standard (i.e., single-signer) digital signature scheme. The popular digital signature schemes are as follows:
- ECDSA, based on the Secp256k1 curve used by Bitcoin
- Schnorr, based on the Secp256k1 curve used by Bitcoin Cash and Mina
- Ed25519, based on the Edwards25519 curve used by Cardano
Finally, given a digital signature we can now discuss threshold signature schemes. The threshold signature schemes start from a single-signer scheme and split the private key between `n` participants. Then, in the signing phase, t-out-of-n participants can run the signing algorithm to obtain the signature. Finally any single (external) party can verify the signature using the same algorithm for verifying the single-signer signatures. In other words, the signatures generated by threshold signature and single-signer signature schemes are interchangeable. Stated differently, a threshold signing algorithm has three phases.
- Generate the public/private key pair. Next, split the private key into multiple secret shares and distribute these shares between the `n` parties. This phase can be performed in two modes.
- Trusted Dealer mode: A single trusted party will generate the private key, then split and distribute the keys. The main problem with this approach is that the dealer will see the private key in plaintext.
- Distributed Key Generation (DKG): an MPC protocol is run between the `n` participants such that at the end, the participants will obtain the secret shares and no one will ever see the private key in plaintext at any point in the process.
- Gather a threshold of `t` participants and run an MPC protocol to sign the transaction.
- Verify the signature, using the standard signature’s verification algorithm.
The threshold signature schemes are fast evolving. At the time of writing this post, the secure and popular schemes include the following.
- FROST is a threshold signature and DKG protocol that offers minimal rounds of communication and is secure to be run in parallel. FROST protocol is a threshold version of the Schnorr signature scheme.
- DKLs18: is a 2-out-of-2 threshold signature and DKG protocol that offers fast signature computation for ECDSA signature scheme.
Threshold Signatures and Multisig
Multisig, or multisignature schemes offer similar capabilities to threshold signatures with a difference: each participant has its own public key (instead of secret shares of a single common public key). This small difference has a huge impact on cost, speed, and availability of the multisig on various blockchains.
- Efficiency: in threshold signature schemes, each public key, and its corresponding private keyshares, belong permanently to a single, fixed group of signers; in multisignatures, each individual participant has its own distinct, dedicated public key. The benefit of this latter scheme is that each such participant can reuse its private–public keypair to participate in arbitrarily many distinct signing groups. The cost of using multisignatures, however, is that the size of the “public key” (actually, a list of public keys) representing any particular such group must grow linearly in the number of members of that group. Similarly, the verification time of a multisignature obviously must grow linearly in the size of the group, as the verifier must in particular read the entire list of public keys representing the group. In threshold schemes, by contrast, just one public key represents the entire group, and both key-size and verification time are constant.
- Availability: to ensure that the minimum threshold of `t` is met, the blockchain should have native support for multisignatures. In most cases, this support is in the form of a smart contract. As a result, not all blockchains support multisig wallets. In contrast, the MPC-based threshold signatures are independent of the blockchain as long as the signature scheme that is used by the blockchain has a secure threshold version.
Threshold digital signatures enable us to do incredible things previously not possible in cryptocurrencies — multisig contracts require additional costs to operate, but this can happen without a smart contract. This means that we can support a whole new tier of wallets: where before there is the traditional custodial wallet like Coinbase offers in many different ways, or self-custody wallet options like our Coinbase Wallet application, this threshold ECDSA approach allows customers to be an active participant in this signing process. In this approach, the user holds a share of the private key, and Coinbase holds another, and only when both agree to the flight plan, can transactions be signed. This provides the security and trust we are known for at Coinbase, with the user remaining the one in control.
If you are interested in cutting-edge cryptography, check out our open roles here.