Finality: what is it and why is it important?

DaveH · July 29, 2020, 2:55pm

Japanese: Translation here
CoinPost Japan: Article here
Chinese: Translation here
Chinese 2: Translation here
Spanish: Translation here
Russian: Translation here
Italian: Translation here

Hi all,

There have been a few requests on various channels to understand Finality better, what the options are, why it is important and some more detail behind the process of what is being done for Symbol. This post tries to pull a few of those things together as an information resource.

Before getting into it, this is an area I personally have learned about in more depth very recently, I’m not an expert, I don’t claim to be I would not be able to design the solution from scratch. However, thanks to @jaguar0625 and @gimre 's patience, help from external security companies and a fair bit of reading online, I can explain it at a surface level for people. Frankly it makes my mind bend and my head hurt to keep up, what the devs are doing is complex and not for the faint of heart.

I have tried to stay out of the technical detail where possible and it is deliberately over simplified in places to make it easier to read. This will be translated and posted on the official blog as well.

The two second read is below:

Finality is important because without it, any transaction can theoretically be rolled back
There are two approaches – an explicit (deterministic) and implicit (probabilistic)
Without Deterministic finality, users of the chain need to build in checks to manage risk until they can consider a transaction to be final, i.e. 6 BTC confirmations and ~1 hour
Newer chains have ways to add Deterministic finality which means they are secure, more robust, and require fewer workarounds but there are tradeoffs such as being more prone to DDoS for example. These chains are Symbol’s competition/peers.
When talking to exchanges, custodians, regulated entities, and ownership transfer solutions, finality is important and it has been raised in conversations by at least one Tier 1 exchange for XYM listing.
The Core Devs have worked on several solution designs based on leading approaches and bleeding edge research. It is a hard subject and we have engaged external assistance to support and help where appropriate. The design is progressing well but has areas that need to be fully decided and this is taking longer than estimated.
At this stage a launch date delay is looking likely, the devs have continued to develop in parallel with the design but are now at a point where progress stalls without the design being fully agreed. The length of the is likely to be weeks not months but won’t be known until the design is complete.

For those who want more detail, please continue reading and ask any questions.

Blockchain 101

As most people reading this will know, a blockchain is a series of linked blocks that a network of nodes agree are linked together and form one crypographically verifiable chain/ledger that all nodes have a copy of. There are a few basic terms worth defining:

Consensus: The mechanism by which nodes on a network all agree that a single chain of blocks is the valid and verifiable chain, and how they handle disagreements, malicious nodes, nodes coming on/offline, etc.

NEM is uses a modified NXT/PeerCoin Proof of Stake consensus. Modifications are an importance/reputational score and a VRF hash to help ensure harvesting randomness
Safety: How to ensure honest parties do not agree on conflicting things
Liveness: How to ensure agreement is made on something by honest parties
Verifiable Random Function (VRF): A function that’s output can be cryptographically verified publicly that the outputs are genuinely pseudo-random.

In Symbol wallet (and opt in) you will see VRF keys as an option for anyone who wants to harvest…make sure you leave it ticket or you wont be eligible to harvest (dont worry it will be in the guides)
Partitioning: When a network splits/does not agree
- Consistence and Partition Tolerance: If a network partition occurs then no blocks will be added until the state is rectified (liveness suffers), the chain(s) effectively stall
- Available and Partition Tolerance: If a network partition occurs then all partitions will make independent progress, the independent chains will grow separately and when the partition is resolved the competing chains will need to be resolved (often but not always this is the largest partition)
Finality: How users of a chain can “know” if a transaction is final and will not change, or could be rolled back on a blockchain.
- Byzantine Fault Tolerance (BFT): How a network can cope with any given actor turning malicious/adverserial or not acting honestly. In finality this is often PBFT (Practical BFT) or DBFT (Delegated BFT)

What is Finality

Finality in the blockchain space is the way someone can tell if a transaction or block on a chain is at risk of changing in the future or not.

All blockchains have micro forks, frequently and they continually recover from that state to a single chain. In these micros forks some nodes think block A is the next valid block, some think block B is the next one and the network forks to create 2 parallel chains for a few blocks that can go on for 1 or many blocks.

The networks need a way to come to Consensus on which parallel chain is correct, and roll back the incorrect one so all nodes are back on the same chain (and therefore common ledger), it is also useful in that process if the chain is able to mark a given block as agreed upon (final).

Finality is the process by which a transaction in a given chain can be considered final, so a transaction won’t or is highly unlikely to be rolled back and can be trusted as permanent.

What types of Finality that exist?

There 2 main ways to achieve it - Probabilistic or Deterministic.

Probabilistic relies on enough blocks passing that it is prohibitively expensive/impractical to deliberately start a parallel chain and try and roll back the real chain, it is probably final
Deterministic means the network agrees (and normally marks somehow) a block or transaction as final on chain – it is determined to be final

BTC and most other PoW chains use Probabilistic, the more modern PoS ones tend to aim for Deterministic, primarily because the security is better, they have faster block times and/or are aiming for specific use cases or behaviours. Ultimately it means their transactions and balances can be trusted more easily, more securely and more quickly.

Probabilistic finality is the reason why for example Binance takes 6 confirmations to credit your BTC deposit…they wait for the chain to grow long enough that it reduces their counterparty risk. It is generally accepted that it takes 6 confirmations (1 hour) so that the risk of a significant 51% attack to roll the chain back is less likely, so it is probably final, although it never technically is.

Why is Finality important?

Anyone using any blockchain should not trust transactions in a given block, unless they are marked final or the chain is long enough that they are probabilistically final. The very simple answer is that it is important because otherwise people need to “guess” at when a transaction can be trusted, whether it’s a value transfer, a multi sig setup, information in a message – any transaction on chain.

By way of an example of what could happen without it:

I send 100BTC to Binance and they give me it after 1 confirmation
I trade it to Eth and withdraw to another wallet immediately (takes 10-20 mins)
At confirmation 4 (40 mins) BTC transaction rolls back due to being on the “wrong chain”

The net effect of the above is that my transfer to Binance never happened, my 100BTC is still in my account, but my Eth trades and subsequent withdrawal did so I also have Eth in my wallet…and Binance is down 100BTC. This is oversimplified but illustrates the concept.

The double spend problem most crypto payment systems have to solve if using BTC is another example. The customer doesn’t wait 1 hour to buy a coffee, and a coffee shop doesn’t want to risk losing funds after I leave due to a roll back. This is part of what Lightning does on BTC, another solution is that payment providers take the risk and write off losses from their revenue, so ultimately fees increase; providers create off chain risk management solutions.

There are other instances such as transferring ownership artwork, or a security token or a collectible, etc. The way that is dealt with in older chains is to wait a period of time until it is probabilistically final and it is coded into the solutions that use the chains to wait. Newer chains have started to solve the issue by moving to Deterministic Finality, so competing chains have this and it is part of the decision making process for some use cases when selecting a chain.

The question of finality is not one that came up most of the time when XEM was being listed, it wasn’t a thing. But it has already come up from exchanges in relation to XYM, because exchanges (and custodial solutions) now know there are challenges with it and solutions for it. It is important when you start speaking about things like central bank usage, security token usage, new tier 1 exchange listings, cross chain swaps on public chains etc

It is mostly not important for private chains but can be in some cases.

What are some common Finality Approaches

In deterministic finality, there are a few main approaches being used on other chains:

Polkadot: Grandpa (PBFT), validates chains not blocks, available partition
Ethereum 2.0: Caspar FFG (PBFT), 20 min finality, available partition
Cosmos/Tendermint: PBFT, fast finality, consistent partition
Algorand: New, fast finality but uses consistent partition and assumes synchronicity

Cosmos/Tendermint were the first to put up a production solution (it relies on validators), Algorand is generally held up as one of the more advanced ways of doing it at present, however it relies on some assumptions that have yet to be proven over time and on a strong network synchronicity assumption.

If you want to mark a block/transaction as final, the problem breaks down into 2 primary steps:

Who should be allowed to vote on it being final
How do all the voters know to trust all the other voters and know when they have all voted

There are largely two sub-types of deterministic finality:

Semi Decentralised

These solutions rely on a known pool of nodes that are pre-selected to be allowed to vote that a given block or transaction is final. This is used in most solutions where you hear terms like validator or witness (Cosmos, Steem, Eth 2.0, Polkadot etc). Common models for this include DBFT, PBFT, Threshold Signatures and a few others. Most of the solutions have theoretical scaling limits and haven’t been widely pushed beyond 250 nodes voting, primarily due to the network chatter involved in point 2 above. These limits are hard to define precisely as they also depend on message complexity, frequency of finalisation runs and a few other things.

In NEM’s scenario this option means we may need to select a subset of supernodes and have conversations about the fairness and robustness of that selection approach, if it can be randomised etc. There is also a risk as the number of voting nodes increases that finality begins to fall behind. These are hypothetical risks because the issues have not been widely dealt with in the real world and rely on mathematical/desktop modelling in research papers to an extent, they would be tested as part of Testnet.

These solutions can involve an Available or Consistent Partition approach.

Fully Decentralised

These solutions rely on a dynamic voting pool that any node can join or leave to vote on a given block. All nodes need to have a way to know who should vote and when voting is complete without needing to speak to every other node to establish that (reduced network chatter).

The solutions are more complex, but more robust, more secure and more scalable. There are a couple of chains who have well documented approaches to this problem, for example Algorand.

Algorand specifically uses a Consistent Partition approach which means that it would stall in the presence of a network split and need to recover (automatically). In the event of a longer stall that may lead to available voting keys running out and finality requiring to be restarted completely, this is automated and runs on an hourly basis. Which means that consumers of the chain data need to handle the situation that finality exists, but is expected to fail and recover, so any checks on a transaction being final immediately need an edge case path if finality takes up to 1 hour to be marked.

Where are we up to with Symbol

The good news is that some other chains have begun to solve this issue, and the designs are open source, so our design process in theory is slightly easier, but is still complex. None of these other chains for example use the same Consensus base + modifications as we do.

The Core Devs have been working on and researching the solution design for several months and fully understand the problem domain. We have recently brought in a specialist security company who has worked on multiple similar projects, and whose reports are recognised by entities making decisions about things like due diligence by exchanges for listing. Their head cryptographer is actively working with the team right now and early feedback has been positive on the design but there is still work to do.

An Algorand-like approach had been considered and up until very recently looked possible to apply, however near the end of the process a particularly thorny issue was identified around Reduction. Specifically, the approach relies on validating blocks, however we (similar to Polkadot) validate chains, the micro forks (>1 blocks) mentioned earlier. To reconcile those two would require a considerable rewrite and likely a design decision toward a Consistent Partition approach that NEM chose not to follow early in the development work, favouring an Available Partition approach.

The next step is to (re)assess a PBFT solution for launch and the Core Devs are working on this actively at present, the conversation is progressing daily and involves a lot of research and modelling. This approach was previously set aside temporarily so research work is not beginning from scratch and the security firm is actively assisting with this approach.

If this approach proves fruitful and passes the design reviews, it is likely to provide a similar approach to chains that could be considered direct competition for Symbol (Eth 2.0, Polkadot and others).

The design process has taken longer than estimated, primarily because these feature sets are on the leading edge of blockchain academic research, building on top of solutions like PBFT that have been around for several years. It is a very dense area that there is no clear, common right answer to in the industry. Getting the solution right puts Symbol Public Chain right on the front edge of technology solutions in the market today and alongside projects that have had significant success (both in adoption and token price performance), but that is not a simple process.

The Core Developers have continued coding in parallel with the design until quite recently when we reached a point that it is no longer possible to progress without the final design, so progress has continued but has now reached a bottleneck.

The next steps from here are to finalise the design, which we hope to do in the coming 1-2 weeks (this is still an estimate not a hard date). Then we can make a more firm estimate of the work needed to complete the implementation.

At this stage it looks likely it may have an impact on testnet starting and therefore possibly launch date, but there are some options we are looking at to try and avoid or minimise this as much as possible. For context we are talking about a few weeks, not a few months IF it occurs.

I hope the above all makes sense, and anyone still reading is still awake, please do ask questions and I will my best to answer (or source answers) for them.

Useful further reading

The below resources are some of the content that has helped me understand the issue and options:

h-gocchi · July 24, 2020, 9:42pm

Thank you for the detailed and easy-to-understand explanation!
From this explanation, we can understand that finality is very important for avoiding rollback, safety and robustness.
The examples that are important in the case of Exchange usage, central bank usage, security token usage, new tier 1 exchange listings, cross chain swaps on public chains etc are also very helpful to imagine.

TakaNobu · July 25, 2020, 2:54am

Can I post a Japanese translation of part of this article on Qiita, the Japanese technology sharing service?

DaveH · July 25, 2020, 5:58am

Sure, I have no problem with it. A full translation will come soon but that might be helpful until it is ready

Edit @TakaNobu coinpost japan picked it up already as well https://coinpost.jp/?p=170312&utm_source=coinview

TakaNobu · July 25, 2020, 6:31am

Thanks!
Qiita has a different target audience and sticks directly to people who want to do interesting things with new IT technologies. I’ll try it.

TakaNobu · July 26, 2020, 2:40pm

I’ve done

Dtbychris · July 26, 2020, 3:04pm

chinese^^（繁體）

DaveH · July 29, 2020, 2:55pm

Thanks all, I’ve edited the original article to include Japanese, Chinese, Spanish, Russian and Italian, credit to @TakaNobu, @Dtbychris, @psputnik, @klimgeran & @thilon for the translations

klimgeran · July 29, 2020, 11:30am

Russian

WebSite: https://nemnews.io/finalizacija-chto-jeto-takoe-i-pochemu-jeto-vazhno/

klimgeran · July 29, 2020, 11:31am

Italian

WebSite: https://nemitalia.io/finalizzazione-cose-e-perche-e-importante/

Dtbychris · July 29, 2020, 11:32am

@DaveH

thilon · July 29, 2020, 2:18pm

China
http://www.itechly.com/articles/2585.html

Dtbychris · July 29, 2020, 2:43pm

@DaveH

simonx · August 23, 2020, 9:05am

A great article, I want permission to publish in Turkish.I believe it will be useful.

DaveH · August 24, 2020, 6:36am

Thanks @simonx yep you have permission. Just link back to the original please.

Anyone is free to republish in any language and link back

system · October 23, 2020, 6:36am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.