5.2 Ledger State
5.2.1 Introduction
The introduction of a voting-based consensus requires a fast and easy way to determine a node's initial opinion for every received transaction. This includes the ability to both detect double spends and transactions that try to spend non-existing funds. These conditions are fulfilled by the introduction of an Unspent Transaction Output (UTXO) model for record-keeping, which enables the validation of transactions in real time, see also the section on UTXO.
The concept of UTXO style transactions is directly linked to the creation of a directed acyclic graph (DAG), in which the vertices are transactions and the links between these are determined by the outputs and inputs of transactions.
To deal with double spends and leverage on certain properties of UTXO, we introduce the Realities Ledger State.
5.2.2 Dependencies
The Ledger State depends on:
- UTXO: see the Section on UTXO DAG as well as 5.1 UTXO.
- Tangle: the Tangle maps the approval relations between messages as well as transactions, see 4.1 The Tangle.
- Solidification: Secures that all non-conflicting transactions converge to the same ledger state, see 4.4 Solidification.
5.2.3 Realities Ledger State
In the Realities Ledger State, we model the different perceptions of the ledger state that exist in the Tangle. In each “reality” on its own there are zero conflicting transactions. Each reality thus forms an in itself consistent UTXO sub-DAG, where every transaction references any other transaction correctly.
Since outputs of transactions can only be consumed once, a transaction that double spends outputs creates a persistent branch in a corresponding UTXO DAG. Each branch receives a unique identifier branchID
. These branches cannot be merged by any vertices (transactions).
A transaction that attempts to merge incompatible branches fails to pass a validity check and is marked as invalid.
The composition of all realities defines the Realities Ledger State.
From this composition nodes are able to know which possible outcomes for the Tangle exist, where they split, how they relate to each other, if they can be merged and which messages are valid tips. All of this information can be retrieved in a fast and efficient way without having to walk the Tangle.
Ultimately, for a set of competing realities, only one reality can survive. It is then up to the consensus protocol to determine which branch is part of the eventually accepted reality.
In total the ledger state thus involves three different layers:
- the UTXO DAG,
- its extension to the corresponding branches and the branch DAG,
- the Tangle, which maps the parent relations between messages and thus also transactions.
5.2.4 The UTXO DAG
The UTXO DAG models the relationship between transactions, by tracking which outputs have been spent by what transaction, see also the section on UTXO. Since outputs can only be spent once, we use this property to detect double spends.
We allow for different versions of the ledger to coexist temporarily. This is enabled by extending the UTXO DAG by the introduction of branches (see the following section). We can then determine which conflicting versions of the ledger state exist in the presence of conflicts. Thus, we allow for different versions of the ledger to coexist temporarily.
5.2.4.1 Conflict Sets and Detection of Double Spends
For every output we maintain a list of consumers consumerList
, and where the consumers have the unique identifier consumerID
. For a given output this list keeps track of which transactions have spent that particular output. For every spending transaction we add an element with consumerID=transactionID
.
Outputs without consumers are considered to be unspent outputs. Transactions that consume an output that have more than one consumer are considered to be double spends.
When there are more than one consumer in the consumer list we shall create a conflict set list conflictSet
, whose elements have a unique identifier conflictID
each. The conflictSet
is uniquely identified by the unique identifier conflictSetID
. Since the outputID
is directly and uniquely linked to the conflict set, we set conflictSetID=outputID
. For every transaction that shall be added to the conflict set we add an element with conflictID=transactionID
.
5.2.5 Branches
The UTXO model and the concept of solidification, see section 4.4 Solidification, makes all non-conflicting transactions converge to the same ledger state no matter in which order the transactions are received. Messages containing these transactions could always reference each other in the Tangle without limitations.
However, every double spend creates a new possible version of the ledger state that will no longer converge. Whenever a double spend is detected (see the previous section), we track the outputs created by the conflicting transactions and all the transactions that spend these outputs, by creating a container for them in the ledger which we call a branch.
More specifically a container branch
shall be created for each transaction that double spends one or several outputs, or if messages aggregate those branches.
Every transaction that spends directly or indirectly from a transaction that created a branch, i.e. double spent funds, is also contained in this branch
or one of its child branches.
Note that a branch that was created by a transaction that spends multiple outputs can be part of multiple conflict sets.
In other words, a branch is a downward closed, conflict free collection of conflicts.
Every branch shall be identified by the unique identifier branchID
. We consider two kinds of branches: conflict branches and aggregated branches, which are explained in the following sections.
5.2.5.1 Conflict Branches
A conflict branch is created by a corresponding double spend transaction. Since the transaction identifier is unique, we choose the transaction id transactionID
of the double spending transaction as the branchID
.
Outputs inside a branch can be double spent again, recursively forming sub-branches.
On solidification of a message, we shall store the corresponding branchID
together with every output, as well as the transaction metadata to enable instant lookups of this information. Thus, on solidification, a transaction can be immediately associated with a branch.
5.2.5.2 Aggregated Branches
A transaction that does not create a double spend inherits the branches of the input's branches. In the simplest case, where there is only one input branch the transaction inherits that branch. If outputs from multiple non-conflicting branches are spent in the same transaction, then the transaction and its resulting outputs are part of an aggregated branch. This type of branch is not part of any conflict set. Rather it simply combines the perception that the individual conflict branches associated to the transaction's inputs are the ones that will be accepted by the network.
Furthermore, since a message inherits the branches from its parents, it also can create aggregated branches.
Each aggregated branch shall have a unique identifier branchID
, which is the same type as for conflict branches. Furthermore, the container for an aggregated branch is also of type branch
.
To calculate the unique identifier of a new aggregated branch, we take the identifiers of the branches that were aggregated, sort them lexicographically and hash the concatenated identifiers once:
# AggregatedBranchID returns the identifier for an aggregated branch.
FUNCTION aggregatedBranchID = GetAggregatedBranchID(branchIDs)
sortedBranchIDs = Sort(branchIDs)
RETURN Hash(sortedBranchIDs)
An aggregated branch can't aggregate other aggregated branches. However, it can aggregate the conflict branches that are part of the referenced aggregated branch.
Thus aggregated branches have no further branches as their children and they remain tips in the branch DAG. Furthermore, the sortation of the branchID
s in the function AggregatedBranchID()
ensures that even though messages can attach at different points in the Tangle and aggregate different aggregated branches they are treated as if they are in the same aggregated branch if the referenced conflict branches are the same.
These properties allow for an efficient reduction of a set of branches. In the following we will require the following fields as part of the branch data:
isConflictBranch
is a boolean flat that isTRUE
if the branch is a conflict branch orFALSE
if its an aggregated branch.parentBranches
contains the list of parent conflict branches of the branch, i.e. the conflict branches that are directly referenced by this branch.
Then the following function takes a list of branches (which can be either conflict or aggregated branches) and returns a unique set of conflict branches that these branches represent. This is done by replacing duplicates and extracting the parent conflict branches from aggregated branches.
FUNCTION reducedBranches = ReduceBranches(branches)
FOR branch IN branches
IF branch.isConflictBranch
IF NOT (branch IN reducedBranches)
Append(reducedBranches,branch)
ELSE
FOR parentBranch IN branch.parentBranches
IF NOT (parentBranch IN reducedBranches)
Append(reducedBranches,parentBranch)
RETURN reducedBranches
5.2.5.3 The Branch DAG
A new branch is created for each transaction that is part of a conflict set, or if a transaction aggregates branches.
In the branch DAG, branches constitute the vertices of the DAG. A branch that is created by a transaction that is spending outputs from other branches has edges pointing to those branches. The branch DAG maps the UTXO DAG to a simpler structure that ignores details about relations between transactions inside the branches and instead retains only details about the interrelations of conflicts.
The set of all non-conflicting transactions form the master branch. Thus, at its root the branch DAG has the master branch, which consists of non-conflicting transaction and resolved transactions. From this root of the branch DAG the various branches emerge.
In other words the conflict branches and the aggregated branches appear as the children of the master branch.
5.2.5.4 Detecting Conflicting Branches
Branches are conflicting if they, or any of their ancestors, are part of the same conflict set.
The branch DAG can be used to check if branches are conflicting, by applying an operation called normalization, to a set of input branches.
From this information we can identify messages or transactions that are trying to combine branches belonging to conflicting double spends, and thus introduce an invalid perception of the ledger state.
Since branches represent the ledger state associated with a double spend and sub-branches implicitly share the perception of their parents, we define a function NormalizeBranches()
to normalize a list of branches and that gets rid of all branches that are referenced by other branches in that list. The function returns NULL
if the branches are conflicting and can not be merged.
In order to explain this function in pseudo code we require the following global variables
- seenConflictSets = map[]conflictSetID
- traversedBranches = map[]branch
- parentsToCheck = map[]branch
as well as a function BranchCheck()
that performs certain checks and returns TRUE
when the branch is conflicting with a previously seen branch. However, we note that this is an implementation detail that must not match the implementation.
# reduce list of branches to normalized branches, and return NULL when detecting conflicting branches
FUNCTION normalizedBranches = NormalizeBranches(initialBranches)
IF Len(initialBranches) == 0
RETURN masterBranch
IF Len(initialBranches) == 1
RETURN initialBranches
# check original set of branches
normalizedBranches = ReduceBranches(initialBranches)
FOR branch IN normalizedBranches
BranchCheck(branch)
# check every ancestor
WHILE Len(parentsToCheck) != 0
branch = parentsToCheck[0]
Delete(parentsToCheck,branch) # delete this branch from the list
# remove this ancestor
IF branch IN normalizedBranches
Delete(normalizedBranches,branch)
# if branch check fails, i.e. a conflict set was seen twice, return a null list
IF BranchCheck(branch)
RETURN NULL
RETURN normalizedBranches
The branch check function BranchCheck()
checks if the branch was already traversed, i.e. we have handled this branch already. Then it checks if the branch's conflict set has been already seen, which proofs that the current branch conflicts with an already traversed branch. Lastly it adds new branches to the queue of branches that should be traversed.
FUNCTION isConflicting = BranchCheck(branch)
# abort if branch was traversed already
IF branch IN traversedBranches
RETURN FALSE
ELSE
Append(traversedBranches,branch.ID)
# check if conflict set was seen twice
IF branch.conflictSetID IN seenConflictSets
RETURN TRUE
ELSE
Append(seenConflictSets,branch.conflictSetID)
# queue parents to be checked when traversing ancestors
FOR parentBranch IN branch.parentBranches
IF branch NOT IN parentsToCheck
Append(parentsToCheck,parentBranch)
RETURN FALSE
5.2.5.5 Merging of Branches
A branch gains approval weight when messages from (previously non-attached) nodeID
s attach to messages in the future cone of that branch. Once the approval weight exceeds a certain threshold we consider the branch as confirmed, see also section 6.4 Finality.
However, there are two special cases of branches:
First the branch that is created by the genesis transaction is called master branch and has the identifier masterBranchID
. The masterBranchID
is confirmed on creation and thus it is the "correct" reality by definition.
Once a conflict branch is confirmed, it can be merged into the master branch. Since the approval weight is monotonically increasing for branches from the past to the future, branches are only merged into the master branch.
Second, a branch rejectedBranch
is created that is rejected by definition, and it has the identifier rejectedBranchID
.
Messages that are contained in a rejected branch or in one its child branches are booked into the rejectedBranch
.
5.2.6 Relation to the Tangle
Since messages in the Tangle are dependent on the fate of the messages they approve, we shall create dependencies between payloads, messages, and branches. The branch ID of a message or of a transaction represents all the conflicts upon which that object depends. Specifically, we associate a branch to a payload and to a message in the following way.
- The branch of a non-value payload is always the master branch.
- The branch of a transaction is assigned in one of two ways:
- If the transaction is a conflict, then a new branch is created whose branchID is that transactionID. The transaction gets assigned to this new branch.
- Otherwise, the transaction is assigned to the aggregated branch of all its inputs.
- The branch of a message is the aggregate of
- The branch of its payload
- The branches of each strong parent
- The branches of the payloads of the weak parents.
This assignments captures the essence of weak and strong parents, see 4.3 Tip Selection Specification. Strong arrows pick up the dependencies of the whole past cone, where as the weak arrows only penetrate to the paylaod of the parent, ignoring the history of the parent.
We that a message (resp. transaction ) belongs to a branch if the branch of (resp. ) is in the branch past of . Thus, branches, represent certain coherent sections of the Tangle which are then ordered by inclusion.
After a message is solidified, it and its payload are both assigned to their branch. During this check, the message is flagged as invalid if:
- The payload is a transaction, and the node cannot aggregate branchIDs of the transaction's input into a valid branchID.
- The branchIF of the message cannot be aggregated. If these branchID's cannot be computed, then the message contains a pair of its history, and thus does not support a coherent view of the ledger.