4.7 Markers
4.7.1 Introduction
This section specifies the requirements for the marker tool. A tool as defined here is a feature that adds functionality to the node software but is not an essential component. The marker tool improves the efficiency with which certain properties can be checked or certain metrics calculated.
The potential issues addressed by the use of the marker tool are in handling potentially numerically expensive operations. Specifically, the following operations can become numerically expensive:
- Future- or past cone inclusion. For certain applications it is necessary to know whether a certain message is in the past or future cone of another message. In the default approach the Tangle has to be walked until a given message is found.
- Approval weight. In order to compute the approval weight of a given message the node software needs to traverse the Tangle from that message to the tips and sum up the active consensus Mana of all the messages in its future cone, see also the section on approval weight.
The marker tool allows a node to efficiently determine whether certain markers are in the past or future cone of a given message, by reducing the proportion of the Tangle that needs to be traversed.
The marker tool achieves this by defining a parallel internal data structure, consisting of additional metadata applied to specific messages in the Tangle. Specifically, the marker tool "marks" certain messages, which form a subDAG which approximates the topological structure of the Tangle. Furthermore, the markers are grouped into sequences (which themselves form yet another DAG), which allow the node to quickly determine which markers reference each other.
Note, that we shall require that markers are assigned when booking a message. Thus, for that part of the message DAG that is already booked the corresponding marker DAG does not change anymore.
The Markers tool specification depends on:
4.7.2 Definitions
The following terms are defined in relation to markers:
- UTXO branch: This is a set of outputs that spawn off from a conflict transaction. Each UTXO branch by itself is conflict free. See also Section 5.1 - UTXO and Section 5.2 - Ledger State for a more complete discussion on UTXO and its branches.
- Aggregated branch: The aggregation of a combination of several branches.
- Branch identifier (
BID
): The unique identifier of a branch or aggregated branch. - Main branch: The part of the UTXO DAG, in which all outputs are considered to be good in the sense that all conflicts in their past have been resolved, either by a given conflict being accepted or rejected.
- Rank: The length of the longest directed path in DAG terminating in a given vertex/object. Specifically, if a vertex directly references only and then .
- Marker: A message that is assigned additional properties locally on the node, and that tracks a particular UTXO branch.
- Marker identifier (
MID
): The unique identifier of the marker. - Marker DAG: The collection of all markers.
- Marker rank (
MR
): The rank of a marker in the marker DAG. - Marker-sequence: A marker-sequence is a group of markers. Each marker-sequence maps to a UTXO branch; see Section 5.2 - Ledger State.
- Marker-sequence identifier (
SID
): A marker-sequence identifier is a number that uniquely identifies a marker-sequence. - Marker-sequence rank (
SR
): The rank of a marker-sequence in the marker-sequence DAG. - Future marker (
FM
): This field in the message metadata is (potentially) updated when a new marker is generated in the future cone of the message, following the rules defined in Section "Message Metadata". Essentially it contains the list of markers for which there is no other marker between the marker in question and the message, or in more mathematical terms, the minimal markers in the future cone. - Past marker (
PM
): A past marker of a message is a most recent past marker of the parents (with respect toMR
). The past marker of a marker is set to itself.
4.7.3 The Markers
A marker consists of the following data:
Variable | Type | Description |
---|---|---|
MID | uint64 | Unique identifier of the marker |
SID | uint64 | Unique identifier of the marker-sequence |
MR | uint64 | Marker rank |
Table 4.7.1: Marker data.
A new marker shall be created by the marker tool when any of the following conditions are met:
- a new UTXO branch is created and the message that would get a marker assigned is not yet booked. This also creates a new marker-sequence.
- more than a certain number of messages (
maxMsgPerMarker
) have been received since the last marker. This rule must be applied per marker-sequence. I.e. for each marker-sequence with more thanmaxMsgPerMarker
since the last marker in that marker-sequence, the rule shall be applied independently. - a certain time window (
maxTimePerMarker
) has passed since the last marker.
A marker is created with a MID
, an this MID
must be unique.
To set a new marker within a marker-sequence, the marker tool randomly selects from strong tips set a message whose past marker is the last marker in the sequence. The next marker will then reference that transaction. If there is no strong tip with the appropriate past marker, the selection shall be from message in the weak tips set. The rank of the new marker should be one greater than the rank of all the past markers of the message.
Since , marker ranks are monotonically non-decreasing such that , where is the future cone of .
4.7.4 The Marker-sequence
Marker-sequences are used to track the UTXO DAG branches. Each branch corresponds to a marker-sequence with a unique SID
, and the marker-sequences form a DAG.
4.7.4.1 Marker-sequence data
Each marker-sequence is associated with the following data:
Variable | Type | Description |
---|---|---|
SID | unit64 | The marker-sequence identifier |
SR | unit64 | The rank of a marker-sequence in the marker-sequence DAG |
MRMax | unit64 | The highest MR in the marker-sequence |
MRMin | unit64 | The lowest MR in the marker-sequence |
ParentReferences | map[Marker ] Marker | Relationship map from parent marker-sequences to markers (*) |
Table 4.7.2: Data associated to a marker-sequence.
*The field ParentReferences
models the relationship between marker-sequences. This maps which marker in this marker-sequence references which other markers from other marker-sequences.
Whenever a new marker is added that is a member of a given marker-sequence, MR_max
and ParentReferences
for that marker-sequence shall be updated.
4.7.4.2 Creation of Marker-sequences
A new marker-sequence shall be created when:
- there's a transaction that creates a new conflict, i.e. creates a new UTXO branch.
- the UTXO branches are aggregated.
- UTXO branches are merged.
Each new marker-sequence shall start with a new marker. Hence, with the creation of a new marker-sequence also a new marker must be assigned to the message that caused one of the three above events.
Whenever a new marker-sequence is created, the marker tool shall assign:
- a new
SID
, created by the rule . A new createdSID
must be unique. - a new .
To prevent assigning a new
SID
when combining the same marker-sequences at different times, the marker tool shall build parent-child relationships in a map whenever a new marker-sequence is created.
For further details about the UTXO model, please refer to the section on UTXO.
4.7.5 Message Metadata
For each message in the Tangle, the marker tool shall maintain metadata that provides information about the markers that are closest in the past or future cone of that message, as well as whether the message itself is a marker and what rank the message has. The following message metadata shall be defined in the marker tool to support that requirement:
Variable | Type | Description |
---|---|---|
IsMarker | bool | A flag to indicate whether a message is a marker. |
PastMarkers | map[SID ]MID | A list of the closest markers from different marker-sequences in the past cone of the message. |
FutureMarkers | map[SID ]MID | A list of the closest markers from different marker-sequences in the future cone of the message. |
MarkerBranchID | BID | The branch ID to which the marker is mapped, or nil if the message is no marker. |
PayloadBranchID | BID | The branch ID to which the Payload is mapped in case it is a conflict, or nil otherwise. |
IndividualBranchID | BID | The branch ID if there is need for mapping the message individually to a branch ID, or nil otherwise. |
Table 4.7.3: Markers metadata.
The PastMarkers
field contains
- only the marker identifier of itself, if the message is marked as a marker.
- the marker identifier of its closest past markers (PMs), i.e. from each referenced marker-sequence only the markers with the highest marker rank (
MR
). Markers which are referenced by other markers in this list shall be removed.
The FutureMarkers
list shall be empty at the start and shall be updated when a new marker directly or indirectly references that list.
The propagation of a FM to its past cone (i.e. the update of the FutureMarkers
list in the encountered messages) shall not continue beyond a message if:
FutureMarkers
of a message includes a previous marker of the same marker-sequence; the message that includes such a marker shall not get updated.- the message is the marker in a different marker-sequence. Then the
FutureMarkers
shall be updated for that marker only.
Through this approach past and future markers do not cross weak parents. It also prevents the lists from growing unboundedly.
The fields MarkerBranchID
, PayloadBranchID
and IndividualBranchID
allow for making connections between the marker DAG, the message DAG and the UTXO branch DAG. When a new Sequence is created the MarkerBranchID
is set to the branch that creates the sequence.
4.7.5.1 Update of Already Booked Messages on Double Spends
If a transaction arrives that double spends an already booked transaction, a new marker-sequence shall be created for the newly arrived message (containing the transaction), see Section Creation of marker-sequences.
For the already booked conflicting transaction no new marker or marker Sequence shall be created. This is because the marker DAG and Sequence DAG shall not be changed post-booking a message. However a new UTXO branch is created.
First, assume the existing booked transaction is a Marker itself. Then the marker gets mapped onto the new branch by updating the field MarkerBranchID
in the message metadata. Furthermore, the PayloadBranchID
is updated to the new branch. For all FM in the same sequence the MarkerBranchID
gets updated to the new branch. Furthermore, for every sequence that directly or indirectly references the sequence in which the double-spend occurs, the first marker is remapped to the new branch as well.
Second, assume the existing transaction is not a marker. Then all messages between the transaction and the following future markers (including the transaction itself) get mapped individually to the new branch mapping using the field IndividualBranchID
. From the future markers onwards, the same applies as in the first scenario.
For an example implementation of these scenarios also visit the example here.
4.7.6 Marker Application Description
Figure 1 shows an example of how the markers and marker-sequences (here also called Sequence) would look in the Tangle from the perspective of the Message DAG, the marker DAG and the marker-sequence DAG. The purple colored messages are markers:
Image 4.7.4: Markers and marker-sequences in the Tangle.
4.7.6.1 Example Implementation
An illustrative example of the markers tool in action is provided here for the prototype implementation.
4.7.6.2 Approval Weight Approximation
To approximate the approval weight of a message, the markers tool retrieves the approval weight of FutureMarkers
. Since a given message is in the past cone of its FMs, the approval weight and thus the finality of the message will be at least the same as the maximum weight of its FMs. This gives a lower bound (which is the “safe” bound), and if the markers are set frequently enough, this provides a good approximation of that bound.
4.7.6.3 Past Cone Check
By comparing the PastMarkers
of a message with the FutureMarkers
of another message, the markers tool can determine if that message is in the past cone of the other. For example, consider two messages X
and Y
that are members in the same marker-sequence. Then if PM(X)>FM(Y)
, then X
is in the future of Y
.
One way in which this check can be carried out is by traversing the marker DAG while remaining in the bounds of the marker ranks.
A potential optimization is that the marker-sequence DAG can be traversed while considering the marker-sequence ranks, prior to any traversal of the marker DAG.
It is possible that the marker DAG does not cover certain areas of the message DAG at a given point in time. In this case, a check on this question can return one of the following three values:
TRUE
FALSE
N/A
If the check returns a N/A
, then the Message DAG must be searched via a search algorithm.
For an example implementation of the algorithm for the past cone check visit GoShimmer markers.