Object Storage
In GoShimmer ObjectStorage
is used as a base data structure for many data collection elements such as conflictStorage
, conflictStorage
, blockStorage
and others.
It can be described by the following characteristics, it:
- is a manual cache which keeps objects in memory as long as consumers are using it
- uses key-value storage type
- provides mutex options for guarding shared variables and preventing changing the object state by multiple goroutines at the same time
- takes care of dynamic creation of different object types depending on the key, and the serialized data it receives through the utility
objectstorage.Factory
- helps with the creation of multiple
ObjectStorage
instances from the same package and automatic configuration.
In order to create an object storage we need to provide the underlying kvstore.KVStore
structure backed by the database.
Database
GoShimmer stores data in the form of an object storage system. The data is stored in one large repository with flat structure. It is a scalable solution that allows for fast data retrieval because of its categorization structure.
Additionally, GoShimmer leaves the possibility to store data only in memory that can be specified with the parameter CfgDatabaseInMemory
value. In-memory storage is purely based on a Go map, package mapdb
from hive.go.
For the persistent storage in a database it uses RocksDB
. It is a fast key-value database that performs well for both reads and writes simultaneously that was chosen due to its low memory consumption.
Both solutions are implemented in the database
package, along with prefix definitions that can be used during the creation of new object storage elements.
The database plugin is responsible for creating a store
instance of the chosen database under the directory specified with CfgDatabaseDir
parameter. It will manage a proper closure of the database upon receiving a shutdown signal. During the start configuration, the database is marked as unhealthy, and it will be marked as healthy on shutdown. Then the garbage collector is run and the database can be closed.
ObjectStorage
Assume we need to store data for some newly created object A
. Then we need to define a new prefix for our package in the database
package, and prefixes for single storage objects. They will be later used during ObjectStorage
creation. A package prefix will be combined with a store specific prefix to create a specific realm.
package example
type Storage struct {
A *generic.ObjectStorage
...
shutdownOnce sync.Once
}
ObjectStorage Factory
To easily create multiple storage objects instances for one package, the most convenient way is to use the factory function.
osFactory := objectstorage.NewFactory(store, database.Prefix)
It needs two parameters:
store
- the key valuekvstore
instancedatabase.Prefix
- a prefix defined in thedatabase
package for our newexample
package. It will be responsible for automatic configuration of the newly providedkvstore
instance.
After defining the storage factory for the group of objects, we can use it to create an *objectstorage.ObjectStorage
instance:
AStorage = osFactory.New(objPrefix, FromObjectStorage)
AStorage = osFactory.New(objPrefix, FromObjectStorage, optionalOptions...)
For the function parameter we should provide:
objPrefix
- mentioned before, we provide the object specific prefix.FromObjectStorage
- a function that allows the dynamic creation of different object types depending on the stored data.optionalOptions
- an optional parameter provided in the form of options array[]objectstorage.Option
. All possible options are defined inobjectstorage.Options
. If we do not specify them during creation, the default values will be used, such as enabled persistence or setting cache time to 0.
StorableObject
StorableObject
is an interface that allows the dynamic creation of different object types depending on the stored data. We need to make sure that all methods required by the interface are implemented to use the object storage factory.
SetModified
- marks the object as modified, which will be written to the disk (if persistence is enabled).IsModified
- returns true if the object is marked as modifiedDelete
- marks the object to be deleted from the persistence layerIsDeleted
- returns true if the object was marked as deletedPersist
- enables or disables persistence for this objectShouldPersist
- returns true if this object is going to be persistedUpdate
- updates the object with the values of another object - requires an explicit implementationObjectStorageKey
- returns the key that is used to store the object in the database - requires an explicit implementationObjectStorageValue
- marshals the object data into a sequence of bytes that are used as the value part in the object storage - requires an explicit implementation
Most of these have their default implementation in objectstorage
library, except from Update
, ObjectStorageKey
, ObjectStorageValue
which need to be provided.
StorableObjectFactory Function
The function ObjectFromObjectStorage
from object storage provides functionality to restore objects from the ObjectStorage
. By convention the implementation of this function usually follows the schema:
ObjectFromObjectStorage
uses ObjectFromBytes
func ObjectFromObjectStorage(key []byte, data []byte) (result StorableObject, err error) {
result, err := ObjectFromBytes(marshalutil.New(data))
...
return
}
ObjectFromBytes
unmarshals the object sequence of bytes with a help of marshalutil
library. The returned consumedBytes
can be used for the testing purposes.
The created marshalUtil
instance stores the stream of bytes and keeps track of what has been already read (readOffset
).
func ObjectFromBytes(bytes []byte) (object *ObjectType, consumedBytes int, err error) {
marshalUtil := marshalutil.New(bytes)
if object, err = ObjectFromMarshalUtil(marshalUtil); err != nil {
...
consumedBytes = marshalUtil.ReadOffset()
return
}
The key logic is implemented in ObjectFromMarshalUtil
that takes the marshaled object and transforms it into the object of specified type.
Because the data is stored in a sequence of bytes, it has no information about the form of an object and any data types it had before writing to the database.
Thus, we need to serialize any data into a stream of bytes in order to write it (marshaling), and deserialize the stream of bytes back into correct data structures when reading it (unmarshaling).
Let's consider as an example, unmarshaling of the Child
object.
type Child struct {
childType ChildType // 8 bytes
referencedBlockID BlockID // 32 bytes
childBlockID BlockID // 32 bytes
}
The order in which we read bytes has to reflect the order in which it was written down during marshaling. As in the example, the order: referencedBlockID
, childType
, childBlockID
is the same in both marshalling and unmarshalling.
// Unmarshalling
func ChildFromMarshalUtil(marshalUtil *marshalutil.MarshalUtil) (result *Child) {
result = &Child{}
result.referencedBlockID = BlockIDFromMarshalUtil(marshalUtil)
result.childType = ChildTypeFromMarshalUtil(marshalUtil)
result.childBlockID = BlockIDFromMarshalUtil(marshalUtil)
return
}
// Marshalling
func (a *Child) ObjectStorageChild() []byte {
return marshalutil.New().
Write(a.referencedBlockID).
Write(a.childType).
Write(a.childBlockID).
Bytes()
}
We continue to decompose our object into smaller pieces with help of MarshalUtil
struct that keeps track of bytes, and a read offset.
Then we use marshalutil
build in methods on the appropriate parts of the byte stream with its length defined by the data
type of the struct field. This way, we are able to parse bytes to the correct Go data structure.
ObjectStorage Methods
After defining marshalling and unmarshalling mechanism forobjectStorage
bytes conversion,
we can start using it for its sole purpose, to actually store and read the particular parts of the project elements.
Load
allows retrieving the corresponding object based on the provided id. For example, the method on the blockobjectStorage
is getting the cached object.To convert an object retrieved in the form of a cache to its own corresponding type, we can use
Unwrap
. In the code below it will return the block wrapped by the cached object.Exists
- checks weather the object has been deleted. If so it is released from memory with theRelease
method.func (s *Storage) Block(blockID BlockID) *CachedBlock {
return &CachedBlock{CachedObject: s.blockStorage.Load(blockID[:])}
}
cachedBlock := blocklayer.Tangle().Storage.Block(blkID)
if !cachedBlock.Exists() {
blkObject.Release()
}
block := cachedBlock.Unwrap()Consume
will be useful when we want to apply a function on the cached object.Consume
unwraps theCachedObject
and passes a type-casted version to the consumer function. Right after the object is consumed and when the callback is finished, the object is released.cachedBlock.Consume(func(block *tangle.Block) {
doSomething(block)
})ForEach
- allows to apply aConsumer
function for every object residing within the cache and the underlying persistence layer. For example, this is how we can count the number of blocks.blockCount := 0
blockStorage.ForEach(func(key []byte, cachedObject generic.CachedObject) bool {
cachedObject.Consume(func(object generic.StorableObject) {
blockCount++
})
}Store
- storing an object in the objectStorage. An extended version is methodStoreIfAbsent
that stores an object only if it was not stored before and returns boolean indication if the object was stored.ComputeIfAbsent
works similarly but does not access the value log.cachedBlock := blockStorage.Store(newBlock)
cachedBlock, stored := blockStorage.StoreIfAbsent(newBlock)
cachedBlock := blockStorage.ComputeIfAbsent(newBlock, remappingFunction)