Title: Three Examples of Data Storage and Integrity Proof in Distributed Systems
Introduction
Since their inception, blockchain and cryptocurrency have been committed to changing the financial landscape by providing broader access and eliminating intermediaries. The development of web3 has expanded the applications of blockchain technology, highlighting its potential in creating a thriving creator and user-controlled data internet.
Ensuring decentralization while empowering end-users requires a resilient and censorship-resistant database for data storage. Distributed data storage systems meet the demand for fault tolerance and high resilience storage by building a network of nodes for storing, managing, and sharing data.
Tagion
Architecture
Tagion is a decentralized network dedicated to high-capacity transactions, aiming to establish a unique currency system based on technology and democratic governance. It relies on an innovative database architecture and encryption technology to achieve scalability. Tagion’s proof mechanism is an example of deterministic proof.
The core function of the DART database is to store data based on hash keys. As information storage increases, this structure naturally generates more branches, each supporting up to 256 combinations of records and sub-branches.
In addition to the distributed hash table, Tagion’s infrastructure can be understood as a Sparse Merkle Tree (SMT). SMT is an authenticated data structure based on key-value pairs, supporting standard database operations such as lookup, insertion, update, and deletion.
Tagion’s system utilizes the root hash containing all sub-branch hashes to quickly verify data status with minimal computation. To further enhance processing capacity, the system can create sub-DART for specific ecosystems, similar to sharded blockchains.
Using DART creates a stateless system, eliminating the need to maintain a complete history of system transitions. This means data can be deleted, reducing storage requirements overall and potentially increasing decentralization through lightweighting.
Tagion also adopts HiBON (Hash-Invariant Binary Object Notation) to ensure data remains unchanged upon entry, simplifying data retrieval based on associated hashes. Through these mechanisms, Tagion securely stores and efficiently verifies data integrity in the network.
Data Integrity
All subsystems of Tagion undergo a so-called random walk to check if data is stored and provided as required. Nodes that fail to pass the verification challenge are excluded from the network.
Filecoin
Filecoin is a decentralized storage network that incentivizes miners to provide storage capacity through its native token, Filecoin. Miners must generate proofs to verify their storage capacity to earn rewards.
Filecoin’s basic storage unit is called a sector, with a standard size and extendable lifespan for providers. All user data stored on Filecoin is encrypted, and multiple copies are distributed in the network to prevent miners from accessing file content.
To ensure data integrity and availability, Filecoin relies on two algorithms: storage proof and replication proof.
Storage Proof
Miners in Filecoin generate proofs to verify that they hold data copies at any given time. The system challenges miners, and only those with the data can correctly respond.
Filecoin introduces a time-space proof (PoSt) to ensure continuous storage and data availability. PoSt includes winning PoSt and window PoSt to verify data storage and maintenance requirements.
Replication Proof
The sealing process is part of the replication proof algorithm, encouraging miners to reduce sealing frequency. This proof ensures that miners create and store unique copies of data on their physical hardware.
Celestia
Celestia is a data availability blockchain that provides execution and data storage for modular blockchains, allowing them to outsource core functionalities.
To simplify this process, Celestia adopts data availability sampling (DAS). This method involves light nodes downloading only a small part of the data until a predetermined confidence level is reached. If the sampled data is available, it is considered a probabilistic proof of data availability.
Conclusion
In summary, distributed networks employ various methods for storing and verifying data availability and integrity. Each platform, Tagion, Filecoin, and Celestia, presents unique strategies to ensure data integrity, availability, and accessibility, making significant contributions to building resilient data publication and storage systems that support decentralized networks.