Decentralization storage evolution: From idealism to the exploration of practical applications

2025-07-13 18:00:39

The Evolution of Decentralization Storage: From Idealism to Realism

Storage has been one of the hot narratives in the blockchain industry. Filecoin, as the leading project of the last bull market, once had a market value exceeding $10 billion. Arweave, with its selling point of permanent storage, reached a maximum market value of $3.5 billion. However, as the availability of cold data storage comes into question, the necessity of permanent storage has also been called into doubt. Whether decentralized storage can truly take root remains an unresolved issue.

The emergence of Walrus has brought a breath of fresh air to the long-dormant storage track. Recently, the Shelby project launched by Aptos and Jump Crypto aims to elevate decentralized storage in the hot data sector to new heights. So, can decentralized storage make a comeback and provide solutions for a wider range of application scenarios? Or is it merely another round of speculation? This article will analyze the evolution of the decentralized storage narrative based on the development trajectories of four projects: Filecoin, Arweave, Walrus, and Shelby, and explore the possibility of popularizing decentralized storage.

Filecoin: Storage is just a facade, mining is the essence.

Filecoin is one of the earliest emerging cryptocurrency projects, and its development direction naturally revolves around Decentralization. This is a common characteristic of early crypto projects - that is, seeking the meaning of Decentralization in various traditional fields. Filecoin is no exception, as it connects storage with Decentralization, thereby naturally pointing out the drawbacks of centralized data storage services: the trust assumption on centralized storage providers. Therefore, the goal of Filecoin is to transform centralized storage into decentralized storage. However, certain compromises made to achieve Decentralization have become the pain points that later projects like Arweave or Walrus attempted to address. To understand why Filecoin is essentially just a mining coin, one must understand the objective limitations of its underlying technology, IPFS, which is not suitable for handling hot data.

IPFS: Decentralization architecture, but limited by transmission bottlenecks

IPFS( InterPlanetary File System) was launched around 2015, aiming to disrupt the traditional HTTP protocol through content addressing. The biggest drawback of IPFS is the extremely slow retrieval speed. In an era where traditional data service providers can achieve millisecond-level responses, it still takes IPFS several seconds to retrieve a file, making it difficult to promote in practical applications and explaining why it is rarely adopted by traditional industries, except for a few blockchain projects.

The underlying P2P protocol of IPFS is mainly suitable for "cold data", which refers to static content that does not change often, such as videos, images, and documents. However, when it comes to handling hot data, such as dynamic web pages, online games, or artificial intelligence applications, the P2P protocol does not have a significant advantage over traditional CDNs.

Although IPFS itself is not a blockchain, its directed acyclic graph (DAG) design concept is highly compatible with many public chains and Web3 protocols, making it inherently suitable as a foundational building framework for blockchains. Therefore, even if it has no practical value, it is sufficient as a foundational framework for carrying blockchain narratives; early altcoin projects only need a functioning framework to embark on a vast journey. However, when Filecoin reaches a certain stage of development, the inherent flaws brought by IPFS begin to hinder its further growth.

Coin logic under the storage exterior

The original intention of IPFS's design is to allow users to be part of the storage network while storing data. However, in the absence of economic incentives, it is difficult for users to voluntarily use this system, let alone become active storage nodes. This means that most users will only store files on IPFS but will not contribute their own storage space or store others' files. It is against this backdrop that Filecoin was born.

There are three main roles in the token economic model of Filecoin: users are responsible for paying fees to store data; storage miners receive token incentives for storing user data; and retrieval miners provide data when users need it and receive incentives.

This model has potential malicious space. Storage miners may fill garbage data after providing storage space to obtain rewards. Since this garbage data will not be retrieved, even if it is lost, it will not trigger the penalty mechanism for storage miners. This allows storage miners to delete garbage data and repeat this process. Filecoin's replication proof consensus can only ensure that user data has not been privately deleted, but it cannot prevent miners from filling garbage data.

The operation of Filecoin largely relies on miners' continuous investment in the token economy, rather than on the real demand from end users for distributed storage. Although the project is still iterating, at this stage, the ecological construction of Filecoin is more in line with the "mining coin logic" rather than the "application-driven" definition of storage projects.

Arweave: Succeeded by Long-Termism, Defeated by Long-Termism

If Filecoin's design goal is to build an incentivized, verifiable Decentralization "data cloud" shell, then Arweave takes an extreme direction in storage: providing the capability for permanent data storage. Arweave does not attempt to build a distributed computing platform; its entire system revolves around a core assumption - important data should be stored once and remain forever on the network. This extreme long-termism makes Arweave fundamentally different from Filecoin in terms of mechanisms, incentive models, hardware requirements, and narrative perspectives.

Arweave uses Bitcoin as a learning object, attempting to continuously optimize its permanent storage network over long periods measured in years. Arweave does not care about marketing, nor does it care about competitors or market development trends. It is simply moving forward on the path of iterating the network architecture, indifferent even if no one pays attention, because this is the essence of the Arweave development team: long-termism. Thanks to long-termism, Arweave was warmly embraced during the last bull market; and because of long-termism, even if it falls to the bottom, Arweave may still endure through several rounds of bull and bear markets. The only question is whether there will be a place for Arweave in the future of Decentralization storage? The existential value of permanent storage can only be proven over time.

Since version 1.5 of the Arweave mainnet to the recent 2.9 version, although it has lost the market discussion heat, it has been committed to enabling a broader range of miners to participate in the network at minimal cost and incentivizing miners to maximize data storage, continuously enhancing the robustness of the entire network. Arweave is well aware that it does not align with market preferences, and therefore takes a conservative approach, not embracing the miner community, leading to a complete stagnation of the ecosystem, upgrading the mainnet at minimal cost, and continuously lowering hardware thresholds without compromising network security.

Review of the upgrade path from 1.5 to 2.9

The Arweave version 1.5 exposed a vulnerability where miners could rely on GPU stacking instead of actual storage to optimize block generation chances. To curb this trend, version 1.7 introduced the RandomX algorithm, restricting the use of specialized computing power and requiring general-purpose CPUs to participate in mining, thereby weakening computing power centralization.

In version 2.0, Arweave adopts SPoA, converting data proofs into a concise path of Merkle tree structure, and introduces format 2 transactions to reduce synchronization burden. This architecture alleviates network bandwidth pressure, significantly enhancing node collaboration capabilities. However, some miners can still evade the responsibility of holding real data through centralized high-speed storage pool strategies.

To correct this bias, version 2.4 launched the SPoRA mechanism, introducing global indexing and slow hash random access, requiring miners to genuinely hold data blocks to participate in effective block production, thereby weakening the effects of hash power stacking from a mechanism perspective. As a result, miners began to focus on storage access speed, driving the application of SSDs and high-speed read-write devices. Version 2.6 introduced hash chain control for block production rhythm, balancing the marginal benefits of high-performance devices and providing fair participation space for small and medium miners.

Subsequent versions further enhance network collaboration capability and storage diversity: 2.7 adds collaborative mining and pool mechanism, improving the competitiveness of small miners; 2.8 introduces a composite packaging mechanism, allowing large-capacity low-speed devices to participate flexibly; 2.9 introduces a new packaging process in replica_2_9 format, significantly improving efficiency and reducing computational dependence, completing the closed-loop of data-oriented mining model.

Overall, Arweave's upgrade path clearly presents its storage-oriented long-term strategy: while continuously resisting the trend of computational power centralization, it consistently lowers the participation threshold to ensure the possibility of the protocol's long-term operation.

Walrus: Is Embracing Hot Data Hype or a Hidden Treasure?

Walrus, in terms of design philosophy, is completely different from Filecoin and Arweave. The starting point of Filecoin is to create a decentralized and verifiable storage system, at the cost of cold data storage; Arweave's starting point is to create an on-chain Alexandria library that can permanently store data, at the cost of too few scenarios; Walrus's starting point is to optimize the storage cost of hot data storage protocols.

Magic Modification Error Correction Code: Cost Innovation or Old Wine in a New Bottle?

In terms of storage cost design, Walrus believes that the storage expenses of Filecoin and Arweave are unreasonable, as both latter adopt a fully replicated architecture, whose main advantage lies in each node holding a complete copy, providing strong fault tolerance and independence among nodes. This type of architecture ensures that even if some nodes go offline, the network still maintains data availability. However, this also means that the system requires multiple copies for redundancy to maintain robustness, which in turn increases storage costs. Especially in the design of Arweave, the consensus mechanism itself encourages nodes to store redundant data to enhance data security. In contrast, Filecoin is more flexible in cost control, but at the expense of potentially higher data loss risks for some low-cost storage options. Walrus attempts to find a balance between the two, controlling replication costs while enhancing availability through structured redundancy, thereby establishing a new compromise path between data availability and cost efficiency.

The Redstuff created by Walrus is a key technology for reducing node redundancy. It is derived from Reed-Solomon ( RS ) coding. RS coding is a very traditional error correction code algorithm, and error correction codes are techniques that allow the data set to be doubled by adding redundant fragments ( erasure code ), which can be used to reconstruct the original data. From CD-ROMs to satellite communications to QR codes, it is frequently used in daily life.

Erasure codes allow users to take a block, for example, 1MB in size, and then "expand" it to 2MB, where the additional 1MB is special data known as erasure codes. If any byte in the block is lost, users can easily recover those bytes through the code. Even if up to 1MB of the block is lost, you can still recover the entire block. The same technology allows computers to read all the data from a CD-ROM, even if it has been damaged.

Currently, the most commonly used is RS coding. The implementation method is to start with k information blocks, construct the corresponding polynomial, and evaluate it at different x coordinates to obtain the encoded blocks. Using RS erasure codes, the likelihood of randomly sampling large chunks of data loss is very small.

For example: Divide a file into 6 data blocks and 4 parity blocks, totaling 10 pieces. As long as any 6 of them are retained, the original data can be completely restored.

Advantages: Strong fault tolerance, widely used in CD/DVD, fault-tolerant hard disk arrays (RAID), and cloud storage systems ( such as Azure Storage, Facebook F4).

Disadvantages: the decoding computation is complex and the overhead is relatively high; it is not suitable for data scenarios with frequent changes. Therefore, it is usually used for data recovery and scheduling in off-chain centralized environments.

Under the Decentralization architecture, Storj and Sia have adjusted traditional RS coding to meet the actual needs of distributed networks. Walrus has also proposed its own variant based on this - the RedStuff coding algorithm, to achieve lower cost and more flexible redundancy storage mechanisms.

What is the biggest feature of Redstuff? By improving the erasure coding algorithm, Walrus can quickly and robustly encode unstructured data blocks into smaller shards, which are distributed and stored in a network of storage nodes. Even if up to two-thirds of the shards are lost, the original data block can be quickly reconstructed using partial shards. This maintains redundancy.

FIL3%

AR5.31%

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

15 Likes