What Is Decentralized Cloud Storage?
Most cloud storage today is centralized, meaning data is stored in data centers owned and run by storage providers. The largest cloud storage providers are notably Amazon, Microsoft, and Google. While these storage providers have multiple data center locations regionally spaced in a hub and spoke model, data is still controlled in just a few locations by singular entities.
In a decentralized cloud storage model, data is stored across a large network of thousands of storage nodes run by unique entities that have no visibility into what is stored on their hardware. It’s a distributed model that has only become possible at a global scale in the last decade thanks to improvements in bandwidth speed and availability.
The biggest difference from the centralized model is the way privacy and security are handled. In a centralized storage model, you are trusting the storage provider to protect your data, but they have full access to that data. A decentralized storage model is zero trust, meaning you assume that you can’t trust anyone in the network. So the system is built with end-to-end encryption, erasure coding, and sophisticated access management to ensure that no one has access to your data other than you.
Netflix Has Already Moved to Decentralized Cloud Storage
Netflix customers were having difficulty streaming their video content in regions that were located far away from the source. The traditional centralized model was not able to transfer large media files fast enough for streaming.
In order to successfully transfer large media files via point-to-port throughout the world, Netflix had to develop its own decentralized cloud storage network
. To facilitate this, they ship huge containers of servers to ISPs around the world so that they don’t have to go through the centralized hub and spoke delivery model for their video streaming. This has allowed Netflix to provide high quality and highly available access to their service at a larger scale than would be possible going through today’s centralized hyperscalers.
While the Netflix decentralized storage network proves many of the benefits of decentralization, it isn’t really a viable model for any other company. The deals brokered with the ISPs, let alone the server costs, are impressive, but far beyond what most companies can achieve individually. Luckily, decentralized cloud storage projects and networks are growing, and companies can achieve the same benefits as Netflix without the management complexity and exorbitant costs.
1. Storage network
A decentralized storage network is a network of storage nodes that can be utilized to store data. Typically users can choose what nodes host their data, but they are responsible for the security and resiliency of that arrangement.
A decentralized marketplace is a storage broker for decentralized storage providers. The marketplace handles integration with the decentralized storage vendors as well as centralized—like S3 compatibility.
3. Distributed storage
A decentralized network built for distributed storage uses erasure coding to segment data and distribute it to various storage nodes. This handles the security and resiliency of the network, keeping data protected.
Decentralized vendors are built on one of these models depending upon their goals for value and intended use. Some are built for long term archive, some for decentralization purists, while others are built as an alternative to centralized storage for business use cases. All decentralized networks have innovated in interesting ways that add value to Web3 as well as provide new, cost-efficient ways to store data.
Decentralized Cloud Storage Solution Options
There are a strong handful of decentralized networks that are viable today. Each has approached decentralized cloud storage in unique ways. There are five decentralized storage vendors who are providing true decentralized storage services. These are Sia, Filecoin, Arweave, Filebase, and Storj. The graphic here illustrates how these providers serve the needs of individuals versus businesses.
Each decentralized storage vendor takes a different approach to how their network is architected. Some have created a storage network or a community of people offering storage. Some are more of a decentralized storage marketplace brokering storage services. Others have built a network of distributed storage. Here are the basics on each of these networks and where to go to get more information.
is a storage network/community that has organized its decentralized network to back data with sustainable and perpetual endowments to store data for 200 years.
is a marketplace that acts as a cloud service broker for decentralized networks. They have built S3 compatible gateways with multiple decentralized network backends.
is a decentralized distributed storage network competitive with centralized storage. It is fully S3 compatible and is highly secure with erasure coding and encryption.
is a fully decentralized distributed storage network where the data is stored using blockchain components and payment for storage is made using the Sia token.
is a decentralized marketplace for cloud storage. The Filecoin network allows anyone to participate as a storage provider to compete for business.
For a deeper dive comparison of the decentralized cloud storage providers, see this write up from Gemini
How Decentralized Cloud Storage Works
While Netflix built a decentralized network using hardware with existing ISPs to gain the performance their customers demand, publicly available decentralized cloud storage networks are built without large servers or data centers. Instead, underutilized capacity on tens of thousands of hard drives located throughout the world make up a multi-petabyte storage network. How the upload and download of data is managed, how security is handled, how storage nodes are maintained, how files are repaired, how pricing is set—all of this is unique to each decentralized vendor.
For the purposes of understanding how decentralized cloud storage works relative to centralized cloud storage, Storj
will be used to provide an example of the features and function of decentralized cloud storage.
Looking at Storj, the independent storage nodes within their distributed network store data without any access to any complete file or usable data. The data and applications are encrypted, encoded and split into fragments, then stored across the distributed nodes. Satellites automatically manage the network access controls, ensure data reliability and node integrity, and compensates the storage nodes for the capacity and bandwidth they supply.
Who or What Are Decentralized Storage Nodes?
Storage nodes at a basic level are hard drives with excess capacity that have been connected to the decentralized network for utilization. This can be an individual with extra desktop space, an SMB with NAS or two, or a small datacenter with spare server space—or any combination. Available high-speed bandwidth is the most important factor in a node being beneficial to the network. An open-source application is used to securely store and share that hard drive space and storage nodes are incentivized via compensation with an ERC-20 utility token.
A recent survey of the storage node operators on the Storj network revealed that 72% of storage node operators operate only one node and 87% of these are located in a home or home office. Typical performance is high speed with 79% of storage nodes delivering 100MB to more than 1GB in bandwidth. Regarding sustainability, 69% of storage nodes are using existing hardware that was underutilized or repurposed for use as a storage node.
As to the question of why people choose to host a storage node, the answers vary from wanting to support a more sustainable storage model to earning compensation for their unused storage capacity.
It doesn’t matter who the person is or what the hardware is. The beauty of the decentralized and distributed model is that it’s zero trust, which ensures that storage node operators cannot see or access the data stored on their node.
Interested in becoming a storage node operator? Here’s how to host a node
How Can Developers Build on Decentralized Cloud Storage?
Decentralized cloud storage user interfaces, options and tools have significantly evolved to deliver ease of use. In the case of Storj, there is nothing for users to install. A choice of open source interfaces handles object upload, download and sharing as well as access management. There’s a simple drag-and-drop object browser option, plug-and-play S3 compatibility with the GatewayMT or a more traditional native Uplink CLI. Before data gets uploaded to a decentralized network, it’s encrypted with keys that are only held by the customer, not by anybody operating the network. The data is then broken up into lots of pieces in a redundant way, using erasure coding vs replication. That encryption and erasure coding of files is handled automatically right out of the box.
The best way to get a feel for how decentralized cloud storage works is to try it out. Sign up for Storj
and get 150 GB capacity and bandwidth for free!
How Does Decentralized Cloud Storage Ensure Data Privacy and Security?
To describe what the decentralized storage network does to protect data can be illustrated by thinking about grains of sand. Distributed storage splits data objects into pieces of encrypted sand that are randomly distributed on an encrypted beach. Sounds pretty secure, but let’s break down the technology used to make it happen.
Object Handling in Storj Decentralized Cloud Storage
If using the satellite Uplink CLI, end-to-end encryption is used. Server-side encryption is used for Object Browser and the S3 GatewayMT. Segments are encrypted using a salted, randomized encryption key that is then encrypted with the user’s encryption passphrase and stored in the object metadata.
is then used for data redundancy. A 276% expansion factor is used and objects are broken into segments 64MB or smaller. These segments are then broken into 80 or more pieces.
3. Identify Nodes
The system then identifies which nodes to store those pieces across the tens of thousands of geographically diverse nodes and ISPs in more than 100 countries.
The pieces are then distributed to the identified nodes for storage.
1. Identify Nodes
For downloads, the pieces needed to reconstitute an object (only 29 of the 80) are located on the closest nodes geographically.
2. Download Segments
The identified pieces to make up each segment needed are then downloaded.
3. Assemble File
Those pieces are assembled together into the encrypted file.
4. Decrypt Data
In the final step, the data is decrypted and the object can be accessed.
All of these steps are handled automatically for every object upload and download. This process has significantly increased security, availability, and performance benefits over the centralized cloud storage model.
How Does Access Management Work in Decentralized Cloud Storage?
Access management on Storj decentralized cloud storage requires a parallel coordination of authorization and encryption. Authorization is a determination of whether a particular action request is valid. Authorization management is implemented in a decentralized system using hierarchical deterministic API keys based on macaroons
. Objects on a decentralized cloud storage network are encrypted with a randomized encryption key that is salted with predetermined salt. Paths and randomized encryption keys are encrypted using AES 256 GCM or Secretbox depending on if the Object Browser, the S3 GatewayMT, or the Uplink CLI is used. Get more details on how Storj encryption works here.
When access has been authorized, an access grant is created. This is a security envelope that contains a satellite address, a restricted API key, and a restricted path-based encryption key. This is everything an application needs to locate an object on the network, access that object, and decrypt it. Access grants are created and managed client-side. This includes any encoded restrictions such as restricting certain operations, whether operations can be done on one or more buckets, specific paths, and specific time windows of access.
Common Use Cases for Decentralized Cloud Storage
Decentralized cloud storage is highly performant and economic for use cases where security, privacy and on-demand availability are extremely important, for large files and data sets, for data that is written once, but read many, has hundreds of thousands of downloads a month, or requires high transfer speed. The predominant use cases for decentralized cloud storage today are the most challenging for centralized cloud storage due to high costs or availability limitations. The most common use cases are:
Let’s dive into the specific challenges of these use cases and the reasons why these are a great fit for decentralized storage.
Video Storage and Streaming
Trying to do video storage and streaming in centralized cloud storage can result in high egress costs and can be limited in geographic distribution and availability. As seen in Netflix moving to a decentralized model, on-demand availability just isn’t fully achievable in a centralized network. Additionally, there can be the challenge of the performance needed for transcoding the video into different formats, as well as consistency in versions.
Decentralized cloud storage is great for video storage and streaming
because it is highly performant at the edge and 1/5th to 1/40th the cost of centralized storage with no hidden egress fees. The service is multi-region by default and handles multi-threaded concurrent downloads. Decentralized storage also has an enterprise-grade SLA for durability and availability.
Cloud Native Applications
As cloud native applications get started, centralized cloud storage is a highly attractive and inexpensive option, but costs quickly become unmanageable as data continues to grow. While high cost is a factor, the biggest challenges with centralized cloud storage are concerns with their customer’s perception of how the big cloud providers are handling data privacy
as well as the increasing frequency of security breaches on centralized cloud storage
. Access delegation in centralized cloud storage is also not a zero trust model which adds to security concerns. And depending upon the regions you are looking to serve with your cloud apps, limitations in geographic distribution may also be a challenge.
Decentralized cloud storage is a great alternative for cloud native apps
because it is a zero trust
architecture. Neither the decentralized network provider, nor the storage nodes can access your data. The system makes it virtually impossible to infiltrate and even if a malicious actor could get in, they can’t reassemble your data. And this standard level of encryption is for data and metadata (the data about a user’s data). Read this IDC Analyst Brief
to take a deeper look at how decentralized cloud storage architecture takes zero trust to the next level. Combined with the global distribution and significantly lower cost, this makes decentralized cloud storage a highly preferred option over centralized cloud storage.
Software & Large File Distribution
Similar to video storage and streaming, if you’ve got software or large files you are trying to distribute, you may find centralized cloud storage costly and limiting. This quickly becomes a problem when working with containers or large data sets such as in scientific research.
Decentralized cloud storage works very well for software and large file distribution
thanks to its low, consistent pricing while being able to easily distribute files in any geography with great performance and durability.
Centralized cloud storage providers offer lower cost services for backups and archival, like Amazon’s Glacier, however there are still significant limitations in geographic access and performance in disaster recovery scenarios. Data resilience is a concern as well as centralized cloud storage is not 100% durable.
Decentralized cloud storage is a better choice for backups where you still need access to that data
—particularly for disaster recovery scenarios. The backups are more secure and more available, while keeping costs still lower than cold storage. But if cold storage is what you need, some decentralized networks offer “forever storage” for 200 years of archival.
Limitations of Decentralized Cloud Storage
While the use cases covered so far are great fits for decentralized cloud storage, there are some use cases that don’t make sense for decentralized cloud storage. Fundamentally, these are use cases that aren’t a fit for object storage—whether centralized or decentralized.
Use cases that aren’t appropriate for decentralized cloud object storage
- Large numbers of small files (<1MB, or hundreds of KB)
- Files that are frequently updated
- Archival data that is stored indefinitely and never accessed
- CDN/ultrahigh frequency access
Advantages of Decentralized Cloud Storage
If you’ve got the right use case, decentralized cloud storage can yield incredible benefits over alternatives like centralized cloud storage. Performance, privacy, security, and cost are significantly improved. And perhaps even more important in the long term, decentralized cloud storage is a sustainable option that fulfills the vision of Web3. Are you not quite sold? Here’s the proof of each of these benefits, again using Storj as the example network for comparison.
Decentralized Cloud Storage Yields Better Performance
Decentralized cloud storage by design has better global availability
, reliability, and resiliency than centralized cloud storage. All of the components in the network are multi-region by default with built-in redundancy, and even the satellites are multi-region within and across multiple continents. This immediately provides superior availability and scale to the network.
The Storj decentralized network is also self-healing. Storage nodes are audited via an automated process to determine the availability of segments based on storage node availability. These audits create a reputation score for each node using a statistical model for node quality and health. If a storage node leaves the network, the system automatically replaces the pieces held by those nodes and uploads them to new nodes.
Regarding actual upload and download speed, it may seem difficult to believe that a system with all of that encryption, erasure coding and distribution would be fast. Yet decentralized cloud storage was developed to optimize for a bandwidth-constrained environment by minimizing coordination-related dependencies, maximizing parallelism, and working to hone and balance efficiency to reduce the long-tail effect.
Storj decentralized cloud storage uses a statistical model to determine relative performance of storage nodes in a distributed system. Performance is determined by the fastest performing nodes and oversampling is utilized for long tail elimination. For upload, the system attempts to upload 110 pieces, but stops at 80 successful uploads and uses parallelism to improve performance. Similarly, for downloads, the system attempts 39 pieces and stops at 29 successful.
But how fast is it really? A typical laptop can achieve 500MB transfer speed with a supporting internet connection while more powerful servers can exceed 2,500MB downloading and in excess of 1,000MB uploading. And these rates are true from any location, unlike centralized cloud storage that can have extremely slow download speeds from locations close to their data centers, let alone far away.
Take a deeper dive into the performance benefits of multilayered parallelism with this report
from the University of Edinburgh.
Decentralized Cloud Storage Eliminates Data Privacy Concerns
Data privacy is a growing concern with developers as many end users see centralized cloud storage providers like Google, Microsoft, and Amazon as having ulterior motives to access their data and metadata. These companies have entire business models around using user data to better advertise to them so the concern is understandable. Bottom line is that end users perceive that centralized cloud providers have access to their personal data stored on centralized cloud networks.
While minimizing user data and metadata collection is an important first step, some collection is necessary for application functionality. Storing that data and metadata on decentralized cloud storage eliminates privacy concerns because no entity in the network has access to that data. Not the decentralized network provider and not the storage nodes. And with the encryption, erasure coding, and distribution malicious actors can’t access the data either. It truly is the most private storage option available.
If you want to dig deeper on data privacy in cloud storage, you can find a great comparison here
Decentralized Cloud Storage Is the Most Secure Storage Option
There are two main concerns with data protection in cloud storage—ensuring data is kept secure from malicious actors and ensuring no data loss occurs. Let’s first consider data loss. In order to lose access to an object in Storj’s decentralized cloud storage, you would need to lose 52 out of 80 storage nodes holding a segment simultaneously. This is highly improbable since these storage nodes are in different locations, operated by different people, on different internet connections and providers, with independent power supplies. Plus, all of the satellites running in at least three different regional data centers within a geography must fail simultaneously. And, all of the hosted edge services running in multiple data centers in different geographies must also fail simultaneously. This impossible occurrence is why decentralized cloud storage has 99.95% availability and 11 9’s of durability.
Regarding keeping data secure from malicious actors, decentralized cloud storage was built using zero trust architecture. Which means that developers had to assume that no one in the entire process could be trusted and ensure that data will still be safe. The encryption, erasure coding, and distribution as well as the delegated authorization used for access management all work together to form a defense-in-depth strategy to ensure data protection and integrity. Fundamentally, it eliminates the risks of cloud storage.
For the details on the security risks of centralized cloud storage and how decentralized systems overcome them, go here
Decentralized Cloud Storage Is the Most Economical Cloud Storage Option
Centralized cloud storage is quite inexpensive when starting out. However, as companies have generated more data, over time they’ve seen their storage costs rise to unsustainable levels. Additionally, high (and sometimes hidden) egress costs are adding to this burden.
Decentralized cloud storage is a completely different economical model. Instead of having to buy more servers and build more data centers as more data is produced, the decentralized network is able to take advantage of unused storage capacity that already exists. And because there is an excess of unused capacity—most hard drives are only 25% full—costs can stay low and don’t need to rise in conjunction with data growth. Decentralized cloud storage providers have no overhead costs, and they don’t need huge security teams to protect the data centers because the system is inherently protected.
All of this allows decentralized cloud storage to be available at a fraction of the cost of centralized cloud storage. To be specific: decentralized cloud storage is 1/5 to 1/40 the cost of AWS storage. Additionally, decentralized storage has no hidden or added fees. End-to-end encryption comes standard, whereas it is an upcharge with centralized cloud providers. Pricing is simple and predictable.
Want a more detailed analysis of decentralized cloud storage economics? Read this article
. Or get a side-by-side comparison of what you’re using today versus Storj and see how much you could save by switching. Get a free cost analysis.
Decentralized Cloud Storage Is Environmentally Sustainable
While sustainability might not always be at the forefront of business decisions, it is important to note that decentralized cloud storage is environmentally sustainable and fits in the vision of Web3. That makes it very beneficial for future planning.
Decentralized cloud storage utilizes existing storage capacity without adding incremental energy cost or new capacity. It takes advantage of latent, under-utilized network capacity. Unlike proof of work (PoW) systems that consume huge amounts of power and equipment, decentralized cloud storage actually reduces power and equipment needed over centralized systems, while outperforming storage systems built on PoW concepts.
Many of the more advanced decentralized cloud storage systems are even divesting their PoW cryptocurrency methods for compensating storage node operators in order to be fully sustainable. Environmental sustainability is becoming more important for businesses looking to go green and reduce their carbon footprint.
about multiple efforts Storj is taking to reduce its carbon footprint, innovate cloud storage, and conduct business in a more environmentally responsible way.
Factors Causing the Inevitable Growth of Decentralized Cloud Storage
Decentralized cloud storage is not only here to stay, but will see significant growth in the coming years. This growth will be driven by factors that we already see impacting storage decisions today and some that we have yet to even imagine.
Factors driving the growth of decentralized cloud storage
Decentralized storage providers today only offer object storage. The larger providers have integrations with S3 for compute, but isn’t the ideal for all use cases. Significant advancements are being made in edge computing that will make compute more accessible where it is needed. It is highly likely that decentralized storage providers will begin to partner with edge computing providers to help customers achieve both compute and storage at the edge. This will continue to improve performance and open up new use cases.
There is no question that data protection regulations are continuing to evolve, becoming more strict in how data needs to be secured and how private information is collected and stored. This trend is already forcing companies to find more secure and private ways to store their data and decentralized cloud storage absolutely fits that requirement.
Decentralized cloud storage is already significantly less expensive than centralized storage. As the network of storage nodes grows, the cost will actually get cheaper while the performance improves. Costs go down because there is more available storage capacity and performance improves because the system can ensure data is stored on high performing nodes located even closer to where they are needed.
Sustainability and Web3
Just like the internet grew out of a necessity to have no single point of failure or ownership, the decentralized internet, or Web3, is putting data ownership and privacy back in the hands of the users versus a few large corporations. It is also a drive toward reducing the carbon footprint of the internet. Sustainability is becoming more important to companies and especially end users who want amazing innovation, but not at the cost of our planet. Decentralized cloud storage fits the bill for the vision of Web3 and reducing environmental impact because it isn’t building new data centers, it is using storage and bandwidth that already exists and is going unused. Decentralized cloud storage is the greenest storage option for environmental and economic sustainability. And this fact will absolutely drive its growth.
As decentralized cloud storage grows its storage network and its functionality, use cases that we haven’t even dreamed of yet will be possible—which will also contribute to future growth. Really, the stars are aligned for decentralized cloud storage to see extreme growth. And this will be a big step toward helping to realize the vision of Web3 and a more sustainable internet.
Transition from Web2 to Web3 Storage
Storj DCS offers an S3 compatible interface to make that transition as smooth and low risk as possible.Learn More