Storj Data Redundancy, Availability, and Durability

Increased Availability

Data plays a vital role in your business. How you store, secure, and make that data accessible can have significant impacts on several aspects of your operations - and your bottom line. Standard practices for on-premise storage has been to replicate or mirror data to multiple data centers to ensure its availability. As data storage has moved to the cloud, many organizations have continued that strategy by storing their data in multiple availability zones in the centralized cloud. Unfortunately, following that legacy approach in the cloud has significant negative impacts.

History has shown that when outages occur on a centralized cloud storage network, it’s not uncommon for the outage to affect multiple availability zones, as well as the entire region. As a result, to achieve acceptable levels of data availability you need to replicate it across multiple availability zones and regions. Replicating your data in that manner multiplies your already high storage and egress costs, as well as your management complexity and effort. It also has adverse sustainability and carbon footprint impacts.

Storing your data in the decentralized cloud with Storj Decentralized Cloud Services (DCS) offers a better option that inherently increases data availability and reliability while dramatically decreasing cost and complexity.

High Data Redundancy and Availability

To achieve enterprise-grade 99.95% data availability, Storj uses Reed Solomon erasure coding that automatically splits each object file into 80 or more encrypted pieces across 13,500 geographically diverse nodes running on different ISPs and power grids across the globe. The reconstruction of that encrypted file only requires 29 of those 80 pieces. Since Storj automatically distributes those file pieces across multiple regions and countries, an outage that spans multiple regions won’t hurt its availability. From a data availability standpoint your data is literally everywhere.Additionally, from a security perspective there’s no single location where the whole file exists. In other words, with Storj you get global data redundancy and high availability with no single-point of failure. Also, if you need to control the geographical locations where your data is stored due to General Data Protection Regulation (GDPR) concerns, Storj also provides geofencing capabilities. Furthermore, the redundancy that Storj provides is considerably more efficient than the replication used by most data storage systems, resulting in a much lower carbon footprint.

High Data Durability

When evaluating decentralized networks considering how it handles node churn is a critical step. Hard drives can fail and a storage node can permanently go offline at any time. A storage network’s redundancy strategy must store data in a way that provides high probability access, even though any given number of individual nodes may go into an offline state. Some decentralized networks try to address this durability issue by simply replicating the data several times across their network, but that comes at a great cost, increasing bandwidth consumption by 100% for every replication instance (See Replication is bad for decentralized storage).

Storj provides a better approach to data durability. By utilizing erasure codes to create data redundancy, Storj delivers eleven 9s of durability.

If a storage node goes offline or if data gets deleted or corrupted, the erasure coded pieces enable Storj to recover or repair that data using the file’s remaining pieces stored across the network. On top of that, Storj has an automated audit and repair process that continually checks the status of its storage nodes and stored data and makes repairs as necessary.

The audit and prepare process maps where the pieces of a file are stored. And as part of its ongoing status audits, if it detects that a storage node has been offline for a certain period of time, that node gets evicted from the network. If at any point the storage nodes associated with a file drops to a predetermined threshold, the repair process will automatically reconstruct the file using the fastest available 23 pieces. It will then split that reconstructed file back into at least 80 erasure coded pieces and redistribute them to an equal number of nodes across the network, ensuring its ongoing availability and durability.

In addition to the built-in automated audit and repair, Storj also makes a significant effort to simply reduce overall node churn through an economy that incentivizes storage node operators to stay online with high-performing availability.
  1. Automated data orchestration ensures global availability for anytime, anywhere data retrieval
  2. Automated repair system ensures reliability by relocating file pieces in the event of storage node failures
  3. Reed Solomon erasure coding ensures data remains intact even if multiple nodes go offline for any reason - 99.95% availability and 99.999999999% durability.
  4. Bitrot resistance through statistical audits of data to validate storage nodes also validates retrievability of data while repair provides continuous refresh

No-Worry Management and SLA

The automated processes that ensure the high redundancy, availability, and durability on the DCS distributed network are all part of the comprehensive management layer that sits on top of Storj. We take care of the orchestration and management so you don’t have to. In addition to all the automated management processes, hands-on Storj engineers continually monitor network performance as well. Additionally, we provide a 99.95% availability SLA so you can be certain your data will always be accessible.

Build on the distributed cloud.

Get S3-compatible object storage with better security, performance and cost.

Start for free
Storj dashboard