Pick 2: Applying Lessons from the Electric Car to the Path to 100% Decentralization

As you may be aware, Storj operates Storj DCS (Decentralized Cloud Storage), the industry’s first enterprise-grade, economically sustainable, decentralized storage service with an S3 compatible Gateway.

We are really proud of our 99.99999999% durability, 99.95% availability, S3 compatibility, and that the platform was built with security and privacy for developers in mind. We’re proud of the fact that we’re able to sustainably price at 1/10 to 1/40th the price of large cloud providers while providing attractive economics to Storage Node Operators and partners. And, we are exceptionally gratified we’ve done this entirely on decentralized Nodes, with a zero-knowledge, end-to-end encrypted architecture that delivers on the security, privacy, economic, and resilience promises of decentralization.

However, the purists in the community (and in Storj as well) will correctly point out that we’re not 100% decentralized with regard to Satellites. While Satellites are highly distributed and highly redundant, they’re not fully decentralized within the Storj network. We intend to make them fully decentralized over time, but this was a conscientious choice to go to market with this intermediate stage first.

A few thoughts on why:

Pick Two (for engineering projects generally)

You may be familiar with the engineering dictum: you can be fast, you can be cheap, and you can be good—but you can’t be all three. You have to pick two.

There’s a long list of engineering projects that have failed (or were never completed) because the project leaders tried to meet all three factors simultaneously and ended up succeeding at none.

Pick Two (for electric cars)

A similar scenario played out in the electric car industry a decade ago. Everyone wanted an affordable car that had long ranges between charging and that was all-electric. But, at the time, it turned out that it was impossible to meet all three factors fully. You had to pick two. And, you had to pick the correct two.

Tesla, for example, initially decided to focus on being all-electric and delivering a long-range between charges. When Tesla introduced the all-electric Roadster in 2008, it had an impressive >320 km of range and was 100% electric. But, at a starting price of over $120K, few could afford it. Less than 2,500 Roadsters were sold. However, the Roadster succeeded as a proof of concept and succeeded at generating interest. Over time, of course, Tesla used what they learned from the Roadster to produce more affordable and practical 100% electric cars, such as the Model S and Model 3. The Model 3 (at a ~$40K list price) sold over 500,000 units last year.

Toyota and others took a different approach. They created cars (e.g., the Prius, plug-in hybrids) that were affordable and had nearly unlimited range but achieved that range by not depending 100% on the battery. These cars delivered many of the most important fuel economy and environmental benefits of 100% electric cars but weren’t 100% electric. However, with an MSRP of ~$22K, the initial Prius model helped create a line that has sold millions of units (over 500K units in 2008, the year the Tesla Roadster launched) and helped Toyota and others fund moving towards 100% electric. By April of 2020, Toyota had sold over 15 million electric vehicles.

Both Tesla and Toyota were, in my opinion, successful approaches. Both made a clear choice of two out of three factors to start. Now, both have reached the point where they have long-range, all-electric, and (relatively) affordable offerings. But, both had to compromise on one of the factors in the early days and move to deliver completely on the third factor over time.

By contrast, there were many failed approaches in the electric car industry. These included a plethora of relatively affordable, all-electric vehicles with impractically short ranges (<15 km), which couldn’t be used as a primary vehicle by most consumers. (i.e., they picked the wrong two factors). Other approaches failed to deliver sufficiently on any of the three factors (e.g., those that relied on non-existent networks of chargers or battery replacement stations to provide range).

‍

Pick two (for decentralized storage)

In building a decentralized storage network, we faced a similar set of choices. For us, the three factors are:

1) Being economically attractive and sustainable

2) Being enterprise-grade (security, performance, scalability, durability, service level agreements, etc.)

3) Being decentralized

Because this is a market-based network, the economics must work for users, Storage Node Operators, demand partners, and network operators alike. This is a two-sided marketplace, and we need both supply and demand to work to be sustainable. As an analogy, ride-sharing companies need to set prices low enough to attract riders but high enough so the amount shared with drivers makes driving attractive. Similarly, we need to have an attractive price for storage users but preserve enough margin to make operating a Storage Node work for our Node Operators.

To gain broad adoption and demand, of course, the service can’t just be inexpensive. The service has to work for enterprise apps and users, delivering durability, security, performance, etc., comparable to or better than centralized cloud storage. If we want to move beyond the early adopters and dApp enthusiasts, we need a service that is enterprise-grade and compatible with existing object storage apps.

Finally, of course, we want to deliver on the full decentralized vision, where there are no single points of failure, robust privacy, zero-knowledge, and full user control.

Many prior attempts at decentralized storage (including our own V2 network), failed to deliver fully on at least two of the three factors. For example, many have created impractical or unusable networks in the service of being 100% decentralized. These networks are interesting but have failed to attract large numbers of users or grow beyond a few terabytes, and most have yet to exit alpha or beta. Others are decentralized in terms of the economics, but have made the actual storage of the data weirdly centralized (e.g. services that store data unencrypted and non-redundantly on single services run by large miners).

When we began planning Storj DCS, we believed we needed to fully meet the economically sustainable and enterprise-grade factors to be viable. We believed that we could deliver significant user value by having a largely (but not 100%) decentralized architecture. I guess you could say that we chose the “plug-in hybrid” path for our Storj DCS launch.

We talked extensively on the path to our launch in April 2020 about being enterprise-grade and how we deliver our economics. We’ve continued to improve on these aspects of the service during the 1.5 years since launching and in the run-up to the launch of our upgraded Storj DCS service. Our approach to Satellites enabled us to deliver those factors and deliver them far more quickly. But, how decentralized are we? And, how close are we to delivering on the decentralized vision?

How decentralized is Storj DCS?

Let’s start with what is decentralized about the system:

NODES: Storj DCS is fully decentralized from a Storage Node perspective, delivered on a network of Nodes independently owned and operated by over 13,000 individuals and companies in over 90 countries. The network delivers over 9 9s of durability, 99.95% availability, and exceptional performance due to this structure. Node reputation is determined algorithmically, and we can withstand the loss of huge numbers of Nodes (including outages caused by widespread power outages and natural disasters) without compromising file durability. We can withstand Nodes run by bad people, incompetent people, and byzantine behavior without compromising security, durability, or performance. And, this architecture supports economic empowerment and sustainability. Of course, the Storage Node code is open source. Being open source is critical to any decentralized system.

USERS: Anyone can use the system from almost any location, paying in either token or fiat. The system has zero knowledge and strong encryption by default, so user data cannot be mined by anyone or shared without fine-grained user permission. Of course, the user (Uplink) code is also open source.

PAYMENTS: All payments can happen transparently leveraging blockchain. All Node Operators are paid using the STORJ ERC-20 token, delivered on top of the Ethereum network. Users who pay in STORJ token receive a bonus of 10% on top of their deposited amount.

So, if we’re really decentralized for Nodes, users, and payments, where are we not fully decentralized? The answer is that our Satellites are largely, but not 100%, decentralized.

SATELLITES: The Satellite code is open source, supporting the creation of storage networks completely separate from Storj Labs.

However, our current Storj DCS network depends on a redundant, distributed set of Satellites that are (for the moment) all run by Storj Labs. We have gone a long way in order to make sure these Satellites are as distributed as possible. The Satellites are multi-server instances located in multiple locations worldwide, with industry best practices on uptime, backups, etc. All satellites are multi-region. A Satellite compromise would cause very little damage in terms of security, as Satellites never hold encryption keys and have only limited (and client-side encrypted) metadata. Similarly, today, Satellites have been designed to enable the loss of multiple instances (e.g., Chaos monkey) or even full data centers (Chaos Gorilla or Chaos Kong) without impacting availability. If an entire multi-region satellite is lost (worse than Chaos Kong, equivalent to large parts of a whole continent going offline), availability will suffer, but durability will only impact data stored between the last snapshot (done hourly.) We are adding a write-ahead log to eliminate even this window of exposure to continental-wide outages. However, security is not in danger. Inherent to the design is that no one—not even Storj Labs—can mine or see user data. Not only is data always encrypted, but all possible meta-data is also encrypted.

Soon, we’ll enable partners to operate Storj Satellites. And, our midterm roadmap includes enabling failover between Satellites. We certainly hope that the success of the Storj network will encourage others to set up non-Storj-based networks.

Ultimately, of course, we want to progress to the point where Satellites can be run by any competent operator and be part of the decentralized network, much as any router or bridge can be part of the Internet.

For a fuller discussion of the current limitations of our current system, see https://www.storj.io/terms-of-service

Conclusion

Ultimately, we feel that we have made the right choices going into this important launch. We’ve delivered an enterprise-grade network with sustainable and attractive economics. The network also delivers almost all of the benefits of decentralization to users and Storage Node Operators. We’ve still got a ways to go, but we hope that history and our users will prove us right.