Data Privacy Frameworks & Decentralized Storage

John Gleeson and Katherine Johnson

January 13, 2022

This is the third in a four-part series on data privacy and compliance
‍

In our previous post in this series, we took a closer look at laws that regulate how individual personal information is stored and managed. Because these regulatory frameworks were designed to solve privacy issues specific to on-premise and centralized cloud storage, they don’t apply in usual ways to decentralized storage solutions.

Here, we explain how decentralization enables applications to be more private and secure in the way they store data — even when data privacy regulations don’t apply to them in the same way that they do to more outdated technologies.

Unique Challenges for Decentralized Storage

Data privacy regulations were designed against the backdrop of technology infrastructure that has evolved from on-premise applications to include traditional centralized cloud storage services.

Decentralized cloud applications are architected to be private and censorship resistant, and the way they achieve that privacy is by ensuring — at the code level — that privacy can’t be compromised.

Decentralized applications are architected for enhanced levels of privacy with strong encryption and zero trust architecture, but the architecture and approach also differ in two significant ways related to the location and monitoring of the data.

Data Location

Data stored on a decentralized network like Storj DCS is typically distributed over a number of different nodes and devices that share storage capacity with a decentralized service. In the case of Storj, that’s 13,000 nodes in 100 different countries today.

Data is broken up into segments that are encrypted, erasure coded, and distributed over these nodes. So the only thing stored on any node is an erasure coded piece of an encrypted file. A node may store tens, hundreds, thousands of pieces, but never more than one piece out of the 80 for any particular segment. No node ever has access to any complete object, any complete file, or any unencrypted data.

One key aspect of decentralized applications is that the infrastructure itself is crowdsourced and it can be operated almost anywhere around the world. That infrastructure is run predominantly by third parties. When data is stored on a decentralized storage network, the data is required to be transferred to hardware operated by third parties.

In contrast, when data is transferred between third parties in centralized storage, the entire data set is transferred. It’s typically encrypted, but the entire data set is moved. So the third party may or may not have access to the encryption keys depending on whether the data will be used or processed by the third party.

Data privacy regulations are largely geared toward these 1:1 transfers. When data is transferred between third parties, that transfer is typically governed by a contract describing what can and can’t be done with the data, and an element of trust is required that the data won’t be misused.

In the case of decentralized storage, the architecture is extremely secure and private for the very reason that the software is designed to eliminate that layer of trust between third parties. It’s not that the infrastructure operators agree not to access the data, it’s that they can’t.

Data privacy laws frequently include data residency requirements, ensuring that the data is not transferred to foreign jurisdictions that might have less stringent data protections. And by distributing that data, the data is actually made more secure since no third party has access to all of the data. Moreover, the availability and durability are improved without compromising privacy.

Data Monitoring

Data monitoring ensures against loss, corruption, or misuse, and regulations rely on contracts to enforce this level of protection. It’s not that you can’t do the thing they don’t want you to do; it’s that you’re contractually obligated to behave in a certain way with regard to data handling. When you have a legal contract, it’s hard to know what can and can’t be done from a technical standpoint. It just commits the parties in word but not necessarily or technically in deed.

The compliance concerns that come with centralized systems include lack of efficiency, security, and the need to trust the provider. Ideally, we wouldn’t have 50% of the world’s data in a handful of providers, but we do now. You have to trust that centralized provider and hope that these legal agreements are enough to protect you, and not only that they’re enough to protect you, but that you have the time, patience and training to go through them and understand what they're saying and how your data is being handled.

In a decentralized service, these functions are both programmatic and monitored through statistical audits, and all of this is done while the data remains in an encrypted state at all times. This is something regulations haven’t adapted to in the significant delta between Web 2.0 and Web 3.0 architectures.

Next in the Series: 3 Data Privacy Trends to Watch

Even with increased data privacy regulations, we expect to see more high-profile breaches and concerns about data privacy.

In our next post, we’ll present three data privacy trends, and share ways that Storj is working to help our users manage the complex web of privacy regulations so they can make better decisions about data storage.

Storj and TenrecX partner to deliver a hyperscaler alternative for data-intensive workloads.

New IDC report says it’s time to rethink storage for post-production.

What distributed infrastructure enables in the real world: lessons from IRC.

Get in touch

Speak with an expert.

Chat with our team to get your questions answered and unlock your free trial.

Get started