Data Privacy

The regulatory landscape for data privacy changed in 2018 with the European Union’s sweeping General Data Protection Regulation (GDPR) coming into effect . Data privacy regulations — GDPR included — are often seen as counter-intuitive and difficult to align with the permissionless, distributed, and immutable nature of public ledgers. The incredible pace of change in terms of data privacy, as well as continued regulatory uncertainty among businesses and consumers, is still increasing. This holds true globally. For example, in the US, three states have passed personal data privacy legislation since 2018, including the California Consumer Privacy Act (CCPA), while many other states have data privacy bills moving through the legislative process.
These data privacy regulations share common principles and grant similar rights to the data subject. For example, both GDPR and CCPA bear the rights to correct, update, or delete data on a business database and require businesses to obtain consent for certain collection, processing, transfer, or sale of personal data. Software solutions, including blockchain-based ones, must be built in a way that considers these issues. Scrutiny over the way organizations manage data privacy rights has never been higher. Regulators have full-bodied enforcement agendas; consumers across the globe are learning how they can take action to protect their own data privacy; and enterprises, large and small, are tackling the operational, technical, and reputational challenges for business-as-usual operations. In parallel, significant uncertainty remains in how these data privacy regulations apply to blockchain technology.
It seems obvious for (enterprise) users of blockchain technology that data privacy matters warrant significant consideration. Especially from a legal perspective, technology to be used within an enterprise ecosystem must follow specific regulatory demands that vary per jurisdiction.
Therefore, it is key to build a solution that is compliant with regulatory data privacy demands. Key challenges associated with reconciling data privacy regulations with DLTs are data mutability, data residency, and data democracy, which are covered in the following sections.
Data Mutability Under the GDPR and other regulations, some data subject rights require the modification of previously collected data. For example, data subjects may have the right to correct inaccurate or outdated data, commonly referred to as the right of rectification, and the right to delete collected data. Blockchains are always described as an immutable or tamper-proof ledger or distributed database. This immutability is portrayed as a key enabler of the trust in the blockchain through resistance to malicious modifications. Once a transaction is added to a block and that block added to the end of the chain, then after some number of additional blocks are added, the transaction in question is effectively written in stone, as it would be impractical for an attacker to modify that transaction. Immutability appears to make it difficult or impossible to satisfy the requirements of the right of erasure, at least in regards to data stored on-chain.
The fundamental challenge for supporting a user’s right to demand modification or erasure of their personal data if maintained on an immutable chain is not that it is impossible to delete a given piece of data, but rather that doing so would prevent subsequent validation of the chain. Specifically, the running hash would be invalidated if data was to be deleted.
Data Residency On a typical public blockchain network, the network data is distributed and replicated across many geographically distributed nodes. Data privacy regulations often classify that protections may be rendered useless if the data is able to be freely transferred to another jurisdiction with less severe data privacy requirements. In the context of GDPR, for example, there are detailed requirements about the transfer of EU data to, or viewing of such data from, other jurisdictions. In general, the recipient of this data must be under a legally binding obligation to follow GDPR data protection principles or their equivalent.
Data Democracy Data Democracy means information is controlled by those who generate it. This offers a more holistic view on data ownership, usage, and rights. It may sound straightforward, but in a world where personal and enterprise data is being exchanged with multiple parties, providers, governments, and others, this is a complex task.
Status Quo: Giant data silos and monopolistic control Today, cloud is simply a slogan for data storage on someone else’s computer. When the data is held in this centralized and siloed way by a third-party, users are exposed to threats such as censorship, surveillance and access restrictions which could impact their autonomy and decision-making. For example, in fall 2019, a modification in export law required that U.S. companies block users connecting from Syria, Iran, Venezuela, and Cuba . Suddenly, users were unexpectedly unable to access their data. Some Silicon Valley business models rely on monopolistic control of user data and interaction. This is why approaches on data democracy haven't been pursued by many in the technology industry to date. User data is held hostage. It is sold to the highest bidder and not controlled by those who generate it.
Peer-to-peer networks: One way to overcome monopolistic data silos would be to host data on peer-to-peer networks, similar to BitTorrent, instead of a cloud or centralized server. This approach it does eliminate the centralization, but has shown to have major drawbacks. At first, there has to be one peer-to-peer connection for every relationship. Consider a complex supply chain with hundreds of involved parties – this would require thousands of individual connections to be maintained. Second, and potentially even worse is the lack of trust between parties. No one can be sure that their data is not being manipulated by the counterparty of a P2P connection – i.e., there is no way to prove data integrity.
(Classical) Blockchain: Blockchain technology seems to address these issues. It has proven to be a clever mechanism that facilitates transactions across a web of potentially untrusted computers. It also is a distributed system and uses some of the same computer science concepts as peer-to-peer applications. But blockchains need to have a consensus mechanism in place, because public transactions are mediated between participants which are all potentially malicious.
These trustless transactions are the key assumption baked within blockchains that distinguishes them from peer-to-peer applications. Most public blockchains, like Bitcoin and Ethereum, employ actors called “miners” for the consensus mechanism, who run computationally expensive algorithms in exchange for monetary compensation. This incentivizes the purchase of ever larger and more expensive data centers, rewarding miners who are able to take larger financial risks. In other words, the rich get richer.
Returning to the issue of data privacy, we see that this trend towards fewer, richer miners also leads to a centralized structure. There exists a famous picture taken a few years ago with less than 10 people on it, who represented about 90% of Bitcoin’s mining power! This handful of partly unknown and not-bound-to-any-contract miners could decide the fate of the Bitcoin network. For example, they had the power to manipulate transaction data or halt the network. In this early model, the only protections the network had against these kinds of "bad actors" were monetary incentives. It follows that these classical PoW blockchains are not suited for data storage or processing when data privacy is taken into account.
Although some blockchain proponents may claim Proof-of-work is ‘trustless’, technology is not neutral, which in practice means that you have to trust someone at some point. Within these classical blockchains, one doesn't know who processes and stores data, which rules and laws they follow (if any) or where and how the data is stored.
Copy link