User data in Web3: Nirvana or Nightmare?
How should we think about managing user data, when we're now the owners?
Where do we point the fingers for data abuses in a world in which we own our data? The data-hungry tech companies have become political whipping boys for good reason, exploiting our data for their profit. Fuelled for blitz-scale growth by competitive VCs (and testosterone…), young (male) founders eventually replaced righteous missions plausible business models - which ended up being targeted advertising. This requires mountains of data and software geniuses building opioid-level addictiveness. Web3 offers the promise that we’ll own our own data - no middlemen to exploit our data, but also to blame.
What does all this mean for user data, and in particular the most valuable user data of all, health data?
Pharma companies are willing to pay up to $10,000 for health data about a single user, according to one of the speakers at an innovation forum that we recently held to explore this issue. EiC’s partner, The Collective and King’s College London’s ARK hosted a workshop on web3 and user data at King’s in London on April 21, 2022.
Challenges around user data and Web3 are numerous: we have data silos due to use case-specific solutions, the fact that our data is owned and managed by big tech companies (in particular makers of wearable devices) and the lack of standards around exchangeability of data.
Below are ten takeaways the list of the speakers and link to attendees - a diverse and eclectic bunch. Note - the meeting was held using Chatham House rules - so participants don’t quote individuals directly.
Five innovation ‘lenses’ for the discussion
We identified five ‘lenses’ that we applied to the topics around Web3 and user data:
User need. Do users want decentralized data? Many are happy with Apple etc. so why would they want to own their own data rather than have it managed by experts. Do we need to re-introduce intermediaries / stewards?
Usability. Today’s experience of most Web3 applications and crypto in general is very poor - partly due to the nascent state of the industry and partly because by definition, there is no centralised gatekeeper who can control and optimise the user experience.
Privacy. It seems most anonymised data can be de-anonymised, and genetic data is by its nature unique to the individual. How do we ensure privacy and security of valuable health data.
Value. Despite the push for independence and user control, the assumption is that user data will be valued and purchased by corporations. Do they want it? In what form will it need to be to be useful to them?
Interoperability. The promise of Web3.0 is data ownership and that requires interoperability among different stakeholders. How realistic is that, and where are we on that journey?
10 key messages
1. “Web3 is a paradigm shift, not a get-rich-quick scheme”
Most people will be familiar with Web3 via Bitcoin and crypto currencies, or perhaps Meta’s championing of the metaverse. The promise of Web3 is much broader than new currencies or 3D aviators - it’s the promise of genuine decentralization and broader participation by individuals themselves creating a new infrastructure and economic models (broadly called ‘tokenomics’). The web and the marketplaces built on top of it will potentially change from being managed by one vendor to a group orchestration challenge.
None of the participants in the meeting spent much time talking about currencies - participants emphasised the structural shifts that were coming in tech architectures, business models and user experiences, and all noted how early we were in the evolution of the industry and how things are evolving and iterating rapidly.
It’s a cliché but also true to say that as consumers, in Web2.0 if we’re not paying for the product, we are the product. In Web3, it’s a different story, as individuals have a new role - they ‘bring their own data’. A core tenet is the removal of intermediaries, so it’s not just a question about what policies the data-keepers need to follow, but how to manage an entirely new infrastructure of participation with different roles and different players.
“I believe that time to value for blockchain and web3 technologies will be faster than IoT has been. This was just more of a realisation than an insight from the conversation. With that said, every new technology looks stupid, scary, worthless and weird - until it becomes valuable through key use cases that impact people's day to day.”
2. “The Web3 user experience is broken today”
The lack of intermediaries means that technology and data interfaces need to do the heavy lifting and are generally falling short - an overall confusion about the space, security concerns (losing keys, risk of scams etc), imperfect interoperability and more point to an industry early in its development.
“It’s a mess out there”
“It is unclear whether we will spend considerable time defining a universal one-size-fits-all approach to enable its secondary use or settle with multiple possible scenarios, which will depend on the level of involvement of data owners (patients, wearable users, institutions) we want to maintain. Whichever it is, blockchain, opportunities for tokenisation, and gamification certainly contribute to involvement, retention and more complete "data profiles", at least on the patient level.”
There was consensus on a call to reduce complexity and made data collection non-intrusive. And a need to build technologically robust and trustable consent mechanisms - so people don’t just mindless click on EULAs like they do today, but have genuine buy-in and participation.
3. “We’re already using ‘tokens’ today for data”
We are already benefiting from “tokenised” personal data rewards from e.g. shopping points, airmiles, fuel, insurance trackers in cars. Web3 evolves this model by removing the middleman, but should retain the benefits of user data. How this happens has a lot of bright minds focused nowadays. DAOs will be a big part of the answer:
“DAOs will be important platforms to promote & incentivise the secure collection and sharing of consumer wellness data.”
Tokenizing data is extending into the healthcare space. In addition to ‘traditional’ health data (health results and clinical updates), we’re seeing new categories of data play a role in predicting outcomes such as wellness information (e.g. steps), activity data (e.g. shopping) and sentiment (e.g. mental state). As mentioned above, some medical records are worth $10k+ to pharma companies. In reality most data is locked inside apps and hospital firewalls due to business choices and regulations (e.g. GDPR).
There needs to be greater understanding and dissemination of the differences between tokens and supermarket points to someone who’s new to the space. And there should be better science-based quantification and communication of the value of data, considering value is often situational, combinatorial and on-demand as opposite to fixed price-per-data point.
4. “The digital divide could get even bigger”
There is a need to address digital divide, fairness and engagement. The impact of digital divide will be heightened as people have more control over their data - especially a significant challenge for older people and on the wrong side of the digital divide. In the developing world, even micro-transactions could be a roadblock. Given the new role played by the individual, additional work will be needed to ensure that those participating are truly representative of the population at large. There will need to be scenarios for diverse cases (sick vs. healthy people, individuals vs. healthcare institutions, patients vs. consumers), and new discussions around a reliable and 'fair' model of data ownership. Further, engagement will be key - ensuring that individuals who now own and control their data are engaged and motivated to manage it becomes even more important. In Web2.0 identity and password management has become an increasing burden, and in this scenario we still have trusted intermediaries to manage our accounts.
Part of addressing the digital divide is to solve the limitation of costs of wearables (financial but even more importantly attention-, commitment- and time-costs) and lower transaction costs.
5. “We won’t get genuine data ownership without hardware”
Data today all goes back to hardware providers, so individuals may get some control but they don’t have ownership of their own data. This will change in a future in which the hardware itself is trusted:
“Hardware in which the user directly owns the data and has full control over it would be of great importance. And also creating different levels of trust around the quality of the data. And then you have trust on different levels - how it's used and how it's valued. So I think the trusted hardware in the long term will also play a very important part in the creation of a preventative ecosystem.”
‘Staking’ for data quality. ‘Staking’ is widely used in Web3 to prove the efficacy of the network, and the Forum heard how it could also be used to guarantee data quality. There is an incentive for people to say they have high quality data as it is more valuable, and so new mechanisms are emerging to allow people to look at data quality. Three markers of high quality data are conformance, completeness and plausibility - owners of data can use staking as a kind of ‘escrow’ account to guarantee data quality. This is an interesting area that needs further investigation.
Web3 is about creating trustless environments - architecting the system so that you don’t need to trust anyone, the system operates regardless. Relating to trust, an interesting comment was made that “anonymous” data is rarely that. 80%+ of ‘anonymous’ data set can be de-anonymised with some simple techniques. There’s a large “shadow economy” of personal data that is poorly understood.
6. “A ‘Basic Attention Token’ could turn Facebook on its head”
This referenced a powerful idea for a token that allows our attention to be valued and valuable. Web3 marketplaces could generate more value for the data, since with today’s model only e.g. Facebook has your data. With a Web3 approach you could create a bottom up - emergent model. In the old web you couldn’t do it - Facebook’s models are hidden to the rest of us. This topic feels like it needs its own research track too.
Related to this is the idea that decentralisation doesn’t mean you lose the value of data analysis, machine learning and AI. There is a misconception that if you decentralised data you don’t get the benefits of machine learning needed to reap the benefits of AI. Enter ‘federated learning’. This is a way for analysis to happen on data that is distributed across a network rather than being centralised.
7. “As with Web2.0, incentives will be key”
How incentives work for people, in particular behaviour change, is key. We need to better understand who is getting rewarded for what, and which stakeholder(s) are involved (doctor, data gatherer/wearer, web3 company):
“The main question I have here is around the concept of rewards - mass audience/mainstream, the question of as people were talking it was clear to me that most people in the general public don't care - they automatically trust technology and don't ask too many questions until it is too late as consumers did in web2. I believe the general public will do the same for web3, so I hope the answer to the question above is thought through before mass adoption.”
8. “Web3 can resolve the tension between privacy and value”
Blockchain offers the ability to solve the tension between data that’s all locked up (and therefore not valuable in a marketplace) vs. data that’s bought and sold with impunity, with all the privacy implications that brings with it. In Web3, users’ preferences can be stored transparently, allowing them to make the decision about who gets to use their data. ‘Differential privacy’ was referenced, this is a concept that adds noise to data to improve privacy.
Tokens allow data to be made valuable by enabling microtransactions. This can better align the different of the different stakeholders (e.g. hospitals working with EMR, mobile health vendors and patients). Aligning interests is more viable since revenues can be shared via virtual currencies to institutions and patients themselves. Governance is also possible - tokens that come from activity also provide governance rights.
9. “EU’s Data Governance Act could be bigger than GDPR”
The new EU legislation aims to re-open the space that was closed down with GDPR and creates opportunities for altruistic use of data. Technologies for compliant data sharing are ready to scale. The Data Governance Act (Data Act) will ease some of the cultural and organisational resistance to leverage them, especially for hospitals.
This matters because data that influences health outcomes is far broader than what is considered ‘healthcare data’ today. The need to create personalised healthcare surfaces very specific challenges when it comes to data collection and management; we basically need massive amounts of data.
“Once we move towards personalisation, we need a lot of data on many small use cases within the population, meaning that we cannot really come to hospital and pool blood samples and run some machine learning on top of this data, we actually have to think how can we aggregate data on specific or very often rare diseases or people with different phenotypes? And how we can learn from all that distributed data?”
A broad array of lifestyle and activity data - more than just the traditional clinical data and medical records - has the potential to deliver insights for health outcomes.
One interesting tidbit was the changing role of research and academia could change. Given the new availability of rich new data sets with participant consent, validated data, and new ways distribution models (e.g. marketplaces). Much of this was formerly the preserve of the universities and academics.
10. “Not everything needs to be a transaction”
Smart voices have pointed out the dangers of Web3 becoming a more commercial version of Web3; each version of the Web3 starts out idealistic and ends up feeling tawdry. How will this be different, especially as there’s more value at stake.
This raises ethical and moral issues. Not everything needs to be transaction or stored on a ledger.
“Should I be paying to access my X-rays?”
Final thoughts on next steps
The Forum surfaced multiple areas that invite further exploration as well as identifying the organisations and individuals most willing and able to drive a positive new vision of Web3 and user data at scale.
The reality is there’s a lot of noise and confusion about the benefits of Web3 to the individual, the evolution of user data, the prospects of exchangeability of data, the relationship to Web2.0 and the environmental footprint of different approaches, among other things. A number of participants suggested the need for pilot testing capabilities that integrate with and extend Web2.0 to Web3.
Quick, informative iterations on Web3 apps in real-world scenarios are needed to move the ball forward. Strategic partnerships between hospitals and companies can be the way to refine solutions and bring them jointly to market. Integrate into / build on top of existing systems and perceptions to share and disseminate best practices and learnings.
Your thoughts are welcome as we continue to iterate and build on these key insights from the discussion.
Presentations:
Speakers & attendees
The event was hosted by Stephen Johnston (EiC & The Collective) and Richard Siow (ARK), and brought together 34 attendees, mostly startups in the Web3 space and 8 speakers representing UK, USA and Netherlands.
Stephen Johnston, Founder, The Collective (UK, Australia)
Richard Siow, Director, Ageing Research at King's (ARK) (UK)
Svitlana Surodina, CEO, Stein Group & EIR ARK (UK)
Rosanne Warmerdam, Co-founder & CEO, HealthBlocks BV & The Pando Network (NL)
Adam Sobol, Founder & CEO, ThirdWave (Remote) (USA)
Daniel Hirschmann, CEO, Hirsch & Mann (Remote) (UK)
Davide Zaccagnini, Co-founder & CEO, Agora (Remote) (USA)
Sergey Jakimov, Co-founder, LongeVC & Longevity Science Foundation (Remote) (Latvia)
The full list of registered attendees, including speakers, is here.