Precision medicine and cloud computing are shaking healthcare to its core. At Amadeus Capital Partners, we scout for wide open spaces and as such believe that there is an unprecedented opportunity to build the foundational framework for biodata.

Why now?

Hate to break it to you, but size does matter. Mutations that trigger a strong detrimental effect bring an evolutionary disadvantage, so they tend to be exceedingly rare and require large cohorts to detect. Historically, most clinically relevant gene variants were identified through genome-wide studies, in which individuals with a given disease are examined to identify associated genetic signatures.

Population programs are usually led by governments and dominant research institutions. The UK has its 100,000 genome project focused on cancers, hereditary disorders and infectious diseases. On the other side of the Atlantic, the Million Veteran Program aims to build a database on the genetic, lifestyle and health information of veterans, recently attracting its 500,000th volunteer, making it one of the broadest studies in the world. All of it thanks to Iceland’s DeCODE project, now aged 20 and the poster child of population genetics.

I’m givin’ her all she’s got, Captain! Large studies have historically been the bedrock of personalised medicine. But there are catches.

A by-product of scale, lengthy collection periods and inelastic protocols is obsolescence. For instance, it is now accepted that the ultimate dataset contains genetic and phenotypic information collected longitudinally. Yet, even today, too few programs are designed to collect real-life data and instead focus on genetics alone, missing the chance to understand triggers and outcome.

Then there are issues around data accessibility and the relative lack of standardization. In theory, the tens of thousands of patients sequenced all over the world could already provide the evidence to make more clinical decisions with confidence. In practice, results are trapped within repositories that frequently differ in accessibility, reliability, usability, security, and scale — universal translator, anyone?

The final nail in the coffin is that, whilst patients are possibly the central component of healthcare and a primary source of phenotypic data, they are the furthest removed from the equation. More on that topic in a next blog.

Taken in isolation, any of the above points is a material limiting factor. So, when it all happens simultaneously and the effects get amplified by the sheer size and the pivotal place of population studies, large programs suddenly appear in a new light: pillars of the entire ecosystem, or single points of failure in disguise?

As genome sequencing technology evolves and datasets get more complex and scattered over the world, the situation becomes untenable. Time to regroup and allow for dynamic aggregation of biodata.

Beam me up, Scotty. Pockets of activity are burgeoning. Folks are joining hands and concurrent studies are taking off at an accelerated rate. Enticed by the anticipated network effects of new collaborative models, Flatiron Health and Foundation Medicine for instance started depositing data on the NIH sponsored Genomic Data Commons(targeting 50,000 patient cases by end 2017). AstraZeneca, Merck Group, Boehringer Ingelheim are part of a growing fleet pledging allegiance to Repositive.io, the world’s largest genomic data hub (1.1 million genomic datasets already accessible).

The likes of Illumina and Roche are coming on board, leveraging their venture arms to get closer to the data their hardware generates. Cloud genomic giants too have joined the armada. Amazon, Google or Microsoft are certainly not pulling punches when it comes to be the ones hosting prominent programs. And the software layer for curation and analytics is thickening by the day. Alliances are forming, it’s getting crowded and unstructured. Fair to say many won’t make it unless a reference framework emerges.

The United Federation of Phosphates. We think that as factions gradually come together, the leading coalition will be an overarching network of heterogeneous, autonomous and complementary data hubs, sharing a code of conduct (aka. protocols and data visualisation models), and offering on-demand access to their assets (data, tools, skillsets).

To build on the existing diverseness, the initial system will function as a federation, with specialised and more homogenous sub-groups possibly organised in distributed fashion. To the user, it will be one integrated marketplace, where the metadata will be freely available and the underlying data only accessible on a transactional basis, unless already in the public domain.

To boldly go where no man has gone before. Expect cross-fertilisation as silos break down and divers skillets get added to the mix. With the projected explosion in genomic data, it is critical to have tools that can be used as easily by consumers and physicians as by geneticists.

The first development phase will possibly be about improving database design, access, visualisation, interoperability and data anonymization. Building it on a blockchain infrastructure could embed principles of privacy, security and traceability, critical to generating intellectual property. Success will eventually depend on coherent data sharing policies.

Imagine creating synthetic cohorts of patients, assembled from a variety of sources, and defined by specific clinical or genomic parameters. One would compare them to other groups to analyse survival, treatment response and other outcomes. They will also be used to develop algorithms for variant interpretations and diagnostic applications, and eventually guide the development of personalised therapies. In a cloud environment, once one can dial up and down the computational power, lots of simultaneous questions can be asked to any aggregated dataset. The possibilities are endless.

It is when pooled together that the combined data starts talking, delivering value through new insights, new applications, which in turn generate more insights and more data. A pure network effect, right there, and the whole gamut of new business models.

Resistance is futile.

#datasharing #digitalhealth #genomics #bigdata #marketplace #PrecisionMedicine#healthcare #ArtificialIntelligence #ai

Pierre Socha is a Principal in the Early Stage Funds team at Amadeus Capital Partners.