Science rests on data, of that there can be no doubt. But peer through the hot haze of hype surrounding the use of big data in biology and you will see plenty of cold facts that suggest we need fresh thinking if we are to turn the swelling ocean of “omes” – genomes, proteomes and transcriptomes – into new drugs and treatments.
The relatively meagre returns from the human genome project reflect how DNA sequences do not translate readily into understanding of disease, let alone treatments. The rebranding of “personalised medicine” – the idea that decoding the genome will lead to treatments tailored to the individual – as “precision medicine” reflects the dawning realisation that using the -omes of groups of people to develop targeted treatments is quite different from using a person’s own genome.
Because we are all ultimately different, the only way to use our genetic information to predict how an individual will react to a drug is if we have a profound understanding of how the body works, so we can model the way that each person will absorb and interact with the drug molecule. This is tough to do right now, so the next best thing is precision medicine, where we look at how genetically similar people react and then assume that a given person will respond in a similar way.
Even the long-held dream that drugs can be routinely designed by knowing the atomic structure of proteins, in order to identify the location in a protein where a drug acts, has not been realised.
Most importantly, the fact that “most published research findings are false”, as famously reported by John Ioannidis, an epidemiologist from Stanford University, underlines that data is not the same as facts; one critical dataset – the conclusions of peer reviewed studies – is not to be relied on without evidence of good experimental design and rigorous statistical analysis. Yet many now claim that we live in the “data age”. If you count research findings themselves as an important class of data, it is very worrying to find that they are more likely to be false (incorrect) than true.
“There’s no doubt of the impact of big data, which could contribute more than £200 billion to the UK economy alone over five years,” says Roger Highfield, director of external affairs at the Science Museum, London. But “the worship of big data has encouraged some to make the extraordinary claim that this marks the end of theory and the scientific method”.
The worship of big data downplays many issues, some profound. To make sense of all this data, researchers are using a type of artificial intelligence known as neural networks. But no matter their “depth” and sophistication, they merely fit curves to existing data. They can fail in circumstances beyond the range of the data used to train them. All they can, in effect, say is that “based on the people we have seen and treated before, we expect the patient in front of us now to do this”.