Machine learning algorithms are becoming greatly beneficial for drug development. According to a recent paper, ‘Use of machine learning approaches for novel drug discovery’, they can now be applied in several steps of the drug discovery methodology. These include ‘the prediction of target structure, prediction of biological activity of new ligands through model construction, discovery or optimization of hits, and construction of models that predict the pharmacokinetic and toxicological (ADMET) profile of compounds.’
AstraZeneca’s announcement today that they have joined forces with Human Longevity, a US sequencing and machine learning company, to sequence 2 million genomes is therefore not a surprise as such, but it does represent a step up in terms of the scale of such projects. AstraZeneca will be able to use Human Longevity's database of 1 million genomic and heath records alongside 500,000 DNA samples from their own clinical trials. The creation of this new database is likely to take as long as a decade, but the project will also include sequences from samples donated in the past 15 years.
Machine learning algorithms teach a computer to search for certain answers in datasets by itself, and discover patterns that can help regularly improve performance and behaviors. Clearly, the sort of numbers AstraZeneca are analyzing are far beyond human capabilities, and the project should discover patterns in the genomes that lead to insights otherwise impossible to garner, helping to better identify treatments and match them to patients.
The implications of machine learning for drug discovery are tremendous. A paper released by Google Research last year further explored the issue.