The task of Quantitative Structure Activity Relationship (QSAR) Learning is to learn a function that, given the structure of a small molecule (a potential drug), outputs the predicted activity of the compound against an assay (a test that predicts the potential of the compound being a drug). Multi-Task Learning (MTL) is a machine learning approach that exploits commonalities in learning tasks by learning the tasks together.
QSAR learning is well suited for Multi-Task Learning (MTL) because there are often commonalities in assays. In particular, many assays involve targeting proteins, and these proteins may be related. For example, many QSAR studies have targeted the dihydrofolate reductase (DHFR) proteins from Plasmodium falciparum and P. vivax to look for potential anti-malaria drugs. The DHFRs from P. falciparum and P. vivax are similar, but not identical. It is therefore reasonable to learn QSARs for both targets at the same time using MTL.
The two Plasmodium DHFRs are homologous, i.e., they evolved from a common ancestral protein. This enables a natural metric (of evolutionary distance) to be inferred between the two targets based on sequence similarity: the smaller this distance, the more recently they shared a common ancestor, and the more likely we expect MTL to be effective. We are unaware of any previous MTL studies using such a natural metric between problems. In this paper we show that MTL can improve standard QSAR learning through the use of related targets, and that MTL QSAR is improved by incorporating the evolutionary distance between targets.