Challenges and Opportunities for Novel Maths in in Cheminformatics

Professor Douglas Kell (School of Chemistry and Manchester Institute of Biotechnology)

Alan Turing Building, Frank Adams Room 1,

Reasoning about small molecules is at the core of drug discovery, and the ability to do this computationally underpins the whole field known as cheminformatics. It brings many interesting questions that cut across pure and applied maths, computer science, data analytics, multivariate statistics and machine learning.

For half the seminar I will set out some of the problems (examples) that I believe contain the potential for novel mathematics and approaches.

These will include:

  • What is best mathematical representation of a molecule?
  • Is graph theory the best way? Bitstrings and hash functions?
  • How do I assess molecular similarity, similarity coefficients and clustering?
  • How do I best navigate a database of 22 million molecules?
  • Are there numerical transformations that make reasoning about molecular structures much easier (in some sense)?
  • How do I best relate structures to activities?
  • What is the best way to analyze molecules in terms of their substructures (a known NP-hard problem)?

The second half will allow for an open exchange of interesting directions we might pursue.

Some literature:

