Last week my
attention was drawn to the forthcoming conference RECOMB-AB 2012 : First RECOMB Satellite Conference on Open Problems in Algorithmic Biology:
“RECOMB-AB brings together leading
researchers in the mathematical, computational, and life sciences to discuss
interesting, challenging, and well-formulated open problems in algorithmic
biology.”
As someone
working in the field of “algorithmic biology” (which, I guess, could be defined
as the application of techniques from computer science, discrete mathematics,
combinatorial optimization and operations research to computational biology
problems) I was, predictably, immediately enthusiastic about the conference.
However, what really caught my attention was the following paragraph:
“The discussion panels at RECOMB-AB
will also address the worrisome proliferation of ill-formulated computational
problems in bioinformatics. While some biological problems can be translated
into well-formulated computational problems, others defy all attempts to bridge
biology and computing. This may result in computational biology papers that
lack a formulation of a computational problem they are trying to solve. While
some such papers may represent valuable biological contributions (despite
lacking a well-defined computational problem), others may represent
computational 'pseudoscience.' RECOMB-AB will address the difficult question of
how to evaluate computational papers that lack a computational problem
formulation.”
Calls-for-participation
rarely strike such a negative tone. However, in this case I think the
conference organizers have highlighted an extremely important point. Problems
arising in computational biology are inherently complex and this entails a
bewildering number of parameters and degrees of freedom in the underlying models.
Furthermore, it is commonplace for computational biology articles to utilize a
large number of intermediate algorithms and software packages to perform
auxiliary processing, and this further compounds the number of unknowns (and the
inaccuracies) in the system.
All this is,
to a certain extent, inevitable. However, this complexity sometimes
seems to have become an end in itself. This would be harmless except for the
fact that scientists subsequently attempt to draw biological conclusions from
this mass of data. Rarely is the question asked: is there actually any “biological
signal” left amongst all those numbers? Would we have obtained similar results
if we had just fed random noise into the system?
The fact that these questions are not posed, is directly linked to the lack of a clear and explicitly articulated optimization criterion. In other words: just what are we trying to optimize exactly? What makes one solution “better” than another? What, at the end of the day, is the question that we are trying to answer? This is exactly what RECOMB-AB is getting at with the sentence, “This may result in computational biology papers that lack a formulation of a computational problem they are trying to solve”. The articulation might be slightly formal, but the point they raise is nevertheless fundamental.
The fact that these questions are not posed, is directly linked to the lack of a clear and explicitly articulated optimization criterion. In other words: just what are we trying to optimize exactly? What makes one solution “better” than another? What, at the end of the day, is the question that we are trying to answer? This is exactly what RECOMB-AB is getting at with the sentence, “This may result in computational biology papers that lack a formulation of a computational problem they are trying to solve”. The articulation might be slightly formal, but the point they raise is nevertheless fundamental.
It remains
to be seen what kind of a role phylogenetic networks will play at RECOMB-AB, if
any. For sure, the field of phylogenetic networks continues to generate a vast
number of fascinating open algorithmic problems. However, are the underlying
biological models precise enough to allow us to say that we are actually
producing biologically-meaningful output? Overall, I think the answer is still no.
However, I think that there is reason for optimism. The field is young and
evolving and it is likely that both biologists and algorithmic scientists will have
a significant role in shaping its future. Hopefully this interplay will allow
us to move forward on the biological front without losing sight of the need for
explicit optimization criteria.
No comments:
Post a Comment