The Machine Learning Approach
An unprecedented wealth of data is being generated by genome sequencing projects and other experimental efforts to determine the structure and function of biological molecules. The demands and opportunities for interpreting these data are expanding more than ever. Biotechnology, pharmacology, and medicine will be particularly affected by the new results and the increased understanding of life at the molecular level. Bioinformatics is the development and application of computer methods for analysis, interpretation, and prediction, as well as for the design of experiments. It has emerged as a strategic frontier between biology and computer science. Machine learning approaches (e.g., neural networks, hidden Markov models, and belief networks) are ideally suited for areas where there is a lot of data but little theory—and this is exactly the situation in molecular biology. As with its predecessor, statistical model fitting, the goal in machine learning is to extract useful information from a body of data by building good probabilistic models. The particular twist behind machine learning, however, is to automate the process as much as possible.In this book, Pierre Baldi and Søren Brunak present the key machine learning approaches and apply them to the computational problems encountered in the analysis of biological data. The book is aimed at two types of researchers and students. First are the biologists and biochemists who need to understand new data-driven algorithms, such as neural networks and hidden Markov models, in the context of biological sequences and their molecular structure and function. Second are those with a primary background in physics, mathematics, statistics, or computer science who need to know more about specific applications in molecular biology.