Read more
High-throughput sequencing has revolutionised the field of biological sequence analysis. Its application has enabled researchers to address important biological questions, often for the first time. This book provides an integrated presentation of the fundamental algorithms and data structures that power modern sequence analysis workflows. The topics covered range from the foundations of biological sequence analysis (alignments and hidden Markov models), to classical index structures (k-mer indexes, suffix arrays and suffix trees), Burrows-Wheeler indexes, graph algorithms and a number of advanced omics applications. The chapters feature numerous examples, algorithm visualisations, exercises and problems, each chosen to reflect the steps of large-scale sequencing projects, including read alignment, variant calling, haplotyping, fragment assembly, alignment-free genome comparison, transcript prediction and analysis of metagenomic samples. Each biological problem is accompanied by precise formulations, providing graduate students and researchers in bioinformatics and computer science with a powerful toolkit for the emerging applications of high-throughput sequencing.
List of contents
Notation; Preface; Part I. Preliminaries: 1. Molecular biology and high-throughput sequencing; 2. Algorithm design; 3. Data structures; 4. Graphs; 5. Network flows; Part II. Fundamentals of Biological Sequence Analysis: 6. Alignments; 7. Hidden Markov models (HMMs); Part III. Genome-Scale Index Structures: 8. Classical indexes; 9. Burrows-Wheeler indexes; Part IV. Genome-Scale Algorithms: 10. Read alignment; 11. Genome analysis and comparison; 12. Genome compression; 13. Fragment assembly; Part V. Applications: 14. Genomics; 15. Transcriptomics; 16. Metagenomics; References; Index.
About the author
Veli Mäkinen is a Professor of Computer Science at the University of Helsinki, Finland, where he heads a research group working on genome-scale algorithms as part of the Finnish Center of Excellence in Cancer Genetics Research. He has taught advanced courses on string processing, data compression, biological sequence analysis, along with introductory courses on bioinformatics.
Summary
Outlining the fundamental algorithms and data structures that power modern sequence analysis workflows, this book provides a powerful toolkit for students and researchers in bioinformatics and computer science. Its numerous examples and exercises are designed to help readers understand applications of the latest algorithm techniques, providing tools for further research.
Report
'Genome-Scale Algorithm Design is a well-thought-out ... book that fills a gap in the recent literature ... [on algorithms] for bioinformatics. It offers a sound, clear, and rich overview of computer science methods for the challenge of today's biological sequence analysis. I [recommend] it to students as well as to researchers in the field.' Nadia Pisanti, University of Pisa