Read more
This practical book teaches the skills that scientists need for turning large sequencing datasets into reproducible and robust biological findings. Many biologists begin their bioinformatics training by learning languages like Perl and R alongside the Unix command line. But there's a huge gap between knowing a few programming languages and being prepared to analyze large amounts of biological data.
Rather than teach bioinformatics as a set of workflows that are likely to change with this rapidly evolving field, this book demsonstrates the practice of bioinformatics through data skills. Rigorous assessment of data quality and of the effectiveness of tools is the foundation of reproducible and robust bioinformatics analysis. Through open source and freely available tools, you';ll learn not only how to do bioinformatics, but how to approach problems as a bioinformatician.
Go from handling small problems with messy scripts to tackling large problems with clever methods and tools
Focus on high-throughput (or "next generation") sequencing data
Learn data analysis with modern methods, versus covering older theoretical concepts
Understand how to choose and implement the best tool for the job
Delve into methods that lead to easier, more reproducible, and robust bioinformatics analysis
About the author
Vince Buffalo is a bioinformatician at the UC Davis Department of Plant Sciences, in Jorge Dubcovsky's wheat genomics lab. Before this, he was the primary statistical programmer at the UC Davis Genome Center's Bioinformatics Core where he analyzed many diverse genomics datasets. An obsessive programmer since he was a young teenager, Vince was drawn to the statistical and computational problems of genomics. He works on open source bioinformatics tools in his work and free time, and enjoys fly fishing and cooking when away from the computer.
Summary
Learn the data skills necessary for turning large sequencing datasets into reproducible and robust biological findings. With this practical guide, you'll learn how to use freely available open source tools to extract meaning from large complex biological data sets.