Read more
Many data-intensive applications that use machine learning or artificial intelligence techniques depend on humans providing the initial dataset, enabling algorithms to process the rest or for other humans to evaluate the performance of such algorithms. Not only can labeled data for training and evaluation be collected faster, cheaper, and easier than ever before, but we now see the emergence of hybrid human-machine software that combines computations performed by humans and machines in conjunction. There are, however, real-world practical issues with the adoption of human computation and crowdsourcing. Building systems and data processing pipelines that require crowd computing remains difficult. In this book, we present practical considerations for designing and implementing tasks that require the use of humans and machines in combination with the goal of producing high-quality labels.
List of contents
Preface.- Acknowledgments.- Introduction.- Designing and Developing Microtasks.- Quality Assurance.- Algorithms and Techniques for Quality Control.- The Human Side of Human Computation.- Putting All Things Together.- Systems and Data Pipelines.- Looking Ahead.- Bibliography.- Author's Biography .
About the author
Omar Alonso is a Principal Data Scientist Lead at Microsoft in Silicon Valley where he works on the intersection of social media, information retrieval, knowledge graphs, and human computation. He holds a Ph.D. from the University of California at Davis and an undergraduate degree from UNICEN, Argentina.