Condividi
Fr. 79.00
J Cheng, Joh Cheng, John Cheng, John Grossman Cheng, John Mckercher Cheng, Cheng John...
Professional CUDA C Programming
Inglese · Tascabile
Spedizione di solito entro 2 a 3 settimane (il titolo viene stampato sull'ordine)
Descrizione
Break into the powerful world of parallel GPU programming with this down-to-earth, practical guide
Designed for professionals across multiple industrial sectors, Professional CUDA C Programming presents CUDA -- a parallel computing platform and programming model designed to ease the development of GPU programming -- fundamentals in an easy-to-follow format, and teaches readers how to think in parallel and implement parallel algorithms on GPUs. Each chapter covers a specific topic, and includes workable examples that demonstrate the development process, allowing readers to explore both the "hard" and "soft" aspects of GPU programming.
Computing architectures are experiencing a fundamental shift toward scalable parallel computing motivated by application requirements in industry and science. This book demonstrates the challenges of efficiently utilizing compute resources at peak performance, presents modern techniques for tackling these challenges, while increasing accessibility for professionals who are not necessarily parallel programming experts. The CUDA programming model and tools empower developers to write high-performance applications on a scalable, parallel computing platform: the GPU. However, CUDA itself can be difficult to learn without extensive programming experience. Recognized CUDA authorities John Cheng, Max Grossman, and Ty McKercher guide readers through essential GPU programming skills and best practices in Professional CUDA C Programming, including:
* CUDA Programming Model
* GPU Execution Model
* GPU Memory model
* Streams, Event and Concurrency
* Multi-GPU Programming
* CUDA Domain-Specific Libraries
* Profiling and Performance Tuning
The book makes complex CUDA concepts easy to understand for anyone with knowledge of basic software development with exercises designed to be both readable and high-performance. For the professional seeking entrance to parallel computing and the high-performance computing community, Professional CUDA C Programming is an invaluable resource, with the most current information available on the market.
Sommario
FOREWORD xvii
PREFACE xix
INTRODUCTION xxi
CHAPTER 1: HETEROGENEOUS PARALLEL COMPUTING WITH CUDA 1
Parallel Computing 2
Sequential and Parallel Programming 3
Parallelism 4
Computer Architecture 6
Heterogeneous Computing 8
Heterogeneous Architecture 9
Paradigm of Heterogeneous Computing 12
CUDA: A Platform for Heterogeneous Computing 14
Hello World from GPU 17
Is CUDA C Programming Difficult? 20
Summary 21
CHAPTER 2: CUDA PROGRAMMING MODEL 23
Introducing the CUDA Programming Model 23
CUDA Programming Structure 25
Managing Memory 26
Organizing Threads 30
Launching a CUDA Kernel 36
Writing Your Kernel 37
Verifying Your Kernel 39
Handling Errors 40
Compiling and Executing 40
Timing Your Kernel 43
Timing with CPU Timer 44
Timing with nvprof 47
Organizing Parallel Threads 49
Indexing Matrices with Blocks and Threads 49
Summing Matrices with a 2D Grid and 2D Blocks 53
Summing Matrices with a 1D Grid and 1D Blocks 57
Summing Matrices with a 2D Grid and 1D Blocks 58
Managing Devices 60
Using the Runtime API to Query GPU Information 61
Determining the Best GPU 63
Using nvidia-smi to Query GPU Information 63
Setting Devices at Runtime 64
Summary 65
CHAPTER 3: CUDA EXECUTION MODEL 67
Introducing the CUDA Execution Model 67
GPU Architecture Overview 68
The Fermi Architecture 71
The Kepler Architecture 73
Profile-Driven Optimization 78
Understanding the Nature of Warp Execution 80
Warps and Thread Blocks 80
Warp Divergence 82
Resource Partitioning 87
Latency Hiding 90
Occupancy 93
Synchronization 97
Scalability 98
Exposing Parallelism 98
Checking Active Warps with nvprof 100
Checking Memory Operations with nvprof 100
Exposing More Parallelism 101
Avoiding Branch Divergence 104
The Parallel Reduction Problem 104
Divergence in Parallel Reduction 106
Improving Divergence in Parallel Reduction 110
Reducing with Interleaved Pairs 112
Unrolling Loops 114
Reducing with Unrolling 115
Reducing with Unrolled Warps 117
Reducing with Complete Unrolling 119
Reducing with Template Functions 120
Dynamic Parallelism 122
Nested Execution 123
Nested Hello World on the GPU 124
Nested Reduction 128
Summary 132
CHAPTER 4: GLOBAL MEMORY 135
Introducing the CUDA Memory Model 136
Benefi ts of a Memory Hierarchy 136
CUDA Memory Model 137
Memory Management 145
Memory Allocation and Deallocation 146
Memory Transfer 146
Pinned Memory 148
Zero-Copy Memory 150
Unifi ed Virtual Addressing 156
Unified Memory 157
Memory Access Patterns 158
Aligned and Coalesced Access 158
Global Memory Reads 160
Global Memory Writes 169
Array of Structures versus Structure of Arrays 171
Performance Tuning 176
What Bandwidth Can a Kernel Achieve? 179
Memory Bandwidth 179
Matrix Transpose Problem 180
Matrix Addition with Unified Memory 195
Summary 199
CHAPTER 5: SHARED MEMORY AND CONSTANT MEMO
Info autore
John Cheng, PHD, is a Research Scientist at BGP International in Houston. He has developed seismic imaging products with GPU technology and many high-performance parallel production applications on heterogeneous computing-platforms. Max Grossman is an expert in GPU computing with experience applying CUDA to problems in medical imaging, machine learning, geophysics, and more. Ty McKercher has been helping customers adopt GPU acceleration technologies while he has been employed at NVIDIA since 2008.
Riassunto
Professional CUDA Programming in C provides down to earth coverage of the complex topic of parallel computing, a topic increasingly essential in every day computing. This entry-level programming book for professionals turns complex subjects into easy-to-comprehend concepts and easy-to-follows steps.
Dettagli sul prodotto
Autori | J Cheng, Joh Cheng, John Cheng, John Grossman Cheng, John Mckercher Cheng, Cheng John, Ma Grossman, Max Grossman, Grossman Max, Ty McKercher, McKercher Ty |
Editore | Wiley, John and Sons Ltd |
Lingue | Inglese |
Formato | Tascabile |
Pubblicazione | 07.10.2014 |
EAN | 9781118739327 |
ISBN | 978-1-118-73932-7 |
Pagine | 528 |
Dimensioni | 190 mm x 238 mm x 25 mm |
Categorie |
Scienze naturali, medicina, informatica, tecnica
> Informatica, EDP
> Informatica
Informatik, computer science, Parallelverarbeitung, Paralleles u. Verteiltes Rechnen, Parallel and Distributed Computing, CUDA |
Recensioni dei clienti
Per questo articolo non c'è ancora nessuna recensione. Scrivi la prima recensione e aiuta gli altri utenti a scegliere.
Scrivi una recensione
Top o flop? Scrivi la tua recensione.