»
Heterogeneous Computing with OpenCL
 
 

Heterogeneous Computing with OpenCL, 2nd Edition

Revised OpenCL 1.2 Edition

 
Heterogeneous Computing with OpenCL, 2nd Edition,Benedict Gaster,Lee Howes,David Kaeli,Perhaad Mistry,Dana Schaa,ISBN9780124058941
 
 
Up to
25%
off
 

  &      &      &      &      

Morgan Kaufmann

9780124058941

9780124055209

308

235 X 191

Learn parallel programming with CPUs, GPUs, and APUs, from OpenCL community leaders

Print Book + eBook

USD 83.94
USD 139.90

Buy both together and save 40%

Print Book

Paperback

In Stock

Estimated Delivery Time
USD 52.46
USD 69.95

eBook
eBook Overview

VST (VitalSource Bookshelf) format

DRM-free included formats : EPUB, Mobi (for Kindle), PDF

USD 52.46
USD 69.95
Add to Cart
 
 

Key Features

  • Explains principles and strategies to learn parallel programming with OpenCL, from understanding the four abstraction models to thoroughly testing and debugging complete applications.
  • Covers image processing, web plugins, particle simulations, video editing, performance optimization, and more.
  • Shows how OpenCL maps to an example target architecture and explains some of the tradeoffs associated with mapping to various architectures
  • Addresses a range of fundamental programming techniques, with multiple examples and case studies that demonstrate OpenCL extensions for a variety of hardware platforms

Description

Heterogeneous Computing with OpenCL teaches OpenCL and parallel programming for complex systems that may include a variety of device architectures: multi-core CPUs, GPUs, and fully-integrated Accelerated Processing Units (APUs) such as AMD Fusion technology. Designed to work on multiple platforms and with wide industry support, OpenCL will help you more effectively program for a heterogeneous future.

Written by leaders in the parallel computing and OpenCL communities, this book will give you hands-on OpenCL experience to address a range of fundamental parallel algorithms. The authors explore memory spaces, optimization techniques, graphics interoperability, extensions, and debugging and profiling. Intended to support a parallel programming course, Heterogeneous Computing with OpenCL includes detailed examples throughout, plus additional online exercises and other supporting materials.

Readership

Software engineers, programmers, hardware engineers, students / advanced students

Benedict Gaster

Benedict R. Gaster is a software architect working on programming models for next-generation heterogeneous processors, in particular looking at high-level abstractions for parallel programming on the emerging class of processors that contain both CPUs and accelerators such as GPUs. Benedict has contributed extensively to the OpenCL's design and has represented AMD at the Khronos Group open standard consortium. Benedict has a Ph.D in computer science for his work on type systems for extensible records and variants.

Affiliations and Expertise

OpenCL Architect, AMD

Lee Howes

Lee Howes has spent the last two years working at AMD and currently focuses on programming models for the future of heterogeneous computing. Lee's interests lie in declaratively representing mappings of iteration domains to data and in communicating complicated architectural concepts and optimizations succinctly to a developer audience, both through programming model improvements and education. Lee has a Ph.D. in computer science from Imperial College London for work in this area.

Affiliations and Expertise

Member of Technical Staff, AMD

David Kaeli

David Kaeli received a BS and PhD in Electrical Engineering from Rutgers University, and an MS in Computer Engineering from Syracuse University. He is the Associate Dean of Undergraduate Programs in the College of Engineering and a Full Processor on the ECE faculty at Northeastern University, Boston, MA where he directs the Northeastern University Computer Architecture Research Laboratory (NUCAR). Prior to joining Northeastern in 1993, Kaeli spent 12 years at IBM, the last 7 at T.J. Watson Research Center, Yorktown Heights, NY. Dr. Kaeli has co-authored more than 200 critically reviewed publications. His research spans a range of areas including microarchitecture to back-end compilers and software engineering. He leads a number of research projects in the area of GPU Computing. He presently serves as the Chair of the IEEE Technical Committee on Computer Architecture. Dr. Kaeli is an IEEE Fellow and a member of the ACM.

Affiliations and Expertise

Northeastern University, Boston, MA, USA

View additional works by David R. Kaeli

Perhaad Mistry

Perhaad Mistry works in AMD’s developer tools group at the Boston Design Center focusing on developing debugging and performance profiling tools for heterogeneous architectures. He is presently focused on debugger architectures for upcoming platforms shared memory and discrete Graphics Processing Unit (GPU) platforms. Perhaad has been working on GPU architectures and parallel programming since CUDA 0.8 in 2007. He has enjoyed implementing medical imaging algorithms for GPGPU platforms and architecture aware data structures for surgical simulators. Perhaad's present work focuses on the design of debuggers and architectural support for performance analysis for the next generation of applications that will target GPU platforms. Perhaad graduated after 7 years with a PhD from Northeastern University in Electrical and Computer Engineering and was advised by Dr. David Kaeli who the leads Northeastern University Computer Architecture Research Laboratory (NUCAR). Even after graduating, Perhaad is still a member of NUCAR and is advising on research projects on performance analysis of parallel architectures. He received a BS in Electronics Engineering from University of Mumbai and an MS in Computer Engineering from Northeastern University in Boston. He is presently based in Boston.

Affiliations and Expertise

Northeastern University, Boston, MA, USA

Dana Schaa

Dana Schaa received a BS in Computer Engineering from Cal Poly, San Luis Obispo, and an MS and PhD in Electrical and Computer Engineering from Northeastern University. He works on GPU architecture modeling at AMD, and has interests and expertise that include memory systems, microarchitecture, performance analysis, and general purpose computing on GPUs. His background includes the development OpenCL-based medical imaging applications ranging from real-time visualization of 3D ultrasound to CT image reconstruction in heterogeneous environments. Dana married his wonderful wife Jenny in 2010, and they live together in San Jose with their charming cats.

Affiliations and Expertise

Northeastern University, Boston, MA, USA

Heterogeneous Computing with OpenCL, 2nd Edition

Foreword to the Revised OpenCL 1.2 Edition

Foreword to the First Edition

Preface

Our Heterogeneous World

OpenCL

This Text

Acknowledgments

About the Authors

Chapter 1. Introduction to Parallel Programming

Introduction

OpenCL

The Goals of This Book

Thinking Parallel

Concurrency and Parallel Programming Models

Structure

Reference

Further Reading and Relevant Websites

Chapter 2. Introduction to OpenCL

Introduction

Platform and Devices

The Execution Environment

Memory Model

Writing Kernels

Full Source Code Example for Vector Addition

Vector Addition with C++ Wrapper

Summary

Reference

Chapter 3. OpenCL Device Architectures

Introduction

Hardware trade-offs

The architectural design space

Summary

References

Chapter 4. Basic OpenCL Examples

Introduction

Example Applications

Compiling OpenCL Host Applications

Summary

Chapter 5. Understanding OpenCL’s Concurrency and Execution Model

Introduction

Kernels, Work-Items, Workgroups, and the Execution Domain

OpenCL Synchronization: Kernels, Fences, and Barriers

Queuing and Global Synchronization

The Host-Side Memory Model

The Device-Side Memory Model

Summary

Chapter 6. Dissecting a CPU/GPU OpenCL Implementation

Introduction

OpenCL on an AMD Bulldozer CPU

OpenCL on the AMD Radeon HD7970 GPU

Memory Performance Considerations in OpenCL

Summary

References

Chapter 7. Data Management

Memory management

Data transfer in a discrete environment

Data placement in a shared-memory environment

Example application—work group reduction

References

Chapter 8. OpenCL Case Study: Convolution

Introduction

Convolution Kernel

Conclusions

Code Listings

Reference

Chapter 9. OpenCL Case Study: Histogram

Introduction

Choosing the Number of Workgroups

Choosing the Optimal Workgroup Size

Optimizing Global Memory Data Access Patterns

Using Atomics to Perform Local Histogram

Optimizing Local Memory Access

Local Histogram Reduction

The Global Reduction

Full Kernel Code

Performance and Summary

Chapter 10. OpenCL Case Study: Mixed Particle Simulation

Introduction

Overview of the Computation

GPU Implementation

CPU Implementation

Load Balancing

Performance and Summary

Kernel for Uniform Grid Creation

Kernels for Simulation

Chapter 11. OpenCL Extensions

Introduction

Overview of Extension Mechanism

Device Fission

Double Precision

References

Chapter 12. Foreign Lands: Plugging OpenCL In

Introduction

Beyond C and C++

Haskell OpenCL

Summary

References

Chapter 13. OpenCL Profiling and Debugging

Introduction

Profiling with events

AMD Accelerated Parallel Processing Profiler

AMD Accelerated Parallel Processing KernelAnalyzer

Walking through the AMD APP Profiler

Debugging OpenCL Applications

Overview of gDEBugger

AMD Printf Extension

Conclusion

Chapter 14. Performance Optimization of an Image Analysis Application

Introduction

Description of the algorithm

Migrating multithreaded CPU implementation to OpenCL

Performance optimization

Power and performance analysis

Conclusion

References

Index

Quotes and reviews

"With parallel computing now in the mainstream, this book provides an excellent reference on the state-of-the-art techniques in accelerating applications on CPU-GPU systems."--David A. Bader, Georgia Institute of Technology

"Intended for software architects and engineers, this guide to OpenCL examines potential uses and practical application of the cross platform programming language for heterogeneous computing. The work explores the use of OpenCL to design and produce scalable applications that have the ability to be optimized for processor core and GPU usage. Chapters cover an overview of OpenCL, basic examples, CPU/GPU implementation and extensions. Illustrations and sample code, as well as sections outlining case studies for the use of OpenCL in several common situations, are provided."--SciTech Book News

"I always enjoy reviewing later editions of a book…this book does not disappoint. It is definitely worth the time spent reading it."--ComputingReviews.com, September 27, 2013

 
 
Free Shipping
Shop with Confidence

Free Shipping around the world
▪ Broad range of products
▪ 30 days return policy
FAQ

Contact Us