GPU Study Group - Session 1 Quick Reference
🔑 Key Concepts Cheat Sheet
GPU vs CPU Architecture
GPU | CPU |
---|
Thousands of simple cores | 4-16 complex cores |
High throughput | Low latency |
SIMT execution | SIMD/scalar execution |
Optimized for parallel tasks | Optimized for sequential tasks |
High memory bandwidth | Large cache hierarchy |
CUDA Terminology
Term | Definition | Example |
---|
Host | CPU and its memory | Where main() runs |
Device | GPU and its memory | Where kernels run |
Kernel | Function that runs on GPU | __global__ void myKernel() |
Thread | Single execution unit | One GPU core executing |
Block | Group of threads | Up to 1024 threads |
Grid | Collection of blocks | All blocks for one kernel |
Warp | 32 threads executing together | Hardware scheduling unit |
Thread Hierarchy
Grid
├── Block (0,0)
│ ├── Thread (0,0)
│ ├── Thread (0,1)
│ └── Thread (0,2)
├── Block (0,1)
│ ├── Thread (0,0)
│ └── Thread (0,1)
Memory Hierarchy (Fast → Slow)
- Registers - Per-thread, fastest
- Shared Memory - Per-block, fast
- Global Memory - All threads, slow but large
- Host Memory - CPU RAM, slowest access from GPU
Common CUDA Function Patterns
// Kernel declaration
__global__ void kernelName(parameters) { }
// Kernel launch
kernelName<<<gridSize, blockSize>>>(arguments);
// Synchronization
cudaDeviceSynchronize();
// Error checking
cudaError_t err = cudaGetLastError();
🗣️ Discussion Facilitation Quick Guides
Opening Questions
- “What’s your experience with parallel programming?”
- “What GPU applications have you used as an end user?”
- “What questions came up while reading the materials?”
When Someone Says “I Don’t Understand…”
- Ask for specifics: “Which part exactly?”
- Use analogies: “Think of it like…”
- Draw it out: Visual representation
- Get group help: “Who can explain this differently?”
Redirecting Off-Topic Discussion
- “That’s fascinating - how does it connect to GPU architecture?”
- “Let’s bookmark that for our advanced topics session”
- “Great question for our follow-up discussion”
⚡ Technical Quick Fixes
CUDA Not Found
# Check installation
which nvcc
echo $PATH
export PATH=/usr/local/cuda/bin:$PATH
Compilation Issues
# Basic compilation
nvcc hello_cuda.cu -o hello_cuda
# With debug info
nvcc -g -G hello_cuda.cu -o hello_cuda
# Specify architecture
nvcc -arch=sm_50 hello_cuda.cu -o hello_cuda
No GPU Available
- Backup: Google Colab with GPU runtime
- Alternative: Show host screen for live demo
- Workaround: CPU-only version to show concept
📊 Progress Tracking
Session Success Indicators
Red Flags
- Multiple people struggling with basic installation
- Confusion about fundamental concepts (host/device)
- No questions being asked
- Time running significantly over/under
Adaptation Strategies
- Too fast: Add more discussion time, deeper exercises
- Too slow: Skip advanced questions, focus on core concepts
- Technical issues: Pair programming, shared screen demos
- Mixed levels: Advanced participants help beginners