PipeOrgan: Modeling Memory-Bandwidth-Bound Executions for AI and Beyond
TL;DR: Latency-tolerant architectures, e.g., GPUs, increasingly use memory/storage hierarchies, e.g., for KV Caches to speed Large-Language Model AI inference. To aid codesign of such workloads and architectures, we develop the simple PipeOrgan analytic model for...
In Memoriam: Remembering Mike Flynn
Michael J. Flynn is a widely respected contributor—indeed a giant—in the field of Computer Architecture. He made highly significant and impactful contributions throughout his career, both in industry and in academia. Sadly, he passed away peacefully December 24,...
Microarchitectural Modeling in the Era of Accelerator-Rich Systems and Computing at Scale
Microarchitecture simulators have been conceived and implemented to be valuable tools for the design of computing chips of all types (SimpleScalar, gem5, SMTSIM, Sniper, Qflex, Scarab, GPGPU-sim, Accel-Sim, Multi2Sim, NaviSim, SCALE-sim, gem5-Salam, TAO, PyTorchSim –...
The Hitchhiker’s Guide to Coherent Fabrics: 5 Programming Rules for CXL, NVLink, and InfinityFabric
This is the second article in the series, following our first blog in Dec 2023: Tuning the Symphony of Heterogeneous Memory Systems Modern applications are increasingly memory hungry. Applications like Large-Language Models (LLM), in-memory databases, and data...
IEEE Computer Architecture Letters (CAL) – An Update and FAQs
CAL has held a unique place in the computer architecture community for well over two decades as a periodical for publishing early and exciting results. CAL papers are only four pages long and undergo rigorous peer review to select those with novel ideas and/or...
From Theory to Practice: Introducing Architectural Prisms, an Experiment in AI-First Academic Dialogue
A little while ago, I published a post on this blog titled, “The Reviewer is Dead, Long Live the Review: Re-engineering Peer Review for the Age of AI.” In it, I argued that the traditional human-only peer-review system is buckling under the weight of...
An Invitation to Visual Computing
This post is a much simplified introductory chapter of an open, online textbook, Foundations of Visual Computing. Visual computing is wonderfully broad, touching everything from the sciences of human vision to the engineering of sensors, optics, displays, and computer...
All in on MatMul? Don’t Put All Your Tensors in One Basket!
Matrix multiplication dominates AI hardware and research. Betting everything on MatMul risks an innovation monoculture — it’s time to diversify our compute bets.
