Call for Participation:

ISPASS 2013 — Workshops and Tutorials

Submitted by Ioana Baldini
Call for Participation: ISPASS 2013 — Workshops and Tutorials:

The following two tutorials and one workshop are being held on
Sunday April 21, 2013 in conjunction with ISPASS:

Modeling Exascale Applications with SST/macro and Eiger
(half day)

GPUWattch + GPGPU-Sim: An Integrated Framework for
Energy Optimizations in Manycore (half day)

FastPath 2013 Workshop on Performance Analysis
of Workload Optimized Systems (full day)
Submission deadline: March 10, 2013

Detailed information:

Modeling Exascale Applications with SST/macro and Eiger
(half day)

In high performance computing (HPC), the importance of fast, large
scale models of high fidelity are only increasing as we move towards
the next frontier of exascale. Hardware/software codesign is viewed as
a key methodology to reaching this end. The SST/macro toolkit[1]
provides HPC engineers the ability to explore current and future
hardware/software design constraints. Instead of costly (in time and
user effort) cycle-accurate simulation, macro-scale simulation can
provide valuable insight into the performance of large
applications. The value of these tools lies in high quality
application models for increasingly complex hardware designs.

The Eiger Performance Modeling Framework[2] generates models by
applying statistical techniques from the field of machine learning on
empirical performance data. While macro-scale simulation can provide a
reasonable overview of system wide phenomena, Eiger can leverage data
acquired from micro-scale sources to inform large scale simulations in
SST/macro. Eiger provides an API and data store for aggregating data
from micro-scale sources such as simulators, emulators, and runtime

This tutorial will present attendees with the techniques and
methodologies to leverage SST/macro and Eiger for modelling large
scale applications on upcoming supercomputer hardware. This
presentation is geared towards domain experts and HPC hardware
designers, as well as students and researchers whose work requires
exploration of programming models, interaction between computation and
communication, and data-driven modelling techniques for large scale
systems. These tools are geared toward ease of use and rapid
iteration, allowing area experts to generate verbose performance
models without requiring intricate knowledge of every facet of the
computing environment. This tutorial will require only a basic level
of programming skill.

GPUWattch + GPGPU-Sim: An Integrated Framework for
Energy Optimizations in Manycore (half day)

The objective of this tutorial is to present an overview of the design
and implementation of the GPGPU- Sim simulation infrastructure along
with a newly developed power model. The integrated GPUWattch power
model is highly configurable and extensible.

GPGPU-Sim version 3.x represents a significant update to GPGPU-Sim,
featuring a more accurate and detailed microarchitectural model. It
includes support for NVIDIA’s native ISA and the Fermi
Architecture. With the tightly-coupled GPUWattch, the simulation
infrastructure is now a complete platform for performance and energy
optimization research. The infrastructure follows a rigorous design
methodology that has been tested and validated against hardware
performance and power measurements for both the Fermi and Quadro

FastPath 2013 Workshop on Performance Analysis
of Workload Optimized Systems (full day)

The goal of FastPath is to bring together researchers and
practitioners involved in cross-stack hardware/software performance
analysis, modeling, and evaluation of workload optimized systems.

With microprocessor clock speeds being held constant, optimizing
systems around specific workloads is an increasingly attractive means
to improve performance. The importance of workload optimized systems
is seen in their ubiquitous deployment in diverse systems from
cellphones to tablets to routers to game machines to Top500
supercomputers, and IT appliances such as IBM’s DataPower and Netezza,
and Oracle’s Exadata.

More precisely, workload optimized systems have hardware and/or
software specifically designed to run well for a particular
application or application class. The types and components of workload
optimized systems vary, but a partial list includes traditional CPUs
assisted with accelerators (ASICs, FPGAs, GPUs), memory accelerators,
I/O accelerators, hybrid systems, and IT appliances.

Exploiting CPU savings and speed-ups offered by workload optimized
systems for application level performance improvement poses several
cross stack hardware and software challenges. These include developing
alternate programming models to exploit massive parallelism offered by
accelerators, designing low-latency, high-throughput H/W-S/W
interfaces, and developing techniques to efficiently map processing
logic on hardware.