Call for Papers: ISPASS 2016

Submitted by Erik Hagersten
April 17 to April 19, 2016

Submitted by Erik Hagersten

IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2016)
Uppsala, Sweden
April 17-19, 2016

Abstract registration deadline: October 2, 2015

The IEEE International Symposium on Performance Analysis of Systems and
Software provides a forum for sharing advanced academic and industrial
research work focused on performance analysis in the design of computer
systems and software. Authors are invited to submit previously unpublished
work for possible presentation at the conference.

Papers are solicited in fields that include the following:

1) Performance and power evaluation methodologies
-Analytical modeling
-Statistical approaches
-Tracing and profiling tools
-Simulation techniques
-Hardware (e.g., FPGA) accelerated simulation
-Hardware performance counter architectures
-Power/Temperature/Variability/Reliability models for computer systems
-Micro-benchmark based hardware analysis techniques

2) Performance and power analysis
-Bottleneck identification and analysis

3) Power/Performance analysis of commercial and experimental hardware
-General-purpose microprocessors
-Multi-threaded, multi-core and many-core architectures
-Accelerators and graphics processing units
-Embedded and mobile systems
-Enterprise systems and data centers
-Computer networks

4) Power/Performance analysis of emerging workloads and software
-Software written in managed languages
-Virtualization and consolidation workloads
-Internet-sector workloads
-Embedded, multimedia, games, telepresence
-Bioinformatics, life sciences, security, biometrics

5) Application and system code tuning and optimization

6) Confirmations or refutations of important prior results

In addition to research papers, we also welcome tool papers. The conference
is an ideal forum to publicize new tools to the community. Tool papers will
be judged primarily on their potentially wide impact and use than on their
research contribution. Tools in any of the above fields of interest are

Paper abstract submission: October 2, 2015
Full submission: October 9, 2015
Rebuttal: January 12-13, 2016
Notification: January 22, 2016
Final paper due: March 3, 2016

General Chair:
Erik Hagersten, Uppsala University

Program Chair:
Andreas Moshovos, University of Toronto

Call for Papers: 3rd Workshop on Near-Data Processing

Submitted by Boris Grot
December 5, 2015

Submitted by Boris Grot

3rd Workshop on Near-Data Processing (WoNDP)
in conjunction with MICRO’15
Waikiki, Hawaii, USA
December 5 or 6, 2015

Computing in large-scale systems is shifting away from the traditional
compute-centric model successfully used for many decades into one that is much
more data-centric. This transition is driven by the evolving nature of what
computing comprises, no longer dominated by the execution of arithmetic and
logic calculations but instead becoming dominated by large data volume and the
cost of moving data to the locations where computations are performed. Data
movement impacts performance, power efficiency and reliability, three
fundamental components of a system. These trends are leading to changes in the
computing paradigm, in particular the notion of moving computation to the data
in a so-called Near-Data Processing approach, which seeks to perform
computations in the most appropriate location depending on where data resides
and what needs to be extracted from that data. Examples already exist in
systems that perform some computations closer to disk storage, leveraging the
data streaming coming from the disks, filtering the data so that only useful
items are transferred for processing in other parts of the system.
Conceptually, the same principle can be applied throughout a system, by placing
computing resources close to where data is located, and decomposing
applications so that they can leverage such a distributed and potentially
heterogeneous computing infrastructure.

This workshop is intended to bring together experts from academia and industry
to share advances in the development of Near-Data Processing systems
principles, with emphasis on large-scale systems. This is the 3rd edition of
this workshop. The first two editions were held at MICRO 2013 and 2014, and had
over 60 attendees each. The workshop will consist of submitted papers and
invited talks. Topics of interest include but are not limited to

– Analysis of applications illustrating the potential for Near-Data Processing
– System and software architectures for Near-Data Processing
– Programming models for distributed and heterogeneous infrastructures, driven
by location of the data
– Processing/Memory/Storage architectures and microarchitectures for Near-Data
– Performance evaluation of Near-Data Processing systems and subsystems
– Power-efficiency and reliability analysis and evaluation of Near-Data

Two kinds of papers are invited:
1) Technical papers (4-6 pages) with preliminary results.
2) Position papers (3 pages max) on directions for research and development.
Please submit an electronic copy of your paper (in PDF) in two column format
with at least 10pt font. Blind submissions are optional. The selected papers
will be made available online. Publication in WoNDP does not preclude later
publication at regular conferences or journals.

Paper submissions due: October 9, 2015
Notification: November 6th, 2015
Final Paper Due: December 1st, 2015

Rajeev Balasubramonian, University of Utah
Boris Grot, University of Edinburgh
Jaime Moreno, IBM TJ Watson Research Center
Ravi Nair, IBM TJ Watson Research Center

Call for Papers: ADAPT'16

Submitted by Grigori Fursin
January 18, 2016

Submitted by Grigori Fursin

6th International Workshop on Adaptive Self-tuning Computing Systems (ADAPT)
co-located with HiPEAC 2016
Prague, Czech Republic
January 18, 2016

ADAPT is an interdisciplinary workshop to discuss and demonstrate practical and
reproducible techniques, methodology and tools that can help convert existing
or future software and hardware into adaptive, scalable and self-tuning
systems. Such systems should be able to automatically improve their
characteristics (execution time, energy usage, size, accuracy, reliability,
bandwidth, adaptation time and memory usage) depending on an application
and its input, available resources, run-time state of the system, and user

ADAPT topics include but are not limited to machine learning based autotuning,
representative benchmarking, real application self-tuning, automatic
performance modelling, self-tuning compilers, automatic bug detection, run-time
adaptation, automatic fault tolerance, dynamic hardware reconfiguration,
predictive scheduling, new programming models, green data centres, adaptive
embedded devices, reproducible experimentation, and optimization knowledge
sharing. You can check out accepted papers from the past ADAPT workshops at:

Based on positive feedback from last year’s edition, we have decided to use
exclusively our new submission and publication model where authors submit
their articles (and artifacts) directly to open ArXiv, we then open a discussion
thread for each paper on Reddit, and eventually let our PC select the most
appropriate ones for presentation.

We hope you will be able to submit a paper and be part of this new experiment.
See our ADAPT website for motivation and further details.

We encourage earlier submissions before the deadline to allow more time
for open discussions!

– Paper submission deadline: 9 October 2015
– End of public discussions: 30 October 2015
– Author notification: 20 November 2015
– Early fees registration: ~15 December 2015

Christophe Dubach, University of Edinburgh (UK)
Grigori Fursin, dividti (UK) / cTuning Foundation (France)

Now Available: IEEE Micro Special Issue on Heterogeneous Computing

Submitted by Lieven Eeckhout

Submitted by Lieven Eeckhout

The IEEE Micro Special Issue on Heterogeneous Computing, guest edited by
Ravi Iyer and Dean Tullsen, is now available. The Special Issue covers seven
articles in heterogeneous CPUs, GPUs, accelerators, control and data decoupling,
data placement, memory hierarchies, and programming models.

Tool Release: Dynamic Binary Instrumentation of Application, OS Kernel, Driver and BIOS

Submitted by Vijay Janapa Reddi

Submitted by Vijay Janapa Reddi

Dynamic Binary Instrumentation of Application, OS Kernel, Driver and BIOS

We introduce the Intel® Simulation and Analysis Engine (Intel® SAE) —
a framework for full-system instruction-level instrumentation of “ring
0” (privileged) and “ring 3” (user-level) code behavior. When
plugged-in to a Wind River® Simics Virtual Platform, Intel® SAE boots
native operating systems (e.g. Linux and Windows, as well as Android),
and runs unmodified binaries while facilitating flexible and
customizable instruction-level instrumentation on everything executing
on the CPU, i.e. BIOS, kernel, drivers and all kernel and user-space
processes. Furthermore, Intel SAE can facilitate distributed
node-to-node multi-system simulation and analysis of enterprise-scale
workloads such as CloudSuite, Hadoop, Memcached etc.

Available for academic use.

Call for Participation: Tutorial on Dynamic Binary Instrumentation of Application, OS Kernel, Driver and BIOS

Submitted by Vijay Janapa Reddi

Submitted by Vijay Janapa Reddi

Tutorial on Dynamic Binary Instrumentation of Application,
OS Kernel, Driver and BIOS

in conjunction with IISWC 2015
Atlanta, GA, USA
October 4, 2015

We introduce the Intel® Simulation and Analysis Engine (Intel® SAE) —
a framework for full-system instruction-level instrumentation of “ring
0” (privileged) and “ring 3” (user-level) code behavior. When
plugged-in to a Wind River® Simics Virtual Platform, Intel® SAE boots
native operating systems (e.g. Linux and Windows, as well as Android),
and runs unmodified binaries while facilitating flexible and
customizable instruction-level instrumentation of everything executing
on the CPU, i.e. BIOS, kernel, drivers and kernel and user-space processes.

At the IISWC 2015 held in Atlanta Georgia on October 4th, we will have
a hands-on tutorial. The tutorial will cover the use of Intel® SAE tools
(called ztools) for conducting different types of architectural and
program analysis’ studies, such as cache modeling, instruction usage
characterization, and new instruction emulation. The tutorial will also
cover how users can write their own new tools using Intel® SAE generic
APIs. If you used Pin or written a PinTool, you will find Intel SAE
ztools very useful. Beyond Pin-like features, it performs distributed
node-to-node multi-system simulation, useful for analysis of
enterprise-scale workloads, such as CloudSuite, Hadoop, Memcached etc.

Call for Papers: Workshop on Low-Power Dependable Computing

Submitted by Abdullah Muzahid
December 14 to December 16, 2015

Submitted by Abdullah Muzahid

2nd Workshop on Low-Power Dependable Computing (LPDC)
in conjunction with International Green and Sustainable Computing Conference (IGSC)
Las Vegas, Nevada, USA
Dec 14-16, 2015

Submission Deadline: Sept. 15, 2015

Dependable computing is normally achieved through various redundancy (such
as modular, temporal and/or information) techniques at different levels (for
instance, circuit, architecture, runtime, operating systems and software) in
the systems. With the scaled technology size and miniaturization of computing
systems, faults will become more common and it is imperative for most modern
computing systems to deploy various fault-tolerance techniques. On the other
hand, redundancy based fault-tolerance generally has energy implications, which
warrants careful consideration since energy has been promoted to be the
first-class system resource recently.

This workshop aims at establishing a forum for practitioners and researchers
from both industry and academia who work on different aspects of fault
tolerance and system energy efficiency to exchange ideas on how to achieve
low-power dependable computing. To cover a broad range of research related to
energy efficiency and dependable computing, the workshop will consider various
levels (from circuits to software), components (from memory to computation) and
systems (from battery-powered embedded systems to large scale reliable

The topics of interest include (but are not limited to) the following:
– Novel energy-efficient redundant circuit design
– Novel energy-efficient fault-tolerance architecture
– Compilation techniques for redundancy and low-power
– Runtime management for energy-efficiency and fault tolerance
– Scheduling algorithms for energy-efficiency and fault tolerance
– Energy-efficiency tradeoffs for different fault-tolerance techniques
– Low-power reliable memory and storage systems
– Low-power and reliable on-chip communications
– Case study on low-power dependable systems

The workshop invites authors to submit papers in the above mentioned areas that
describe original and unpublished work that are not concurrently under review

The papers submitted to this workshop is limited to be six (6) single-spaced,
double-column pages (with IEEE Computer Society Proceedings Manuscripts style:
11-point fonts and 8.5 x 11 inch), which should include everything (e.g.,
abstract, research description, figures, tables, and references).

All submissions will be reviewed by the program committee. Once a submission
is accepted, at least one author needs to register the conference following the
instructions on IGSC webpage (, and attends the
conference to present the work. Each accepted workshop paper will require a
full IGSC registration at the IEEE member or at the non-member rate (NOT
student rate).

Accepted papers will be published in the workshop proceedings and included in
the IEEE Xplore Digital Library.

Please submit your papers through the following link with EasyChair:

Submission deadline: Aug. 28, 2015
Notification date: Sep. 30, 2015
Final paper due: Oct. 15, 2015

Workshop Chairs:
Tam Chantem (Utah State University, USA)
Dakai Zhu (University of Texas at San Antonio, USA)

Technical Program Committee:
Jian-Jia Chen (TU Dortmund, Germany)
Steven Drager (Air Force Research Lab, USA)
Petru Eles (Linköping University, Sweden)
Alireza Ejlali (Sharif University of Technology, Iran)
Nathan Fisher (Wayne State University, USA)
Yifeng Guo (NetApp. Inc., USA)
Song Han (University of Connecticut, USA)
Houman Homayoun (George Mason University, USA)
Cong Liu (University of Texas at Dallas, USA)
Muhammad Shafique (Karlsruhe Institute of Technology, Germany)
Kaijie Wu (Chongqing University, China)
Xuan Qi (Oracle, USA)
Jun Yi (Amazon, USA)
Zhao Zhang (Iowa State University, USA)
Baoxian Zhao (MicroStrategy Inc, USA)

Call for Papers: Workshop on Heterogeneous High-performance Reconfigurable Computing

Submitted by Jason D. Bakos
November 15, 2015

Submitted by Jason D. Bakos

First International Workshop on Heterogeneous High-performance
Reconfigurable Computing (H2RC 2015)

in conjunction with Supercomputing 2015
Austin, TX, USA
Sunday, November 15, 2015

– Submission Deadline: August 31, 2015 (extended)
– Acceptance Notification: September 15, 2015
– Camera-ready Manuscripts Due: October 15, 2015
– Workshop Date: November 15, 2015

With Exascale systems on the horizon at the same time that conventional
von-Neumann architectures are suffering from rising power densities, we are
facing an era with power, energy-efficiency, and cooling as first-class
constraints for scalable HPC. FPGAs can tailor the hardware to the application,
avoiding overheads of general-purpose architectures. Leading FPGA manufacturers
have recently made a concerted effort to provide a range of higher-level,
easier to use, high level programming models for FPGAs, including the OpenCL
framework, which is already widely used by the HPC community for heterogeneous
computing. OpenCL is particularly appealing because it offers the potential for
portability to GPUs and Xeon Phi.

Such initiatives are already stimulating new interest within the HPC community
around the potential advantages of FPGAs over other architectures in terms of
both performance and energy consumption. With this in mind, this will be the
first workshop at SC to bring together HPC and heterogeneous-computing
researchers to demonstrate and share experiences on how newly-available
high-level programming models, including OpenCL, are already empowering
HPC software developers to directly leverage FPGAs, and to identify future
opportunities and needs for research in this area.

Submissions are solicited which explore the state of the art in the use of
FPGAs in heterogeneous high-performance compute architectures and, at a system
level, in data centers and supercomputers. FPGAs may be considered from either
or both the distributed, parallel and composable fabric of compute elements or
from their dynamic reconfigurability. We especially encourage submissions which
not only consider the implementation of algorithms and applications in FPGAs
but concretely relate this to the heterogeneous compute model consisting of
differently functioned cooperating compute elements and the overall impact of
such architectures on the compute capacity, cost and power efficiency, and
computational capabilities of data centers and supercomputers. Submissions
may report on theoretical or applied research, implementation case studies,
benchmarks, standards, or any other area that promises to make a significant
contribution to our understanding of heterogeneous high-performance
reconfigurable computing and will help to shape future research and
implementations in this domain. A non-comprehensive list of potential topics
of interest is given below:

1. FPGAs in the Cloud and Data Center: FPGAs in relation to challenges to
Cloud/Data Center/Supercomputing posed by the end of Dennard scaling

2. Cloud and Data Center Applications: Exploiting FPGA compute fabric to
implement critical cloud/HPC applications

3. Leveraging Reconfigurability: Using reconfigurability for new approaches
to algorithms used in cloud/HPC applications

4. Benchmarks: Compute performance and/or power and cost efficiency for
cloud/HPC with heterogeneous architectures using FPGAs

5. Implementation Studies: Heterogeneous Hardware and Management

6. Programming Languages/Tools/Frameworks for FPGA Heterogeneous

7. Future-gazing: New Applications/The Cloud Enabled by FPGA Heterogeneous
Computing, Evolution of Computer Architecture in relation to FPGA
Heterogeneous Computing

8. Community Building: Standards, consortium activity, open source, education,
initiatives to enable and grow FPGA Heterogeneous Computing

Prospective authors are invited to submit original and unpublished
contributions as 8-page papers. All contributions must be submitted
electronically in ACM SIG Proceedings format.

Submission site:

All papers selected for this workshop will be peer-reviewed. The authors of
accepted papers will be scheduled to present their work in one of the two
lightning talks sessions. Workshop proceedings will be made available online
and authors will retain their copyright.

Workshop Organizers:
Michaela Blott, Xilinx
Michael Leventhal, Xilinx
Michael Lysaght, ICHEC
Torsten Hoefler, ETH Zurich
Kevin Skadron, University of Virginia
Jason D. Bakos, University of South Carolina

Technical Program Co-Chairs:
Deming Chen, UIUC
Kees Vissers, Xilinx Research

Program Committee:
Junwei Bao, Baidu
Michaela Blott, Xilinx
Paul Chow, University of Toronto
Carl Ebeling, Altera
Hans Eberle, NVIDIA
Tarek El-Ghazawi, George Washington University
Wu Feng, Virginia Tech
Georgi Gaydadjiev, Maxeler
Alan George, University of Florida
Christoph Hagleitner, IBM
Martin Herbordt, Boston University
H. Peter Hofstee, IBM Research, Austin
Miriam Leeser, Northeastern University
Wayne Luk, Imperial College
Walid Najjar, University of California Riverside
Viktor Prasanna, University of Southern California
Desh Singh, Altera
Mitch Wright, HP

Tool Release: VoltSpot 2.0

Submitted by Kevin Skadron

Submitted by Kevin Skadron

VoltSpot 2.0

VoltSpot version 2.0 introduces several new features that can be useful to
power-grid modeling needs: 1) Supports PDN modeling for 3D architectures;
2) Supports voltage-stacking for on-chip power-delivery, especially for 3D
architectures; 3) Provides a compact RC model for on-chip switched-capacitor
voltage converters, as well as a framework for modeling other integrated
voltage regulators.

You can download version 2.0 from

Call for Papers: 1st GPU Warp/Wavefront Scheduling Championship

Submitted by Adwait Jog
December 5 to December 6, 2015

Submitted by Adwait Jog

1st GPU Warp/Wavefront Scheduling Championship (GPU-WSC)
in conjunction with MICRO 2015
Waikiki, Hawaii
December 5 or 6, 2015

The workshop on computer architecture competitions is a forum for holding
contests to evaluate computer architecture research topics. This workshop
is organized around a competition for scheduling algorithms for Graphics
Processing Units (GPUs). This 1st GPU Warp/Wavefront Scheduling Championship
(GPU-WSC) invites contestants to submit their GPU scheduler design to
participate in this competition. The contestants must develop algorithms to
optimize multiple metrics (e.g., IPC, cache miss-rates, memory bandwidth
utilization, hardware overheads etc.) on a common evaluation framework
provided by the organizing committee.

Phase-1 Submission: Oct 5, 2015
Notification of Acceptance for Phase 1: Nov 2, 2015
Phase-2 Submission: Nov 16, 2015.
Championship Dates: Dec 5 or Dec 6, 2015

There are two following phases in the entire championship process:

Phase 1
1) Write-up Submission:
Interested participants are invited to submit a write-up describing
their GPU Warp scheduler design. This write-up must clearly
demonstrate the idea, motivation, design trade-offs, and estimate the
hardware overheads of their proposed scheduler. In addition, write-up
should provide details on the evaluation methodology and impact of
their proposed scheduler on four metrics: 1) IPC, 2) L1 miss-rates
(all three caches: data, texture, constant), 3) L2 miss-rates, and 4)
DRAM bandwidth utilization, in comparison to the Greedy-then-Oldest
(GTO) GPU Warp Scheduler. The comparison results should be described
in the paper with the help of clearly visible graphs. Please use
GPGPU-Sim (Version 3.2.2). — open-source GPU evaluation platform.
More details are on the championship website.

Note that, the primary metric to rank schedulers is performance (IPC).
In case performance of two schedulers are fairly close, we will use
secondary metrics such as miss-rates, bandwidth utilization, and
hardware overheads for breaking the ties.

The Program Committee chaired by the organizers will review the
submitted write-ups. The submission will be evaluated based on the
novelty, presentation of the results, and effectiveness of the
proposed scheduler on different metrics. Novelty is not a strict
requirement, for example, a contestant may submit his/her previously
published design or make incremental enhancements to a previously
proposed design.

2) File Submission:
The authors are required to submit source code, configuration files,
output result files (dump the simulator output to a file), and a diff
of your code with the unmodified GPGPU-Sim version 3.2.2.

Phase 2
The authors of the accepted write-ups will be required to submit their
final write-ups along with the updated files. In this phase, authors
have an option to incrementally revise their scheduler design to
become more competitive. We expect these changes to be only related to
some optimizations with regard to their scheduler design. A complete
change in the scheduler design is not acceptable.

1) Evaluation of Schedulers
The submitted files will be tested by the organizers on the
applications recommended by us (see Simulation Infrastructure section
below). The organizers also plan to include some applications in the
testing set that may not be known apriori to the contestants. The
final ranking of the schedulers will be based on performance (IPC). In
case performance of two schedulers are fairly close, we will use
secondary metrics such as miss-rates, bandwidth utilization, and
hardware overheads for breaking the ties.

2) Incentives
The winner(s) will receive a trophy commemorating his/her triumph (OR
some other prize to be determined later). Authors of all the accepted
write-ups will also be invited to present their papers at the
workshop. All source code, final write-ups, and ranking results will
be made publicly available through this website.

Please use GPGPU-Sim (Version 3.2.2). — open-source GPU evaluation
platform. More details are on the championship website.

Phase-1 Submission: Please use the standard LaTeX or Word ACM
templates. Write-up length should not exceed 6 pages. Email your
write-up to all the organizers by the Phase-1 deadline. Also,
submission of files is mandatory.

Phase-2 Submission: Please use the standard LaTeX or Word ACM
templates. Final Write-up length should not exceed 8 pages. Email your
final write-up to all the organizers by the Phase-2 deadline.

Workshop chairs:
Adwait Jog, College of William and Mary (Email:
Onur Kayiran, AMD Research (Email:
Tim Rogers, Purdue (Email:

Program Committee:

Steering Committee:
Alaa R. Alameldeen (Intel)
Hyesoon Kim (Georgia Tech)
Moin Qureshi (Georgia Tech)

Please contact the organizers with regard to any questions regarding
the championship.