If you were one of the 1700 attendees of the first virtual ISCA last week, you hopefully got a chance to call in to one or more of this year’s mini-panels. Spread across 3 days and many timezones, ISCA hosted 12 mini-panels in total, which ranged in topics from accelerators, processing-in-memory and security, all the way to intermittent computing and quantum architectures.
The goal of this post is to provide a short overview of the mini-panel on datacenter architecture on which I participated; the topics that were discussed, the concerns regarding representative datacenter research in academia, and some short-term predictions on what’s in stock for cloud hardware in the future.
The panel consisted of David Brooks, Hsien-Hsin Sean Lee, who each brought their unique perspectives from both academia and industry, and myself, and like with all the other mini-panels, attendees got to ask questions in advance, vote on existing questions, and ask questions live through the zoom interface. Below I am summarizing the main topics the questions touched upon.
Academic Research in Cloud Computing
Unsurprisingly, several questions hinged on the challenges of doing representative research on cloud computing in academia. Specifically, a recurring theme was how can academia meaningfully contribute in this space, given that industry has a much larger availability of systems and applications. While this is certainly true, academia also has the freedom to try more radical solutions, while industry is understandably more risk-averse. So, if you’re thinking of an idea that can be easily implemented in a cloud system over the next year, then industry has probably already done it, or is at a much better position to do it in a representative way. Think further ahead!
Questions on the hesitation of many graduate students to pursue research in datacenter computing also came up, given the effort and time it takes to build representative benchmarks. The panel’s view on this was unanimous that the open-source community can be a great tool to reduce the effort needed by one student/academic group to put together representative applications, by contributing to existing benchmark suites. Apart from lowering the amount of work, this also creates a common reference point that helps academic work be more reproducible than it is today.
Along the same lines, attendees pointed out that experiments at scale – apart from being time-consuming and expensive – are also prone to high performance variability, especially in public clouds, making it hard to get representative results. Our advice was to not ignore the performance variability! Real systems, and especially large-scale systems, are prone to jittery performance; your experiments should take it into account (and ideally develop techniques to reduce it), not try to sidestep it. Your typical multicore simulator is not entirely deterministic (and if it is, it shouldn’t be) because real multicores do not have entirely deterministic performance; the same applies for cloud systems as well.
A topic brought up both by the panelists and audience was that datacenters for a long time were unaffected (or less affected) by Moore’s Law ending, because they could leverage scaling out to counterbalance the effects of limited technology scaling. As long as applications care about throughput, as batch workloads do, scaling out can ensure that performance continues to scale, perhaps at the higher cost of requiring more machines. As more and more applications switch to latency-critical, interactive services though, scaling out alone is not sufficient. For interactive services, tail latency, and hence single-thread performance are the critical metrics, and they cannot be met by simply distributing work across more servers.
That is, in part, what has motivated an increasing number of hardware accelerators to find their way in datacenters, which traditionally have been almost entirely homogeneous systems (let’s ignore different server generations for this). Despite their performance, power, and in some cases cost benefits, accelerators also introduce increased maintenance, deployment, and programmability challenges in the system. For a more detailed discussion on the challenges and opportunities of heterogeneity in the cloud, you can read this post I wrote recently for the SIGOPS blog.
Cloud application trends were also a popular question from the audience, specifically the increasing popularity of frameworks like microservices and serverless, and how architects can ensure that the frequent application changes they promote do not mean equally frequent hardware designs. An older SIGARCH post describes the implications of these application trends in more detail. There were two proposals from the panelists. First, not all new programming frameworks require new hardware, especially given the fact that a lot of the inefficiency is in the many levels of indirection of the software stack (a recent Science paper discusses this discrepancy between hardware and software performance). Second, reconfigurability was discussed as critical to absorb short-term application-level changes without needing an entirely new hardware platform at the same cadence. Nonetheless, the panelists observed that the increasing heterogeneity and stricter latency requirements of cloud applications will require fundamentally rethinking how we design and manage large-scale systems in a more practical and scalable way.
Finally, the panel briefly touched upon some projections on the trade-offs between cloud and edge processing, as well as what datacenters are expected to look like 10 years from now. The first topic refers to whether the recurring argument of centralizing computation versus distributing it to smaller-scale resources, physically closer to the end user, changes, as latency requirements become stricter. Distributing computation near the endpoints has been very successful for content providers, like Akamai, but has so far been less adopted by cloud computing. Added costs and a potentially less environmentally-friendly mode of operation, as well as losing some of the management benefits of centralized deployment were mentioned as reasons as to why that may be the case. Luiz Barroso’s excellent Eckert-Mauchly talk also briefly touched upon this subject.
Finally, with respect to what datacenters are likely to look like in ten years, the opinions were unanimous in that people should expect stricter latency requirements from the application side, and as a result, an increasing number of hardware accelerators, not only for application-specific computation, but primarily to handle datacenter-wide tasks, such as network processing, memory management, and cross-application functionality.
We hope all those who attended the panel enjoyed it, and that this blog has provided a quick summary to those that may have missed it. If we didn’t get to answer your question during the panel, feel free to continue the discussion below!
About the Author: Christina Delimitrou is an Assistant Professor in Electrical and Computer Engineering at Cornell University.
Disclaimer: These posts are written by individual contributors to share their thoughts on the Computer Architecture Today blog for the benefit of the community. Any views or opinions represented in this blog are personal, belong solely to the blog author and do not represent those of ACM SIGARCH or its parent organization, ACM.