Provably Space Efficient Parallel Functional ProgrammingDistinguished Paper
Because of its many desirable properties, such as its ability to control effects and thus potentially disastrous race conditions, functional programming offers a viable approach to programming modern multicore computers. Over the past decade several parallel functional languages, typically based on dialects of ML and Haskell, have been developed. These languages, however, have traditionally underperformed procedural languages (such as C and Java). The primary reason for this is their hunger for memory, which only grows with parallelism, causing traditional memory management techniques to buckle under increased demand for memory. Recent work opened a new angle of attack on this problem by identifying a memory property of determinacy-race-free parallel programs, called disentanglement, which limits the knowledge of concurrent computations about each other’s memory allocations. The work has showed some promise in delivering good time scalability.
In this paper, we present provably space-efficient automatic memory management techniques for determinacy- race-free functional parallel programs, allowing both pure and imperative programs where memory may be destructively updated. We prove that for a program with sequential live memory of R*, any P-processor garbage-collected parallel run requires at most O(R* · P) memory. We also prove a work bound of O(W + R* P) for P-processor executions, accounting also for the cost of garbage collection. To achieve these results, we integrate thread scheduling with memory management. The idea is to coordinate memory allocation and garbage collection with thread scheduling decisions so that each processor can allocate memory without synchronization and independently collect a portion of memory by consulting a collection policy, which we formulate. The collection policy is fully distributed and does not require communicating with other processors. We show that the approach is practical by implementing it as an extension to the MPL compiler for Parallel ML. Our experimental results confirm our theoretical bounds and show that the techniques perform and scale well.
Thu 21 JanDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
16:00 - 17:00 | |||
16:00 10mTalk | Verifying Observational Robustness Against a C11-style Memory Model POPL Link to publication DOI | ||
16:10 10mTalk | Provably Space Efficient Parallel Functional ProgrammingDistinguished Paper POPL Link to publication DOI | ||
16:20 10mTalk | Modeling and Analyzing Evaluation Cost of CUDA Kernels POPL Link to publication DOI | ||
16:30 10mTalk | Optimal Prediction of Synchronization-Preserving Races POPL Umang Mathur University of Illinois at Urbana-Champaign, Andreas Pavlogiannis Aarhus University, Mahesh Viswanathan University of Illinois at Urbana-Champaign Link to publication DOI Pre-print | ||
16:40 10mTalk | Taming x86-TSO Persistency POPL Link to publication DOI Pre-print | ||
16:50 10mBreak | Break POPL |