Journal Article
ACM Transactions on Architecture and Code Optimization, vol. 13, iss. 4, pp. 1-24, 2016
Authors
Mehmet Can Kurt, Sriram Krishnamoorthy, Gagan Agrawal, Bin Ren
Abstract
The emergence of the multi-core era has led to increased interest in designing effective yet practical parallel programming models. Models based on task graphs that operate on
single-assignment
data are attractive in several ways. Notably, they can support dynamic applications and precisely represent the available concurrency. However, for efficient execution, they also require nuanced algorithms for scheduling and memory management. In this article, we consider memory-efficient dynamic scheduling of task graphs. Specifically, we present a novel approach for dynamically recycling the memory locations assigned to data items as they are produced by tasks. We develop algorithms to identify memory-efficient store recycling functions by systematically evaluating the validity of a set of user-provided or automatically generated alternatives. Because recycling functions can be input data-dependent, we have also developed support for continued correct execution of a task graph in the presence of a potentially incorrect store recycling function.
Experimental evaluation demonstrates that this approach to automatic store recycling incurs little to no overheads, achieves memory usage comparable to the best manually derived solutions, often produces recycling functions valid across problem sizes and input parameters, and efficiently recovers from an incorrect choice of store recycling functions.