Empirical performance model-driven data layout optimization and library call selection for tensor contraction expressions

Journal Article
Journal of Parallel and Distributed Computing, vol. 72, iss. 3, pp. 338-352, 2012
Authors
Qingda Lu, Xiaoyang Gao, Sriram Krishnamoorthy, Gerald Baumgartner, J. Ramanujam, P. Sadayappan
English