Cache-Oblivious Algorithms in PracticeMaster's Thesis by Jesper Holm Olsen and Søren SkovABSTRACTThe memory of contemporary computers is structured in a hierarchy of successively larger, slower, and cheaper memory levels. Each level contains a working copy-or cache-of the level above. Recent developments in processor and memory technology imply an increasing penalty if programs do not take optimal advantage of the memory hierarchy. To alleviate this, the notion of cache-oblivious algorithms has been developed. The main idea behind cache-oblivious algorithms is to achieve optimal use of caches on all levels of a memory hierarchy without knowledge of their size. The purpose of this thesis is to examine cache-oblivious algorithms from a practical point of view. The thesis consists of a discussion of the theory, and a thorough benchmarking of some of the most recently published cache-oblivious algorithms, in the form of two priority-queue algorithms. The cache-oblivious theory has, so far, not incorporated the virtual memory system. We show that each of the levels in the virtual memory system can be seen as a separate level of cache, and is therefore also encompassed by the theoretical model. Several practical details that are not explained in the original articles, are described in order to implement the selected algorithms. Most importantly, we clarify how the algorithms should be constructed to use linear space. We furthermore develop a new optimal cache-oblivious algorithm for a priority deque, based on one of the cache-oblivious priority queues. Our results show, that for the cache-oblivious algorithms used in our case-study, the extra work incurred by making algorithms cache oblivious is too big, for them to be competitive on all levels in the memory hiearchy. Cache-obliviousalgorithms do outperform traditional RAM-model algorithms when working on data sets larger than main memory, but in the lower levels of the memory hierarchy, the traditional RAM-based algorithms are faster, due to a smaller amount of work. The cache-aware implementations, which we compare with, exhibit good use of the caches without too much extra work, and are therefore significally better. Based on our investigation of two priority-queue algorithms, we conclude that the cache-oblivious approach is currently not able to offer a competitive priority queue, except when working on data sets larger than main memory. ![]()
| ||