Compendia: Reducing virtual-memory costs via selective densification
Virtual-to-physical memory translation is becoming an increasingly dominant cost in workload execution; as data sizes scale, up to four memory accesses are required per translation, and 24 in virtualised systems. However, the radix trees in use today to hold these translations have many favourable properties, including cacheability, ability to fit in conventional 4 KiB page frames, and a sparse representation. They are therefore unlikely to be replaced in the near future.
In this paper we argue that these structures are actually too sparse for modern workloads, so many of the overheads are unnecessary. Instead, where appropriate, we expand groups of 4 KiB layers, each able to translate 9 bits of address space, into a single 2 MiB layer, able to translate 18 bits in a single memory access. These fit in the standard huge-page allocations used by most conventional operating systems and architectures. With minor extensions to the page-table-walker structures to support these, and aid in their cacheability, we can reduce memory accesses per walk by 27%, or 56% for virtualised systems, without significant memory overhead.