Indices, not Pointers

Jul 15, 2025

There is a pattern I’ve learned while using Zig which I’ve never seen used in any other language. It’s an extremely simple trick which - when applied to a data structure - reduces memory usage, reduces memory allocations, speeds up accesses, makes freeing instantaneous, and generally makes everything much, much faster. The trick is to use indices, not pointers.

This is something I learned from a talk by Andrew Kelley (Zig’s creator) on data-oriented design. It’s used in Zig’s compiler to make very memory-efficient ASTs, and can be applied to pretty much any node-based data structure, usually trees.

So what does this mean exactly? Well, to use indices means to store the nodes of the data structure in a dynamic array, appending new nodes instead of individually allocating them. Nodes can then reference each other via indices instead of pointers.

A comparison of memory layouts with different storage methods

Pretty simple, right? But this strategy has some major performance benefits.

Smaller Nodes

A pointer costs 8 bytes to store on a modern 64-bit system, but unless your planning on storing over 4 billion nodes in memory, an index can be stored in just 4 bytes.

Faster Access

Due to the reduced node size and the fact that nodes are stored contiguously in memory, the data structure will fit into fewer memory pages and more nodes will fit in the cpu’s cache line, which generally improves access times significantly.

Less Allocation Overhead

The way most people learn to implement data structures like trees is to make a separate allocation for each individual node, one at a time. This is a very naive way of allocating memory, however, as each memory allocation comes with a small but significant overhead which can really slow things down for a large number of nodes. Storing nodes in a growable arraylist minimizes this overhead as arraylists grow superlinearly (e.g, doubling in size each time more space is needed) meaning the majority of new nodes can just be placed in the next available slot without requesting more memory!

An arraylist growing by moving elements to a bigger allocation

Instant Frees

Freeing structures which are allocated in the traditional “nest of pointers” fashion can be very slow, as the entire structure has to be traversed to find and individually free each node. Storing nodes in a single allocation eliminates this problem entirely and freeing the structure becomes just a single free call, as it should be.

A Downside - Freeing Single Nodes

One disadvantage of storing all the nodes in a contiguous buffer is that it makes it harder to free an individual node as removing a single element from an arraylist would involve shifting over all the elements after it, a linear time operation which is almost always too slow to be practical. In practice this isn’t something you normally need to do as many data structures, like an AST, can be freed all at once, but if you need to be able to free individual nodes and still want to use this technique then the obvious solution would be to use a freelist.

Freelists

A freelist is, as the name suggests, a list used to track free slots in memory allocators. In our case we can simply use a stack to store indices of free slots in our arraylist and attempt to pop off this stack any time we add a new element. The extra code complexity should be weighed against the actual performance benefit when considering this approach.

Code Example

Here is a short demo of this technique in Zig (v0.14.1). There are some Zig quirks involved like passing memory allocators and using an enum as an index type but hopefully the general idea is clear.

pub fn main() !void {
    var debug_allocator = std.heap.DebugAllocator(.{}).init;
    defer _ = debug_allocator.deinit();

    var tree = Tree{
        // Zig uses a memory allocator interface to allow us to pass in an allocation strategy for the arraylist to use.
        .nodes = ArrayList(Tree.Node).init(debug_allocator.allocator()),
    };
    defer tree.nodes.deinit();

    // append the root node.
    const root = try tree.createNode(45);

    const a = try tree.createNode(-10);
    const b = try tree.createNode(89000);
    const c = try tree.createNode(2);

    tree.setLeftChild(root, a);
    tree.setRightChild(root, b);
    tree.setLeftChild(b, c);

    printTree(&tree);
}

const Tree = struct {
    /// Stores all the nodes in the tree. The root node is at index 0.
    nodes: ArrayList(Node),

    const Node = struct {
        data: i32,
        left_child: NodeIndex = .none,
        right_child: NodeIndex = .none,
    };

    // In Zig it is common to use a non-exhaustive enum instead of a bare integer for indices
    // to add back some of the type safety which is lost since we're not using pointers.
    const NodeIndex = enum(u32) {
        // The root nodes is stored at index 0, so 0 can be used as a null-value for child indices.
        none = 0,
        _,
    };

    fn createNode(tree: *Tree, value: i32) std.mem.Allocator.Error!NodeIndex {
        const index: NodeIndex = @enumFromInt(@as(u32, @intCast(tree.nodes.items.len)));
        try tree.nodes.append(.{ .data = value });
        return index;
    }

    fn setLeftChild(tree: *const Tree, parent: NodeIndex, child: NodeIndex) void {
        tree.nodes.items[@intFromEnum(parent)].left_child = child;
    }

    fn setRightChild(tree: *const Tree, parent: NodeIndex, child: NodeIndex) void {
        tree.nodes.items[@intFromEnum(parent)].right_child = child;
    }
};

fn printTree(tree: *const Tree) void {
    assert(tree.nodes.items.len > 0);

    // print the root node.
    printNode(tree, @enumFromInt(0), 0);
}

fn printNode(tree: *const Tree, node_index: Tree.NodeIndex, depth: u32) void {
    const node = tree.nodes.items[@intFromEnum(node_index)];

    for (0..depth) |_| print("  ", .{});
    print("[{d}] {d}\n", .{ @intFromEnum(node_index), node.data });

    if (node.left_child != .none) printNode(tree, node.left_child, depth + 1);
    if (node.right_child != .none) printNode(tree, node.right_child, depth + 1);
}

const std = @import("std");
const ArrayList = std.ArrayList;
const assert = std.debug.assert;
const print = std.debug.print;

And here is the output:

$ zig run indices.zig
[0] 45
  [1] -10
  [2] 89000
    [3] 2