Abdul Rahman Sibahi | Writing a C Compiler, Chapter 4, in Zig

Three chapters out of who knows from Writing a C Compiler. Time onto Chapter 4. Chapter 4 is about, let me check, "logical and relational operators". Great. More operators.

Lexer

You would expect the lexer by now to be a boring done deal, and you would be right. However, this is slightly more interesting because I get to add more states to the state machine!

The tokens for today are the following: !, &&, ||, ==, !=, <, >, <=, and >=. As I am not supporting bitshift operations, I will consider those tokens (specifically & and |) a failure state. The tests will not have them, either way.

Adding the token tags is straightforward. The State enum however, should have a new state for every partial application. This is the new State:

const State = enum {
    start,
    identifier,
    int,
    hyphen,

    // new
    bang,
    ambersand,
    pipe,
    equals,
    lesser_than,
    greater_than,
};

The character in which the tokenizer enters these states is, I think, self explanatory. This, for example, is pipe, and lesser_than.

.start => switch (self.buffer[self.index]) {
    // snip --
    '|' => continue :state .pipe,
    '<' => continue :state .lesser_than,
    // snip --
],
.pipe => {
    self.index += 1;
    switch (self.buffer[self.index]) {
        '|' => {
            self.index += 1;
            result.tag = .double_pipe;
        },
        else => result.tag = .invalid,
    }
},
.lesser_than => {
    self.index += 1;
    switch (self.buffer[self.index]) {
        '=' => {
            self.index += 1;
            result.tag = .lesser_equals;
        },
        else => result.tag = .lesser_then,
    }
},

Parser

The parser is a relatively simpler affair. More binary operations with different precedence levels. None of them are right associative, so nothing but updating the AST and the various switch statements. Nothing much to write about.

Unit Tests

To spice up this section, I have decided to take on writing unit tests for the parser. Zig has first class support for unit tests. Make a block titled test and the compiler sees it. The only problem is hooking it in the build system.

Back in chapter 2, I think, I rejigged the zig build test command to run the Book's test suite. As in, it just compiles and installs the binary as normal, then runs a shell command. Zig's built in testing support requires a different kind of wiring.

Thankfully, the default output with zig init contains all the necessary pieces, I can reuse those. One question remains is whether to tie them to the suit's test suite, so both run with zig build test, or have it a separate command. I believe having it a separate command is best, to cut down on the, already slow, testing time.

I shall change the existing test command to submit (so I'd type zig build submit), and have the test command for unit tests. For reference, this is the old step after changing the name. (The block is organizational).

{ // `zig build submit` command
    const test_step = b.step("submit", "Run the Book's test suite");

    // subshells. how do they work.
    const inner_command = try std.mem.join(b.allocator, " ", &.{
        "../writing-a-c-compiler-tests/test_compiler",
        b.pathJoin(&.{ b.exe_dir, "paella" }),
        try std.mem.join(
            b.allocator,
            " ",
            b.args orelse &.{""},
        ),
    });

    // does this work like i think it does?
    const test_command = b.addSystemCommand(
        &.{ "arch", "-x86_64", "zsh", "-c", inner_command },
    );

    test_command.step.dependOn(b.getInstallStep());
    test_step.dependOn(&test_command.step);
}

Copying from the default build.zig and trimming unnecessary things, I get this:

{ // `zig build test` command
    const test_step = b.step("test", "Run unit tests");

    const exe_unit_tests = b.addTest(.{
        .root_module = exe_mod,
    });
    const run_exe_unit_tests = b.addRunArtifact(exe_unit_tests);

    test_step.dependOn(&run_exe_unit_tests.step);
}

And .. that should be it? To test whether the test command run, I shall write a test to test whether 2 + 2 equals 4 in main.zig.

test "test the test" {
    try std.testing.expect(2 + 2 == 4);
}

Ok zig build test comes and goes and says nothing. Let's see if it can detect a failure. I shall expect 2 + 2 does not equal 4.

test "test the test" {
    try std.testing.expect(2 + 2 != 4);
}

And it fails. So our tests work! Putting it parser.zig however, does not seem to work. Tests pass no matter what I expect. What gives?

After asking around in the Zig discord, this is a known thing. Zig's compiler is lazy, and does not look at functions not referenced from main.zig. But because the test executable does not run fn main, it does not see any modules that are referenced within main.zig but not referenced in main.zig's tests.

It is dumb as fuck, if you ask me. And honestly I cannot tell if this is a deliberate design decision or just a known bug. The solution is, well, to write a test block in main.zig that references the files in which I'd like to run tests.¹ The most straightforward way to do this is this:

test {
  std.testing.refAllDeclsRecursive(@This());
}

And now the tests in parse.zig pass and fail as expected. This is apparently a hack and not recommended. The recommended way is even more stupid looking, which only imports parser.zig, and I'd have to do it for every file and every sub import.

test {
    _ = parser;
}

Having gone that far, let's actually write our unit test in parse.zig.

test "precedence 1" {
    const t = std.testing;
    const src = "3 * 4 + 5;";
    var tokens = lexer.Tokenizer.init(src);

    var a_a = std.heap.ArenaAllocator.init(t.allocator);
    defer a_a.deinit();
    const a = a_a.allocator();

    const result = try parse_expr(a, &tokens, 0);

    try t.expect(result.* == .binop_add);
    try t.expectFmt("(+ (* 3 4) 5)", "{}", .{result});
}

My first drafts of the test failed because I was leaking memory, hence the adding an arena ceremony. It is a bit more ceremony than I expected, but it is fine. Who needs unit testing anyway? End to end testing is where it is at.

Internal Representation

The IR this chapter is a lot more interesting than usual. Since && and || are short circuting operators, it means now is the time have jumps and labels and gotos and jump-if-zeros. With conditionals, anything is possible.

the new instructions are a lot. Here is the new ir.Instr. You will note that there are no direct instructions for and and or.²


pub const Instr = union(enum) {
    ret: Value,

    copy: Unary,
    jump: [:0]const u8,
    jump_z: JumpIf,
    jump_nz: JumpIf,

    label: [:0]const u8,

    unop_neg: Unary,
    unop_not: Unary,
    unop_lnot: Unary, // <-- new

    binop_add: Binary,
    binop_sub: Binary,
    binop_mul: Binary,
    binop_div: Binary,
    binop_rem: Binary,

    binop_eql: Binary,
    binop_neq: Binary,
    binop_lt: Binary,
    binop_le: Binary,
    binop_gt: Binary,
    binop_ge: Binary,

    pub const Unary = struct {
        src: Value,
        dst: Value,

        pub fn init(src: Value, dst: Value) @This() {
            return .{ .src = src, .dst = dst };
        }
    };

    pub const Binary = struct {
        src1: Value,
        src2: Value,
        dst: Value,

        pub fn init(src1: Value, src2: Value, dst: Value) @This() {
            return .{ .src1 = src1, .src2 = src2, .dst = dst };
        }
    };

    pub const JumpIf = struct {
        cond: Value,
        target: [:0]const u8,

        pub fn init(cond: Value, target: [:0]const u8) @This() {
            return .{ .cond = cond, .target = target };
        }
    };
};

The jump instruction is straight up goto. ³ The two conditional jumps simply evaluate their cond operand, and act accordingly. One of them is enough, but it is apparently simpler to do it this way. These conditional jumps are the backbone of && and ||, and the ternary operator come chapter 6.

The operations other than && and || have a straightforward impl similar to the previous ones. It is && and || that depend on the new instructions. This is the implementation of &&.

.binop_and => |b| {
    const false_label = try bp.make_temporary("false_and");
    const end_label = try bp.make_temporary("end_and");
    const result = try bp.make_temporary("dst_and");

    const src1 = try expr_emit_ir(bp, b.@"0");
    try bp.append(.{ .jump_z = .init(src1, false_label) });
    const src2 = try expr_emit_ir(bp, b.@"1");
    try bp.append(.{ .jump_z = .init(src2, false_label) });
    const dst : ir.Value = .{ .variable = result };
    try bp.append(.{ .copy = .init(.{ .constant = 1 },dst ) });
    try bp.append(.{ .jump = end_label });
    try bp.append(.{ .label = false_label });
    try bp.append(.{ .copy = .init(.{ .constant = 0 }, dst) });
    try bp.append(.{ .label = end_label });

    return dst;
},

Let me walk this one bit by bit, just to make sure I understand it. && short circuits if the first operand is false, or zero.

evaluate src1.
if src1 is zero, the result is zero, and we jump to where result is set to zero.
if not, then we evaluate src2.
if src2 is zero, the result is zero, and we jump to where result is set to zero.
if not, then the result is one, and we jump to where the result is set to one.

For || this would be the following:

evaluate src1.
if src1 is one, the result is one, and we jump to where result is set to one.
if not, then we evaluate src2.
if src2 is one, the result is one, and we jump to where result is set to one.
if not, then the result is zero, and we jump to where the result is set to zero.

Which means this should be our || IR generation.

.binop_or => |b| {
    const true_label = try bp.make_temporary("true_or");
    const end_label = try bp.make_temporary("end_or");
    const result = try bp.make_temporary("dst_or");

    const src1 = try expr_emit_ir(bp, b.@"0");
    try bp.append(.{ .jump_nz = .init(src1, true_label) });
    const src2 = try expr_emit_ir(bp, b.@"1");
    try bp.append(.{ .jump_nz = .init(src2, true_label) });
    const dst : ir.Value = .{ .variable = result };
    try bp.append(.{ .copy = .init(.{ .constant = 0 }, dst) });
    try bp.append(.{ .jump = end_label });
    try bp.append(.{ .label = true_label });
    try bp.append(.{ .copy = .init(.{ .constant = 1 }, dst) });
    try bp.append(.{ .label = end_label });

    return dst;
},

This should work.

This is the C file I am experimenting with this evening:

int main(void) {
    return 0 || 0 && (1 / 0);
}

And this is the generated IR:

PROGRAM
	FUNCTION main
		jnz 0 => true_or.0
		jz  0 => false_and.3
		div.6 <- 1 / 0
		jz  div.6 => false_and.3
		dst_and.5 <- 1
		jump => end_and.4
		=> false_and.3
		dst_and.5 <- 0
		=> end_and.4
		jnz dst_and.5 => true_or.0
		dst_or.2 <- 0
		jump => end_or.1
		=> true_or.0
		dst_or.2 <- 1
		=> end_or.1
		ret dst_or.2

Eh. This looks wrong at first glance, but working through it, it seems to be .. fine? The division by 0 is immediately jumped over to greener pastures and the function returns 0.

The tests pass, which means I have corrected all the typos and exhausted all the switches and there are no panics. Verification of the logic will come later.

Assembly Generation

The Book goes into a lengthy explanation of the "flags" mechanism in c86 assembly, in addition with the first brush with Undefined Behaviour. No point in repeating that here, go read the book. I just want to write code.

The new Assembly AST includes new assembly instructions, as well as cond_codes: the various flags that will be set and unset depending on given comparisons. These are kinda free floating right now. Maybe I will stuff them within the Instr namespace later.

const CondCode = enum { e, ne, g, ge, l, le };

That's it. The rest comes when adding doubles and floats and nasty other stuff. And here are the new instructions:

pub const Instr = union(enum) {
    // snip --
    cmp: Mov,
    jmp: [:0]const u8,
    jmp_cc: struct { CondCode, [:0]const u8 },
    set_cc: struct { CondCode, Operand },
    label: [:0]const u8,
    // snip --
};

cmp just uses the Mov struct, which is really an Operand tuple. The other tuples are not reused enough for me to give them a name (yet?).

Then comes the conversion between the IR and the Assembly.

fn instr_to_asm(
    alloc: std.mem.Allocator,
    instr: ir.Instr,
) ![]assembly.Instr {
    switch (instr) {
        // snip==
        .unop_lnot => |u| {
            const src = value_to_asm(u.src);
            const dst = value_to_asm(u.dst);
            return try alloc.dupe(assembly.Instr, &.{
            .{ .cmp = .init(.{ .imm = 0 }, src) },
            .{ .mov = .init(.{ .imm = 0 }, dst) },
            .{ .set_cc = .{ .e, dst } },
            });
        },
        // snip --
        .binop_eql, .binop_neq, .binop_lt, .binop_le, .binop_gt, .binop_ge => |b| {
            const src1 = value_to_asm(b.src1);
            const src2 = value_to_asm(b.src2);
            const dst = value_to_asm(b.dst);

            const cc: assembly.Instr.CondCode = switch (instr) {
                .binop_eql => .e,
                .binop_neq => .ne,
                .binop_lt => .l,
                .binop_le => .le,
                .binop_gt => .g,
                .binop_ge => .ge,
                else => unreachable,
            };

            return try alloc.dupe(assembly.Instr, &.{
                .{ .cmp = .init(src2, src1) },
                .{ .mov = .init(.{ .imm = 0 }, dst) },
                .{ .set_cc = .{ cc, dst } },
            });
        },
        .label => |s| return try alloc.dupe(assembly.Instr, &.{
            .{ .label = s },
        }),
        .jump => |s| return try alloc.dupe(assembly.Instr, &.{
            .{ .jmp = s },
        }),
        .jump_z, .jump_nz => |j| return try alloc.dupe(assembly.Instr, &.{
            .{ .cmp = .init(.{ .imm = 0 }, value_to_asm(j.cond)) },
            .{ .jmp_cc = .{ if (instr == .jump_z) .e else .ne, j.target } },
        }),
        .copy => |u| return try alloc.dupe(assembly.Instr, &.{
            .{ .mov = .init(
                value_to_asm(u.src),
                value_to_asm(u.dst),
            ) },
        }),
    }
}

The rest of replacing pseudo registers is fairly mechanical and boring and I hate it. Here is the current fixup_instrs, just to see what it has grown into in three short chapters.


pub fn fixup_instrs(
    alloc: std.mem.Allocator,
    prgm: *assembly.Prgm,
) !void {
    var out: std.ArrayListUnmanaged(assembly.Instr) = try .initCapacity(
        alloc,
        prgm.func_def.instrs.capacity,
    );
    defer {
        std.mem.swap(
            std.ArrayListUnmanaged(assembly.Instr),
            &out,
            &prgm.func_def.instrs,
        );
        out.deinit(alloc);
    }

    const State = enum {
        start,
        mov_stack_stack,
        cmp_stack_stack,
        cmp_to_imm,
        add_stack_stack,
        sub_stack_stack,
        mul_to_stack,
        idiv_const,
        legal,
    };

    for (prgm.func_def.instrs.items) |instr| {
        state: switch (State.start) {
            .start => switch (instr) {
                .mov => |m| if (m.src == .stack and m.dst == .stack)
                    continue :state .mov_stack_stack
                else
                    continue :state .legal,
                .cmp => |m| if (m.src == .stack and m.dst == .stack)
                    continue :state .cmp_stack_stack
                else if (m.dst == .imm)
                    continue :state .cmp_to_imm
                else
                    continue :state .legal,
                .add => |m| if (m.src == .stack and m.dst == .stack)
                    continue :state .add_stack_stack
                else
                    continue :state .legal,
                .sub => |m| if (m.src == .stack and m.dst == .stack)
                    continue :state .sub_stack_stack
                else
                    continue :state .legal,
                .mul => |m| if (m.dst == .stack)
                    continue :state .mul_to_stack
                else
                    continue :state .legal,
                .idiv => |o| if (o == .imm)
                    continue :state .idiv_const
                else
                    continue :state .legal,

                else => continue :state .legal,
            },

            .mov_stack_stack => try out.appendSlice(alloc, &.{
                .{ .mov = .init(instr.mov.src, .{ .reg = .R10 }) },
                .{ .mov = .init(.{ .reg = .R10 }, instr.mov.dst) },
            }),
            .cmp_stack_stack => try out.appendSlice(alloc, &.{
                .{ .mov = .init(instr.cmp.src, .{ .reg = .R10 }) },
                .{ .cmp = .init(.{ .reg = .R10 }, instr.cmp.dst) },
            }),
            .cmp_to_imm => try out.appendSlice(alloc, &.{
                .{ .mov = .init(instr.cmp.dst, .{ .reg = .R11 }) },
                .{ .cmp = .init(instr.cmp.src, .{ .reg = .R11 }) },
            }),
            .add_stack_stack => try out.appendSlice(alloc, &.{
                .{ .mov = .init(instr.add.src, .{ .reg = .R10 }) },
                .{ .add = .init(.{ .reg = .R10 }, instr.add.dst) },
            }),
            .sub_stack_stack => try out.appendSlice(alloc, &.{
                .{ .mov = .init(instr.sub.src, .{ .reg = .R10 }) },
                .{ .sub = .init(.{ .reg = .R10 }, instr.sub.dst) },
            }),
            .mul_to_stack => try out.appendSlice(alloc, &.{
                .{ .mov = .init(instr.mul.dst, .{ .reg = .R11 }) },
                .{ .mul = .init(instr.mul.src, .{ .reg = .R11 }) },
                .{ .mov = .init(.{ .reg = .R11 }, instr.mul.dst) },
            }),
            .idiv_const => try out.appendSlice(alloc, &.{
                .{ .mov = .init(instr.idiv, .{ .reg = .R11 }) },
                .{ .idiv = .{ .reg = .R11 } },
            }),

            .legal => try out.append(alloc, instr),
        }
    }
}

I kind of like how it looks, in that there is a clear listing of each illegal state, and a direct transformation from a legal form to the other, and the discovery code is split up from the replacement code. It might even be easier to review? I do not know.

All tests pass. Even the earlier dividng by zero one. If you are curious, here is the generated final assembly.

.globl _main
_main:
	pushq   %rbp
	movq    %rsp, %rbp
	subq    $12, %rsp
	movl    $0, %r11d
	cmpl    $0, %r11d
	jne     .Ltrue_or.0
	movl    $0, %r11d
	cmpl    $0, %r11d
	je      .Lfalse_and.3
	movl    $1, %eax
	cdq
	movl    $0, %r11d
	idivl   %r11d
	movl    %eax, -4(%rsp)
	cmpl    $0, -4(%rsp)
	je      .Lfalse_and.3
	movl    $1, -8(%rsp)
	jmp    .Lend_and.4
.Lfalse_and.3:
	movl    $0, -8(%rsp)
.Lend_and.4:
	cmpl    $0, -8(%rsp)
	jne     .Ltrue_or.0
	movl    $0, -12(%rsp)
	jmp    .Lend_or.1
.Ltrue_or.0:
	movl    $1, -12(%rsp)
.Lend_or.1:
	movl    -12(%rsp), %eax
	movq    %rbp, %rsp
	popq    %rbp
	ret

There is one complication. Even tho the tests passed, while reviewing through the book I realized I am supposed to emit different names for the registers when they are in the set_cc instruction. Since the tests passed I can just .. ignore this? But let's do it.

I am going to abuse the width property (same one I used to get indentation) to pass in the register's width. For reference, this the current emission/printing code for set_cc:

.set_cc => |s| try writer.print("\tset{s:<7}{gen}", .{ @tagName(s.@"0"), s.@"1" }),

Fairly straightforward and avoids special formatting logic for CondCode: just use the tag name! The :_<7 syntax is how you're supposed to use the width property, by passing optional padding.

For the Operand, all the change needed here is to change {gen} to {gen:1}, passing a width of 1 into the downstream formatter.

In the Operand formatter, this is the Register printing code:

.reg => |r| switch (r) {
    .AX => try writer.print("%eax", .{}),
    .DX => try writer.print("%edx", .{}),
    .R10 => try writer.print("%r10d", .{}),
    .R11 => try writer.print("%r11d", .{}),
},

And this is when a conditional for width == 1 is added. I am treating the default value as before.

.reg => |r| if (options.width == 1) switch (r) {
    .AX => try writer.print("%a1", .{}),
    .DX => try writer.print("%d1", .{}),
    .R10 => try writer.print("%r10b", .{}),
    .R11 => try writer.print("%r11b", .{}),
} else switch (r) {
    .AX => try writer.print("%eax", .{}),
    .DX => try writer.print("%edx", .{}),
    .R10 => try writer.print("%r10d", .{}),
    .R11 => try writer.print("%r11d", .{}),
},

All tests continue to pass. So that is nice I guess.

Lessons Learned

Not much really. This is a fairly mechanical chapter. The new info I had already learned when going through the Book the first time. Maybe I learned not to target x86 assembly directly for my own language. Why not target wasm? It does not have goto.