Check the code

A simple CLI tool to execute and debug LMC (assembly-like) programs. The Little Man Computer (LMC) is a dummy Von Neumann processor. Its simplicity makes it ideal for learning about low level programming and machine architecture.

LMC Architecture

One must imagine a “little man” within a room with 100 mailboxes (the memory), a working bech (accumulator), a counter (starts at 0) and two trays for input and output. The mailboxes are numbered 0-99 (addresss) and store a number also between 0-99 (data). Note that the data within any mailbox can be either raw data or an instruction (as per the von Neumann architecture: data is indistinguishable from instructions).

There’s also a list of “opcodes” that translate a piece of data to an action. For example 901 may mean “fetch next data in input tray and put it in the accumulator”, or 123 “fetch whatever is in address 23 and add it to what is already in the accumulator”. Each instruction contains two fields: an opcode (indicating the operation to perform, usually the first number) and the address field (indicating where to find the data to perform the operation on, usually the last 2 numbers).

The little man does these simple repetitive tasks in a loop:

Check the content of the mailbox at address = counter
Increment the counter by 1 (so that it contains the mailbox number of the next instruction)
Decode the instruction. If the instruction uses data stored in another mailbox, then use the address field to find the mailbox number for the data it will work on, e.g. “get data from mailbox 42” or “get data from input”.
Fetch the data (from the input, accumulator, or mailbox with the address determined in step 4).
Execute the instruction based on the opcode given
Branch or store the result (in the output, accumulator, or mailbox with the address determined in step 4)
Return to the counter to repeat the cycle or halt

Real processors

Registers

LMC has only one register. This kills any idea of efficiency, as every operation that needs more than one piece of data needs to constantly load and save data to memory. In reality, CPUs have multiple registers: Intel’s x86-64 architecture chips have up to 16 registers, each 64 bits long.

CPUs also work with binary numbers rather than base 10, so the size of addresses and opcodes is limited in base 2 (eg. 32 bit or 64 bit).

Memory

Modern CPUs come with many types of “mailboxes”, or address spaces. The most obvious memory space is the RAM, which can be GBs long, but the i/o cost of accessing RAM is very high for the CPU, as they are physically far apart.

Because of this, CPUs now have a local memory called “cache” that lives inside the CPU itself, and makes for a much faster memory space. The cache is reserved for memory that was recently used, as it is likely that it may be used again soon. Nowadays we can find multiple layers of cache in a single CPU (L1, L2, L3) with different sizes and speed. There is naturally an inverse relationship bewteen size and speed, limiting cache size to MB or even KB.

ALU & opcodes

The list of valid opcodes in constantly growing in modern CPUs. This is because CPUs are made out of hardware, and when engineers find a better circuit to calculate, for example, a vector multiplication of 3x3 matrices, they “hard-code it” into an opcode.

This is also why processing is also separated into units: as Arithmetic & Logical Unit (ALU), Single Instruction, Floating Point Unit (FPU), or Multiple Data (SIMD - vectors). These can be very complex, as not every instruction is executed by the same circuit.

Many little men

Having one little man do the fetching and the calculating is obviously a bottleneck. As with ALU, modern CPUs spearate these essential tasks into different processing units to increase circuit-specific efficiencies, and to increase paralel processing.

Back to index