Modifying the Acheron VM Instruction Set

The instruction set and related documentation is defined with an assortment of ca65 macros.  These macro invocations must be included in the main body of acheron.asm, between the prologue and epilogue processing that defines the mechanics of the system.

Declaration macros:

Additional information:

Some low-level options for dispatch and encoding are present in options.inc, and are commented there.


Instruction encoding

Each instruction has an opcode encoding (either unembedded or embedded), and one of many parameter encodings.  Adding new parameter encodings is easily supported.

Unembedded instructions are similar to most 8-bit architectures.  The first byte is the opcode, and the following bytes (if any) represent parameters.

The add unembedded instruction, with rDrA parameter encoding:

 Byte 1         Byte 2
+------------+ +------+------+
| Opcode     | |  rD  |  rA  |
+------------+ +------+------+

Unembedded opcodes dispatch very quickly, though often some time must be taken within the instruction implementation to isolate the various 4-bit parameters.  Unembedded instructions that take zero or only byte-size parameters execute very quickly, and might be only a few times slower than equivalent native machine code in practice.

Embedded instructions use a 4-bit opcode with a parameter embedded as the lower 4 bits of the opcode byte itself.  This is a compact representation, but consumes 16 of the possible 256 opcodes for a single instruction.  These instructions may also have further parameters, hence the total parameters are an odd number of nybbles that can "spill over" into the opcode byte.

The set16 embedded instruction, with rDimm16 parameter encoding:

 Byte 1          Byte 2       Byte 3
+------+------+ +----------+ +-----------+
|Opcode|  rD  | |         imm16          |
+------+------+ +----------+ +-----------+

If an instruction has an odd 4-bit parameter, such as a single register parameter, potential ways of encoding it could be:

  1. Embedded instruction, taking up 16 opcodes of the 256 available.
  2. Unembedded instruction, wasting the other 4 bits of a parameter byte.
  3. Use rP instead of an explicit parameter, which may be less convenient.
  4. Add another parameter to round out the parameter byte usage.

The pushregs/popregs instructions are an example of #4.  Given the register stack, I did not expect CPU stack operations to be commonly used.  I did not want to waste 32 opcodes on having "push rD" and "pop rD" embedded instructions, or even 17 with "push rP" and "pop rD".  By trying to think of what additional parameters might be, in this case the ability to push a range of registers with a single instruction, the instruction becomes more powerful and useful for the overall case to which it applies.

When developing software-emulated instruction sets, adding more complexity to a single instruction has the potential to make the system run faster.  The alternative would be to dispatch and execute multiple independent software-emulated instructions, each of which must save and complete their working state without taking advantage of their potential interplay that a single, more complex instruction can use in native code.  However, since more specific and complex instructions eat away opcode space and add a larger footprint to the interpreter, care must be taken that they are useful as general tools and don't become mostly unused dead weight.  Use custom instructions to legitimately compact the size of the interpreted programs and/or speed-optimize very common cases, and drop to inline native code otherwise.


Instruction Authoring

Instructions are, for the most part, completely self-contained.  Here is the complete, standalone implementation of one of the simplest instructions in the system:

OP dropc, none, bits, "Discard the most recent carry bit."
asl cstack
jmp mainLoop1

Removing these 3 lines of source would cleanly remove all traces of the instruction from the VM, freeing up 1 opcode for use elsewhere.

Declaration

The OP macro (OPE for embedded instructions) takes care of everything: It allocates an opcode, registers a vector in the dispatch table, outputs the instruction's syntax and encoding in the include file, and generates the instruction's documentation.

OP[E] <name>, <parameter-encoding>, <category>, <documentation>

The OP macro does not place any bytes into the code stream, so instructions can safely chain into each other:

OP hibyte, none, bits, "rP := upper byte of rP."
lda 1,x ; move high byte to low byte
sta 0,x
OP lobyte, none, bits, "rP := lower byte of rP."
lda #0 ; clear high byte
sta 1,x
bcc mainLoop1 ; happens to be close enough to shave a byte off using a jmp

Contract

On entry:

.X   zp address of rD for embedded instructions
zp address of rP for unembedded instructions
.Y   ready to read the next parameter byte with lda (iptr),y, so it's usually 1.
.A   undefined
Carry   clear
rP   for embedded instructions, already set to rD

In the body:

Three parameter decoding macros are included for assistance in extracting 4-bit parameters from the bytes following the opcode.  .Y must already point to the next byte in the instruction; these helpers already include the lda (iptr),y call.  The carry bit remains clear after these decoding helpers.  rP is automatically set to this newly decoded rD.

Macro    Byte contents   Effect
decode_rdra   rD << 4 + rA   .X = zp pointer to rD
.Y = zp pointer to rA
decode_rdimm4   rD << 4 + imm4   .X  = zp pointer to rD
.Y = imm4
decode_rd   rD << 2   .A = zp pointer to rD

3 temporary bytes are available during instruction processing, at 'zptemp'.

On exit:

Jump to one of the following routines to increment iptr to the next opcode.  These are sorted in order from fastest to slowest.

Cycles        
0   mainLoop0   iptr already points to the next instruction, common after branches, returns, etc.
8   mainLoopAc   Increment iptr by .A, assuming the carry is already clear.  Useful after tya.
8   mainLoop1   Increment iptr by 1.
10   mainLoop2c   Increment iptr by 2, assming carry is already clear.
12   mainLoop2   Increment iptr by 2.
13   mainLoopYc   Increment iptr by .Y, assuming carry is already clear.
14   mainLoop3   Increment iptr by 3.
15   mainLoopY   Increment iptr by .Y.  Useful for variable-length instructions, or if .Y has been incremented.

Branch instructions are coded right next to the mainLoop functions, so native branch instructions can exit directly back to the main loop, instead of needing to use JMP.


Parameter Encodings

The param-encodings.inc file holds the ca65 macros which act to define the how many parameters are accepted by an instruction, and how they are encoded into bytes.  Encodings for embedded and unembedded instructions are in two separate sections.  Each encoding takes 2 lines, one to test the parameter and the other to define it:

...
.elseif .xmatch(args, imm8)
EMIT u, name, category, desc, "imm8", ", imm8"
.elseif .xmatch(args, rdimm4)
EMIT u, name, category, desc, "rD, imm4", ", (((rD)<<4) + (imm4))"
...

The .xmatch and the last 2 string parameters of EMIT are what's important here; the rest is boilerplate for the section it happens to reside in.

Parameter encoding names do not need to be quoted, and thus need to be legal symbol names.  The first string is the text of the input parameter list, the second string is the output to be added to a ca65 .byte command.  The opcode byte immediately precedes this output, so embedded instructions start their output with "+ ..." to modify the opcode byte itself, while unembedded instructions start with ", ..." to define the next bytes.

The addp8 and addic instructions use the parameter encodings sampled above.  This example of a generated include file shows how the output is correlated:

[[from ops-addsub.asm]]
OP addp8, imm8, math, "rP := rP + imm8"
OP addic, rdimm4, math, "rD := rD + imm4 + carry"

[[generated in acheron.inc, opcodes are arbitrary]]
.define addp8 (imm8)             .byte $2c, imm8
.define pushregs (rD, imm4)      .byte $04, (((rD)<<4) + (imm4))

Here is an example of an embedded parameter encoding, noting that the opcode byte is modified by one of the parameters:

OPE set8, rdimm8, regs, "rD := imm8"

[[param-encodings.inc]]
.elseif .xmatch(args,rdimm8)
  EMIT e, name, category, desc, "rD, imm8", " + (rD), imm8"

[[generated in acheron.inc]]
.define set8 (rD, imm8)          .byte $a0 + (rD), imm8

Parameter encodings are the one place where instruction implementations might not be completely self-contained, because they might require an encoding not already included in the system.  Fortunately, all that's required to support it is the addition of 2 lines to param-encodings.inc.

Parameter encodings have no memory overhead in the 6502 code; they purely exist within the workings of the assembler.


Categories

All the other declarations have category fields, which are used to organize the instruction set documentation into groups.  Categories themselves should be declared, and can be given additional documentation.

OP_CATEGORY <category>, <title text> [, <header text>]
OP_CATDOC <category>, <paragraph text>

Each category has an optional header paragraph which appears above any declared content (instructions, pseudo-ops, etc), and zero or more OP_CATDOC paragraphs which appear in the order that they were found in the code, below the declared content.

Categories do not need to be declared before use.  If any declaration references an unknown category, an empty category will be created in the documentation to house it.


Documentation Strings

All documentation strings are in HTML format.  Unfortunately, ca65 syntax does not allow embedding newlines or double-quotes in string literals, has no line continuation marker, and has a fairly short maximum length for strings, meaning we have to jump through a few hoops.

Curly-braces effectively group strings to be concatenated, in which the EOL preprocessor define and the double-quote character surrounded in single-quotes can be interspersed:

OP_CATDOC sample, {"This is a ",'"',"long",'"'," single string.","<pre>line1",EOL,"line2</pre>"}

Pseudo Instructions

Having to choose between different parameter sizes like clr, set8, or set16, it is much easier to set up a single pseudo instruction set which automatically chooses which of the 3 to use based on the scope of the immediate parameter.  ca65 macros are wrapped to accomplish this.

PSEUDO <name>, <parameter string>, <category>, <documentation>, <macro body>

Pseudo ops directly contain a plain macro body in string format, meaning it all has to fit on one line.  Therefore, it is easier to write out and test the macro directly, then draw the body back into a PSEUDO declaration.

Raw macro for testing:

.macro set reg, imm
 .if (imm > 255) .or (imm < 0)
   set16 reg, imm
 .elseif (imm > 0)
   set8 reg, imm
 .else
   clr reg
 .endif
.endmacro

Edited into final declaration:

PSEUDO set, "reg, imm", regs, "Becomes clr, set8, or set16.", {" .if (imm > 255) .or (imm < 0)",EOL," set16 reg, imm",EOL," .elseif (imm > 0)",EOL," set8 reg, imm",EOL," .else",EOL," clr reg",EOL," .endif"} 

Note that neither the .macro nor .endmacro lines are part of the included body; the PSEUDO macro automatically adds those.


Native Routines

Some instructions or features include helper routines that are intended to be called from native code.  These can be declared to the system to be exported and documented in their category as well:

NATIVE <name>, <category>, <call type>, <documentation>, <nolabel>

The call type is typically either jsr or jmp to indicate how it should be entered, but anything listed in there will be printed before the name in the documentation.

If anything is present in the nolabel field (typically the literal symbol 'nolabel' is used there), then the name will be assumed to already exist, and the declaration will just add it to the documentation and exports.

NATIVE acheron, flow, jsr, "Enter Acheron mode, interpreting bytecodes immediately after the JSR instruction."

Zeropage Allocation

Zeropage state which is intended to be long-lived with the VM process needs to be declared, so it is allocated properly at the top of the ZPSTACK area.

ZPVAR <name>, <category>, <size>, <documentation>, <nolabel>

If anything is present in the nolabel field (typically the literal symbol 'nolabel' is used there), then the name will be assumed to already exist, and the declaration will just add it to the documentation and exports.

Size is the number of bytes to allocate.  If nolabel is given, then this is just for reference and no allocation takes place.  A size of 0 is legal, and is used to mark a specific point in memory.

ZPVAR zpTop, regs, 0, "Highest zeropage memory location in use, for copying out process context.", nolabel
ZPVAR iptr, flow, 2, "Instruction pointer. Always points to the beginning of an instruction, with (zp),y addressing reading the parameters."