asm

package
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 14, 2016 License: Apache-2.0 Imports: 8 Imported by: 0

Documentation

Overview

Package asm provides utility functions to assemble and disassemble Ngaro VM code.

Supported assembler mnemonics:

TOS is the value on top of the data stack. NOS is the next value on the data stack.
Instructions with a check mark in the "arg" column expect an argument in the cell
following them.

opcode	asm	alias	arg	stack	description
------	---	-----	---	-----	------------------------------------------------------------------------
0	nop				no-op
1	lit		✓	-n	push the value in the following memory location to the data stack.
2	dup			n-nn	duplicate TOS
3	drop			n-	drop TOS
4	swap			xy-yx	swap TOS and NOS
5	push			n-	push TOS to address stack
6	pop			-n	pop value on top of address stack and place it on TOS
7	loop		✓	n-?	decrement TOS. If >0 jump to address in next cell, else drop TOS and do nothing
8	jump	jmp	✓		jump to address in next cell
9	;	ret			return: pop address from address stack, add 1 and jump to it.
10	>jump	jgt	✓	xy-	jump to address in next cell if NOS > TOS
11	<jump	jlt	✓	xy-	jump to address in next cell if NOS < TOS
12	!jump	jne	✓	xy-	jump to address in next cell if NOS != TOS
13	=jump	jge	✓	xy-	jump to address in next cell if NOS == TOS
14	@			a-n	fetch: get the value at the address on TOS and place it on TOS.
15	!			na-	store: store the value in NOS at address in TOS
16	+	add		xy-z	add NOS to TOS and place result on TOS
17	-	sub		xy-z	subtract NOS from TOS and place result on TOS
18	*	mul		xy-z	multiply NOS with TOS and place result on TOS
19	/mod	div		xy-rq	divide TOS by NOS and place remainder in NOS, quotient in TOS
20	and			xy-z	do a logical and of NOS and TOS and place result on TOS
21	or			xy-z	do a logical or of NOS and TOS and place result on TOS
22	xor			xy-z	do a logical xor of NOS and TOS and place result on TOS
23	<<	shl		xy-z	do a logical left shift of NOS by TOS and place result on TOS
24	>>	asr		xy-z	do an arithmetic right shift of NOS by TOS and place result on TOS
25	0;	0ret		n-?	ZeroExit: if TOS is 0, drop it and do a return, else do nothing
26	1+	inc		n-n	increment tos
27	1-	dec		n-n	decrement tos
28	in			p-n	I/O in (see Ngaro VM spec)
29	out			np-	I/O out (see Ngaro VM spec)
30	wait			?-	I/O wait (see Ngaro VM spec)

Comments:

Comments are placed between parentheses, i.e. '(' and ')'. The body of the comment must be separated from the enclosing parentheses by a space. That is:

Some valid comments:

( this is a valid comment )
( this is a
  rather long
  multiline comment )

The following ae invalid comments:

(this will be seen by the parser as label "(this" and will not work )
( comments may ( not be nested ) here, the parser will complain trying to resolve
  "here," as a label )

Literals and label/const identifiers:

The parser behaves almost like a Forth parser: input is split at white space (space, tab or new line) into tokens. The parser then does the following:

  • If a token can be converted to a Go integer (see strconv.ParseInt), it will be converted to an integer literal.

  • If it is a Go character literal between single quotes, it will be converted to the corresponding integer literal. Watch out with unicode chars: they will be convberted to the proper rune (int32), but they are not natively supported by the VM I/O code.

  • If a token is the name of a defined constant, it will be replaced internally by the constant's value and can be used anywhere an integer literal is expected.

  • Then name resolution applies:

  • if an instruction is expected, the token is looked up in the assembler mnemonics and if no match is found, it is considered to be a label.

  • if an argument is expected, the token is always considered a label.

You may therefore define unusual labels or constant names (at least for Go programmers) such as "2dup", "(weird" or "end-weird)". Also, more than one instruction may appear on the same line and comments can be placed anywhere between instructions.

Implicit "lit":

Where the parser is expecting an instruction, integer literals, character literals and constants will be compiled with an implicit "lit":

lit 42
42	( will compile as "lit 42", just like above )
( like ) 'a' ( compiles as ) lit 'a' ( which in fact compiles as ) lit 97

Labels:

Labels are defined by prefixing them with a colon (:) and can be used as address in any lit, jump or loop instruction (without the ':' prefix). For example:

foo		( forward references are ok. This will be compiled as a call to foo )
lit foo		( this will compile as lit <address of foo>. This is actually the
		  only way to place the address of a label on the stack. )

:foo		( foo defined here )
nop
;

:bar	nop	( label definitions can be grouped with other instructions on the same line )
	;

:foobar	nop ;	( we can actually place any number of instructions on the same line )

Local labels:

Local labels work in the same way as in the GNU assembler. They are defined as a colon followed by a sequence of digits (i.e. :007, :0, :42). Although they can be defined multiple times, the compiler internally assigns them a unique name of the form N·counter (the middle character is '\u00b7'). References to such labels must be suffixed with either a '-' (meaning backward reference to the last definition of this label), or a '+' (meaning a forward reference to the next definition of this label). For example, in the following code:

:1	jump 1+	( not to be confused with the '1+' mnemonic. Here it means next occurrence of :1 )
:2	jump 1-
:1	jump 2+
:2	jump 1-

the labels will be internally converted to:

:1·1	jump 1·2
:2·1	jump 1·1
:1·2	jump 2·1
:2·2	jump 1·2

As a consequence, you should not use or define labels of the form N·N where N is any non-empty sequence of difigts. This also prevents the definition of labels of the form N+ or N- because they will not be addressable.

Please note that the parser does not prevent you either from using/defining labels with the same name as instructions. The only caveat, besides confusing yourself, is that you will not be able to use implicit calls to such labels:

:drop	'D' 1 1 out 0 0 out wait ( print 'D' )
	drop ;	( this will not loop forever, drop will be compiled as opcode 3, not a call )
drop		( still opcode 3 )
.dat drop	( will compile an implicit call to our custom drop )

Assembler directives:

The assembler supports the following directives:

.equ <identifier> <value>

defines a constant value. <identifier> can be any valid identifier (any combination of letters, symbols, digits and punctuation). The value must be an integer value, named constant or character literal. Constants must be defined before being used. Constants can be redefined, the compiler will always use the last assigned value.

.org <value>

Will place the next instruction at the address specified by the given integer literal or named constant.

.dat <value>

Will compile the specified integer value, named constant or character literal as-is (i.e. with no implicit "lit"). This is primarily used used for data storage structures:

:table	.dat 65
	.dat 'B'

The cells at addresses table+0 and table+1 will contain 65 and 66 respectively.

.opcode <identifier> <value>

defines a custom opcode. <identifier> can be any valid identifier (any combination of letters, symbols, digits and punctuation). The value must be an integer value, named constant or character literal. Custom opcodes must be defined before being used. They can be redefined, the compiler will always use the last assigned value. Default opcodes can also be redefined (think override) with this directive, it should therefore be used with caution.

For example, suppose that we have a VM implementation that maps opcode -42 to a function that computes the square root of the number on top of the data stack:

.opcode sqrt -42

lit 49
sqrt		( this compiles as .dat -42 )
7 !jump error

Note that there is no mechanism to tell the assdembler that a given custom opcode expects an argument from the next memory location (like lit or jump). Should you need to implement this type of opcode, constant and integer arguments would have to be prefixed with a .dat directive. For example, a compare instruction would look like:

.opcode cmp -1		( compares TOS with value in next memory location )

cmp 0		( Wrong: would compile as ".dat -1 lit 0" )
cmp .dat 0	( Correct: will compile as ".dat -1 0" )
Example (Locals)

Demonstrates use of local labels

package main

import (
	"fmt"
	"os"
	"strings"

	"github.com/db47h/ngaro/asm"
)

func main() {
	code := `
	:1	jump 1+
	:2	jump 1-
	:1	jump 2+
	:2	jump 1-
	`

	img, err := asm.Assemble("locals", strings.NewReader(code))
	if err != nil {
		fmt.Println(err)
		return
	}

	for pc := 0; pc < len(img); {
		fmt.Printf("% 4d\t", pc)
		pc = asm.Disassemble(img, pc, os.Stdout)
		fmt.Println()
	}

}
Output:

   0	jump 4
   2	jump 0
   4	jump 6
   6	jump 4

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func Assemble

func Assemble(name string, r io.Reader) (img []vm.Cell, err error)

Assemble compiles assembly read from the supplied io.Reader and returns the resulting image and error if any.

Then name parameter is used only in error messages to name the source of the error. If the io.Reader is a file, name should be the file name.

The returned error, if not nil, can safely be cast to an ErrAsm value that will contain up to 10 entries.

Example

Shows off some of the assembler features.

package main

import (
	"fmt"
	"os"
	"strings"

	"github.com/db47h/ngaro/asm"
)

func main() {
	code := `
		( this is a comment. brackets must be separated by spaces )
		
		( a constant definition. Does not generate any code on its own )
		.equ SOMECONST 42
		
		nop
		123			( implicit literal )
		SOMECONST   ( const literal )
		drop
		drop
		foo			( implicit call )
		pop
		lit table	( address of table )
		'x'			( char literal, compiles as lit 'x' )
		
		.org 32 ( set compilation address )
		
:foo	42 bar drop ;
:bar	1+ ;  ( several instructions on the same line )

		.opcode sqrt -1	( test custom opcode )
		sqrt			( should compile like .dat -1 )
		
:table	( data structure )
		.dat -100		( will appear in the disassembly as "call -100" )
		.dat 0666		( octal )
		.dat 0x27		( hex )
		.dat '\u2033'	( unicode char )
		.dat SOMECONST
		.dat foo		( address of some label )
`

	img, err := asm.Assemble("raw_string", strings.NewReader(code))
	if err != nil {
		fmt.Println(err)
		return
	}

	for pc := 0; pc < len(img); {
		fmt.Printf("% 4d\t", pc)
		pc = asm.Disassemble(img, pc, os.Stdout)
		fmt.Println()
	}

}
Output:

   0	nop
   1	123
   3	42
   5	drop
   6	drop
   7	call 32
   8	pop
   9	40
  11	120
  13	nop
  14	nop
  15	nop
  16	nop
  17	nop
  18	nop
  19	nop
  20	nop
  21	nop
  22	nop
  23	nop
  24	nop
  25	nop
  26	nop
  27	nop
  28	nop
  29	nop
  30	nop
  31	nop
  32	42
  34	call 37
  35	drop
  36	;
  37	1+
  38	;
  39	call -1
  40	call -100
  41	call 438
  42	call 39
  43	call 8243
  44	call 42
  45	call 32

func Disassemble

func Disassemble(i []vm.Cell, pc int, w io.Writer) (next int)

Disassemble disassembles the cells in the given slice at position pc to the specified io.Writer and returns the position of the next valid opcode.

Example

Disassemble is pretty straightforward. Here we Disassemble a hand crafted fibonacci function.

package main

import (
	"fmt"
	"os"
	"strings"

	"github.com/db47h/ngaro/asm"
)

func main() {
	fibS := `
	:fib
		push 0 1 pop	( like [ 0 1 ] dip )
		jump 1+		( jump forward to the next :1 )
	:0  push		( local label )
		dup	push
		+
		pop	swap
		pop
	:1  loop 0-		( local label back )
		swap drop ;
		lit		( lit deliberately unterminated at end of image for testing purposes )
		`
	fib, err := asm.Assemble("fib", strings.NewReader(fibS))

	if err != nil {
		fmt.Println(err)
		return
	}

	for pc := 0; pc < len(fib); {
		fmt.Printf("% 4d\t", pc)
		pc = asm.Disassemble(fib, pc, os.Stdout)
		fmt.Printf("\n")
	}

}
Output:

   0	push
   1	0
   3	1
   5	pop
   6	jump 15
   8	push
   9	dup
  10	push
  11	+
  12	pop
  13	swap
  14	pop
  15	loop 8
  17	swap
  18	drop
  19	;
  20	???

Types

type ErrAsm

type ErrAsm []struct {
	Pos scanner.Position
	Msg string
}

ErrAsm encapsulates errors generated by the assembler.

func (ErrAsm) Error

func (e ErrAsm) Error() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL