

8086/80186
----------
Add CPU options
Make push immediate, imul, shift immediate and enter/leave 186 specific
Optimise >> 8 and friends as register moves (note that the core compiler
code also goes off and turns * 256 into << 8)

Unstarted
---------
Float

Struct/Union testing

Maths optimisation. Div/mul on 8086 are *slow* so generate stack/double and
other optimised forms

Long helper methods (out of line 32bit mul etc) 

Make long long a class E with 64bit types and four virtual registers
rewritten as memory.

Other call formats

Optimisation
------------
Avoid sp,bp set up if we can on entry/exit

Delayed sp adjustments

Don't adjust sp then reload it from bp!

Register constant tracking (especially important as we often know where a
zero word or byte is and we can use it for push immediate #0)

Spotting two halves of a 32bit value being the same and merging

Allow moves between half registers and their full shadow (al->ax) for char
to short etc. How to balance compilers tendancy to allocate AL, AH, etc
which is good for register freedom and bad for chars ?

Better uchar handling - if we allocated xL before xH somehow in all cases
then we would be able to do xL -> xX efficiently.

We get very poor code because the compiler doesn't understand that type
converting al does not destroy al, only ah and the al/ax relationship. In
particular with char args it loves to write crap like


	mov bl, 1
	...
	mov al, bl
	cbw
	push ax
	mov al, bl
	cbw
	inc ax
	push ax

not
	mov al, 1
	cbw
	push ax
	inc al
	cbw
	push ax







Questions
---------
An assembler programmer would not 

	mov al, #14
	cbw

but would

	mov ax, #0014

not obvious how we do that optimisation with pcc in all cases but its important

Can we optimise compare with zero cases ?

How do we make use of rep and loopXX, these are rather useful on 8086 but
need indexes to tend to use the right register

Should there be a CPU type/feature match in table.c ?
