Warning
This version is intended for compiler hackers. We are in the midst of substantial structural changes, and this is a snapshot. It only supports the Sparc, Alpha, HPPA, and PowerPC architectures. Furthermore, there are performance bugs that have to be fixed.
Summary:
This version is primarily intended to have a working x86 back end that will aid in migrating FLINT changes into the central repository.
X86 back end
The new x86 back end in this version is one point in a series of experiments aimed at code generation for the x86.See the paper X86-k32 under: Compiler notes.
cm.bell-labs.com/cm/cs/what/smlnj/compiler-notes/index.htmlIn summary, on a 433MHz Pentium II, floating point programs show a better improvement (over a factor of 2 for mandelbrot) than integer programs, when compared with 110.0.3.
Name 110.0.3 New Speedup tsp 7.55 6.49 16.39% logic 5.45 4.87 11.84% fft 1.11 0.81 37.89% barnesHut 3.89 3.04 27.64% ray 3.93 3.26 20.28% mandelbrot 1.32 0.56 134.22% simple 2.85 2.76 3.02% vliw 2.20 1.78 23.64% mlyacc 0.49 0.47 3.86% lexgen 0.97 0.91 6.99% knuthBendix 1.05 0.78 34.98% life 0.14 0.12 13.51% boyer 0.25 0.22 13.74% Compile time however, has uniformly increased by 40%. The compiler compiles itself in 10.51 minutes, which is reasonable for the time being. For very fast compile times, a quick and simple code generator is possible, but my priority right now is execution speed.
Compiler changes
The CPS compiler will frequently generate a sequence of SELECTS, and use the result in the rest of the function. This is bad for the x86, which will normally spill the result of the select, reload it at the use, only to have it written into the heap.Many of these selects are used just once, and so the MLRisc generator moves these pure operations to the point of their use, to reduce register pressure. It turns out that this optimization is also useful for RISC machines.
110.16 on a 533MHz DEC Alpha, compiles itself in 5.92 minutes
{gc="26.961",sys="6.740",tot="382.769",usr="349.067"}whereas 110.15 compiles the sources for 110.16 in 6.72 minutes{gc="50.055",sys="9.132",tot="453.331",usr="394.143"}an improvement of 14%.
MLRISC
The interface to the register allocator now takes a list of registers to prohibit from spilling.
Lal George Last modified: Thu Jun 3 11:20:13 EDT 1999