This version consist primarily of enhancements to CM and changes to the module system.
Bug Fixes
Relative to 110.8
No. Description 1352 Equality status of reals is compromised 1354 spurious? "possibly inconsistent structure definitions" 1357 rebinding of constructors and exceptions not allowed 1364 Mistakenly accepted datatype spec when signature matching. 1374 Datatype replication causes nonexhaustive match error. 1384 incorrect complaint about "inconsistent structure definitions" 1385 functor defn including "where" structure defs shouldn't elaborate 1386 redefinition of a type spec is not detected 1414 Compiler bug: Instantiate: unexpected DATATYPE 354 1417 problems with datatype replication in functors 1418 CM.set_path is a constant function 1421 incorrect type comparison for val spec in signature match 1422 Core dump on Sparc when using lazy features 1423 2.0 + 2.0 = nan 1426 smlnj-c interface function, second call core dumps 1428 CM docs out of date 1432 signature match fails for datatype specs if "where type" is used 1433 eqtype u=t doesn't force eqtype. [partial fix] 1438 Wrong types for TextIO.StreamIO.inputAll and TextIO.StreamIO.mkInstream 1440 CM.set_path has no effect in Win 32 or Irix 6.4 1445 uncaught Unbound in FLINT/trans/transtypes.sml 1450 bugs in Array2.fromList and Array2.row -- fixed Real.fromString (regression test failures in basis/tests/reals.sml)
Module System
1. Layered Redefinitions Ignored (Bug 1354 fixed)
Structure definition specs can easily give rise to redefinitions. Consider the following example:
signature S1 = sig type t end; signature S2 = sig structure A : S1 structure B : S1 = A end; signature S3 = sig structure C : S2 structure D : S2 = C end;Here the substructureD.BofS3is defined withinS2in terms ofD.A(implyingD.B.t = D.A.t), while by the definitional spec inS3,D.Bis equal toC.B(implyingD.B.t = C.B.t).There are two ways of dealing with such secondary, or layered definitional specs.
In case (2), these implied sharing constraints will be verified automatically during signature matching, in the process of matching all the definitional specifications. The question is whether they should be taken into account (i.e. satisfied) when instantiating a signature like
- Secondary definitions can be treated as errors (excepting those cases where the equivalence of the definitions can easily be verified). This was the policy for 110.0.3 through 110.8, where examples like the above gave rise to the error message
Error: possibly inconsistent structure definitions[This was bug 1354.]- The redefinitions can be regarded as producing implied sharing constraints (e.g.
D.A.t = C.B.tin this case).S3. We adopt the lenient policy of not dealing with these implied constraints during instantiation. Instead, during instantiation secondary definitions are simply ignored. In the example, the inner definitionD.B.t = D.A.ttakes effect while the secondary definitionD.B.t = C.B.tis ignored.The consequence of this policy is that some additional inconsistent signatures will be successfully instantiated. As usual, any attempt to match these inconsistent signatures will fail.
The compiler flag
Compiler.Control.multDefWarn : bool refcontrols whether a warning message will be generated when a secondary definition is ignored. It's default value is false, meaning no warning messages will be produced.We follow a stricter policy in some other cases, such as layered type definition specs:
signature S = sig type t type s = t end where type s = intHere the secondary redefinition "where type s = int" is detected and causes an error message.Error: where type defn applied to definitional spec: sNOTE: This is an SML/NJ divergence, since the above signature is legal in SML '97. See Note "Sharable Types" below.2. Noninterference of sharing and definitional specs
While the above solution relaxed a constraint enforced by the compiler, this problem involves enforcing a new constraint relating to the interaction between sharing and definitional specs.
The Definition requires that a type constructor involved in a sharing constraint be (1) not defined as a type function, and (2) not defined in terms of some "rigid" type constructor (i.e. a type constructor previously defined in the context).
We choose to define "sharable" as meaning simply that there is {\em no} definition applying to a type constructor. We'll use the term "defined" for the opposite of sharable. A more subtle definition is possible; see note "Sharable Types" below.
Thus the following signature is illegal
signature S = sig type s = int (* s is defined *) type t (* t is sharable *) sharing type t = s (* s is not sharable *) endand has to be reexpressed as (for instance):signature S = sig type s = int type t = s endWith "where type" definitions, things are a little more complicated. An inner type sharing constraint can be affected by an outer definitional constraint, as in the following example:signature S1 = sig type s type t sharing type t = s (* ok, because s and t are sharable here *) end where type t = int; (* this converts both s and t to rigid types *)This is legal, but the following declaration is not:signature S2 = sig structure A : S1 type v sharing type v = A.s (* A.s not flexible *) end;However, S2 can easily be converted to the legal S3 below by replacing the outer sharing constraint by a definition.signature S3 = sig structure A : S1 type v = A.s end;In general, we recommend avoiding sharing constraints that can easily be expressed by definitional specs. So one should always prefertype t = stotype t sharing type t = s.The same applies to structure sharing and structure definition specs (which are an SML/NJ language extension). Violations of this newly enforced constraint can often be eliminated by replacing structure sharing by structure definition specs, e.g. replacingstructure A : SIGA sharing A = B.Cwithstructure A : SIGA = B.C.SML/NJ Exception for Structure Sharing with Same Signature
SML/NJ provides one important exception to the rule about sharing rigid types. This is the case where the type sharing is implied by structure sharing between two structures with the same signature.
Here is an example
signature S = sig type t = int end signature S1 = sig structure A : S structure B : S sharing A = B endThis is allowed in SML/NJ because A and B have the same signature, even though the sharing constraint is equivalent tosharing type A.t = B.tand A.t and B.t have the rigid spectype t = int.Note: Sharable Types
[Mostly for language lawyers]There is some controversy about what type constructors should be allowed in sharing constraints. We can illustrate this by the following example
signature S = sig type s type t = s type u sharing type t = u endBy our definition above,tis defined, and therefore not sharable, and this signature declaration is rejected. Technically, however, the semantic representation oftin the signature is the type function\().()(a nullary type function, wherens is the semantic type "name" forns s), and this type function is eta-equivalent to, a simple flexible type name. Therefore, if this eta-reduction is assumed,ns tmeets the requirements of the definition and can appear in the sharing constraint. On the other hand, considersignature S = sig type s type t = s list type u sharing type t = u endHere the representation oftis the type function\().(()ns) list, which does not reduce to a simple type name, so the sharing constraint is clearly illegal. The reasons that we adopt the simpler and more restrictive meaning of sharable are that it is easier to explain and it admits all sensible usages. I claim that it promotes a cleaner and simpler style in signature writing. It is also much simpler more efficient to implement (at least for SML/NJ). Here are some example signatures that are admitted under the more complicated version of the definition (thanks to Martin Elsman):signature S1 = sig type t type s = t end where type s = int signature S2 = sig type t type s = t sharing type t = s end signature S3 = sig type s structure U : sig type 'a t type u = (int * real) t end where type 'a t = s end where type U.u = intIt is not clear that examples like these have any importance to anyone other than language lawyers. The last is particularly perverse: reading a signature should not be an exercise in puzzle solving!
CM
User Level Changes
- assorted bugs fixed (e.g., CM.set_path is no longer a constant function, etc.)
- CM now uses its own archive file format for storing stable groups. To this end, the contents of the binfiles are attached to the end of each stablefile and the header includes information about the corresponding positions. This means that after a group or library has been stabilized, one can safely remove
- all source files
- all dependency files
- all binfiles
Only the group description itself (one file) and the stablefile (one file per architecture/OS combination) must be present.
- argument types of CM interface functions: All multi-argument functions now use _records_ (as opposed to _tuples_). This way there is less room for confusion because the arguments are named.
- result types of CM interface functions: Boolean results now indicate whether or not CM had to recompile or re-link anything. Functions such as CM.make, CM.recompile, CMB.make, etc. report a boolean result which will be true iff the function had to recompile or relink (re-execute) anything.
- Improved documentation: The manual has been brought up-to-date. This includes some old ommisions, previously undocumented changes, and documentation of the current changes.
COMPILER NOTES
- CMB.make now builds a directory hierachy under hat is analogous to the hierarchy under src/compiler. (There are some special hacks to make ".." work, though.) This removes the need to keep all source file names distinct.
Bootstrap Internals
A bit of History
Originally, when you said CMB.make();, CM would compile the entire source tree and build compilers as well as interactive systems for all available architectures. CM would then output "list" files in the bin.[arch]-[os] directory that omitted those sources that are not necessary for the current architecture.But still, the name of the Compiler structure as well as the name of the structure representing the interactive system was architecture-specific. Therefore, the boot process would select the bindings for those structures from the environment and re-bind them as "Compiler" etc.
Moreover, CM itself would be built as an "ordinary" SML program functorized by the "Compiler" structure.
The New Organization
Since sources of compiler and CM are merged, some things have already been streamlined. I simplified things further in the following way:
- CMB.make (); builds the compiler for the current architecture only.
- Conditional compilation (via #if-directives in CM description files) is used to create a binding for structure Compiler directly without hacking the environment during boot.
- Both CM and the interactive loop refer to structure Compiler (or relevant parts thereof -- see point 4 below) directly. This way there only has to be one version of CM and one interactive loop (i.e., one "glue").
- Therefore, the boot process does not have to do any hacking to get the right names bound. (However, it still does some filtering of the environment as it did before.)
- CMB.retarget (the replacement of CMR.retarget) builds the desired cross-compiler and makes two new structures available at top-level. One of them is the cross-compiler version of CMB, the other the cross-compiler version of the machine-dependent part of Compiler. The original structures CMB and Compiler are retained. Example for naming conventions:
CMB.retarget { cpu = "sparc", os = "unix" };will create bindings forstructure SparcUnixCMB (* cross-compiler version of CMB *)andstructure SparcVisComp (* cross-compiler version of the machine-dependent part of Compiler *)Visible Compiler Internals
The Compiler structure is split into two parts:
- a machine-independent part called GenericVC
- a machine-dependent part called MachDepVC
These two structures are later merged again to form the familiar Compiler structure. The advantage, however, is on CM's side: much of the excessive functorization was untangled because CM does not need to be abstracted over the "generic" part of the visible compiler. This part contains virtually all the relevant types which enables the elimination of almost all "sharing" constraints from CM's source code.
To our pleasant surprise this change has reduced the size of the heap image by a few 100kB.
The main motivation, however, was to clean up CM's source code. It may also make compilation of the compiler somewhat faster, because CM is less heavy on functors now. (This effect doesn't seem to be very noticeable, though.)
Basis
- The in_pos type and the functions getPosIn and setPosIn are no longer part of the STREAM_IO signature in the SML'97 basis, and have been removed from our implementation. Likewise, the functions getPosIn and setPosIn in the IMPERATIVE_IO signature have been removed. The function filePosIn in STREAM_IO now takes an instream argument (instead of a in_pos).
- The type of the mkInstream function in STREAM_IO was fixed to agree with the SML'97 basis specification.
- The types of OS.Path.{mkRelative,mkAbsolute} were changed to track changes in the SML'97 basis specification.
MLRISC
o The MLTREECOMP interface was extended to allow the injection of native instructions into the flowgraph being created. This is necessary on the sparc in order to insert SAVE and RESTORE instructions. o Insert NOPS where required after floating point comparisions for older SPARC machines. o The FLOWGRAPH interface has been removed from several modules that did not need it. o FCMP, FTEST, FBCC on the HPPA has been replaced by a composite FBRANCH instruction. This makes scheduling and other tasks easier. Also, added BLR and BL in the instruction set. o Fixed various assembly output bugs for the SPARC. o Not directly MLRISC related, but gc invocation points for known functions were marked as module entry points. This added an unnecessary edge in the flowgraph.
Lal George Last modified: Wed Jan 6 16:19:15 EST 1999