[MLton] Windows port of MLton using the Microsoft tools (e.g.without MinGW)

skaller skaller at users.sourceforge.net
Thu Jul 26 11:25:33 PDT 2007


On Thu, 2007-07-26 at 12:06 -0500, Matthew Fluet wrote:
> I'd like to suggest that we not let this thread of discussion drift too 
> far from its original topic.  That is, let us stick to technical issues, 
> rather than philosophic issues.  

They're hard to distinguish sometimes :)

> Let us take it as given that:
> 
>   1) There are pragmatic reasons for using the MS toolchain rather than the
>      GNU toolchain, not least of all that the one exists (and the other
>      does not) for 64-bit versions of the Windows operating systems.
> 
>   2) There is no a priori reason that MLton (the compiler) could forgoe
>      using any of the Posix.* structures from the Basis Library, and
>      instead use only the (portable) OS.* structures.  Nonetheless, many
>      portions of MLton's implementation of the Basis Library are built up
>      from the Posix.* structures/functions.  It would be a non-trivial (but
>      not impossible) amount of work to change this.

I have some portable libraries providing some Posix functionality
on Windows, including for 

	* threading, mutex, condition variables, etc
	* socket I/O

however the code is C++ not C. Still, some might be translated
to C. The condition variable emulation was lifted from
some C code in ACE library.

In other cases, I might take on writing translations of
some other functions, since I need them (for the same reasons)
for Felix anyhow. 

> 
> I know that John Skaller has in the past made it clear that Felix 
> generates portable C/C++ as a backend.  What useful advice can you pass 
> on: Do you fall back to C89 for the MS toolchain? 

No, there's no need to. There is a single hack in the library
due to a parsing fault .. in gcc, not MSVC. gcc prior to around
version 4 couldn't parse C++ properly.  I don't use any
high level C++ stuff like heavy template meta-programming,
nested classes, etc.

So the rough advice is the obvious KISS (Keep It Simple Sally!)

This applies to C too.

The major porting problem for Windows which affects source
is the requirement to specify 'export/import' on all
symbols that need exporting (or importing) from a library.

You have to put

__declspec(dllexport) void f();

when building a DLL from which 'f' is to be available,
and you have to say

__declspec(dllimport) void f();

when trying to use that function from another DLL or executable.

This means you must decorate every declaration of every public library
function and variable with a macro, which switches
meaning between these two annotations depending on whether
you're building the DLL or simply using it.

It is also possible to do this with *.DEF files I think,
but the 'right' way is to annotate the C source codes.

This affects dynamically linked run time libraries,
not executables, i.e. it doesn't impact the code generator,
just the library code (because MLton is a whole program
analyser and doesn't build libraries).

>  Is that difficult? 

None of it is difficult, it is just time consuming
and annoying. Nicolas hackery getting GMP to build by
trickery is a good example of that. This one is very
annoying, because it is very hard to automate, and so
it is hard to upgrade because you have to repeat the
trickery by hand each time ;(

> Are there other 'compromises' that need to be made to support the MS 
> toolchain? 

Yes, but you will have to find them by trial and error.

Microsoft is stricter than gcc on some language issues,
and gcc is stricter than MS on others.

Arrays of 0 elements is an example (already cited).

If you write and generate good quality C, there shouldn't
be many problems.

If you switch to C++, some of the problems will go away.
By that I mean: generate C code, but compile it with 
g++ or MSVC++ as C++ code.


>  Do you ever generate assembly code?  What are the pitfalls 
> there?

No, but I do conditionally use some gcc extensions, such
as computed gotos in place of switches. This is supported
with macros. Felix also uses 'assembler labels' here,
and we had a previous discussion of that because Mlton did
too, and gcc can create problems if it unrolls code
containing such an assembler label -- I've never run into
that myself, because of the way it is used.

The hard thing, I think, is tricking C compilers into generation
of fast code. gcc and MSVC++ are going to be tricked by different
techniques. THAT is the really hard bit.

For example, in a couple of places I use a macro wrapping
a gcc annotation which is supposed to help gcc generate 
code supporting better branch prediction -- in the case
of a various failures, the goto the error handler is 'predicted'
as 'unlikely'.

The only way I know to get good performance code is to
make the C compiler generate assembler and actually look at it,
change some things and try again, and do some microbenchmarking.


-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net



More information about the MLton mailing list