[MLton] A few patches we had to apply to port from version 20030716 to 20051202 and for the MinGW port

Nicolas Bertolotti nicolas.bertolotti at polyspace.com
Thu Mar 29 12:07:01 PST 2007


Thanks for your feed-back ... Here are my answers ...

> > - bug-fix-cygwin-1_3_17.patch :
> I question whether these should be applied. It seems reasonable to me
> to simply require that users use the newer cygwin. If the newest
> cygwin had these problems, then it might be another story. However,
> over time the old cygwin versions will vanish, so these patches will
> become cruft in the MLton source tree. Worse, the headers are not
> under the MLton copyright, nor can they be made so as none of us
> wrote them. It might cause problems to include them in the MLton
> distribution.
> 
> Perhaps we could instead offer this patch as a separate download?

The problem with Cygwin is that it is not possible to run concurrently
multiple versions of the DLL. As a consequence, if you have a product which
requires a specific version and an other product which requires a more
recent one, you get into troubles.

Another reason why I like this particularly old version of Cygwin is that
we've noticed that the newer versions come with some performance
regressions.

Anyway, as we plan to stop using Cygwin and run through MinGW in the future,
I'm not so interested in having this patch included in the official release
of the MLton compiler.

> > - bug-fix-solaris7.patch
> >
> Does this MAP_ANON value actually work? If so, is it declared in
> another header? If not, why?
> 
> Also, I don't think it's a good idea to use 'char dummy' as the
> contents of sockaddr_in6. A real IPv6 structure is much larger,
> leading to a potential overrun. Also, it is guaranteed that sizeof
> (sockaddr_storage) >= sizeof(sockaddr_*). Is sockaddr_in really the
> largest? sockaddr_in6 is usually bigger. Are these structures missing
> from headers or missing from the system's support?

Solaris 7 is an old version of Solaris which does not provide support for
anonymous mapping and IPv6 so the definitions are really missing from the
system's support.

As a consequence, the value we set for MAP_ANON (I used the value the
Solaris 8 headers set) is simply ignored when the function is executed on
Solaris 7 and will be taken into account on more recent versions. We have
tested our binaries on both Solaris 7 and 8 and they work fine.

About IPv6, the dummy definitions are activated when we build the MLton
compiler for a Solaris 7 target. The correct definitions are used on more
recent releases so I don't see any problem. If you want to build code that
runs on Solaris 7, you won't use IPv6 because it is not supported. If you
want to build code that runs on a more recent release, you should not build
a compiler for a Solaris 7 target.

> >  - bug-fix-fullpath-windows.patch
> I've only read the code, not run it, but it seems reasonable.
> 
> What's this part about?
> +                          (* Fix bug in isLink when NFS mounted
> problem *)
> +                           if ((not isMinGW) andalso (isLink arc))
> handle _ =>
> +                              raise PosixError.SysErr("System
> problem in path " ^ p, NONE)
> 
> isLink can raise an exception when it's on NFS? Should we pass this
> exception through?

Sorry, I based the implementation on a piece of code we used when we built
our product using the 20030716 version of MLton (the fullPath implementation
was broken on this version and we had included and modified the one from a
newer release directly in our code). I could not manage to find the reason
why we had done this.

Maybe the exception which is raised when we run isLink on a stale NFS file
system is not a SysErr exception and we wanted to change that but I'm afraid
it's going to be hard to reproduce.

> >  - bug-fix-nethostdb-mingw.patch
> I see that other parts of the basis are doing this as well.
> Therefore, I see no harm in putting these calls into those methods.
> 
> However, on another note, shouldn't we be calling this ONCE from the
> main method? I seem to recall it used to work this way? What happened?

I guess the idea is to avoid loading the WinSock2 DLL when the binary does
not need it. Loading a DLL is not "free". Do you think it could be possible
to call the function from SML code and rely on the dead code elimination in
order to achieve the same result ?

> > - bug-fix-times-mingw.patch
> >
> > This patch is designed to avoid raising an exception during the
> > initialization of the basis library on MinGW when the code contains
> > calls to the "times" function which are not detected as dead code
> > during compilation. Then, if the "times" function is called, the
> > exception will be raised from the location of the call which is
> > easier to understand from a user's point of view.
> No idea about this.

In my point of view, the patch is safe and it's definitely more complicated
to investigate an exception which occurs during the basis initialisation
than during the execution of the user's code itself.

> > - evol-getpid-mingw.patch
> I'm a bit concerned. We also return a 'pid' in
> Windows_Process_create. Are these of the same type? ie: Will the
> created process's getpid() return the same value as
> windows.c:Windows_Process_create?

I've also modified the Windows_Process_create function as well as the
spawn... functions in order to make them return the real PID for the
process. This is part of the "waitpid" patch I also provided. Anyway, I have
to admit I did not really test it because we don't use this function in our
code.

The idea behind this is that the PID is unique. Any process which knows the
PID can use it in order to refer to a particular process whereas the handle
is process specific and can not be communicated to an other process as is.

> >  - evol-poll-mingw.patch
> I guess we need Matthew to relicence this. ;-)

While surfing on the internet, I've seen a number of articles which mention
that on some systems, the poll() system function may use more resources than
the select() call. I'm not sure anyway but maybe it would be a good idea to
update the compiler in order to really implement the SML select function
using the select() call.  

> >  - evol-waitpid-mingw.patch
> Again I'm concerned about this mixing of pids. cwait works with the
> pid from Create_Process. Does waitpid? I suspect this will break on
> cygwin, at least. On that platform the Create_Process will return a
> real windows pid, but the getpid/waitpid/etc are wrappers that use
> some other numbering scheme AFAICR.

As I previously mentioned, I don't use the Windows_Process_create function
so I would not be surprised the implementation does not work in all cases. 

If we only talk about evolutions, I would say this patch is probably the
most useful evolution I've made because it really makes it easier to port an
existing Unix application without changing the application code but it is
also the one with the worst code design (sorry for that).

> 
> There's more PID madness happening in the spawne.c changes. Why are
> the Windows_Process_*register methods added to cygwin.h but used in
> an #ifdef __MINGW32__?

This is because the Cygwin version of the spawn* functions already returns a
real PID whereas the MinGW version returns a handle. The only idea behind
the "register" method is that we need to keep an opened handle to the
process in order to make sure we can still read the exit status of the
process after it dies (if we don't do so, the system won't keep the process
information after it dies).

> > - evol-dash-in-pathes-cygwin.patch
> >
> > This patch is designed to use '/' as the path separator when
> > building a path on Cygwin. The idea behind the patch is to use '/'
> > when the path is relative or uses the Unix format and to use '\\'
> > when the path is an absolute Windows like path.
> I'm not sure I understand the rationale. Could you give some examples
> of the old (broken) behaviour and the new correct one? Does this
> still satisfy the basis specification?

As far as I remember, if we run a function from the Path structure on a
Cygwin path, the '/' in the source path used to be converted to '\\' even
when the source path was given in Posix format except the first dash in case
of an absolute path (/usr/local -> /usr\local). 

Then, if we use the resulting path in a shell command, the command may fail
because the backslash will be interpreted as an escape sequence if we don't
pay attention to escape the string before. Usually, the ones who write code
or scripts for Cygwin are more familiar with Unix than Windows and do not
think they have to pay attention to such things and may be disappointed.

> > - evol-new-bitarray-functions.patch
> >
> > We have added a few extensions to the BitArray structure in smlnj-
> > lib. Implementing what we needed outside the compiler using the
> > existing public signature was not efficient enough. You may want to
> > consider this as a PolySpace specific need anyway.
> No comment. I've never used SML/NJ once. :-)

We used SML/NJ before we use MLton and now, we still use SML/NJ for
debugging purpose because the compilation is much faster with it than with
MLton.
 




More information about the MLton mailing list