[MLton] Re: [MLton-commit] r6699

Wesley W. Terpstra wesley at terpstra.ca
Mon Jun 15 19:06:45 PDT 2009


On Tue, Jun 16, 2009 at 1:13 AM, Matthew Fluet<fluet at tti-c.org> wrote:
> While I can understand the marshalling/unmarshalling of arguments through a
> single string, what I'm unclear on is where Cygwin and MinGW interpose their
> own conventions.  That is, spawn{,p}{,e} and CreateProcess are Win32
> functions (right?) --- yet Cygwin and MinGW interpose their own version that
> (may or may not) munge the arguments (before calling the "real"
> spawn{,p}{,e} and CreateProcess)?

There is a difference between a system library function and a kernel
call. Cygwin applications are not linked against the windows CRT. All
their calls go through cygwin1.dll. That means CreateProcess, spawn,
exec, ... everything is run out of the cygwin1.dll. I'm not even
certain that 'spawn' corresponds to a kernel call since you could
implement it using methods like CreateProcess.

> Similarly, starting a program from the
> console should begin execution at main; though, technically, it is wherever
> the loader begins execution, so Cygwin and MinGW could provide their own
> _start (or whatever symbol it is in Windows) that (may or may not) unmunge
> the arguments before calling main.

Correct. Main is not the start of a program, crt1.o is.

> Of course, when calling spawn{,p}{,e} or CreateProcess from a Cygwin or
> MinGW program, it can't know whether the called executable is itself a
> Cygwin, MinGW, or plain Windows program.

Actually, that's false. Cygwin programs recognize each other and do
special magic to communicate. For instance, there is no kill() call in
windows. Yet cygwin processes are able to kill their children and fire
off a signal handler. How? They secretly open a pipe between the
processes to carry signaling information. Similarly fork() does
extremely frightening voodoo where it copies memory from it's "parent"
process to emulate unix.

> So, I don't see why it is sensible for Cygwin
> or MinGW to munge/unmunge arguments at all, since it can't know what
> was/will-be done on the other end.

Well, I won't debate whether it's sensible, but that is how it works.
It does seem likely that the string eventually delivered to a
CreateProcess kernel call is escaped similarly for both cygwin and
mingw (though as I mentioned, it is possible this is false).

I wouldn't be surprised if the spawn() function on cygwin1.dll
requires no escaping at all. They're in a position to fix this bug for
their applications. The CreateProcess will definitely need some sort
of escaping, though. It might be different than MinGW, because the
library re-munges the arguments. I don't know. Someone who has it has
to reverse engineer it or find example code to test via google.



More information about the MLton mailing list