[MLton] Re: [MLton-commit] r6699

Wesley W. Terpstra wesley at terpstra.ca
Mon Jun 15 10:03:10 PDT 2009


On Mon, Jun 15, 2009 at 5:21 PM, Matthew Fluet<fluet at tti-c.org> wrote:
> It's about the Win32 spawn* functions (and possibly the CreateProcess
> function), which provide fork/exec-like functionality.
>
> The issue (as I understand it) is that the  char **argv  argument passed to
> spawnv{,p}{,e} becomes the  const char **argv  argument passed to main of
> the created process.  One doesn't expect the contents of those character
> arrays to be changed from spawn{,p}{,e} to main (that is, one shouldn't need
> to do any escaping at all and one certainly doesn't need to for the *nix
> exec{,p}{,e} functions), but there is some evidence that MinGW does (or
> un-does?) escaping of the arguments.

The root problem is that windows does not have an **argv. That's a
unix convention. Windows programs receive a single flat array (see
CreateProcess). The crt has code which parses and splits this flat
array to emulate argv functionality. exec() and spawn() functions have
code which pastes the arguments together. Unfortunately, a
long-standing bug in windows is that these pasting and parsing
operations are NOT compatible.

The MinGW (/ windows CRT) version of pasting is simply ("a", "b", "c")
-> "a b c". Obviously this breaks for ("a b", "c") -> "a b c". That's
why MinGW needs to escape arguments to spawn as well as CreateProcess.
The escaping function in mlton/process.sml was hand-crafted to match
the parsing done the windows crt at program start-up. The
launchWithCreate method similarly combines ("a b", "c") -> "a b c",
but after it escapes it's arguments the same as it would for spawn().

Cygwin has to paste and parse arguments just as MinGW does, however,
it's possible that the cygwin parsing/pasting actually matches (but I
wouldn't bet on this). If they do match, then no escaping is needed
for spawn. However, like MinGW, Cygwin sometimes calls CreateProcess.
The arguments will need to be escaped and pasted together in whatever
way matches the cygwin runtime. I don't know how the cygwin runtime
parses it's single-argument, but was I read said:

      (* In cygwin, according to what I read, \ should always become \\.
       * Furthermore, more characters cause escaping as compared to MinGW.
       * From what I read, " should become "", not \", but I leave the old
       * behaviour alone until someone runs the spawn regression.
       *)

However, I didn't (and don't) have a cygwin to poke for the parsing
algorithm used.



More information about the MLton mailing list