new regexp library

Henry Cejtin henry@sourcelight.com
Thu, 21 Jun 2001 02:35:11 -0500


I  don't think I would bother with returning the list of matches yet.  What I
usually do is have a regexp that just handles  one  initial  element  of  the
list,  and then, using that, I extract the element and the rest of the string
and loop.  This really pushes you to have some kind of  match-regexp-against-
substring kind of thing, but that should be no problem.

This division can often be more efficient than matching everything in one big
regular expression any way.  If you think about it, it is very  analogous  to
Prolog's  cut  (`!').  The notion is that once you see the start of the list,
you are NOT willing to backtrack around this choice.  If you write it as  one
big regular expression, you can't express that, and so the matching can often
take longer because it is going to keep these other possibilities alive.   In
order  to  avoid any order n^2 thing, you have to stop scanning when you come
to a state which can't possibly succeed.  This shouldn't be hard even with an
NFA, let alone with a DFA.