develooper Front page | perl.perl5.porters | Postings from August 2000

Re: Proposal for \v and \V, the small- and large- cut regex operators.

From:
Jeffrey Friedl
Date:
August 6, 2000 02:40
Subject:
Re: Proposal for \v and \V, the small- and large- cut regex operators.
Message ID:
200008060939.CAA10570@ventrue.yahoo.com

Hugo  <hv@crypt.compulink.co.uk> wrote:
|> I don't think I understand what aspects of the current implementation
|> could change that would invalidate your \v/\V proposal without also
|> undermining the documented definitions of what matches when, but 
|> consider the (?>...) definition to which he refers (from the latest
|> perlre.pod):
|> 
|>   An "independent" subexpression, one which matches the substring
|>   that a I<standalone> C<pattern> would match if anchored at the given
|>   position, and it matches I<nothing other than this substring>.
|> 
|> Can you write a definition of \v and \V that does not invoke the
|> details of the current implementation?

I guess the question then becomes how far away from "implementation" one
has to step. I believe that my definition steps back as far as the general
semantics of regular expressions applied with a nondeterministic finite
automata based engine, and in particular, the description of Perl regex
semantics given on pages p197-201 of PP3.

If we were to step back further to include deterministic finite automata,
my description wouldn't be valid, but then, neither would the concept :-)

To be clear, I believe that the definitions I posted rely only on
documented features of Perl, and on nothing implementation-specific at all.

I don't believe that you can have a meaningful understanding of Perl regex
semantics without understaning the concept of how a nondeterministic finite
automata engine works. (But I don't believe it's important to actually know
the phrase "nondeterministic finite automata", and probably a lot better
not to.)

My opinion is that it's better to talk up front about what's happening, and
let understanding of the complex parts blossom from a true understanding of
the basics. I realize, though, that not everyone will have the
desire/patience for that kind of approach, which is why it's good that
there are so many different forms of documentation.

But a rose by any other name still smells as sweet (you can quote me on
that :-), and however you choose to describe it, I believe that the \v & \V
wedges would make meaningful additions to the language.

	Jeffrey
------------------------------------------------------------------------------
Jeffrey Friedl <jfriedl@yahoo-inc.com> Yahoo! Finance http://finance.yahoo.com




nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About