[HN Gopher] Regular expression matching can be simple and fast (...
___________________________________________________________________
Regular expression matching can be simple and fast (2007)
Author : momonga
Score : 65 points
Date : 2024-05-21 01:26 UTC (10 hours ago)
(HTM) web link (swtch.com)
(TXT) w3m dump (swtch.com)
| jjice wrote:
| This and the associated series of regex posts from Russ Cox are
| probably the best content on practical regular expression engine
| internals you can get. If you have a cursory understanding of
| NFAs and DFAs, it's some of the best technical writing I've ever
| had the pleasure to read.
|
| These posts are responsible for my love of regular expressions :)
| kqr wrote:
| And if you don't have a cursory understanding of NFAs and DFAs,
| they're still great for getting that cursory understanding.
|
| These articles were my introduction to finite automata (and
| almost the only place I've learned about them aside from
| practical use) and my intuition for them seem to often be
| better than my peers'.
| banish-m4 wrote:
| NFAs are apparently undecidable since the same input can go
| to multiple states. DFA states can only go to one state when
| given a particular input sequence.
|
| The Dragon book contains an NFA to DFA algorithm (powerset)
| which explodes states with multiple exiting token
| transitions.
|
| PS: All your DAGs are belong to us.
| gliptic wrote:
| NFAs are decidable. As you said, a NFA can be converted to
| a DFA which is decidable. Maybe you mean they are non-
| deterministic (that's in the name)?
| ChrisArchitect wrote:
| Some discussion from 5 years ago:
|
| https://news.ycombinator.com/item?id=20310669
| dang wrote:
| Thanks! Macroexpanded:
|
| _Regular Expression Matching Can Be Simple and Fast (2007)_ -
| https://news.ycombinator.com/item?id=20310669 - June 2019 (33
| comments)
|
| _Regular Expression Matching Can Be Simple and Fast (2007)_ -
| https://news.ycombinator.com/item?id=16341519 - Feb 2018 (20
| comments)
|
| _Regular Expression Matching Can Be Simple and Fast (2007)_ -
| https://news.ycombinator.com/item?id=9374858 - April 2015 (16
| comments)
|
| _Regular Expression Matching Can Be Simple And Fast [2007]_ -
| https://news.ycombinator.com/item?id=6593331 - Oct 2013 (1
| comment)
|
| _Regular Expression Matching Can Be Simple And Fast_ -
| https://news.ycombinator.com/item?id=820201 - Sept 2009 (15
| comments)
|
| _Regular Expression Matching Can Be Simple And Fast_ -
| https://news.ycombinator.com/item?id=466845 - Feb 2009 (17
| comments)
| rurban wrote:
| Perl needed a couple of years to catchup. At first they had no
| idea what Russ was talking about
| adrianN wrote:
| Regexes can be fast and simple, but sometimes it's faster and
| simpler to use normal string search. I recently replaced a regex
| with two finds for 10x speedup.
| secondcoming wrote:
| Well if people spend time learning arcane regex syntax it'd be
| a shame to not use them!
|
| But yes, you're right. Sometimes simple finds are sufficient.
| cies wrote:
| Or write a parser!
|
| In most cases the parser will not be shorter (code) than the
| regex, but it will most certainly be simpler (cognitively; it's
| hard to understand all that regexes do, they're close to magic)
| and more testable (you can test the parts of the parser, like
| the number parser or the string parser, individually), and
| often faster too.
| cies wrote:
| https://stackoverflow.com/questions/11905506/regular-
| express...
| chx wrote:
| > As mentioned earlier, no one knows how to implement regular
| expressions with backreferences efficiently, though no one can
| prove that it's impossible either. (Specifically, the problem is
| NP-complete, meaning that if someone did find an efficient
| implementation, that would be major news to computer scientists
| and would win a million dollar prize.)
|
| It really needs to be noted how NP-complete and an algorithm
| efficient _in practice_ have nothing to do with each other. If an
| algorithm takes C * 1.000...eighty zeroes...1^n time it is
| exponential but for any n you will encounter in real life the
| time it takes will be indistinguishable from C.
| bubblyworld wrote:
| There's a fun list of problems like this here:
| https://cstheory.stackexchange.com/questions/6660/polynomial...
|
| On the other hand, it's remarkable that so many algorithms _do_
| have reasonable constants/exponents. So the concept works
| pretty well.
___________________________________________________________________
(page generated 2024-05-21 12:01 UTC)