[HN Gopher] Prefix Trees in Action
___________________________________________________________________
Prefix Trees in Action
Author : YuukiRey
Score : 15 points
Date : 2021-04-01 12:21 UTC (2 days ago)
(HTM) web link (medium.com)
(TXT) w3m dump (medium.com)
| staticassertion wrote:
| Thanks for writing this up. I'm solving a similar problem, albeit
| different in a few important ways, and was planning on taking a
| semi-similar approach. Like you, today I'm using a fairly naive
| implementation with some less-than-ideal complexity.
|
| I've never actually worked with a prefix tree before, so this was
| quite helpful for me.
|
| This may be of interest as well - a related paper, Pigasus, does
| this with specialized hardware but it's a similar problem of
| applying 10s of thousands of rules as quickly as possible. I
| found it while reading through acolyer's morning paper.
| https://blog.acolyer.org/2020/11/16/pigasus/
|
| Again, slightly different task - they wouldn't be going for the
| longest match exclusively, but all matches. Also, the rules they
| use are fundamentally regular expressions, so the filtering pass
| is to take advantage of the domain specific knowledge that rule-
| matches are very rare.
| taeric wrote:
| My guess is your "fundamentally regular expressions" is, in
| fact "fundamentally an automata". Which, my guess is even this
| article could have used an automata such that they are
| basically lexing the data they are using over.
| staticassertion wrote:
| I assume you're saying that the regular expressions used are
| likely to be equivalent to a DFA? If so, I agree, that's very
| likely, and I was also thinking about that.
|
| I believe Go's regex engine is built on RE2 though, so idk.
| taeric wrote:
| Essentially, yes. My point is if you implemented the dfa
| directly, then you could just iterate all the characters
| doing whatever the current state of the automata says to do
| at each character.
| secondcoming wrote:
| Someone's Marketing team forced them to write an article!
|
| But yes, tries are very cool.
| taeric wrote:
| I fully agree, and yet I have so rarely had direct need for
| then that it is kind of scary when I see folks reach for them.
|
| Worse, they are stupid common in interviews.
| musingsole wrote:
| > Because of that, the first step on the road to fast programs is
| to measure!
|
| > Making an operation that accounts for 1% of the overall runtime
| 100% faster is far less effective than making something that
| accounts for 50% of the runtime10% faster.
|
| Currently dealing with a slow DB and a team of "engineers" who
| have just _NOT_ internalized the above. It 's been very
| demoralizing.
___________________________________________________________________
(page generated 2021-04-03 23:01 UTC)