[HN Gopher] Efficient set-membership filters and dictionaries ba...
___________________________________________________________________
Efficient set-membership filters and dictionaries based on SAT
Author : keepamovin
Score : 28 points
Date : 2025-06-29 14:10 UTC (3 days ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| convolvatron wrote:
| the reference in the repo is paywalled (US$ 30!). I did find this
| https://arxiv.org/pdf/1912.08258 which may or may not be related.
| but what I found interesting is that the construction looks alot
| like perfect hashes
| joe_the_user wrote:
| They list two references I think. The first is from the Journal
| Of Satifiability. The link appears empty to my browser but this
| link leads to an article of the same authros and the same
| journal;
| https://www.cs.uky.edu/~marek/papers.dir/11.dir/JSAT8_10_Wea...
|
| The second paper was from a conference originally and found
| this link to it through Google Scholar (also listed in another
| comment);
| http://t-news.cn/Floc2018/FLoC2018-pages/proceedings_paper_4...
| inasio wrote:
| Membership filters are very efficient filters that guarantee no
| false negatives, but false positives are possible (how much and
| how many can be adjusted based on the dataset and filter's
| parameters). An obvious application could something like checking
| whether passengers are in a no-fly list, where false-positives
| could be handled by further checks. As far as I know cuckoo
| filters [0] are the state of the art for this, but per this work
| in principle you could make very efficient with using a SAT (or
| XORSAT) solver that could generate many feasible solutions out of
| random SAT problems.
|
| - Google scholar pointed out this link to get a pdf for one of
| the papers cited in the repo [1]
|
| [0] https://en.wikipedia.org/wiki/Cuckoo_filter
|
| [1]
| http://t-news.cn/Floc2018/FLoC2018-pages/proceedings_paper_4...
| thaumasiotes wrote:
| > An obvious application could something like checking whether
| passengers are in a no-fly list, where false-positives could be
| handled by further checks.
|
| Why is this an obvious application? How does this application
| benefit from a "very efficient" first pass? Just the boarding
| process on an airplane takes 20-30 minutes; you can easily
| check the entire passenger manifest in an error-free way in
| much less time than that. People have to buy their tickets
| before the boarding process _begins_.
| gopalv wrote:
| My favourite part of these research publications from the US Gov
| is the licensing.
|
| All of the USDS work is published with "No Copyright".
|
| The SAT filters however still do not support incremental
| building, which is one of bloom filters fun features when you use
| them in distributed databases (you can build N of them and then
| OR bloom filters to get a single one).
|
| I imagine it will still be incredibly useful where you can
| iterate over them and do OR the old fashioned way, but at higher
| accuracy for the same size.
___________________________________________________________________
(page generated 2025-07-02 23:00 UTC)