[HN Gopher] Hash Functions
___________________________________________________________________
Hash Functions
Author : rdpintqogeogsaa
Score : 62 points
Date : 2023-06-03 13:20 UTC (9 hours ago)
(HTM) web link (papa.bretmulvey.com)
(TXT) w3m dump (papa.bretmulvey.com)
| [deleted]
| layer8 wrote:
| I don't understand the encryption use case. Can someone
| elaborate?
| tptacek wrote:
| It's not written super clearly (though the article itself is an
| interesting and ambitious piece of technical writing) but the
| impression I get is that he's referring to the role of hash
| functions in cryptosystems: for signatures, transcript hashes,
| key derivation, channel binding, and things like that.
| Cryptographic hash functions are the glue that binds crypto
| protocols together.
|
| But you can also trivially turn a hash function into a cipher
| and encrypt with it (apologies if I missed an explanation of
| this in the article). Just hash a key and a counter together to
| create a keystream and XOR your plaintext to it. That's how
| Salsa20 and ChaCha20 work. (Interestingly, the reverse process
| --- converting a block cipher into a hash function --- is where
| we historically get our cryptographic hash functions from).
| layer8 wrote:
| I'm familiar with hash functions for signatures, however I'm
| confused about the mention for encryption, in particular with
| respect to the context of common schemes like AES and RSA. I
| guess hashing a key is an option though.
| commandersaki wrote:
| Modern hashing functions pretty much offer encryption out
| of box like gimli or BLAKE2 family (I think they call it
| XOF mode).
|
| This is pretty much thanks to the sponge construction.
| tptacek wrote:
| You can spitball a hash-based stream cipher with any hash
| function in just a couple lines of python. Take a 128 bit
| key string, and then hash it with an incrementing counter
| to get successive 32 bytes of keystream data, just like
| you would with AES in CTR mode.
| adamzochowski wrote:
| There is a fourth category of hash functions: find similar items.
|
| For English words, this is usually done with soundex / metaphone
| / doublemetaphone. Fun fact, soundex was developed over 100 years
| ago to help immigrants find their families in USA even if they
| didn't know how to spell their last name. Soundex is supported by
| most DBs.
|
| For images, there are hashes like ahash/phash/dhash/whash/blockme
| at/colormoment/colorhas/marrHildreth .
|
| Most popular Audio hash is acoustid. There was also EchoPrint
| (with two incompatible versions), but once EchoNest was bought
| out by Spotify the public support and development into echoprint
| died. Some groups of people tried to keep echoprint alive but I
| am unaware if it is still used by anyone publicly.
| joezydeco wrote:
| Soundex is still out there. If you live in Illinois, Florida,
| or Wisconsin your drivers license number is a soundex hash of
| your name combined with your birthday.
|
| http://www.highprogrammer.com/alan/numbers/dl_us_shared.html
| preseinger wrote:
| hash functions are ultimately just a mapping of an input set of
| high/infinite cardinality to an output set of fixed/finite
| cardinality
|
| you correctly point out that some hash functions exist which
| map input words in some specific language into similarity
| groups based on some concept of semantics
|
| but this is essentially unrelated to the topics raised in the
| linked article
| jll29 wrote:
| Robert C. Russell (and Margaret King Odell) invented SOUNDEX,
| which is described in US Patent 1,261,167 (1918) -see also
| Knuth (1973) TAOCP vol. 3.
|
| https://patentimages.storage.googleapis.com/31/35/a1/f697a3a...
| lynndotpy wrote:
| If I remember correctly, Apple's Voxel Net (~2018 neural
| network, used for doing machine learning on sparse lidar point-
| cloud data) uses a simple "spatial hash" for this purpose,
| using the coordinate of the voxel as a key.
| esafak wrote:
| Today you would use embeddings optimized for similarity search.
| nighthawk454 wrote:
| Embedding models are still essentially a hash function, just
| very complex ones
| scentoni wrote:
| Those are examples of useful functions. They are not examples
| of useful hash functions.
| gabetax wrote:
| Specifically, these functions to not provide "uniform
| distribution".
| jagged-chisel wrote:
| That isn't a requirement of an algorithm called a hash
| function
| gabetax wrote:
| From the article:
|
| > For a function to be useful as a hash function, it must
| exhibit the property of uniform distribution
|
| It's also listed as the first property on the Wikipedia
| page:
|
| https://en.wikipedia.org/wiki/Hash_function#Uniformity
| jagged-chisel wrote:
| I can't find the text you quoted. The article starts off:
|
| > A hash function is any function that can be used to map
| data of arbitrary size to fixed-size values
|
| And the first sentence in the section you link says
|
| > A good hash function should map the expected inputs as
| evenly as possible over its output range.
|
| It doesn't have to be "good" to be a hash function.
| anamexis wrote:
| That's not a meaningful definition of a hash function,
| then. def hash(val): return
| 0
| jagged-chisel wrote:
| Yes, this is a hash function. Whether it's "useful" or
| "good" is the purview of the architect.
|
| Tangential edit: https://xkcd.com/221/
| throwmeaway1212 wrote:
| That has a pretty uniform distribution..
| derefr wrote:
| I think you're conflating "functions used to build a hash
| table" with "functions used to build a dictionary ADT." A
| hash table doesn't have to serve the purpose of storing items
| in a dictionary ADT. A hash table where everything that's
| similar collides into the same buckets is just as useful for
| finding "a linked list of similar items" in O(1), as a hash
| table where everything gets its own unique bucket is useful
| for finding items in O(1).
| michaelcampbell wrote:
| While researching Soundex in the mid/late 80's, I of course
| soundex'd my last name and realized that it was the first part
| of my state's driver's license #.
| [deleted]
| metadat wrote:
| Intriguing, I just checked mine and it seems California
| doesn't appear to do this.
|
| If you're comfortable sharing, would like to find out which
| state you're referring to. Cheers.
| ComputerGuru wrote:
| Another poster shared this link: http://www.highprogrammer.
| com/alan/numbers/dl_us_shared.html
|
| It has the encoding formula for Wisconsin, Illinois, and
| Florida (which match the parent's comment) but there may be
| others.
___________________________________________________________________
(page generated 2023-06-03 23:01 UTC)