[HN Gopher] Analyzing New Unique Identifier Formats (UUIDv6, UUI...
       ___________________________________________________________________
        
       Analyzing New Unique Identifier Formats (UUIDv6, UUIDv7, and
       UUIDv8)
        
       Author : futurecat
       Score  : 34 points
       Date   : 2024-10-09 13:57 UTC (3 days ago)
        
 (HTM) web link (blog.scaledcode.com)
 (TXT) w3m dump (blog.scaledcode.com)
        
       | SturgeonsLaw wrote:
       | https://xkcd.com/927/
       | 
       | edit: I don't intend for this to be dismissive because I actually
       | find the thought that goes into designing UUIDs extremely
       | interesting
        
         | ktm5j wrote:
         | Well, technically this is all about different versions of the
         | same standard.
        
         | lukev wrote:
         | The cool thing about the various verions of UUID is that
         | they're all compatible. The differences almost all come down to
         | database locality (and therefore performance.)
         | 
         | The exception is if you're extracting the time portion of a
         | time-based UUID and using it for purposes other than as a
         | unique key, but in my experience this is typically considered
         | bad practice and time is usually stored in a separate column
         | for cases where it matters for business purposes.
        
         | refulgentis wrote:
         | It's not necessarily that its dismissive, more so that its that
         | a fuzzy pattern-matching comment, thats incorrect, and just a
         | wordless link. Trivial to make, nontrivial to respond to:
         | "Funny", in the way in-group cultural references usually are -
         | responding means you're taking it too seriously. Yet, incorrect
         | enough that'll misinform anyone who isn't diligently reading
         | the full article and understands historical context. Noise
         | thats likely to generate noise. Trolling, just missing active
         | intent to derail.
        
       | maxfurman wrote:
       | I'm having trouble understanding the use of v8. It can be pretty
       | much any bits as long as it has 1000 in the right spot? It
       | strikes me as too minimal to be useful. I must be missing
       | something
        
         | SigmundA wrote:
         | The useful part is you can do anything you want with the other
         | bits and have it still be a valid UUID.
        
       | refulgentis wrote:
       | v7 is really helpful for meaningful UX improvements.
       | 
       | ex. I'm loading your documents on startup.
       | 
       | Eventually, we're going to display them as a list on your home
       | screen, newest to oldest.
       | 
       | Now, instead of having to parse the entire document to get the
       | modified date, or wait for the file system, I can just sort by
       | the UUID v7 thats in the filename.
       | 
       | Is it perfect? No, ex. we could have a really old doc thats the
       | most recently modified, and the doc ID is a proxy for the
       | _creation_ date.
       | 
       | But its _much_ better than the status quo of  "we're parsing
       | 1000+ docs at ~random at startup, please wait 5 seconds for the
       | list to stop updating over and over."
        
       | oezi wrote:
       | I have recently wondered why Ruby on Rails is using a full-length
       | SHA256 for their ETag fingerprinting (64 characters) when a UUID
       | at 36 chars would probably be entirely enough to prevent
       | collisions and be more readable at the same time. Esbuild on the
       | other hand seems to use just 32bit (8 chars) for their content
       | hash.
        
         | nertzy wrote:
         | Isn't it because you can generate the same content two
         | different times and hash it and come to the same ETag value?
         | 
         | Using UUID here wouldn't help here because you don't want
         | different identifiers for the same content. Time-based UUID
         | versions would negate the point of ETag, and otherwise if you
         | use UUIDv8 and simply put a hash value in there, all you're
         | doing is reducing the bit depth of the hash and changing its
         | formatting, for limited benefit.
        
           | oezi wrote:
           | I would assume that you would only create a new UUID if the
           | content of the tagged file changed serverside.
           | 
           | Benefits are readability and reduced amount of data to be
           | transferee. UUID is reasonably save to be unique for the ETag
           | use case (I think 64 bits actually would be enough).
        
             | ninkendo wrote:
             | [delayed]
        
             | vlovich123 wrote:
             | SHA256 has the benefit that you can generate the ETAG
             | deterministically without needing to maintain a database
             | (i.e. content-based hashing). That way you also don't need
             | to track if the content changes which reduces bugs that
             | might creep in with UUIDs. Also, if typically you only
             | update a subset of all files, then aside from not needing
             | to keep track of assigned UUIDs per file, you can do a
             | partial update. Reasons to do content-based hashing are not
             | invalidated because of a new UUID format.
        
       | wood_spirit wrote:
       | I am a big fan of the new uuid v7 format.
       | 
       | It has the advantage of being a drop in replacement most places
       | everyone uses v4 today. It also has the advantage over other
       | specs of ulid in that it can be parsed easily even in languages
       | and databases with no libraries because you just need some
       | obvious substr replace and from_hex to extract the timestamp.
       | Other specs typically used some custom lexically sortable base64
       | or something that always needed a library.
       | 
       | Early drafts of the spec included a few bits to increment if
       | there were local ids generated in the same millisecond for
       | sequencing. This was a good fit for lots of use cases like using
       | the new ids for events generated in normal client apps. Even
       | though it didn't make the final spec I think it worth
       | implementing as it doesn't break compatibility
        
       ___________________________________________________________________
       (page generated 2024-10-12 23:00 UTC)