[HN Gopher] Nanosecond timestamp collisions are common
___________________________________________________________________
Nanosecond timestamp collisions are common
Author : ingve
Score : 273 points
Date : 2023-07-21 07:01 UTC (16 hours ago)
(HTM) web link (www.evanjones.ca)
(TXT) w3m dump (www.evanjones.ca)
| vbezhenar wrote:
| If you need unique nanosecond, keep track of the previously
| generated one and increase it if necessary. Would require global
| lock or atomic stuff, but should be good enough for practical
| uses.
| pyrale wrote:
| One benefit of UUID is that you don't need coordination. In
| coordinated systems, unique IDs are a non-issue.
| OJFord wrote:
| Is there any other benefit? That's the raison d'etre -
| _Universally_.
| eru wrote:
| Depends on what you compare them with.
| buster wrote:
| If you do that, just take some redis server and increment an
| integer every time you need a new number?
| marcus_holmes wrote:
| or y'know, use the thing that was designed specifically for
| this case: UUID's
| hnlmorg wrote:
| UUIDs are designed to solve a slightly different, albeit
| related, problem: where you don't have synchronisation of
| the system as a whole. The solution the GP is solving is
| for systems that are synchronised and therefore generating
| a UUID is additional and unnecessary overhead.
| LAC-Tech wrote:
| "Logical Physical Clocks" may be of interest. The timestamps
| are monotonic and don't require atomic clocks like in google
| spanner.
|
| _HLC captures the causality relationship like logical clocks,
| and enables easy identification of consistent snapshots in
| distributed systems. Dually, HLC can be used in lieu of
| physical /NTP clocks since it maintains its logical clock to be
| always close to the NTP clock. Moreover HLC fits in to 64 bits
| NTP timestamp format, and is masking tolerant to NTP kinks and
| uncertainties._
|
| https://cse.buffalo.edu/~demirbas/publications/hlc.pdf
| LeoNatan25 wrote:
| If there are collisions seen, that means there would be lock
| contention at this critical section, thus not a "good enough"
| solution.
| getmeinrn wrote:
| I was going to post about "use a UUID", but I was surprised to
| learn that no UUID uses both timestamp + a random component. You
| can either get fully random with UUID4, or have a time + MAC
| based UUID with UUID1. Strange, I would have thought there would
| exist a UUID that uses time + random to minimize collisions like
| described in the post.
| wood_spirit wrote:
| the new UUIDv6, 7 and 8 work the way you deacribe.
| dtech wrote:
| Only v6 and v7, v8 is "whatever you want".
| mcv wrote:
| That's not much of a standard.
| nabogh wrote:
| Can't find much info about it but is this UUID v7?
| [deleted]
| zote wrote:
| someone else in the thread mentioned UUIDv7
| LAC-Tech wrote:
| I believe the proposed UUIDv7 standard uses this.
|
| https://datatracker.ietf.org/doc/html/draft-peabody-dispatch...
|
| It's a draft but there's a lot of implementations out there.
| OJFord wrote:
| ULID (and I think UUIDv7 draft) has millisecond timestamp + 80b
| randomness.
| sethammons wrote:
| Ksuid may be what you want. Pretty much time sortable uuid.
|
| Go implementation: https://github.com/segmentio/ksuid
| wood_spirit wrote:
| This is why you should use ids that combine both a time component
| and a sequence.
|
| Eg UUIDv7 has a milliseconds time component and then a field that
| increments for each event in the same millisecond, and then
| enough random bits to make collisions between ids generated on
| different machines astronomically unlikely.
|
| Of course there are only so many bits so you might generate too
| many events in the same time slice so the sequence overflows, and
| you might actually get collisions between machines, and you are
| limiting your event generation speed by forcing your cpu to sync
| on the increment etc.
|
| But in practice UUIDv7 works great at scale.
| hardwaresofton wrote:
| And they play nicely with sort ordering in popular databases
| like postgres!
|
| They're not in tree yet but there are a bunch of awesome pg
| extensions that provide access to uuidv7 (outside of just using
| it at the application level)
| junon wrote:
| Don't even need the "same millisecond" part to save a few
| cycles, use case depending. An overflowing increment with a
| counter of any sort plus a few random bits is usually enough.
|
| If you're clever enough, neither needs a branch.
| SV_BubbleTime wrote:
| UUIDv7 has two at least two such monotonic strategies. You
| can use a counter, or you can use those bytes to increase the
| random entropy.
| qwerty456127 wrote:
| The problem with UUIDs is they are completely unreadable. Not
| just you can't understand them, it is prohibitively hard to
| even distinguish between them visually. This is why identifiers
| which would not include any bit of information (or noise)
| beyond the necessary minimum are useful in some cases.
| SV_BubbleTime wrote:
| Thus UUIDv7.
| hinkley wrote:
| I'm feeling a bit like an accidental time traveler, because I
| can recall a conversation at a tech meetup that had to have
| been at least ten years ago where someone was struggling with
| unique UUIDs because they were bursting above 1000 UUIDs per
| millisecond and not happy with the available options.
|
| How old is UUID7? I can't get the internet to tell me.
| yohannesk wrote:
| Were they not happy due to possible collisions or something
| else?
| hinkley wrote:
| Yeah they had a problem where the amortized ID rate was
| less than the number of valid UUIDs you could generate in a
| given time interval, but with a peak rate well above that,
| so UUIDs weren't quite going to function for
| naming/numbering them. You gotta be pushing a lot of
| messages through a system to hit that, but it's possible.
| And adding more layers of coordination on top of something
| meant to reduce coordination overhead tends to... make
| things messy.
|
| I've half-heartedly tried to look up his problem every time
| I've had to introduce UUIDs to something (most recently
| fixing agents not generating correlation IDs), and I have
| not figured out which version of UUID he was talking about.
| I now suspect it was some proprietary or industry-specific
| quote-unquote UUID spec rather than an industry wide one. I
| may look through the UUID7 spec at some point to see if
| they mention prior art. Much more plausible than him being
| a time traveler.
| tredre3 wrote:
| UUID7 was first drafted a bit over a year ago:
| https://datatracker.ietf.org/doc/html/draft-peabody-
| dispatch...
| hinkley wrote:
| That's kinda what I figured. So you can see my confusion.
| eru wrote:
| Why do you need the time component anyway? It's just eating up
| bits in your UUID without contributing much entropy.
| ericHosick wrote:
| UUIDs can serve different purposes. As others have mentioned,
| database performance on inserts might trump the need for
| difficult to guess UUIDs.
|
| In other cases, the UUID needs to be as random as possible.
|
| It really depends on the use case.
| SV_BubbleTime wrote:
| Yea. This is an easy one. We use both.
|
| For our "session" records, it's a UUIDv7. This sorts
| beautifully, and if I wanted to, I could look at a log and
| easily see all the entries in a particular session.
|
| For a larger db, we just need unique entries and at least
| in Dynamo, it is an advantage to equally distribute them as
| much as possible. UUIDv4 there.
| madsbuch wrote:
| You need it to make database indices perform better.
|
| If you don't need that, but just need a random UUID, UUIDv4
| is better.
| EGreg wrote:
| I dont know why people use relational databases other than
| they were first and "that's the way it's always been done".
|
| Why not use a graph database? O(1) lookups instead of O(N).
| Why need indices if you can just point to the data. Why use
| JOINs when map-reduce querying is far more flexible?
| madsbuch wrote:
| How do you do constant time lookup an graph databases?
|
| My intuition let's me know that you can not get below O(n
| log n) (Lower limit for comparison based ordering)
| tharkun__ wrote:
| They were not first.
|
| ISAM and VSAM come to mind. Yes it says "files" in there
| a lot but it got used like a database with programming
| interfaces like finding records in a database. If you
| will this method is more like NoSQL databases than a
| relational DB. The I in ISAM was a step towards not
| having to know the key (true "NoSQL"). Kind of like
| today's NoSQL databases all also give the ability to add
| indexes now.
|
| https://en.m.wikipedia.org/wiki/ISAM
| dlisboa wrote:
| I've been interested in learning more about them and how
| to best utilize them in my company. What graph database
| and query language would you recommend (regardless of
| stack)?
| EGreg wrote:
| 1. Memgraph 2. Neo4j
|
| As for query language: definitely Cypher!
| insanitybit wrote:
| You aren't describing a property of a graph database.
| You're describing a property of some set of key value
| based systems.
|
| The reason why you want indices is because some query
| patterns don't have a key to perform a look up on.
| dtech wrote:
| I feel like v7 is almost a strictly better v4. Assuming you
| can generate a v7 (you have a time source), what are the
| disadvantages?
| andersa wrote:
| Anyone who can see the uuidv7 can determine at what time
| it was generated. You might not want that for things
| exposed to users.
| madsbuch wrote:
| Entropy. You loose bits to a known source, which reduce
| entropy of the UUID.
| dtech wrote:
| I do feel 74 bits per second is enough though. That would
| require 2^(74/2) = 137 billion UUIDs generated in a
| single second for a 50% chance of a single collision.
| dralley wrote:
| 128 bits is a lot of entropy to go around, though.
| wolf550e wrote:
| You can hand out uuidv4 to clients without revealing
| anything.
| mcv wrote:
| But shouldn't that be a separate field then?
| mattashii wrote:
| The logistics of combining fields in indexes and
| identifiers is relatively complex, while the logistics of
| indexing a single field is comparatively trivial. This is
| also why you don't ship timestamps using separate fields
| for second/minute/hour/day/month/year, but a single ISO-
| string or UNIX timestamp as representation: it makes
| querying and interpreting the value more consistent.
| btilly wrote:
| Disagree on "relatively complex".
| create index on $table ($field1, $field2);
|
| Seems pretty simple to me.
| frankreyes wrote:
| Create a new column with an MD5 hash of the other
| columns. Easy.
|
| /s
| rvnx wrote:
| This is how it works, except it's not MD5 but a less-
| costly hash function
| madsbuch wrote:
| I am curios about this, and might be misunderstanding
| what you mean.
|
| Can you layout a demo architecture where you use multiple
| keys like you propose?
| junon wrote:
| There are many DBMSes that combine columns into a
| singular primary key. Performance tradeoffs vary,
| especially when it comes to indexing.
| e12e wrote:
| Any pure relational database design will eschew surrogate
| keys - most real-world systems will (should) add them
| back - because a surprising number of good natural keys
| end up changing (names change, phone numbers change,
| emails change, twitter handles become
| irrelevant/disappear, location of birth may change
| subject to geographic regions changing size...).
|
| And on top of all that, there are efficiency concerns.
|
| That said, at least for join tables AFAIK - there aren't
| often a need for row IDs beyond the involved foreign keys
| - unless you need to add meta data from other
| relations/tables...
| TRiG_Ireland wrote:
| > location of birth may change
|
| My British passport says I was born in Birr; my Irish
| passport says I was born in Galway. These are both
| correct, because they are answering different questions.
| (I was born in the town of Ballinasloe in County Galway,
| but my mother's place of residence at the time of my
| birth was the town of Birr in County Offaly.)
| pc86 wrote:
| So British place of birth means "mother's residence" and
| not place of birth?
| rvnx wrote:
| Official guidances here: https://assets.publishing.servic
| e.gov.uk/government/uploads/...
|
| Could maybe be just a mistake when doing the UK
| registration ? It's an easy fix if wanted.
| dtech wrote:
| But then you have 2 identifiers which complicates
| everything.
| qwerty456127 wrote:
| > Why do you need the time component anyway?
|
| To sort or filter the records by time. Sure, you can just add
| an extra column if you need this, but there are cases when
| this is not convenient to do. E.g. when you want to be able
| to export the records info files, naming the files with the
| IDs and still be able to sort them.
| willis936 wrote:
| One benefit of keeping it separate is that you can choose
| the precision of the timestamp necessary. "Millisecond
| precision" is arbitrary and commonly insufficient.
| preseinger wrote:
| it's commonly totally sufficient for IDs to represent a
| rough sort order
|
| millisecond precision is great for a lot of use cases
| willis936 wrote:
| I didn't say that it wasn't. Hell even ms is too precise
| for many use cases (usually where date is used instead).
|
| What I said was that it's useful to be able to select
| timestamp precision independently of UUID implementation.
| One size that fits all fits none best.
| dtech wrote:
| Lucky for you, they also define UUIDv8 as a free-for-all
| where you can do whatever you want, and nanosecond
| precision is one of the examples given in the RFC.
| danbruc wrote:
| More importantly if you have an index on purely random IDs,
| then each insert will go to some random position into the
| tree whereas having IDs that increase with time will make
| all new IDs end up at the end of the tree which reduces
| index fragmentation.
| lazide wrote:
| depending on scale and architecture, either behavior can
| be better. it's easier to shard when writes occur
| randomly over the overall space. it's easier to coalesce
| when writes all happen in a given place (head or tail)
| remram wrote:
| If that is the case then why shouldn't the storage system
| hash the IDs itself, to spread them as it requires?
| lazide wrote:
| Because sometimes you want some data to be collocated,
| while the rest sharded.
|
| For instance, you might use a random object ID as a
| prefix value in the index, followed by attribute ID which
| isn't. Or a modified time, so you can have a history of
| values which can be read out linearly.
|
| If using it directly, that means Objects and their data
| are sharded randomly across, but when looking for an
| objects attributes (or attribute by time), their index
| entries are always co-located and you can read them out
| linearly with good performance.
|
| If blindly hashing keys to distribute them, you can't do
| that. Also, you can't really do a linear read at all,
| since no data will be 'associatable' with others, as the
| index value is randomized, and what is stored in the
| index has no related to the key provided by the user.
|
| You can only do a straight get, not a read. That is very
| limiting, and expensive with large data sets as most
| algorithms benefit greatly from having ordered data.
| (Well, you could do a read, but you'd get back entries in
| completely random order)
|
| Needless to say, this is 'advanced' usage and requires
| pretty deep understanding of your data and
| indexing/write/read patterns, which is why random hashing
| is the most common hash map behavior.
| sgarland wrote:
| Page splits... page splits everywhere.
| EGreg wrote:
| Or, you could use a graph database and stop having
| frustrating relational impedance mismatch, nonlocality
| etc. You can have O(1) lookups instead of O(log N) for
| almost everything
| coding123 wrote:
| Graph databases don't solve that. All databases,
| document, graph, rel ALL implement indexes to find
| specific things in the exactly the same way. Very well
| known tree, hash and other techniques.
|
| The representation (outside of indexing) has properties
| that make your USE CASE better or worse. EGreg would not
| be someone to hire to architect a solution. He'll just
| put your 1Trillion row per month use-case in a graph DB
| like Neo4J and you'll just watch it fall over when you
| run billing.
| victor106 wrote:
| That sounds too good to be true. Is that really true of
| all grapdb's?
|
| Also, if that's really true why can't everyone just use
| graphdb's?
| EGreg wrote:
| Because of vendor lock-in with the LAMP stack over the
| years. Every host used MySQL, how many had Neo4j as
| available?
| e12e wrote:
| How big is the 1 though?
| EGreg wrote:
| Look, if you have N items related to X, at insert time,
| you store them in an array and have X point to that
| array, instead of foreign keys.
|
| For example, when a user has 7 articles. Do you want to
| just point to where the articles are stored? Or do you
| want to do O(log n) lookup for each article?
|
| And if you have many-to-many, do you want to join an
| Intermediate Table for even more processing, or just
| follow a pointer to a range of an intermediate node
| directly and traverse?
| ndriscoll wrote:
| How is that different from a clustered index?
| EGreg wrote:
| A clustered index requires O(log N) lookups, since it's
| still an index.
|
| I'm talking about pointing directly to the location of an
| array where things are stored. That array isn't a
| clustered index. Each array is different.
| ndriscoll wrote:
| What about when you delete rows? Do you just leave the
| space unused now? Or if you update a row to be larger?
| Rewrite the whole array (so possibly O(n) updates)?
|
| How do you deal with data that gets accessed in different
| orders based on relationships from multiple other
| entities? Duplicate the data? If so, updates now get
| amplified and you can fit less data in RAM so you're more
| likely to require disk IO. If not, you need a layer of
| indirection (so you have an array of pointers instead of
| an array of data).
|
| Even with a layer of indirection, updates that grow a row
| and require a reallocation will cause you to have to go
| update all pointers (also, those pointers need to be
| indexed to find who you need to update). To avoid write
| amplification, you can use an array of ids instead of an
| array of pointers. Now you want an id <-> pointer lookup,
| which could be done with a hash map in O(1), or with a
| tree in O(logN) if you want to also allow efficient range
| queries.
|
| I'm not exactly sure on the implementation details, but
| for ballpark numbers, for an 8 kB page size with 80% fill
| factor and 16 B entries (8 B key + 8 B pointer), with
| 10E9 rows, log(N) ought to be ~4 (page IOs). Not ideal,
| but the point is for a lot of real-world engineering,
| O(logN) is effectively O(1) (with a small enough constant
| factor that it's fine for general use).
|
| So if your data is not write-once, trees are a well-
| rounded default data structure.
| hinkley wrote:
| When you're talking about data sets so large they dictate
| what hardware you use, and introduce terms like
| "cluster", then 1 = [?]n
|
| Which is why we need a version 2 of complexity theory,
| that doesn't treat memory access or arithmetic on
| arbitrary precision numbers (aka as n actually goes to
| infinity) as O(1) operations. They aren't. Which every
| large system engineer knows but few will talk about.
| hansvm wrote:
| Why sqrt(n) and not log(n)?
|
| And that complexity theory already exists. Typical
| whiteboard engineering uses transdichotomous models to
| gloss over some polylogarithmic factors (as do much of
| the literature), but more accurate models exist.
|
| The difference isn't usually super relevant when
| comparing multiple solutions all using the same model of
| computation though since the extra terms don't tend to
| bump one complexity class above another when switching
| models, and if you cared about actual runtime
| characteristics you wouldn't be relying on log factors in
| such a crude tool anyway.
| hinkley wrote:
| Speed of light.
|
| Imagine a data center containing exabytes of data. How
| long does it take to access an arbitrary bit of that
| data?
|
| We use clusters because computers cannot contain an
| infinite amount of memory, storage, or CPUs, because of
| physics. You see this same thing play out at smaller
| scales but it's more obvious at the macro scale. More
| addresses take logn time to sort out, but time to access
| is measured in radii, not gate depth.
|
| In a world where clusters are rare, Knuth made decent
| approximations. In a world where clusters are not only de
| rigeur but hosted on multitenant hardware spread out over
| upwards of 100 miles, those approximations are bullshit
| and need to change.
| hansvm wrote:
| Oooh interesting, thank you. I totally misunderstood this
| as the "integers never need more than 64 bits, so hash
| tables are constant" argument.
| hinkley wrote:
| Integer arithmetic is really quantized logarithmic
| complexity. If your hardware has a bucket your number
| fits into, you're calculating n+1 or nxn in constant
| time. But if your data set doubles in size (especially
| for multiplication) you may find yourself in a bucket
| that doesn't exist or a more expensive one. Contemporary
| code is more likely to reach for bignum which is logn,
| but again stairstepped to each number of integers it
| takes to represent the entire number. A bigger problem
| when your data set is sparse, so that values grow faster
| than population (eg, UUID).
|
| But on the topic of hash tables, you can only reach
| 'O(1)' if you can avoid collisions. And to avoid
| collisions you need a key of length m, which grows as n
| grows. You cannot put 51 pigeons in 50 pigeonholes. So
| the key length of your hash keys is m >= logn, which
| means the time to compare keys is also logn, which means
| hash tables are never actually O(1) access or insert
| time. Actual access time for a hash table on non-
| imaginary hardware is [?]nlogn, not O(1).
|
| If you consider that we have applications that are just a
| single hash table occupying the entire RAM of a single
| machine, then this is not picking nits. It's capacity
| planning.
| danbruc wrote:
| That will depend on which graph database you use as a
| graph database might just store the graph in an
| underlying relational database. And it will also depend
| on what kind of data you have and what kind of queries
| you want to perform. For a graph database it might be
| faster to navigate along links in the graph but I would
| guess you will have to pay a big performance penalty if
| you have to operate orthogonally to your links, like
| aggregate across all instances of some entity.
| [deleted]
| dtech wrote:
| When you sort them they will be ordered by generation time.
| This is useful in UIs, and greatly improves performance in
| databases (how much depends on DB and use case).
| glodalitazic wrote:
| Iirc you dont want to spend entropy that you don't need. The
| advantage of using the time means you need to spend less bits
| on expensive randomness and thus generation is cheaper.
| SV_BubbleTime wrote:
| I only work in embedded, so I'm not sure about fancy
| computers - but when we make UUIDv7s, I am certain the
| random is faster than pulling the time, formatting for
| milliseconds, then storing in position.
| KMag wrote:
| Otherwise, failure recovery requires robust storage. Upon
| startup, you just wait until your timestamp ticks over, and
| then you know you're not re-issuing any UUIDs.
|
| With a pure counter-based system, you need robust distributed
| storage and you need the machine to reserve batches of IDs
| via committing writes to the robust distributed storage.
| Otherwise, a disk failure may cause you re-issue UUIDs.
|
| Though, seeding a thread-local AES-256-ctr or ChaCha20
| instance via getrandom()/getentropy() is probably a better
| fit for most situations. If you need sequential IDs for
| database performance, seed a simple 128-bit counter using
| getrandom()/getenropy(). If you're going to need more than
| 2^40 unique IDs, then use IDs larger than 128 bits.
|
| Assuming getrandom()/getentropy() is getting you a high-
| entropy seed, it's easy to size the IDs to bring the
| probability of collision much lower than the probability of
| background radiation flipping a bit and breaking your
| equality test between IDs. If you're not running redundant
| calculations and/or on radiation-hardened hardware, it's
| difficult to argue against randomized IDs.
| c0balt wrote:
| Because it allows te encoding of information the id. This
| makes it, at least in my experience, somewhat sortable.
| darkclouds wrote:
| > This is why you should use ids that combine both a time
| component and a sequence.
|
| Computers should run like clockwork, so in this example of
| using all the cores, in Windows and likely some other OS's,
| threads are assigned to cores when they are started and you can
| have many threads per core, ergo the time component should also
| have the thread and the core number combined with it with multi
| core systems.
|
| Its possible to write the code in such a way some variables are
| bounced out of the cpu cache back to memory to avoid any
| caching issues because I think the cache on some cpu's are per
| core, and some are per cpu.
| Joker_vD wrote:
| > threads are assigned to cores when they are started
|
| Do they? I thought the normal behaviour was for cores to pick
| any available thread to run, so core migration is quite
| normal.
|
| > ergo the time component should also have the thread and the
| core number combined with it with multi core systems.
|
| Sorry, how exactly does it follow from the previous? You seem
| to have omitted the other half of your syllogism. After all,
| clockworks do not have thread nor core numbers so I don't
| quite see how having those in UUIDs will make computers run
| like clockwork.
| Mikhail_Edoshin wrote:
| Isn't it simpler to use sequence keys then?
| Cthulhu_ wrote:
| Yes, but only on single machines, UUID and co are intended
| for distributed systems.
|
| Although now I wonder if / how UUID v7 can do sequential keys
| on distributed systems. Mind you, on those systems "close
| enough" will probably be good enough, and sorting will be
| done by date instead of incremental ID.
| moffkalast wrote:
| You could also keep a global incremental index for that,
| assuming there's some authoritative server that has to be
| polled sequentially to get them.
|
| Probably too much overhead when conflicting UUIDs are a few
| orders of magnitude less likely than the clients crashing
| from some random bug though.
| EGreg wrote:
| So just prefix the node ID which is assigned when the node
| joins a swarm
| Timon3 wrote:
| How is that easier or better? Now you're letting your
| infrastructure details bleed into your database.
| EGreg wrote:
| They _should_ bleed into your database. The database is
| part of your infrastructure and it contains details like
| where to find the replicas and so on
| Timon3 wrote:
| Why? Tight coupling doesn't make things easier.
| EGreg wrote:
| You're complaining that the infrastructure details are
| bleeding into the intrastructure implementation?
| Timon3 wrote:
| Infrastructure details are bleeding into non-
| infrastructure implementation. Your database content is
| _not_ an infrastructure implementation.
| EGreg wrote:
| the content is not, but it can use a database function,
| such as CURRENT_TIMESTAMP or RANDOM() can it not?
| michaelt wrote:
| Ah but you see, UUIDs are web scale.
|
| And by web scale, I mean they're too big to exchange by any
| offline channel.
| SV_BubbleTime wrote:
| FWIW... we're using UUIDv7s over BLE.
| _flux wrote:
| Should you, though? BLE itself uses UUIDs, but the ones
| it uses are in a shortened format just because the BLE
| frames are small and the protocol is not too fast.
|
| Granted I've even used JSON over GATT so maybe I should
| keep my mouth shut :).
| SV_BubbleTime wrote:
| It's for an auth use. And at least we're using CBOR. But
| I get your point.
| Tade0 wrote:
| It's 128 bits vs 64 bits - not really that much of a
| difference.
| efitz wrote:
| Related
|
| I used to be the program manager owner of the security event log
| in windows.
|
| When things happen simultaneously or very closely in time on
| multi-core systems, thread scheduling can significantly affect
| observations of those things. For example, your thread quantum
| might expire before you get to the syscall to get a time stamp,
| or before you can pass a buffer to queue your event for
| subsequent time stamping.
|
| In fact, on multiprocessing systems, it was very common to see
| out of order event log entries on Windows back in the day
| (2000-oughts). You also could not count on the log timestamp
| accuracy too precisely; 1s was pretty much the safe lower bound
| (some components truncated or rounded time stamps IIRC).
| maxekman wrote:
| There is some interesting content regarding this in the
| Timestamps and Sequencing section in this post about the Go
| execution tracer overhaul:
| https://go.googlesource.com/proposal/+/ac09a140c3d26f8bb62cb...
| forrestthewoods wrote:
| > On my system, the minimum increment was 32 ns.
|
| Then you aren't using a nanosecond clock.
|
| If you want actual nanosecond precision then you probably want
| rdtsc.
| non-e-moose wrote:
| Silly Rabbit - absolutely accurate times are a security problem.
| CPU designers (even waay back in Alpha @ DEC) intentionally
| introduced clock jitter, just to prevent total predictability.
| For x86, I think if you performed 3-4 of them, and then saved the
| values in to registers, and then upon completion reported those
| values, you would find that the time deltas are NOT exactly the
| same.
| grose wrote:
| This is why it scares me a bit to use a raw timestamp as a sort
| key in DynamoDB. I append a (random) unique ID to the timestamp
| text to avoid it. Better safe than sorry, I figure.
| moduspol wrote:
| We ran into this, though we were using millisecond timestamps
| with the Number type.
|
| Ended up working around it by adding numbers after the decimal
| point. DynamoDB apparently supports up to 38 digits of
| precision, and JavaScript (we're NodeJS based) by default
| encodes numbers to allow for up to 16 digits (combined, before
| and after the decimal point). A UNIX epoch millisecond
| timestamp is 13 digits, so we were able to use 3 just to add
| uniqueness without having to update anything else in our
| codebase.
|
| We certainly now plan to essentially always use string-based
| (and generically named) partition and sort keys, which would
| allow us to more easily do what you're describing.
| fer wrote:
| I have a use case that uses a similar solution, but for a
| different reason: it doesn't scare me to use timestamp as sort
| key I since I know my hash keys are unique and I only write
| every few seconds. But I still add random amount of ms (well
| under the frequency of writes/updates), because otherwise I'll
| be hitting hot partition issues on the GSI (indexed by
| timestamp, as you can guess).
| rdtsc wrote:
| Erlang/Elixir (BEAM VM) makes this very clear - it's a
| distinction between monotonic vs strictly monotonic.
|
| https://www.erlang.org/doc/apps/erts/time_correction.html#mo...
|
| > In a monotonically increasing sequence of values, all values
| that have a predecessor are either larger than or equal to its
| predecessor.
|
| These are available via the
| https://www.erlang.org/doc/man/erlang.html#monotonic_time-0
| function.
|
| https://www.erlang.org/doc/apps/erts/time_correction.html#st...
|
| > In a strictly monotonically increasing sequence of values, all
| values that have a predecessor are larger than its predecessor.
|
| Strictly monotonic values imply some synchronization /
| coordination, with an associated performance impact when there
| are many concurrent processes. This functionality is available
| via the
| https://www.erlang.org/doc/man/erlang.html#unique_integer-1
| function. With a warning associated with it:
|
| > Strictly monotonically increasing values are inherently quite
| expensive to generate and scales poorly. This is because the
| values need to be synchronized between CPU cores. That is, do not
| pass the monotonic modifier unless you really need strictly
| monotonically increasing values.
| derefr wrote:
| Relatedly, Erlang's own refs aren't created by a strictly-
| monotonic global generator; but rather are internally a pair of
| a regular monotonic identifier, and the PID of the requesting
| process. In other words, they're akin to UUIDv1s, or to
| https://en.wikipedia.org/wiki/Snowflake_ID s.
|
| You only really _need_ strictly monotonic global identifiers if
| you need immediately-consistent first /last-write-wins. If you
| can instead rely on eventually-consistent first/last-write-wins
| (i.e. if your write events enter an event store/queue that
| linearizes them by ID, and then all "simultaneous" writes but
| the one with highest ID priority can be dropped/ignored, either
| during processing, or on read), then I'd recommend first
| considering packed (nodeID, seq) pairs. And, if you want global
| event orderibility, considering the snowflake ID formulation
| (timestampMajor, nodeID, timestampMinor, seq) specifically.
| datenwolf wrote:
| In other news: Water is wet.
|
| The high resolution precision time counters are derived from the
| system base clock, usually operating at ~33MHz, which translates
| exactly into that 30ns granularity observed.
|
| If you really want robust time derived timestamp identifiers,
| truncate the high resolution timer to at best 10us resolution
| replace the low bits with the hash of them, concatenated with the
| value of the CPU cycle counter (`RDTSC` on x86).
| hinkley wrote:
| The HRT patch set for Linux (which is eight years old) claims
| that Linux had a 10ms clock resolution before the patch, and
| conversations on SO suggest the resolution is now 1ns, so I
| believe this information is dated, or at least OS dependent.
| usefulcat wrote:
| I thought clock_gettime() usually does use rdtsc(p) on Linux?
| Possibly depending on the particular clock type (montonic,
| realtime, etc). Either way I'd be interested in knowing more.
| datenwolf wrote:
| RDTSC is directly influenced by frequency scaling. So while
| monotonic, its clock interval is neither constant, nor
| deterministic on modern systems.
|
| Here's a small online visualization of RDTSC average and
| standard deviation I just hacked together: https://gist.githu
| b.com/datenwolf/151486f6d73c9b25ac701bdbde...
|
| On a system with frequency scaling you can see that under
| higher load the difference between RDTSC in subsequent
| iterations of a tight loop that does nothing else than
| reading that register will drop. Here's how it looks on the
| system I'm currently using:
| https://www.youtube.com/watch?v=FKKjSJ1JZ78
| vient wrote:
| > RDTSC is directly influenced by frequency scaling
|
| Unfortunate wording, RDTSC itself is not influenced by
| frequency scaling, it has constant frequency on modern CPUs
| after all. Your video nicely shows that RDTSC delta is
| influenced by CPU frequency, as expected, but how does it
| affect using RDTSC as a clock? On my CPU RDTSC seems to
| tick at 3GHz, for example. I wonder how precise it is
| though, how much its frequency can drift from spec.
| datenwolf wrote:
| I did update my program, now it measures the ratio.
| datenwolf wrote:
| Ah crud, you're right. What'd really be interesting to
| see is the ratio between d/dt rdtsc and d/dt
| clock_monotonic.
| amelius wrote:
| > replace the low bits with the hash of them
|
| Doesn't this hash-step only increase the probability of
| collisions?
|
| What is this step intended to solve?
| jjk166 wrote:
| > replace the low bits with the hash of them, concatenated
| with the value of the CPU cycle counter (`RDTSC` on x86)
|
| you're concatenating two values and then taking the hash of
| the combination, ie:
|
| hash(low bits + CPU cycle counter)
| dheera wrote:
| Why? low bits + CPU cycle counter
|
| is enough. No need of the hash()
| datenwolf wrote:
| You don't know precisely at which frequency the cycle
| counter runs. Depending on the system load it might
| either run faster or slower than the lowest bits the
| HPTC. For what it's worth this part is more or less
| nondeterministic, so the sane thing to do, is spread out
| the information as much as possible (maximize entropy),
| in order to minimize the probability of collisions.
| dheera wrote:
| That's ok, the bits past the low bits are just there to
| avoid collisions, not an actual measure of high precision
| time beyond the low bits.
|
| It's not worse than the hash solution, I'm just saying
| it's not necessary to hash it if the only objective is to
| reduce collisions.
|
| In fact the hashing solution, if it is replacing the low
| bits with a hash of low bits plus something else, is
| actually destroying valuable time information.
| datenwolf wrote:
| That only works, if you know exactly, that the low bits
| are constant. Otherwise you may run into the issue that
| due to unsteady rate of RDTSC between two
| processes/threads that may be preemptively unscheduled
| between reading the HPTC and the RDTSC you might again
| end up with colliding time stamps. And if you took the
| derivative between timestamps taken in succession, you
| might even find is to be non-monotonic in places.
|
| The combination of multiple counters incremented by
| individual unsteady clocks used to be a source for pseudo
| random scrambler sequences; these days we prefer LFSRs,
| but overall this is something that can be weird.
|
| Hence my recommendation: Just throw xxHash32 on
| concatenation of the HPTC's low bits and the CPU clock
| cycle counter, and forgo any pretense of monotony in the
| low bits (because very likely you don't have it anyway).
| amelius wrote:
| But a hash() destroys information.
|
| When you are trying to reduce collisions, why would you use
| a function that is known to introduce collisions?
| datenwolf wrote:
| Because the raw value from the timestamp will be very low
| entropy and have the short scale variation concentrated
| in just a few bits. A hash not just destroys information,
| it also creates entropy by mixing that information over
| all the bits that come out of it. And using 64 bit hash
| replacing a 64 bit nanosecond counter that has short term
| entropy of less than 16 bits, you're in fact reducing the
| likelihood of a collision by a factor of 2^48.
| marcosdumay wrote:
| > it also creates entropy
|
| It's a nitpick, but it concentrates the entropy. It
| doesn't create any.
|
| I do think it will make the answer more clear, as the
| hash concentrates the less than 64 bits of entropy on
| that 128 bits of original data into a usable 64 bits
| package.
| datenwolf wrote:
| Actually hashes do create entropy (every computation
| creates entropy in some form or another). What's the
| entropy of a 4 bit number? What's the entropy of a 4 bit
| number hashed by a 64 bit hash function? The act of
| computation does in fact create entropy, as per the 2nd
| law of thermodynamics, a part of which shows up in the
| hash.
| amelius wrote:
| > every computation creates entropy in some form or
| another
|
| Ok, what is the entropy created by this function that
| maps a 4-bit number to a 64 bit number:
| 0 -> 0 1 -> 1 2 -> 1 3 -> 1
| 4 -> 1 ... 15 -> 1
| datenwolf wrote:
| 60 bits. Yes, I know, you can compress it down very well.
| But consider that entropy in computation involves not
| just the bits you store, but also the bits that the
| processor touches and eventually dissipates as heat into
| the universe.
| amelius wrote:
| What definition of entropy do you use?
|
| (I'm using Shannon entropy.)
| pxx wrote:
| I don't think you understand what this conversation is
| about. We are considering information theoretic entropy,
| not thermodynamic entropy from the mechanism of
| computation itself.
|
| The result of applying a deterministic function on a
| random variable cannot have more entropy than the
| underlying random variable. This is a theorem, one that
| is trivial enough to not have a name. But you can find
| solution sets to homework that will prove it for you: htt
| ps://my.ece.utah.edu/~rchen/courses/homework1_sol_rr.pdf
| zamadatix wrote:
| The hash, in this case, is just one deterministic way to
| shorten the CPU counter to few bits which can then be
| used to increase the entropy of the timestamp by
| replacing the timestamps stale bits. What's being asked
| here is not why use some compressed bits of the CPU
| counter increases the entropy of the timestamp overall.
| Rather, why you'd use a hash of the entropy providing
| information to do this since hashes allow for many:1
| mappings (i.e. allow for decreases in entropy) while
| never providing better than 1:1 mappings (i.e. preserving
| entropy). At best, the hash is preserving the entropy of
| the timestamp and counter bits and, at worst, it is
| throwing some away. The question that follows: is there a
| better way that preserves the entropy of the inputs
| without the risk of throwing some away and, if so, was
| there some other reason to still use the hash? This is
| what I took amelius to be asking.
|
| Knowing the timestamp delta is ~30ns then even a 1 THz
| processor would only execute 30,000 cycles before said
| timestamp increases and the outcome stays unique anyways.
| From that perspective, taking the low 16 bits of the
| cycle counter is already guaranteed to help produce
| perfectly unique timestamps, and without the risk of
| introducing hash collisions. It's also significantly
| easier to compute and now monotonic* now, whereas hashes
| were neither. From that perspective, I too don't see what
| value the hash is supposed to add to the information
| being fed to it in this case.
|
| In threaded/distributed/parallel scenarios you may wish
| to throw the lane/core/node number in the bits as well.
| In the case having the same timestamp is supposed to be
| invalid this leaves you a simple deterministic way to
| tiebreak the situation as part of the creation of the
| timestamp itself.
|
| *A 10 GHz CPU running for a little over 58 years could
| risk a 64 bit counter rollover in the middle of a time
| delta. If that's too close for comfort or you want the
| code to work on e.g. 32 bit counter systems, you'd need
| to eat more cycles and registers to set another bit to
| the outcome of whether current cycle count is < the one
| at the start of the current timestamp delta.
| sfink wrote:
| Yeah, when I was thinking about the hash, I thought of it
| as stuffing to fill the unused portion of the number that
| would look better than using zeroes.
|
| TIMESTAMP010111101010101TSC "looks better" than
| TIMESTAMP000000000000000TSC even though they contain the
| same information.
|
| I would drop the hash, it's deceptive.
|
| I don't believe it breaks the monotonicity, though? I
| mean, it would if it weren't already broken. If you're
| taking the low 16 bits of the TSC, then a rollover in
| those 16 bits during the same timestamp will already go
| backwards. TIMESTAMP0...0 follows TIMESTAMPf...f.
| byteknight wrote:
| Wouldn't those operations reduce the accuracy of the time
| stamp?
| crote wrote:
| Modern CPUs don't really give you accurate nanosecond-scale
| time stamps anyways. The CPU will randomly speed up or slow
| down, execute instructions out of order, and even
| speculatively execute instructions. Not to mention that it'll
| have a dozen different clocks - which are not guaranteed to
| be in sync.
| blibble wrote:
| > The CPU will randomly speed up or slow down
|
| constant_tsc has been a thing for more than a decade
|
| > execute instructions out of order, and even speculatively
| execute
|
| rdtscp serialises
|
| > Not to mention that it'll have a dozen different clocks -
| which are not guaranteed to be in sync.
|
| synchronised_tsc has been a think for about 6 years now
| lazide wrote:
| None of these are performant, no?
|
| Generally, you can have consistency, speed, or low cost -
| but not more than two at the same time.
| kprotty wrote:
| Invariant (Constant) TSC is detectable via `cpuid` and
| applies to `rdtsc/rdtscp` by default. In that aspect,
| there's no tradeoff being made there (observable to
| software) AFAICK.
| lazide wrote:
| interesting results! i guess it depends in if you
| consider 35ish cycles expensive or not.
|
| [https://community.intel.com/t5/Software-Tuning-
| Performance/H...]
| byteknight wrote:
| This may be coming from a place of ignorance, but if that
| were the case, then time would drift significantly
| constantly due to Intel speed step for example. And if that
| were the source of truth for time, then when your computer
| is off, it wouldn't be able to keep. I'm pretty sure they
| all have real time clock chips in the motherboards.
| nomel wrote:
| There's time keeping (which uses the real time clock) and
| high resolution clocks, which are good for relative
| timers. They're never the same components.
| datenwolf wrote:
| What is this "accuracy" you're talking about? Between the
| scheduler hanging over a process's head, ready to yank away
| its compute time for milliseconds between the syscall to read
| out the HPTC or the CPU cycle counter, and the process
| actually writing the timestamp into a variable/in-memory
| struct, reading the HPTC from the underlying hardware also
| not being super accurate, and the CPU cycle counter being
| influence by frequency scaling, on a bog-standard machine the
| highest precision you can reasonable expect is on the order
| of 100us or so.
| imglorp wrote:
| Integrating a hash might improve (not guarantee) the uniqueness
| situation but not the monotonicity situation. Right?
| datenwolf wrote:
| Well, you can always attempt to catch the CPU cycle counter
| overflow (happens at roughly 10Hz on current machines),
| adding up the carries and add it to a nanosecond counter
| shifted up by a few bits. Problem with the CPU cycle counter
| is, that it's not in lockstep with the HPTC, due to dynamic
| frequency scaling.
|
| If you really, really, really need system wide, nanosecond
| precision timestamps, you'll have to resort to dedicated
| hardware incrementing a counter at the desired rate, and with
| a well known latency for each read access. On the hardware
| and driver level you'd have some MMIO port mapped into user
| space with an identity transform between bus address and
| process address space.
|
| However this still doesn't solve the problem, of the
| scheduler being able to throw in several milliseconds between
| reading the high precision timer value into a register and
| writing it back into the memory holding the timestamp
| variable. Seriously, on your typical computer system software
| derived timestamps at more than 100us resolution are kind of
| bogus.
| jsolson wrote:
| Some of this is not current.
|
| Constant-rate TSCs are ~15-20 years old. Synchronized TSCs
| are at least ten.
|
| Also RDTSC produces a 64-bit result, so it does not
| overflow at a rate of several Hz.
| datenwolf wrote:
| It's 64 bit on 64 systems. In the world of hard realtime
| applications there's still a huge armada of systems out
| there in the field running on 32 bit (think motion
| control, CNC machines, industrial robotics). If you're
| developing software that's concerned with this kind of
| precision, you might find yourself confronted with such
| "outdated" systems more often, than not.
|
| Also see https://news.ycombinator.com/item?id=36814762
| BiteCode_dev wrote:
| Or UUID7
| dan-robertson wrote:
| I don't think this is right. I am pretty doubtful of the
| discussion of the granularity of the results of clock_gettime,
| but I failed to find any sources and I don't have a suitable
| computer to hand to experiment.
|
| But here are two more doubts:
|
| 1. On at least some systems, clock_gettime will be implemented
| by a vdso call that (on x86) uses rdtsc to give you the time,
| so you should expect its results to be pretty highly correlated
| with an rdtsc immediately afterwards, so you aren't really
| adding useful entropy (whereas you are losing time-locality of
| your identifiers which may be desirable if they are to end up
| as primary keys somewhere)
|
| 2. On some systems (eg some AMD processors), rdtsc gave me
| pretty low precision results (eg rounding to the nearest 10
| cycles) so you also won't necessarily get the entropy you would
| expect from the description of the instruction. I failed to
| find a reference for this too apart from an offhand mention in
| a paper about timing attacks.
| godelski wrote:
| Despite the resolution being nanoseconds, what is the actual
| precision of computer clocks? I can't imagine it is actually
| nanoseconds. Takes me back to teaching physics labs where I had
| to hound students to remember that the accuracy of their
| measuring device is not identical to the smallest number it
| displays...
| Rapzid wrote:
| > I can't imagine it is actually nanoseconds
|
| It's nanoseconds.
| godelski wrote:
| 1? 10? 100? Those are all nanoseconds. When someone is asking
| about precision it tends to be a good thing to be precise in
| your answer.
| wewtyflakes wrote:
| You were similarly not precise in your skepticism. ;)
| ajb wrote:
| For devices clocked above 1Ghz it's perfectly possible for the
| clock to increment every ns, although that doesn't make it
| accurate to that level, and multi core systems may have clocks
| that are not synchronised to that level.
|
| ARMv8 guarantees that it's clock increments at at least 1Ghz,
| for intel and earlier ARM it's more complicated
| zokier wrote:
| Cycle counting is one thing, but it gets tricky when you have
| frequency scaling in play. Another problem that even without
| freq scaling, cpu clocks are not designed to be super
| accurate/exact, and the true frequency might vary
| significantly even if the nominal freq is fixed
| daniel-cussen wrote:
| It doesn't matter that much they're not super accurate
| exact, they do in fact count ticks n tocks j dandy. I
| wonder if they are counted equally, interesting question if
| i do say so myself about my own question, i know rising
| edge is steady w the next rising edge, n falling edge w
| falling edge, but yeah...no i think that's a specification
| figure of merit, that they be balanced w respect to each
| other. N Intel(r) despite it's many failures, long ago
| forecast n long overdue due to the sheer difficulty of
| their business model n how long they kept it going--they
| were saying it was going to end soon in the late seventies
| already--n you know what, i'm fine w that. They don't make
| old chips rot like other software companies i could
| mention, bits rot but glass is timeless. Which brings me
| back to the point, in my analysis the problem is not the
| clocks being inaccurate but rather the jitter, which means
| a single run will not suffice in describing a repeatable
| exact clock time taken for eg an inner loop, which is what
| is worth bothering with. The minimum jitter attainable
| currently is 1 cycle, n then i guess you run the same code
| repeatedly n take the minimum with more repeatability as a
| consequence of that low jitter.
|
| In the early nineties it was not so, you'd get the same
| number of clock cycles again n again n again.
|
| N then it gets tricky because cycle counts are thresholds
| set within which, if voltages n frequency are proper, the
| operation will complete deterministically w a soft error
| rate no greater than the system as a whole, about one per
| 30 years of hammering on the chip at sea level. Which is
| not enough for my tastes, n the jitter is a fucking mess.
|
| I much prefer the GA144, least jitter of any platform bar
| none, sensible because it is fully asynchronous logic, no
| clock anywhere in sight until you connect it to an
| oscillator, n even then the vibration doesn't rock the
| system like that of a grandfather clock synchs w another
| grandfather clock with whose pendulum's swing the former
| clock's pendulum swing is aligned. GA144 it's pretty easy
| to tell average case complexity of a function down to the
| tens of picoseconds, at which point you have to check that
| there's no open window letting a draft in. In fact the time
| trial will tell you such a draft is coming into the house,
| it happened to me, while not from a parallel universe in
| spacial dimensions it is by all means from another universe
| in the time dimension.
| ChuckMcM wrote:
| Let me tell you about the time the performance guy figured out
| one of the big cycle eaters was that every CPU in the SMP system
| was trying to read the time of day from the same place in memory
| so they "fixed" that by giving every CPU its only copy of the
| current time ...
| techNoob123 wrote:
| At some point doesn't this come down to the ISA? A CPU running at
| 3GHz gets 3 clock cycles per nanosecond.
|
| I bet there is a fair amount of optimization in the compiler that
| leads to back to back assembly calls of reading the clock
| register. If subsequent time.Now() calls happen within 3 clock
| cycles of each other, can you really fairly expect unique
| nanosecond precision...
| rwmj wrote:
| Linux will (x86_64) use RDTSC and adjust that against a value
| read from the VDSO, so it can indeed happen very quickly.
| zokier wrote:
| I guess I'm old, the macOS behaviour is more in line with my
| expectations.
|
| But this got me thinking, how feasible it would be to tie the
| various clock systems of a computer to some reference clock, like
| 10 MHz GPSDO? Obviously it wouldn't improve the granularity, but
| you could ensure that the timestamps are actually accurate.
| Because otherwise I doubt that random computer clock would be
| accurate down to 32ns even with NTP.
| mgaunard wrote:
| Getting a 10MHz PPS signal requires specialized expensive
| hardware and typically doesn't scale well to cover every
| server. That sort of thing is best left to extreme applications
| with FPGAs or ASICs. In particular the 10MHz version; a lot of
| commodity hardware only supports the 1Hz one.
|
| There's still an in-between of PPS and NTP: PTP.
| mlichvar wrote:
| A major problem in synchronization of the system clock is
| PCIe. Hardware can timestamp PPS signal or PTP/NTP packets
| with accuracy of a few nanoseconds if everything is well
| compensated, but the PCIe bus between the CPU and the
| timestamping HW has a latency of hundreds of nanoseconds with
| potentially large asymmetry, degrading the accuracy
| significantly.
|
| Measuring that error is difficult. A possibility is to run a
| program periodically making random reads from memory
| (avoiding the CPU cache) to generate a PPS signal on the
| memory bus, which can be observed with a scope. There is a
| lot of noise due to other memory activity, RAM refresh, etc.
| From the configured memory speed and tRCD+tCL timings the
| uncertainty of the error can be reduced.
|
| This might improve with new hardware. There is a feature
| called Precision Time Measurement (PTM), which is a hardware
| implementation of an NTP-like protocol in PCIe, but so far I
| have seen this working only on some onboard NICs.
| quickthrower2 wrote:
| Another case of precision but not accuracy.
| dnstalk wrote:
| Comment out CLOCK_MONOTONIC_RAW because that's not available on
| FreeBSD, and it looks OK to me? My understanding is that we
| should see some timestamps repeating if there's a collision,
| correct? I can't get any collisions... >
| ./clock_gettime_demo/clock_gettime_demo
| clock_getres(CLOCK_REALTIME, ...)=1 ns
| clock_getres(CLOCK_MONOTONIC, ...)=1 ns CLOCK_REALTIME
| 30 samples: 1689957434662039455 1689957434662039526
| (diff=71) 1689957434662039566 (diff=40)
| 1689957434662039606 (diff=40) 1689957434662039636 (diff=30)
| 1689957434662039676 (diff=40) 1689957434662039706 (diff=30)
| 1689957434662039736 (diff=30) 1689957434662039766 (diff=30)
| 1689957434662039796 (diff=30) 1689957434662039826 (diff=30)
| 1689957434662039856 (diff=30) 1689957434662039886 (diff=30)
| 1689957434662039916 (diff=30) 1689957434662039946 (diff=30)
| 1689957434662039987 (diff=41) 1689957434662040016 (diff=29)
| 1689957434662040047 (diff=31) 1689957434662040076 (diff=29)
| 1689957434662040107 (diff=31) 1689957434662040137 (diff=30)
| 1689957434662040167 (diff=30) 1689957434662040197 (diff=30)
| 1689957434662040227 (diff=30) 1689957434662040257 (diff=30)
| 1689957434662040297 (diff=40) 1689957434662040327 (diff=30)
| 1689957434662040357 (diff=30) 1689957434662040387 (diff=30)
| 1689957434662040417 (diff=30) CLOCK_MONOTONIC 30
| samples: 2168262770533974 2168262770534023 (diff=49)
| 2168262770534054 (diff=31) 2168262770534094 (diff=40)
| 2168262770534124 (diff=30) 2168262770534164 (diff=40)
| 2168262770534194 (diff=30) 2168262770534224 (diff=30)
| 2168262770534254 (diff=30) 2168262770534284 (diff=30)
| 2168262770534314 (diff=30) 2168262770534344 (diff=30)
| 2168262770534374 (diff=30) 2168262770534404 (diff=30)
| 2168262770534435 (diff=31) 2168262770534464 (diff=29)
| 2168262770534495 (diff=31) 2168262770534524 (diff=29)
| 2168262770534555 (diff=31) 2168262770534585 (diff=30)
| 2168262770534615 (diff=30) 2168262770534645 (diff=30)
| 2168262770534685 (diff=40) 2168262770534715 (diff=30)
| 2168262770534745 (diff=30) 2168262770534775 (diff=30)
| 2168262770534805 (diff=30) 2168262770534835 (diff=30)
| 2168262770534865 (diff=30) 2168262770534895 (diff=30)
| vient wrote:
| Do you run it on 4 cores simultaneously as the author did?
| dnstalk wrote:
| I'm running it on real hardware (Ryzen 9 5900X, 12 cores 24
| threads) and all I'm doing is executing the program once. My
| understanding is that the program is using
| runtime.GOMAXPROCS(0) to ensure it's using all available CPU
| cores because that falls back to runtime.NumCPU ?
| dnstalk wrote:
| Oh, I was only running the C program not the Go binary which
| orchestrates it, that's why I wasn't seeing the longer stats.
| But I'm still getting no duplicate collisions.
| running longer zeros test ... sampled 10000000 pairs; 0
| time diff zeros = 0.000000%; 0 nano diff zeros = 0.000000%
| starting parallel test 24 goroutines x 10000000 samples ...
| 10000000 samples from a thread; 0 collisions inside the
| thread; 0 collisions with other threads 10000000
| samples from a thread; 0 collisions inside the thread; 945466
| collisions with other threads 10000000 samples from a
| thread; 0 collisions inside the thread; 1982688 collisions
| with other threads 10000000 samples from a thread; 0
| collisions inside the thread; 2739919 collisions with other
| threads 10000000 samples from a thread; 0 collisions
| inside the thread; 3334361 collisions with other threads
| 10000000 samples from a thread; 0 collisions inside the
| thread; 4109772 collisions with other threads 10000000
| samples from a thread; 0 collisions inside the thread;
| 4489881 collisions with other threads 10000000 samples
| from a thread; 0 collisions inside the thread; 5157178
| collisions with other threads 10000000 samples from a
| thread; 0 collisions inside the thread; 5596508 collisions
| with other threads 10000000 samples from a thread; 0
| collisions inside the thread; 5854763 collisions with other
| threads 10000000 samples from a thread; 0 collisions
| inside the thread; 5937583 collisions with other threads
| 10000000 samples from a thread; 0 collisions inside the
| thread; 6434076 collisions with other threads 10000000
| samples from a thread; 0 collisions inside the thread;
| 6521917 collisions with other threads 10000000 samples
| from a thread; 0 collisions inside the thread; 6932626
| collisions with other threads 10000000 samples from a
| thread; 0 collisions inside the thread; 7104428 collisions
| with other threads 10000000 samples from a thread; 0
| collisions inside the thread; 7285076 collisions with other
| threads 10000000 samples from a thread; 0 collisions
| inside the thread; 7514904 collisions with other threads
| 10000000 samples from a thread; 0 collisions inside the
| thread; 7833317 collisions with other threads 10000000
| samples from a thread; 0 collisions inside the thread;
| 7737265 collisions with other threads 10000000 samples
| from a thread; 0 collisions inside the thread; 7723545
| collisions with other threads 10000000 samples from a
| thread; 0 collisions inside the thread; 7730441 collisions
| with other threads 10000000 samples from a thread; 0
| collisions inside the thread; 8311161 collisions with other
| threads 10000000 samples from a thread; 0 collisions
| inside the thread; 7965117 collisions with other threads
| 10000000 samples from a thread; 0 collisions inside the
| thread; 8460173 collisions with other threads 102297835
| final samples; 137702165 total collisions = 57.375902%;
| possible duplicate collisions? 0
| qwerty456127 wrote:
| > it is unsafe to assume that a raw nanosecond timestamp is a
| unique identifier
|
| It is reasonable to assume that a raw nanosecond timestamp is a
| unique identifier in cases when you can reasonably expect such
| identifiers are not generated too offten.
|
| I.e. this is Ok (and even redundant) for identifying manual
| interactions a single user would pdosuce. It probably is also Ok
| for labelling manual interactions multiple users would produce if
| that's a small team.
| aaron695 wrote:
| [dead]
| LAC-Tech wrote:
| A lot of mention of UUDv7 in this thread which is good. But I
| also wonder what the collision rate for Ulids are.
| 3cats-in-a-coat wrote:
| I frankly don't understand how it's good. UUID originally was
| intended as something you use very sparingly, to name, say, a
| product SKU maybe, an organization, something like that. Not
| literally content that collides commonly at the same
| nanosecond, in the same application, in the same platform/org.
|
| At some point we have to question the sanity of using one
| single flat address space for everything from the tiniest
| identifier to the... well, "Universe", as if it makes sense.
|
| We can have registrars for global ids, and we can nest local
| ids inside them, we can have a hierarchy, and we can have local
| compact ids, 32, 64 or 128 bit, which will never collide, and
| be locally cohesive.
|
| So why aren't we doing this? Is it ignorance? Is it because
| magic is easier than actually figuring out the synchronization
| and order of things in a system like this (and no,
| synchronization does NOT imply you need to issue ids strictly
| linearly).
|
| Honestly I'm at a loss of words.
| creatonez wrote:
| > I frankly don't understand how it's good. UUID originally
| was intended as something you use very sparingly, to name,
| say, a product SKU maybe, an organization, something like
| that. Not literally content that collides commonly at the
| same nanosecond, in the same application, in the same
| platform/org.
|
| Wikipedia gives a different history, that it was originally
| for networked computers - https://en.wikipedia.org/wiki/Unive
| rsally_unique_identifier#...
| RicardoLuis0 wrote:
| > So why aren't we doing this?
|
| isn't IPv6 is basically that? just restricted to internet
| addresses
| 3cats-in-a-coat wrote:
| It's sort of this. Although it would've been nice if the
| size of the IP wasn't restricted, so one day we can add an
| optional segment on top and connect the whole Milky Way, or
| something.
| fer wrote:
| Fun fact: a ping from the center to the extreme of the
| milky way would overflow for 64 bit millisecond
| timestamps.
| pravus wrote:
| > We can have registrars for global ids, and we can nest
| local ids inside them, we can have a hierarchy, and we can
| have local compact ids, 32, 64 or 128 bit, which will never
| collide, and be locally cohesive.
|
| We have this. It's called OID (Object Identifier) and is used
| in X.509, LDAP, SNMP, and many other technologies. A global
| registry exists (I have an entry) and it's a giant tree.
|
| > So why aren't we doing this? Is it ignorance?
|
| The problem you are solving here is for durable, long-lasting
| keys. There is also a need to generate large batches of
| short-lived keys that need to never collide for
| security/identification purposes. A centralized registry will
| not work for that and it requires a wholly different
| technique.
| 3cats-in-a-coat wrote:
| My point was, let's start with that large OID tree, for
| example.
|
| And continue this concept downward. Except when it's inside
| your org, you're the registrar of your namespace's sub-
| OIDs, and so on. And there's precisely no reason not to
| have hierarchical sequential ids for everything. You need
| to generate ids on 100 servers? Good, give each server a
| namespace. You need to do that in a 100 processes on each
| server? Good, give a sub-namespace to those too.
|
| And the best of all is that the above-organization OID part
| of the tree is ONLY needed if you show these ids outside
| your organization. Otherwise you can use the internal
| private part of the identifiers.
|
| So what am I missing here? Maybe they need to be hard to
| guess, so add a random key part to them (distinct from the
| generated sequence part) as a "password". Done.
| pravus wrote:
| I'll also have the need to make namespaces dynamic and
| centrally coordinate their lifecycle along with the rest
| of my infrastructure as I stand up anything that needs
| some sort of coordinated ID. So I've just moved this
| issue to an even larger scope with even more complexity.
|
| What do I gain for this? UUIDs solve all of this because
| the chance of creating a duplicate is so low it can
| effectively be ignored. I can namespace UUIDS and create
| chains of them if needed.
|
| This is the reason both exist. We need both. We can use
| both.
| 3cats-in-a-coat wrote:
| When you get used to this basic principle of when you
| create something, the creator slaps a name on it, it
| stops being considered a hassle or confusing.
|
| More specifically:
|
| 1. When you create something, the creator names it.
|
| 2. When you bring something you created outside your
| scope, you prepend your name to it.
|
| It's kind of what we do with (actual) children if you
| think, slightly less structured, but the idea was always
| there. Just add recursion to complete the principle.
| pravus wrote:
| I generate children faster than the birth registry can
| issue certificates and Cronus eats them before it could
| be issued anyway. Nesting namespaces doesn't solve the
| problem of scale within a single namespace.
| newaccount74 wrote:
| The problem with hierarchic identifiers is that you need to
| be extremely dilligent when assigning them, and you make it
| extremely hard to refactor something in the future. But
| refactoring is something that happens all the time.
|
| For example, you might have "comments" and "posts", and later
| merge the two entities and call them "notes". Now you have to
| figure out a way to merge the identifiers -- with UUIDs you
| don't have issues like this, because they are universally
| unique.
|
| A lot of developers have found that UUIDv7 provides the
| properties they want (unique identifiers which sort by
| creation time), and they don't come with the hassle that
| other approaches come with.
| aljarry wrote:
| What's wrong with using UUIDs for things that are commonly
| used every nanosecond? It's still improbable to get 2 UUIDs
| for identifier for the same "thing" in the system, even if
| you generate billions of them per second. It's a pretty good
| way to get non-colliding IDs without a central registry.
| That's the main feature.
|
| I've never seen duplicated UUID generated, and if that's the
| issue, you may think about using cryptographic randomness
| source to generate it.
|
| UUID v7 has millisecond precision + 72 bits of randomness,
| which is _a lot_.
|
| Also, central registry for IDs seems like a great way to
| shoot yourself in a foot, in case it's down, your network is
| down, you're on cellular connection, on a plane...
| newaccount74 wrote:
| ULIDs are almost the same as UUIDv7, except that they have a
| few extra random bits that UUIDv7 use to encode the version of
| UUID. Every UUIDv7 is a valid ULID, and the main difference is
| the encoding when displayed in a textual form.
| bArray wrote:
| > I was wondering: how often do nanosecond timestamps collide on
| modern systems? The answer is: very often, like 5% of all
| samples, when reading the clock on all 4 physical cores at the
| same time.
|
| A few things to consider:
|
| 1. This would depend a lot on how regularly you are checking, the
| more regular, the more collisions.
|
| 2. There may also be some weirdness where threads doing similar
| tasks will synchronise or desynchronise to maximize throughput.
|
| 3. Your system may not offer accurate nanosecond time. For this
| reason some OSes tend not to offer lower than microsecond time,
| as the clocks simply cannot offer higher usable accuracy. You're
| also measuring the time to call the function(s), create the
| timestamp, etc.
|
| A simple solution I had years ago was to add random precision to
| reduce collisions. That breaks most tie situations. If you have
| 5% ties in nano seconds, with random pico second accuracy your
| collisions are down to 0.005%. You could also seed the random
| offset by an ID.
| p-e-w wrote:
| If you want unique identifiers, use version 4 (random) UUIDs.
| Problem solved.
|
| The probability of a collision is roughly the same as the
| probability of a fully grown dinosaur spontaneously manifesting
| in your bedroom due to quantum fluctuations.
| beefield wrote:
| Knowing nothing about UUID v4 generation, I have likely a
| stupid question. What makes you so confident that all
| implementations and their entropy sources are flawless enough
| to make actual collision probability close enough to theory?
| Thiez wrote:
| What makes us so confident that our database implements ACID
| correctly, the RAM stores bins correctly, and the disk
| drivers store the data correctly?
|
| In the end we have to make _some_ assumptions about the
| correctness of (some of the) components.
| tjoff wrote:
| We are not confident in any of that. Which is why we
| mitigate it with checksums, ECC etc.
|
| So depending on the consequences you might opt to reduce
| the risk or help mitigate the consequences.
|
| To just state that it is about as likely as my coffee maker
| being a portal to the future isn't very helpful. Poor
| entropy sources or bugs are not uncommon.
| littlestymaar wrote:
| > The probability of a collision is roughly the same as the
| probability of a fully grown dinosaur spontaneously manifesting
| in your bedroom due to quantum fluctuations.
|
| I'd love to see the math for the probability of the second
| option.
| amelius wrote:
| > The probability of a collision is roughly the same as the
| probability of a fully grown dinosaur spontaneously manifesting
| in your bedroom due to quantum fluctuations.
|
| But now the probability of bad things happening increased by
| about a factor of two, which is not acceptable.
| jraph wrote:
| I will take my chance :-)
|
| More seriously, If you can use them, good old increments are
| probably best. They are fast and cheap. Especially in a
| database. They can have privacy/security issues (you could
| guess things by the values of ids of stuff). UUIDs are better
| in those case or when you deal with a distributed system.
| p-e-w wrote:
| Unless the RNG itself is the problem, there's no reason not
| to. All kinds of civilization-ending catastrophes are
| _vastly_ more likely than a collision in a space of size
| 2^122.
| p-e-w wrote:
| > They can have privacy/security issues (you could guess
| things by the values of ids of stuff).
|
| Push them through a secure hash function, and that problem is
| solved too (assuming you can keep the base counter private).
| cacheyourdreams wrote:
| If you're going to do that then you might as well just use
| UUID, since you effectively reintroduce the negative
| aspects of that (infinitesimally miniscule chance of
| collisions, computation involved in the calculation, etc.)
| p-e-w wrote:
| The difference is that you can still use sequential IDs
| internally, while exposing hashed IDs to the outside.
| This protects your database from collisions under all
| circumstances, while in the absolute worst case, a single
| user might experience bugs because two external IDs
| collide.
| asimpletune wrote:
| If you're hashing for security reasons, I think you
| should still maintain a salt.
| cacheyourdreams wrote:
| Yes, I tend to like this philosophy in database design,
| of internal sequential ids which are used for joins
| between tables etc. and an exposed "external reference".
| But I typically would use a UUID for my external
| reference rather than a hash of the internal id.
| newaccount74 wrote:
| Doesn't that just add a whole lot of unnecessary
| complexity? If elements have multiple IDs, one of which
| should not be leaked to the outside, that's just asking
| for trouble in my opinion.
|
| Is generating UUIDv4 or UUIDv7 really too much effort?
| I'd assume that writing the row to the database takes
| longer than generating the UUID.
| iforgotpassword wrote:
| It also means once your hash function leaks for whatever
| reason or gets brute forced because of whatever weird
| weakness in your system, it's game over and everybody
| will forever be able to predict any future ids, guess
| neighboring ids, etc., unless you're willing to change
| the hash and invalidate all links to any content on your
| site.
|
| If I'm in a scenario where I think I need consecutive ids
| internally and random ones externally, I'll just have two
| fields in my tables.
| jstanley wrote:
| You need 2 fields anyway, unless you want to have to
| brute force your hash function when you need to invert
| it.
| wongarsu wrote:
| Store just the sequential id, compute the hash on the
| edge.
|
| This keeps your database simple and performant, and
| pushes complexity and work to the backend servers. This
| can be nice because developers are typically more at home
| at that layer, and scaling the backend can be a lot
| easier than scaling your database. But it also comes with
| the downsides listed in this thread.
| jstanley wrote:
| That's fine, but when a request comes in referencing only
| a hash and not an id (because you're not leaking ids to
| clients), how do you get the id?
| wongarsu wrote:
| Good point. Back when we did that we just used a
| reversible hash function (some would call it encryption).
| There are some simple algorithms meant for encrypting
| single integers with a reasonable key.
| iforgotpassword wrote:
| I might be misremembering, but didn't YouTube do this in
| the early days? So yeah, that was what I was thinking of
| when replying, not a traditional hash function.
| stickfigure wrote:
| This is a weird proposal. If you're using non-hashed IDs
| internally and exposing hashed IDs externally, you are
| going to need to map those (securely hashed) ids back to
| internal ids when the client hands them to you.
|
| I guess you could do this with complete table scans,
| hashing the ids and looking for matches, but that would
| be horribly inefficient. You could maintain your own
| internal reverse index of hash -> id but now I have to
| ask what's the point? You aren't saving any storage and
| you're adding a lot of complexity.
|
| Seems like if you want random unguessable external ids,
| you're always better off just generating them and using
| them as primary keys.
|
| Also, you aren't protecting your database "from
| collisions under all circumstances" - there's no
| guarantee your hash won't collide even if the input is
| small.
| fluoridation wrote:
| A block cipher is better, as it's guaranteed to not have
| collisions.
| marcosdumay wrote:
| Hum, no. You can easily hash numbers from 1 up to the value
| you see and guess the next value.
|
| If you want a secure identifier, make a random 64 or 128
| bits number (a UUID type 4). And do not use this number as
| an internal identifier, because identifiers performance is
| all about predictability and low entropy.
| gnulinux wrote:
| If you use a robust salt, such a randomly generated long
| salt, attacker won't be able to guess the next hash.
| enyone wrote:
| the dinosaur is now on my bed, what next?
| quickthrower2 wrote:
| Enjoy the non heat-death of the universe
| MattPalmer1086 wrote:
| Close the door and immediately switch to v5.
| lyu07282 wrote:
| I would just call Randall Munroe, he will know what to do
| fendy3002 wrote:
| but is it higher or lower if the dinosaur is not fully grown?
| dtech wrote:
| v7 looks nicer since it solves v4's locality issue and you're
| still a gazillion times more likely to win the lottery than
| generate a collision.
| p-e-w wrote:
| Unfortunately, many libraries don't implement it yet.
| gmhafiz wrote:
| It is still in draft and I've watched the spec changed at
| least once.
| hinkley wrote:
| I've met too many people who are surprised by millisecond or
| microsecond timestamp collisions.
|
| My most memorable and least favorite variety of this is when
| people try to assemble a timestamp from two system calls, one for
| the most significant digits and a second for the least.
|
| If the smallest digits roll over from 99x to 00x after you read
| the large digits, due to process preemption, you can create a
| time stamp for an entity that happens before the entity that
| caused it to exist. This breaks some code spectacularly (I've
| seen infinite loops at least twice).
|
| If you haven't memorized this as a thing to always avoid, you end
| up with tests that pass 99.5% of the time, and it can take
| someone with very good pattern matching skills to catch that the
| same test has been red once a week for a month and a half, which
| is a very long time for a logic bomb to live in CI/CD code before
| getting fixed.
| tsenart wrote:
| Our Go ULID package has millisecond precision + monotonic random
| bytes for disambiguation while preserving ordering within the
| same millisecond. https://github.com/oklog/ulid
| chaoticmass wrote:
| Reminds me of some Lotus Notes folklore. Apparently they used to
| use timestamps with a resolution of 1 second as unique IDs for
| things. When there was a collision, they would just add 1 second.
| Eventually, you'd end up with items being in the future because
| there would be so many collisions.
| donatj wrote:
| Slacks API uses straight nanoseconds for the IDs of their
| messages which I always found very curious but figured they must
| have some sort of way on the backend of resolving collisions.
| rocho wrote:
| A message ID only needs to be unique per workspace, in which
| case you'd expect very few collisions to begin with, and you
| could even retry on insert failure with a new timestamp. I
| don't think that would cause significant performance penalties.
| scandum wrote:
| I get the current time in microseconds, and increment by one
| nanosecond for each call between updates.
|
| Problem solved for the next 200 years.
| ourmandave wrote:
| Next article: _Picosecond timestamp collisions are common._
| tourist2d wrote:
| Problem solved until your boss calls you an idiot for storing
| fake nanosecond precisions.
| JacobSeated wrote:
| Timestamps should probably never be used as a "unique" id.
| 3cats-in-a-coat wrote:
| The problem is achieving locality in cohesion/meaning, which
| usually involves locality in time, which provokes using
| timestamps as part of the id at the very least. But it's a
| chain of very lazy thinking IMHO and it's like a peek at the
| house of cards some large systems are built like.
| Mawr wrote:
| Would a Snowflake ID work for this?
| https://en.wikipedia.org/wiki/Snowflake_ID
| eru wrote:
| Why do you want this kind of locality anyway?
| traviscj wrote:
| Database indices.
___________________________________________________________________
(page generated 2023-07-21 23:01 UTC)