[HN Gopher] My Postgres Story: Internationalization
       ___________________________________________________________________
        
       My Postgres Story: Internationalization
        
       Author : PaulHoule
       Score  : 73 points
       Date   : 2023-09-29 12:47 UTC (10 hours ago)
        
 (HTM) web link (obartunov.livejournal.com)
 (TXT) w3m dump (obartunov.livejournal.com)
        
       | hot_gril wrote:
       | One serious way in which the location of your Postgres cluster
       | might matter is the timestamp vs timestamptz ("timestamp with
       | time zone") thing, where timestamptz is counterintuitively almost
       | always what you want. Despite the naming, both store the exact
       | same info, an 8-byte duration since epoch with microsecond
       | precision. The difference is in how each is printed in string
       | conversion, and it depends on your cluster's configured time zone
       | and locale. Here's the problem:                 postgres=# SET
       | timezone TO 'US/Pacific';       SET            postgres=# SELECT
       | now()::timestamptz;                   now
       | ----------------------------        2023-09-29 09:26:40.277-07
       | (1 row)            postgres=# SELECT now()::timestamp;
       | now                  -------------------------        2023-09-29
       | 09:26:43.927       (1 row)
       | 
       | timestamp removes the timezone offset, making it ambiguous. Most
       | Postgres drivers will silently convert this to some kind of
       | datetime assuming UTC, which is really the wrong behavior.
       | timestamptz is fine because the driver knows to apply the 7hr
       | offset. So unless you set your cluster to UTC or maybe the "C"
       | locale (which I think converts everything to UTC but I forget),
       | any backend reading from your DB will get timestamps off by
       | several hours.
        
         | iknownothow wrote:
         | In my opinion that is not how you should be timestamps
         | localized to a time zone in Postgres. You should instead use:
         | SELECT now()::timestampz at time zone 'US/Pacific'
         | 
         | The output is a timestamp that has no time zone offsets and
         | therefore is localized. However I've noticed most novice
         | programmers don't fully understand the implications of 'at time
         | zone ...' syntax.
         | 
         | It makes sense to me that the actual time zone is not stored
         | within the timestamp itself since tomorrow a time zone can
         | introduce DST and therefore some timestamps that were valid
         | today will no longer be valid tomorrow.
        
           | hot_gril wrote:
           | In a backend running this SQL, you'd have to hardcode or
           | track which timezone to localize to ("at time zone"), which
           | isn't desirable.
        
           | masklinn wrote:
           | > It makes sense to me that the actual time zone is not
           | stored within the timestamp itself since tomorrow a time zone
           | can introduce DST and therefore some timestamps that were
           | valid today will no longer be valid tomorrow.
           | 
           | It's the exact inverse. If you want to store a future moment
           | in time then you must store the local time with its timezone,
           | so that you can adjust everything relative to it. Otherwise
           | you lose critical information and any adjustment you do will
           | be the wrong way around.
           | 
           | Once the instant has passed it doesn't matter, because all
           | the offsets are fixed.
           | 
           | > tomorrow a time zone can introduce DST and therefore some
           | timestamps that were valid today will no longer be valid
           | tomorrow.
           | 
           | That happens either way. But the one timestamp which remains
           | valid throughout should be the one for the reference
           | timezone. Whether that timezone is the one which introduces
           | DST or not, the rest is what should shift relative to it.
           | 
           | For instance let's say that last week a buddy told you to
           | meet at 4, you store this information normalised without
           | timezone. In the meantime the legislature declared summer
           | time. Now when you ask for the meeting time your program
           | tells you it's 5 local. Which it is not.
        
         | masklinn wrote:
         | > where timestamptz is counterintuitively almost always what
         | you want.
         | 
         | Funny I think the exact opposite. timestamptz would be useful
         | if it stored a timezone, since it does not, it's useless, the
         | application is a much better location to perform timezone
         | handling (or store the timezone in a nearby column, if
         | applicable, possibly even define a custom domain to bundle
         | both).
         | 
         | > So unless you set your cluster to UTC or maybe the "C" locale
         | (which I think converts everything to UTC but I forget), any
         | backend reading from your DB will get timestamps off by several
         | hours.
         | 
         | AKA they'll get the actual timestamp.
         | 
         | A timestamp is either UTC, or it's zoned and associated with a
         | timezone. Adjusting randomly based on the locale machine's
         | configuration or a piece of connection state is definitely a
         | thing I have no interest in.
         | 
         | Locale string configurations have a similar issue of being
         | half-assed and broken. They barely even worked before the
         | internet existed, as soon as more than two different cultures
         | interact with the software they might as well be chucked away
         | for all the good they do (which is close to none). And that's
         | before finding out all the harm they do.
        
           | hot_gril wrote:
           | > AKA they'll get the actual timestamp.
           | 
           | No, they'll be 7 hours ahead. In my example, the actual data
           | stored in the table is approx 1696004800, which it translated
           | to 2023-09-29 09:26:40.277-07 for timestamptz output, and for
           | timestamp it outputted 2023-09-29 09:26:40.277 which the
           | client would incorrectly assume is 2023-09-29 09:26:40.277Z,
           | or 1696030000. [Edited with Unix timestamps to be clearer.]
           | 
           | > Adjusting randomly based on the locale machine's
           | configuration or a piece of connection state is definitely a
           | thing I have no interest in.
           | 
           | Right, and timestamp is the one that causes this random
           | adjustment. timestamptz avoids this by being explicit about
           | the time zone.
        
             | masklinn wrote:
             | > Right, and timestamp is the one that causes this random
             | adjustment.
             | 
             | No, you having decided to leverage timezones is what causes
             | it. This is a case of violence begetting more violence.
        
               | hot_gril wrote:
               | I'm not sure what you mean by leveraging timezones. The
               | SET TIMEZONE line was to simulate a fresh Postgres setup
               | on my machine, which defaults to US/Pacific for me. If my
               | cluster is hosted elsewhere, idk what it defaults to.
               | Postgres's built-in now() returns a timestamptz, and if I
               | were instead inserting values from - say - JS w/ node-pg
               | and using `new Date()`, I'd get the same behavior. I'm
               | not going out of my way here.
               | 
               | If you're suggesting using timestamp rather than
               | timestamptz, how do you do that without changing or
               | assuming the DB's locale settings?
        
         | jihadjihad wrote:
         | > So unless you set your cluster to UTC
         | 
         | sounds like a good idea to me
        
           | hot_gril wrote:
           | Generally it is, and many cloud services do this, but the
           | default Postgres packages for Linux/Mac/Windows do not iirc.
           | And sometimes it's convenient to see things in your own time
           | zone when querying manually.
        
           | marcosdumay wrote:
           | This breaks about as many things as it fixes.
           | 
           | But the things it breaks are noisy, and the things it fixes
           | are silent. So, yeah, sounds like a good trade-off.
           | 
           | Still, I have no idea what was going through the head of the
           | people that standardized timestamp as local time.
        
             | hot_gril wrote:
             | I can see why they'd want to support local time zone
             | formatting. Even setting to UTC is merely covering up the
             | problem that if you have a string like "2023-09-29
             | 09:26:43.927", that does not indicate a particular point in
             | time any better than my friend across the world proposing a
             | Zoom call "at 8PM today."
        
               | masklinn wrote:
               | > Even setting to UTC is merely covering up the problem
               | that if you have a string like "2023-09-29 09:26:43.927",
               | that does not indicate a particular point in time any
               | better than my friend across the world proposing a Zoom
               | call "at 8PM."
               | 
               | If the string in question is known to be UTC, which it
               | should be since there is no timezone associated, then it
               | does indicate a particular point in time.
        
               | hot_gril wrote:
               | UTC is a time zone too. ISO standard "2023-09-29
               | 09:26:43.927" isn't known to be UTC; you need a "Z" at
               | the end for that. For whatever reason, the client doesn't
               | ask the DB for its implied time zone either.
               | If no UTC relation information is given with a time
               | representation, the time is assumed to be in local time.
               | If the time is in UTC, add a Z directly after the time
               | without a space.
               | 
               | The only way to totally avoid the concept of time zones
               | is to use a Unix-like timestamp, 1696004800.927.
        
       ___________________________________________________________________
       (page generated 2023-09-29 23:01 UTC)