[HN Gopher] My Postgres Story: Internationalization
___________________________________________________________________
My Postgres Story: Internationalization
Author : PaulHoule
Score : 73 points
Date : 2023-09-29 12:47 UTC (10 hours ago)
(HTM) web link (obartunov.livejournal.com)
(TXT) w3m dump (obartunov.livejournal.com)
| hot_gril wrote:
| One serious way in which the location of your Postgres cluster
| might matter is the timestamp vs timestamptz ("timestamp with
| time zone") thing, where timestamptz is counterintuitively almost
| always what you want. Despite the naming, both store the exact
| same info, an 8-byte duration since epoch with microsecond
| precision. The difference is in how each is printed in string
| conversion, and it depends on your cluster's configured time zone
| and locale. Here's the problem: postgres=# SET
| timezone TO 'US/Pacific'; SET postgres=# SELECT
| now()::timestamptz; now
| ---------------------------- 2023-09-29 09:26:40.277-07
| (1 row) postgres=# SELECT now()::timestamp;
| now ------------------------- 2023-09-29
| 09:26:43.927 (1 row)
|
| timestamp removes the timezone offset, making it ambiguous. Most
| Postgres drivers will silently convert this to some kind of
| datetime assuming UTC, which is really the wrong behavior.
| timestamptz is fine because the driver knows to apply the 7hr
| offset. So unless you set your cluster to UTC or maybe the "C"
| locale (which I think converts everything to UTC but I forget),
| any backend reading from your DB will get timestamps off by
| several hours.
| iknownothow wrote:
| In my opinion that is not how you should be timestamps
| localized to a time zone in Postgres. You should instead use:
| SELECT now()::timestampz at time zone 'US/Pacific'
|
| The output is a timestamp that has no time zone offsets and
| therefore is localized. However I've noticed most novice
| programmers don't fully understand the implications of 'at time
| zone ...' syntax.
|
| It makes sense to me that the actual time zone is not stored
| within the timestamp itself since tomorrow a time zone can
| introduce DST and therefore some timestamps that were valid
| today will no longer be valid tomorrow.
| hot_gril wrote:
| In a backend running this SQL, you'd have to hardcode or
| track which timezone to localize to ("at time zone"), which
| isn't desirable.
| masklinn wrote:
| > It makes sense to me that the actual time zone is not
| stored within the timestamp itself since tomorrow a time zone
| can introduce DST and therefore some timestamps that were
| valid today will no longer be valid tomorrow.
|
| It's the exact inverse. If you want to store a future moment
| in time then you must store the local time with its timezone,
| so that you can adjust everything relative to it. Otherwise
| you lose critical information and any adjustment you do will
| be the wrong way around.
|
| Once the instant has passed it doesn't matter, because all
| the offsets are fixed.
|
| > tomorrow a time zone can introduce DST and therefore some
| timestamps that were valid today will no longer be valid
| tomorrow.
|
| That happens either way. But the one timestamp which remains
| valid throughout should be the one for the reference
| timezone. Whether that timezone is the one which introduces
| DST or not, the rest is what should shift relative to it.
|
| For instance let's say that last week a buddy told you to
| meet at 4, you store this information normalised without
| timezone. In the meantime the legislature declared summer
| time. Now when you ask for the meeting time your program
| tells you it's 5 local. Which it is not.
| masklinn wrote:
| > where timestamptz is counterintuitively almost always what
| you want.
|
| Funny I think the exact opposite. timestamptz would be useful
| if it stored a timezone, since it does not, it's useless, the
| application is a much better location to perform timezone
| handling (or store the timezone in a nearby column, if
| applicable, possibly even define a custom domain to bundle
| both).
|
| > So unless you set your cluster to UTC or maybe the "C" locale
| (which I think converts everything to UTC but I forget), any
| backend reading from your DB will get timestamps off by several
| hours.
|
| AKA they'll get the actual timestamp.
|
| A timestamp is either UTC, or it's zoned and associated with a
| timezone. Adjusting randomly based on the locale machine's
| configuration or a piece of connection state is definitely a
| thing I have no interest in.
|
| Locale string configurations have a similar issue of being
| half-assed and broken. They barely even worked before the
| internet existed, as soon as more than two different cultures
| interact with the software they might as well be chucked away
| for all the good they do (which is close to none). And that's
| before finding out all the harm they do.
| hot_gril wrote:
| > AKA they'll get the actual timestamp.
|
| No, they'll be 7 hours ahead. In my example, the actual data
| stored in the table is approx 1696004800, which it translated
| to 2023-09-29 09:26:40.277-07 for timestamptz output, and for
| timestamp it outputted 2023-09-29 09:26:40.277 which the
| client would incorrectly assume is 2023-09-29 09:26:40.277Z,
| or 1696030000. [Edited with Unix timestamps to be clearer.]
|
| > Adjusting randomly based on the locale machine's
| configuration or a piece of connection state is definitely a
| thing I have no interest in.
|
| Right, and timestamp is the one that causes this random
| adjustment. timestamptz avoids this by being explicit about
| the time zone.
| masklinn wrote:
| > Right, and timestamp is the one that causes this random
| adjustment.
|
| No, you having decided to leverage timezones is what causes
| it. This is a case of violence begetting more violence.
| hot_gril wrote:
| I'm not sure what you mean by leveraging timezones. The
| SET TIMEZONE line was to simulate a fresh Postgres setup
| on my machine, which defaults to US/Pacific for me. If my
| cluster is hosted elsewhere, idk what it defaults to.
| Postgres's built-in now() returns a timestamptz, and if I
| were instead inserting values from - say - JS w/ node-pg
| and using `new Date()`, I'd get the same behavior. I'm
| not going out of my way here.
|
| If you're suggesting using timestamp rather than
| timestamptz, how do you do that without changing or
| assuming the DB's locale settings?
| jihadjihad wrote:
| > So unless you set your cluster to UTC
|
| sounds like a good idea to me
| hot_gril wrote:
| Generally it is, and many cloud services do this, but the
| default Postgres packages for Linux/Mac/Windows do not iirc.
| And sometimes it's convenient to see things in your own time
| zone when querying manually.
| marcosdumay wrote:
| This breaks about as many things as it fixes.
|
| But the things it breaks are noisy, and the things it fixes
| are silent. So, yeah, sounds like a good trade-off.
|
| Still, I have no idea what was going through the head of the
| people that standardized timestamp as local time.
| hot_gril wrote:
| I can see why they'd want to support local time zone
| formatting. Even setting to UTC is merely covering up the
| problem that if you have a string like "2023-09-29
| 09:26:43.927", that does not indicate a particular point in
| time any better than my friend across the world proposing a
| Zoom call "at 8PM today."
| masklinn wrote:
| > Even setting to UTC is merely covering up the problem
| that if you have a string like "2023-09-29 09:26:43.927",
| that does not indicate a particular point in time any
| better than my friend across the world proposing a Zoom
| call "at 8PM."
|
| If the string in question is known to be UTC, which it
| should be since there is no timezone associated, then it
| does indicate a particular point in time.
| hot_gril wrote:
| UTC is a time zone too. ISO standard "2023-09-29
| 09:26:43.927" isn't known to be UTC; you need a "Z" at
| the end for that. For whatever reason, the client doesn't
| ask the DB for its implied time zone either.
| If no UTC relation information is given with a time
| representation, the time is assumed to be in local time.
| If the time is in UTC, add a Z directly after the time
| without a space.
|
| The only way to totally avoid the concept of time zones
| is to use a Unix-like timestamp, 1696004800.927.
___________________________________________________________________
(page generated 2023-09-29 23:01 UTC)