[HN Gopher] PostgreSQL and Its Annoying Crosstab
___________________________________________________________________
PostgreSQL and Its Annoying Crosstab
Author : leononame
Score : 40 points
Date : 2024-04-04 10:24 UTC (2 days ago)
(HTM) web link (blog.aurelianix.com)
(TXT) w3m dump (blog.aurelianix.com)
| mbork_pl wrote:
| I admit that I only skimmed the article, but I'm wondering if
| psql's \crosstabview would help?
| sgarman wrote:
| OP says it needs to be a function in sql and backslash commands
| are psql commands not queries. Probably would not suite their
| case.
| leononame wrote:
| Yes, that's exactly right. Other requirements around this
| whole thing force us to do this in a function and not through
| psql. We've actually been thinking about using
| pl{perl,python,rust} instead because this approach works, but
| is annoying.
| nicholasjon wrote:
| Just ran into the fiddlyness of crosstab this week attempting to
| extend an existing query that used it. I ended up converting the
| query to use filters instead, which was a failure on my part to
| get through that fiddlyness -- but I'm consoling myself by saying
| it's a little more explicit this way for the next person. :)
| TheRealDunkirk wrote:
| AND the function is hard-limited to 1600 columns. Don't ask me
| how I know; you already know.
| account-5 wrote:
| I don't know postgresql but is it worse that sqlite? MS Access on
| the other hand is easy!
| to11mtm wrote:
| IIRC PostgreSQL is the preferred 'reference' for when features
| are added to SQLite.
|
| It's a different sort of beast than SQLite/Access.
|
| SQLite, Access, and FirebirdDB in 'local' mode are examples of
| 'file based databases' (Access of course is normally used via
| the application, but you can absolutely get to it in other
| ways.)
|
| Postgres, MySQL/MariaDB, Microsoft SQL Server (MSSQL), Oracle,
| and FirebirdDB in 'client/server' mode, involve having a
| dedicated server (or cluster of servers) that store the data
| and run the actual SQL logic.
|
| Hope that helps?
| account-5 wrote:
| Haha, probably should have worded my previous post better;
| karma taking a pounding due to that, I think.
|
| I meant in reference to Crosstab query, not the databases
| themselves.
| mjevans wrote:
| https://www.postgresql.org/docs/current/tablefunc.html#TABLE...
|
| Given what this function is doing, a limit on the output seems
| reasonable. Others have mentioned 1600 columns, which seems like
| a bit of a strange number, but might correlate internally to a
| maximum tuple of pointers to the source data.
|
| In a quick skim of various searches I don't see any good
| alternatives to crosstab (AKA pivot) for such monstrously large
| use cases. Nearly all the toy examples include months, or
| possibly days where the aggregate inputs collect summary
| statistics. At 1600+ columns, it's probably better to re-process
| the data and convert it from tabular to columnar format in
| whatever way best fits the particular data.
| bionhoward wrote:
| Maybe you could use column_name, dtype, min, max columns and
| sidestep the nulls, but you'd have to convert "min" and "max" to
| text so they're all the same type?
|
| Could also split the table by dtype outside of sql to ditch null
| values without needing to convert non-null ranges to text, but
| then you split your data into multiple tables, maybe that's
| inconvenient to break table schemas into chunks?
| makmanalp wrote:
| In a similar vein, here is my very similar rant and a teeny tiny
| sqlalchemy plugin that deals with the grossness a bit if you're
| using python. Not entirely sure if it still works, was floored to
| realize I'd written this a decade ago! But the sqlalchemy API
| tends to be quite stable.
|
| https://github.com/makmanalp/sqlalchemy-crosstab-postgresql
| tommasoamici wrote:
| Thanks for your work on that!
|
| I recently implemented a pivot table widget at work and it was
| mostly based on your code.
|
| It was a while ago, but I remember it worked with very few
| changes.
| wswope wrote:
| In case it saves anyone a headache: Pivoting with Postgres' JSON
| features (json_build_object and json_object_agg, specifically) is
| usually less difficult than crosstab IME.
|
| I've used it for exactly this same sorta metadata logging.
| singingfish wrote:
| Interesting, thanks! On a similar but unrelated note, I think
| instead of MERGE INTOs, self-updating CTEs are a better bet for
| similar reasons.
| kanobi wrote:
| Whenever I needed to do a pivot in postgres, I used this approach
| that is described in this stackoverflow anwser:
|
| https://stackoverflow.com/questions/20618323/create-a-pivot-...
|
| So for example when you have a table like described (column_name,
| meta_key, value), you would create a query like this:
| SELECT column_name, MAX(CASE WHEN
| meta_key='total_rows' THEN value ELSE NULL END) AS total_rows,
| MAX(CASE WHEN meta_key='not_null_count' THEN value ELSE NULL END)
| AS not_null_count, -- for all other metrics....
| FROM tall_table GROUP BY 1
|
| (edit: formatting)
| jasonpbecker wrote:
| Most of the time I'm using `filter ... where` for cases like
| these... for example select
| column_name, MAX(value) FILTER (where
| meta_key='total_rows') as total_row, MAX(value) FILTER
| (where meta_key='not_null_count') as not_null_count,
| ROUND(SUM (amount_in_cents) FILTER (WHERE EXTRACT(MONTH FROM
| TIMESTAMP '2006-01-01 03:04:05) = 1) / 100.0, 2) as
| 'january_sub_total' FROM table GROUP BY column_name
| ttfkam wrote:
| Someday Postgres will get support for PIVOT/UNPIVOT, but today is
| sadly not that day.
|
| https://learn.microsoft.com/en-us/sql/t-sql/queries/from-usi...
|
| https://blogs.oracle.com/sql/post/how-to-convert-rows-to-col...
___________________________________________________________________
(page generated 2024-04-06 23:00 UTC)