[HN Gopher] Don't bring a tree to a mesh fight
___________________________________________________________________
Don't bring a tree to a mesh fight
Author : valand
Score : 26 points
Date : 2021-11-22 02:11 UTC (1 days ago)
(HTM) web link (valand.dev)
(TXT) w3m dump (valand.dev)
| dragontamer wrote:
| Don't bring a mesh / graph to a hypergraph fight. Hypergraphs
| seem to be the "biggest generalization" I've come across in the
| list -> tree -> graph -> hypergraph hierarchy. (That is: all
| lists are trees. All trees are graphs. All graphs are
| hypergraphs)
|
| In a graph, an "edge" connects two nodes. In hypergraphs, a
| "hyperedge" can connect 2-or-more nodes.
|
| -------
|
| You might think that hypergraphs are purely theoretical,
| except... they are in common usage all the time. You might also
| know hypergraphs by their alternative name: Relational Tables.
|
| Well, I'm being loose with the language last paragraph. A
| relational-table is really a hyper-edge. And the nodes of the
| hypergraphs are the domains of whatever you're working with.
|
| If your relational-table only has 2-columns, you have a normal
| graph relationship between the data. But as anyone who has worked
| with relational databases knows: you often have 3-columns,
| 4-columns or more to represent the data. Once you reach 3-columns
| / 4-columns or beyond, you no longer are working with "simple"
| graphs but instead hypergraphs, and the theoretical complexity of
| the whole system reaches a new level.
|
| -------- CREATE TABLE PersonsHyperedges (
| PersonID int, LastName varchar(255),
| FirstName varchar(255), Address varchar(255),
| City varchar(255) );
|
| You can imagine this Table as a list of hyperedges, connecting
| "PersonID", "LastName", "FirstName", "Address" and "City" nodes
| together.
|
| -------- CREATE TABLE PersonsGraphEdges (
| LastName varchar(255), FirstName varchar(255),
| );
|
| In this case, since there's only 2-elements in the "hyperedge",
| this table is a list of "normal graph-edges"
|
| -----------
|
| There are also things "in-between". Bipartite graphs are very
| common for example. The hierarchy fits as "tree -> bipartite
| graph -> graph" (all trees are bipartite graphs).
|
| If you have assurances that your problem you're working with is
| bipartite, then favor bipartite graph algorithms. There are a lot
| of NP-complete problems over graphs that are in P for bipartite-
| graphs
| brrrrrm wrote:
| A hypergraph can always be viewed as a bipartite graph
| dragontamer wrote:
| I'm not sure if you're correct or wrong? But its a strange
| statement to make.
|
| Like, any graph can have a "minimum spanning tree"
| representation for example. But the minimum-spanning tree has
| some degree of information loss compared to the original.
|
| Similarly: a graph / tree could have a Lexicographic ordering
| (ie: a list representation), but this list-representation
| lost a lot of information in the translation process.
|
| ---------
|
| It just turns out that a lot of algorithms could work on the
| minimum-spanning tree (or the lexicographic list) of the
| graph.
|
| So instead of working on the general graph of a problem, you
| simplify it down into a simpler structure, and THEN work on
| the simplified structure.
| m_a_g wrote:
| "Hypergraphs can be viewed as incidence structures. In
| particular, there is a bipartite 'incidence graph' or 'Levi
| graph' corresponding to every hypergraph, and conversely,
| most, but not all, bipartite graphs can be regarded as
| incidence graphs of hypergraphs."
| lmeyerov wrote:
| The duality is super useful in practice!
|
| In the table/dataframe -> hypergraph dataframe transform @
| https://github.com/graphistry/pygraphistry , we do
| `hypergraph(multicolumn_table, direct=True |
| False)['graph'].plot()` , which renders hypergraphs as a
| regular graph. That lets you pick. Saves a lot of
| wrangling!
|
| Consider exploring some logs of customer activity or
| security events. A hyperedge becomes either:
|
| - a node of a bipartite graph. Ex: each log event becomes a
| node connecting the various entity nodes it mentions. Event
| <> IPs, accounts, countries, ...
|
| - .. or a bunch of pairwise entity<>entity edges. Ex:
| connect each IP<>account<>country directly, and label each
| edge with the hyperedge event id it came from.
|
| In both cases, you can now directly leverage a lot of
| traditional graph thinking, and in our case, GPU
| acceleration & visualization.
|
| Other systems might render hyperedges as say circles
| encomposing their nodes, but that's trickier at even
| small/medium scales, so I haven't seen much widely popular
|
| I increasingly just directly equate 'logs' with 'property
| hypergraphs' and skip the relational step. Funny enough, a
| lot of our enterprise+gov work is undoing the weird tabular
| & taxonomy optimizations of SQL engines that leak into the
| user experience when we're to get simple 360 views for
| users. It's cool an org built out 10,000 tables, but.....
| :) Same thing when we're doing graph neural networks: logs
| => hypergraphs (encoded as bipartite or regular property
| graphs) => event embeddings / classifications / predictions
| .
| dragontamer wrote:
| > I increasingly just directly equate 'logs' with
| 'property hypergraphs' and skip the relational step.
| Funny enough, a lot of our enterprise+gov work is undoing
| the weird tabular optimizations of SQL engines to get 360
| views of data back to users. It's cool someone has 10,000
| tables, but..... :)
|
| The more I program, the more I realize that all these
| data-structures are conceptually the same if you squint
| enough.
|
| Graphs are matrices (Edge == 1. Non-edge == 0). Matrices
| are graphs (see sparse matricies, like COO and its really
| obvious). Computer Code is graphs, Databases are
| Hypergraphs. Circuits are hypergraphs. Etc. etc.
| Everything seems to convert into each other.
|
| The study of NP-completeness is the study of graph (or
| hypergraph) complexity. Chordal graphs and Bipartite
| Graphs seem to always be in P-complexity despite the
| generalized forms being NP-complete. (Hmmm, I think all
| Bipartite graphs are Chordal, giving us another notch in
| the hierarchy of representations)
|
| --------
|
| There are questions about which forms are easier to think
| about... both for the human and the computer. At least,
| for the equivalent forms (Relational Databases vs
| Hypergraph representations seem to be equivalent, though
| I'm not 100% sure)
|
| When one form is "less powerful" than another form, you
| often get significant improvements in speed. Lists are
| less powerful than trees, but almost all list algorithms
| operate way faster.
___________________________________________________________________
(page generated 2021-11-23 23:00 UTC)