[HN Gopher] Don't bring a tree to a mesh fight
       ___________________________________________________________________
        
       Don't bring a tree to a mesh fight
        
       Author : valand
       Score  : 26 points
       Date   : 2021-11-22 02:11 UTC (1 days ago)
        
 (HTM) web link (valand.dev)
 (TXT) w3m dump (valand.dev)
        
       | dragontamer wrote:
       | Don't bring a mesh / graph to a hypergraph fight. Hypergraphs
       | seem to be the "biggest generalization" I've come across in the
       | list -> tree -> graph -> hypergraph hierarchy. (That is: all
       | lists are trees. All trees are graphs. All graphs are
       | hypergraphs)
       | 
       | In a graph, an "edge" connects two nodes. In hypergraphs, a
       | "hyperedge" can connect 2-or-more nodes.
       | 
       | -------
       | 
       | You might think that hypergraphs are purely theoretical,
       | except... they are in common usage all the time. You might also
       | know hypergraphs by their alternative name: Relational Tables.
       | 
       | Well, I'm being loose with the language last paragraph. A
       | relational-table is really a hyper-edge. And the nodes of the
       | hypergraphs are the domains of whatever you're working with.
       | 
       | If your relational-table only has 2-columns, you have a normal
       | graph relationship between the data. But as anyone who has worked
       | with relational databases knows: you often have 3-columns,
       | 4-columns or more to represent the data. Once you reach 3-columns
       | / 4-columns or beyond, you no longer are working with "simple"
       | graphs but instead hypergraphs, and the theoretical complexity of
       | the whole system reaches a new level.
       | 
       | --------                   CREATE TABLE PersonsHyperedges (
       | PersonID int,             LastName varchar(255),
       | FirstName varchar(255),             Address varchar(255),
       | City varchar(255)         );
       | 
       | You can imagine this Table as a list of hyperedges, connecting
       | "PersonID", "LastName", "FirstName", "Address" and "City" nodes
       | together.
       | 
       | --------                   CREATE TABLE PersonsGraphEdges (
       | LastName varchar(255),             FirstName varchar(255),
       | );
       | 
       | In this case, since there's only 2-elements in the "hyperedge",
       | this table is a list of "normal graph-edges"
       | 
       | -----------
       | 
       | There are also things "in-between". Bipartite graphs are very
       | common for example. The hierarchy fits as "tree -> bipartite
       | graph -> graph" (all trees are bipartite graphs).
       | 
       | If you have assurances that your problem you're working with is
       | bipartite, then favor bipartite graph algorithms. There are a lot
       | of NP-complete problems over graphs that are in P for bipartite-
       | graphs
        
         | brrrrrm wrote:
         | A hypergraph can always be viewed as a bipartite graph
        
           | dragontamer wrote:
           | I'm not sure if you're correct or wrong? But its a strange
           | statement to make.
           | 
           | Like, any graph can have a "minimum spanning tree"
           | representation for example. But the minimum-spanning tree has
           | some degree of information loss compared to the original.
           | 
           | Similarly: a graph / tree could have a Lexicographic ordering
           | (ie: a list representation), but this list-representation
           | lost a lot of information in the translation process.
           | 
           | ---------
           | 
           | It just turns out that a lot of algorithms could work on the
           | minimum-spanning tree (or the lexicographic list) of the
           | graph.
           | 
           | So instead of working on the general graph of a problem, you
           | simplify it down into a simpler structure, and THEN work on
           | the simplified structure.
        
             | m_a_g wrote:
             | "Hypergraphs can be viewed as incidence structures. In
             | particular, there is a bipartite 'incidence graph' or 'Levi
             | graph' corresponding to every hypergraph, and conversely,
             | most, but not all, bipartite graphs can be regarded as
             | incidence graphs of hypergraphs."
        
             | lmeyerov wrote:
             | The duality is super useful in practice!
             | 
             | In the table/dataframe -> hypergraph dataframe transform @
             | https://github.com/graphistry/pygraphistry , we do
             | `hypergraph(multicolumn_table, direct=True |
             | False)['graph'].plot()` , which renders hypergraphs as a
             | regular graph. That lets you pick. Saves a lot of
             | wrangling!
             | 
             | Consider exploring some logs of customer activity or
             | security events. A hyperedge becomes either:
             | 
             | - a node of a bipartite graph. Ex: each log event becomes a
             | node connecting the various entity nodes it mentions. Event
             | <> IPs, accounts, countries, ...
             | 
             | - .. or a bunch of pairwise entity<>entity edges. Ex:
             | connect each IP<>account<>country directly, and label each
             | edge with the hyperedge event id it came from.
             | 
             | In both cases, you can now directly leverage a lot of
             | traditional graph thinking, and in our case, GPU
             | acceleration & visualization.
             | 
             | Other systems might render hyperedges as say circles
             | encomposing their nodes, but that's trickier at even
             | small/medium scales, so I haven't seen much widely popular
             | 
             | I increasingly just directly equate 'logs' with 'property
             | hypergraphs' and skip the relational step. Funny enough, a
             | lot of our enterprise+gov work is undoing the weird tabular
             | & taxonomy optimizations of SQL engines that leak into the
             | user experience when we're to get simple 360 views for
             | users. It's cool an org built out 10,000 tables, but.....
             | :) Same thing when we're doing graph neural networks: logs
             | => hypergraphs (encoded as bipartite or regular property
             | graphs) => event embeddings / classifications / predictions
             | .
        
               | dragontamer wrote:
               | > I increasingly just directly equate 'logs' with
               | 'property hypergraphs' and skip the relational step.
               | Funny enough, a lot of our enterprise+gov work is undoing
               | the weird tabular optimizations of SQL engines to get 360
               | views of data back to users. It's cool someone has 10,000
               | tables, but..... :)
               | 
               | The more I program, the more I realize that all these
               | data-structures are conceptually the same if you squint
               | enough.
               | 
               | Graphs are matrices (Edge == 1. Non-edge == 0). Matrices
               | are graphs (see sparse matricies, like COO and its really
               | obvious). Computer Code is graphs, Databases are
               | Hypergraphs. Circuits are hypergraphs. Etc. etc.
               | Everything seems to convert into each other.
               | 
               | The study of NP-completeness is the study of graph (or
               | hypergraph) complexity. Chordal graphs and Bipartite
               | Graphs seem to always be in P-complexity despite the
               | generalized forms being NP-complete. (Hmmm, I think all
               | Bipartite graphs are Chordal, giving us another notch in
               | the hierarchy of representations)
               | 
               | --------
               | 
               | There are questions about which forms are easier to think
               | about... both for the human and the computer. At least,
               | for the equivalent forms (Relational Databases vs
               | Hypergraph representations seem to be equivalent, though
               | I'm not 100% sure)
               | 
               | When one form is "less powerful" than another form, you
               | often get significant improvements in speed. Lists are
               | less powerful than trees, but almost all list algorithms
               | operate way faster.
        
       ___________________________________________________________________
       (page generated 2021-11-23 23:00 UTC)