[HN Gopher] Possibly all the ways to get loop-finding in graphs ...
___________________________________________________________________
Possibly all the ways to get loop-finding in graphs wrong
Author : matt_d
Score : 188 points
Date : 2024-09-11 05:47 UTC (17 hours ago)
(HTM) web link (www.chiark.greenend.org.uk)
(TXT) w3m dump (www.chiark.greenend.org.uk)
| wonnage wrote:
| Any time you're looking for a graph algorithm there's a pretty
| good chance Tarjan's name is on it
| 082349872349872 wrote:
| Tarjan's SCC is elegant with a heavily imperative flavour; has
| anyone come up with an elegantly pure (yet still linear)
| functional variation?
|
| EDIT: looks like Okasaki has a linear-time functional; I'm
| pragmatic enough that 2 passes are (even small constant passes
| would be) fine for me -- my graphs are all tiny enough that I'm
| not about to store them on tape.
|
| EDIT2: my bad, not Okasaki himself, King & Launchbury:
| https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&d...
| scc' :: Graph -> Forest Vertex scc' g = dfs g (reverse
| (postOrd (transposeG g)))
|
| (but their sol'n requires laziness to avoid generating an
| infinite intermediate; has anyone done it linearly,
| functionally, and eagerly?)
| n4r9 wrote:
| I'm not too deep into CS theory so I'm not sure if this is an
| answer to your question but... SCC algorithms are useful when
| building routeable road networks from source data (like OSM)
| in order to filter out isolated sub-networks (like an
| airfield). We use dotnet but have something that looks
| possibly similar to what you wrote(?):
| StronglyConnectedVertices(Vertex source) => ReachableVertices
| (source).Intersect(ReverseReachableVertices(source))
|
| The methods ReachableVertices and ReverseReachableVertices
| are effectively depth-first searches on the graph.
|
| The challenge - in terms of performance - is actually not
| finding a single SCC but returning the full set of
| potentially thousands SCCs in large networks. I originally
| implemented an iterative (imperative) version of Gabow's
| "Path-Based" algorithm but it was too slow on continent-sized
| datasets. We did some experimentation with a parallel variant
| of Tarjan which didn't improve things much.
|
| Eventually I settled for something "good enough" - randomly
| pick a vertex, calculate the SCC with the above code, return
| it if it's large enough (>1k vertices) otherwise filter it
| out, keep going until I've accounted for >90% of the graph.
| The remaining <10% gets filtered out. That's basically
| imperative (using a while loop) but could probably be
| functionalised.
|
| -----------------------------------------------------
|
| EDIT I've taken a better look at the linked paper and what I
| wrote is _definitely_ not an answer to your question. But I
| 'll leave it in case anyone's interested.
| Hendrikto wrote:
| I think it is strange that there is no mention of either time or
| space complexity, in an article about graph algorithms.
|
| If you don't need to be efficient, any problem is relatively easy
| to solve.
| 082349872349872 wrote:
| If you're not yet correct, complexity is moot.
|
| > _When in doubt, use brute force._ --KLT
| xelxebar wrote:
| First time I've heard of the disjoint-set data structure. For a
| DSF implementation, Wikipedia [0] gives the wildest asymptotic
| complexity I've seen, using the _inverse Ackermann function_!
| "Almost constant time" indeed.
|
| [0]:https://en.wikipedia.org/wiki/Disjoint-set_data_structure
| kragen wrote:
| more commonly it's called union-find; it came up on here last
| thursday: https://news.ycombinator.com/item?id=41450948
| shepherdjerred wrote:
| Disjoint sets are awesome! You can use them with a _very_
| simple algorithm to form a minimum spanning tree.
|
| https://en.wikipedia.org/wiki/Kruskal%27s_algorithm
| kevinventullo wrote:
| What's especially surprising to me is that inverse Ackerman is
| provably optimal for this problem, regardless of
| implementation.
| tromp wrote:
| The problem of finding cycles in (pseudo randomly generated)
| graphs is actually one of the first examples of an _asymmetric_
| Proof-of-Work puzzle [1]. Asymmetry means that while finding a
| cycle takes a lot of memory and effort, a solution can be
| verified instantly.
|
| Several of the suggested algorithms were in fact implemented in
| the various Cuckoo Cycle solvers [2]. But by maintaining directed
| paths, my union-find based algorithm _did_ identify the entire
| cycle and not just the cycle completing edge [1] (implemented in
| [3]).
|
| This algorithm suffers from three problems though. One, it only
| finds one among a set of interconnected cycles. Second, it's
| rather slow, being latency bound by the random memory accesses.
| Third, it uses much more memory than needed.
|
| So repeatedly trimming edges with only one endpoint is a much
| faster approach. And for the random graphs that the PoW
| generates, it can be done using only 1 bit per edge to record if
| it's been trimmed.
|
| Once sufficiently trimmed, a loop tracing algorithm (like [4] for
| the Cuckatoo Cycle variant) can find _all_ remaining (simple)
| cycles.
|
| [1] https://www.semanticscholar.org/paper/Cuckoo-Cycle%3A-A-
| Memo...
|
| [2] https://github.com/tromp/cuckoo
|
| [3]
| https://github.com/tromp/cuckoo/blob/master/src/cuckoo/cycle...
|
| [4]
| https://github.com/tromp/cuckoo/blob/master/src/cuckatoo/gra...
| pxeger1 wrote:
| I initially took issue with "first asymmetric proof of work",
| thinking that in practice any NP-complete problem would surely
| be fine, even if we can't prove that it can't be solved
| quickly. But is it difficult to find NP-complete problems which
| can't also be quickly approximated? And I suppose it might also
| be hard to find NP-complete problems in a "Goldilocks zone" of
| difficulty to make for practical proofs of work, since their
| complexity increases so sharply.
| JohnKemeny wrote:
| Approximation doesn't help, since you can require optimal
| solutions, but yes, it's very difficult to come up with hard
| instances. After all, you need to know the correct answer, so
| you can't just randomly generate instances. In addition, SAT
| solvers are ridiculously good these days.
| tromp wrote:
| For Proof-of-Work applications, at least in blockchains, you
| need puzzles that
|
| * can be pseudo randomly generated from a seed (usually the
| hash of a partial blockchain header)
|
| * have a known optimal solution method with a (nearly)
| constant running time that's at most about one second (for
| progress freeness this should be a small fraction of the
| target block time)
|
| NP-hard problems rarely satisfy both these constraints.
| Finding fixed length cycles in random (bipartite) graphs with
| billions of edges does.
| JohnKemeny wrote:
| After trimming, you can also branch on edges: either it is in a
| cycle, in which you find one, otherwise you can remove it and
| further reduce the instance by trimming more edges.
| JohnVideogames wrote:
| I had to tackle a problem much like this one when analysing the
| polygons in planar graphs (analysis of planar chemical networks).
| A fun approach I found was to use a Delaunay triangulation (the
| dual of a Voronoi diagram; draw the Voronoi polygons and join
| their centres if they share an edge). Then join the triangles
| into larger polygons by removing edges that are in the Delaunay
| triangulation but not in your original planar graph. Once you've
| got no more edges to remove you've got all the polygons in the
| graph and convenient triangles for rendering / analysis.
|
| For planar graphs on the surface of a torus, like they faced
| here, I just ended up adding extra images of the central unit
| cell, and then removing the excess. Not elegant (lots of weird
| corner cases) but convenient for visualisation.
| JohnKemeny wrote:
| Can you expand on what you wanted to analyze? Why didn't you
| just output the polygons immediately?
| gilleain wrote:
| Chemistry does ring finding in various ways. I've always liked
| the SSSR (smallest set of smallest rings) just for the name :)
|
| https://depth-first.com/articles/2020/08/31/a-smallest-set-o...
| slashdave wrote:
| Unfortunately, as noted in your link, the SSSR solution is
| not unique. Also unfortunate, Daylight (the inventor of
| SMILES) added SSSR as an optional atom identifier in their
| SMARTS query specification without realizing this. Because
| OpenEye doesn't like SSSR, they actually leave out this
| feature in their implement of SMARTS, breaking from standard.
|
| https://daylight.com/dayhtml/doc/theory/theory.smarts.html
| yohannesk wrote:
| If this article triggers a desire to work on an actual problem
| with graphs try https://leetcode.com/problems/freedom-trail
| devit wrote:
| Seems like a trivial problem to me.
|
| If the graph is directed, do an SCC decomposition in linear time
| using a graph library and then any SCC with size more than one
| has at least a loop, trivially extracted by following any edges
| in the SCC.
|
| If the graph is undirected, compute a spanning tree in linear
| time using a graph library and then any edge outside the tree
| forms a loop, again trivially extractable from the edge and the
| paths from its endpoints to their least common ancestor in the
| tree.
|
| Since those algorithms are already asymptotically optimal, it's
| just an engineering problem of finding the solution with the best
| constant factor given the precise data structures and goal in
| question.
| zeroonetwothree wrote:
| The most obvious way is to just run a DFS, if you ever come
| back to a node you've seen then you have a cycle (and by
| storing parent pointers you can get the cycle back). That's
| like CS 101 level.
|
| However in the article it's actually a graph being modified. If
| you don't want to run a full dfs every time, there are better
| ways. This problem is called "online cycle detection" and you
| can google some other algorithms.
| layer8 wrote:
| You obviously didn't read the article. It's about undirected
| graphs, and about visualizing the loops, not just testing for
| their existence. The final solution is indeed based on spanning
| trees.
| zeroonetwothree wrote:
| This problem is called "online cycle detection" in the
| literature. You can search for various better algorithms that
| have been developed.
| layer8 wrote:
| The article isn't interested in the "online" aspect, but in
| finding all (what turns out to be called) non-bridge edges, in
| order to visualize the loops.
| konstantinua00 wrote:
| it is interested in both
| karmakaze wrote:
| I've always thought Floyd's tortoise and hare[0] algorithm was
| handy as it can find a cycle in a DAG with low memory usage.
|
| [0]
| https://en.wikipedia.org/wiki/Cycle_detection#Floyd's_tortoi...
| konstantinua00 wrote:
| how does it work on DAG?
|
| I always thought it's a linked list algo
| tejtm wrote:
| By never finding a cycle, it proves the graph is a DAG.
| jerf wrote:
| But you have to run it on a sequence, or perhaps even more
| accurately, something that is putatively a sequence but may
| have an infinite loop in it. Tortoise & hare requires an
| unambiguous "next" function, which a graph does not have.
| Modifying it to not require that so it can run on a graph
| may be possible but that would constitute a different
| algorithm for sure, with different complexity analysis,
| etc.
| pavon wrote:
| It is useful for determining whether a given traversal of a
| graph has encountered a cycle, rather than determining
| whether the entire graph has any cycles. So different from
| what the article was about, but still a nice tool to have in
| the box.
| water-data-dude wrote:
| Here is where a smartass might point out that by definition you
| won't find any cycles in your DAG
| pavon wrote:
| It's short for Directed Accidentally-cyclic Graph.
| vouwfietsman wrote:
| Funny I've had to do this occasionally and I always do the
| following: for each edge E, attempt to construct a clockwise loop
| from that edge by traversing each vertex and picking the edge
| that is "most clockwise" compared to the incoming edge. Converse
| for counter clockwise. This gives you, for each edge, the two
| loops the edge is a part of (if they exist).
|
| It sounds horribly slow, but isn't because the loop finding
| process for one edge can mark all edges it traverses as either
| being on the same loop, or not having a loop, for the orientation
| its checking. These edges can be skipped henceforth.
|
| This does not find all loops, only the smallest disjoint loops
| (as every edge is at most part of two disjoint loops). But
| basically results in a similar planar "face" partition the author
| describes.
|
| Its not clear to me whether my approach does not work (perhaps
| the angle compares fail on a torus? I've never had to deal with
| torii), or simply isn't suited because it misses some loops, but
| I thought I'd share it anyway :-)
| cvoss wrote:
| Notwithstanding the limitation of assuming the graph is
| embedded in a coordinate space where you can try to measure
| angles, I think this algorithm still doesn't work.
|
| I may be missing a detail, but it seems to fail on a graph
| consisting of a loop with some extra edges branching off the
| loop toward its interior and some edges branching off toward
| the exterior. The left hand walk hits a dead end in one branch,
| and the right hand walk dead ends in the other, no matter where
| on the loop you start (except for some special cases).
|
| Edit: strengthening the proposed counterexample
| vouwfietsman wrote:
| Hmm I'm having a hard time visualizing your counter example,
| I'll try to think of it some more. An important detail I left
| out though is that in my problem sets edges do not cross,
| which I suppose simplifies the entire problem a lot :-)
| mrgoldenbrown wrote:
| I think you are trying to say your method only works if the
| graph is "planar", ie it can be drawn on a flat piece of
| paper with no lines crossing.
| vouwfietsman wrote:
| Indeed, thanks for the clarification!
| eigenket wrote:
| If I understand your algorithm correctly I think it fails
| on this graph (drawn on my phone, using an online
| implementation of Paint).
|
| https://imgur.com/a/ihPrwrX
|
| I think this is roughly the counterexample the person above
| was suggesting.
| konstantinua00 wrote:
| how is it different from the "sidewalk" algo in the article?
|
| you'd get the same pitfall of torus topology with 2
| perpendicular loops
| penguin_booze wrote:
| I'm curious: If I were to share the same link again now, I'll be
| forwarded to the latest submission instead. Interestingly, this
| link was shared twice in the past few days (click the "past" link
| above - and, yes, one of the submitters was I). How, then, was it
| that neither of the recent submitters, including myself, were
| forwarded to the latest at the time, but instead new submissions
| were created?
| sim7c00 wrote:
| one of the more interesting reads ive had on here so firstly
| thanks for that! love all the explanations of the intermediate
| solutions. ive been working lately with graphs to deduce network
| topologies and finding loops is one of my problems. excited to
| learn about the final answer, and happy to see it involves
| spanning tree :') kind of saves me reading through how that
| protocol works (or should work) so i can get on with implementing
| it! saving me likely all sorts of wrong solutions <3 thanks!
___________________________________________________________________
(page generated 2024-09-11 23:00 UTC)