[HN Gopher] Build C++ Graph Analytics Without Worrying About Memory
___________________________________________________________________
Build C++ Graph Analytics Without Worrying About Memory
Author : taubek
Score : 54 points
Date : 2022-10-06 13:49 UTC (9 hours ago)
(HTM) web link (memgraph.com)
(TXT) w3m dump (memgraph.com)
| worthless443 wrote:
| Automatic memory management is indeed the first thing one needs
| to look for writing performance critical software, and that's a
| first in my check-list. But
|
| > in-memory storage of databases
|
| Doesn't that sound a bit expensive to have large capacity memory?
| Although the expense of R/W IO is far cheaper for in-memory
| analysis. Is such trade-off worth it?
| mbuda wrote:
| Excellent observation/question :D It depends; sometimes, it's
| worth it, and sometimes it is not (as always with tradeoffs).
| Graphs are a bit specific because most of the traversals or
| expensive graph analytics like PageRank touch the whole graph
| (even multiple times) -> the entire graph will end up in memory
| -> why not keep it in memory for faster performance?
|
| But for a vast dataset, the hardware cost might be too much. I
| think we are aware of the tradeoff. We'll probably provide disk
| first storage option at some point because that's definitely a
| valid setup (sometimes the only possible setup). Ofc, we'll
| invest time in making it as performant as possible.
|
| Do you have some specific workload in mind? :D
| worthless443 wrote:
| If a large graph is needed to be read multiple times, sure
| memory bandwidth will result in the most performance possible
| under the context of this workload like interacting with
| PageRank (and going further with optimization techniques on
| memory allocation and management, will boost the performance
| even further).
|
| So to my understanding (and a novice one at that), the graph
| should be stored on disk first, upon initializing the objects
| will have to be an one-time copy to volatile memory but I
| question, memory regions are more likely to yield faults and
| get corrupt and thus graph stored in-memory is also
| completely flushed? (unless the results are being saved to
| disk in-between specific intervals of time?) Does that make
| any sense?
| mbuda wrote:
| I'm not sure I understand the part about corruption. How
| would data in memory become corrupted?
|
| How Memgraph currently works, it stores data in memory, and
| async starts writing data to disk in small data chunks
| called deltas, later these chunks are deleted and replaced
| with the whole graph snapshot (there is also a sync option,
| but that's slower in terms of committing a transaction,
| letting the user know data is written, e.g., RocksDB works
| similarly). All disk-related stuff is purely for durability
| (recovery after the Memgraph process restarts and all
| interactions with the disk are made automatically in the
| background during standard system runtime and startup
| time).
| worthless443 wrote:
| > it stores data in memory, and async starts writing data
| to disk in small data chunks called deltas, later these
| chunks are deleted and replaced with the whole graph
| snapshot
|
| Thanks, that fairly answers my question of recoverability
| of in-memory graphs.
| mbuda wrote:
| Perfect!
| timmy777 wrote:
| Awesome. But how is this different from dgraph?
| mbuda wrote:
| If you are asking about Memgraph in general, overall it's a
| graph storage + analytics system. DGraph is probably more on
| the pure storage side, while Memgraph is more about graph
| analytics (in-memory graph storage but it also stores data on
| disk). In terms of the API, DGraph exposes GraphQL, while
| Memgraph is Cypher + Bolt protocol. There is much more, which
| aspect are you most intrested in? :D
| mbuda wrote:
| Is there any interest in detailed comparison between C++ and Rust
| when it comes to different tradeoffs when implementing/using the
| query modules?
| ncmncm wrote:
| Differences would be about staffing. For any given specialty,
| having C++ skills too is common.
|
| Finding somebody with needed skills and also Rust experience
| will be impossible, so you would either need to plan on
| training up some Ruster on the specialty, or hire somebody
| already up on it with C++ skills and expect them to pick up
| enough Rust to get by.
| mbuda wrote:
| Yep, from the business perspective that's by far the biggest
| concern :D
___________________________________________________________________
(page generated 2022-10-06 23:01 UTC)