[HN Gopher] Data Fabric vs. Data Mesh: What's the Difference?
___________________________________________________________________
Data Fabric vs. Data Mesh: What's the Difference?
Author : electrum
Score : 53 points
Date : 2021-11-18 16:55 UTC (6 hours ago)
(HTM) web link (blog.starburst.io)
(TXT) w3m dump (blog.starburst.io)
| ekzhu wrote:
| There is no reason for both approaches to not coexist: a
| centralized catalog managed by a small team, setting the "gold
| standard" for the many decentralized data producers and curators,
| who are incentivized to maximize their impacts (i.e., usage) by
| having higher quality data following the standard.
|
| Another thing to point out: besides relying on the future
| promises of ML, there are already many signals that can be used
| by a centralized catalog for data discovery. For example: data
| sketches (MinHash, Hyperloglog) for joinable datasets, social
| signals (likes, comments, stars, etc. see Alation and Select Star
| SQL), lineages through data movements (e.g., Azure Data Factory
| and Azure Purview). If the centralized catalog uses those
| signals, then the data producers are incentivized to provide them
| for better visibility.
| abadid wrote:
| I agree that in theory they could both co-exist for the reasons
| you state, but in practice I think it's unlikely a company that
| invests in a data fabric (which is largely a technology cost)
| is going to simultaneously invest in the incentives for the
| data product creators that are necessary for the data mesh not
| to become the wild west.
| chrisjc wrote:
| I had never heard of Data Fabric before. Now that I have, I'm
| not sure they can exist without each other. In fact I would
| imagine that the metadata accumulated by through the data
| fabric would/could end up driving the data mesh implementation.
|
| Perhaps apps and services will end up having to go through
| data-coverage and data-quality verification steps before being
| released. Analytics (and caching, joins, etc) as an after
| thought is unacceptable in this day and age.
| nerdponx wrote:
| Maybe it's good to think of "fabric", "mesh", "warehouse", and
| "lake" as design patterns for data.
| oconnore wrote:
| This reminds me of David Mindell's work on situated autonomy:
| https://www.youtube.com/watch?v=M0-tafxh7gc
| https://www.robotics.org/userAssets/riaUploads/file/20-OurRo...
|
| > The highest levels of technology are not necessarily full
| autonomy, but situated autonomy
|
| > All autonomous systems are joint human-machine cognitive
| systems
|
| Fundamental questions: Where are the people? Which people are
| they? What are they doing? When are they doing it?
| claytonjy wrote:
| So Data Fabric is a bolt-on solution that we can't even honestly
| attempt today, while Data Mesh requires everyone in the
| engineering org to embrace data products?
|
| Sounds like startups might adopt Data Mesh, while it's easy to
| reorient org-level behavior, but big enterprises are doomed to
| carry forward their current messes until AI magically delivers us
| Data Fabric as a viable option.
|
| Anyone having more success with these approaches than my
| pessimistic take implies? Is it easier to adopt Data Mesh in a
| large org than I realize? Is Data Fabric a more viable option
| than the author considers?
| justicezyx wrote:
| A surprisingly well-organized and meaningful description of 2
| marketing concepts, seems targeted towards CXO's in making IT
| solution buying decisions.
| DrBenCarson wrote:
| These are two buzzwords that I still, for whatever reason, feel
| no momentum for.
| recursive wrote:
| So is data lake over now?
___________________________________________________________________
(page generated 2021-11-18 23:03 UTC)