[HN Gopher] Data Mesh Architecture
___________________________________________________________________
Data Mesh Architecture
Author : aiobe
Score : 71 points
Date : 2022-03-18 12:15 UTC (1 days ago)
(HTM) web link (www.datamesh-architecture.com)
(TXT) w3m dump (www.datamesh-architecture.com)
| politelemon wrote:
| Is there an underlying assumption here that all of the datasets'
| domains are perfectly in sync with each other in the context of
| domain metadata?
|
| As an example, a Team1 might define the manufacturer of a
| Sprocket as the company that assembled it, whereas a Team2 might
| define the manufacturer as the company that built the Sprocket's
| engine. Since the purpose of a datamesh is to enable other teams
| to perform cross-domain data analytics, there needs to be
| reconciliation regarding these definitions, or it'll become a
| datamess. Where does that get resolved?
| gxt wrote:
| The chief data officer in close collaboration with the chief
| data engineering officer must elaborate automated normalization
| guidelines backed with implementations used across all data
| streams to insure any skew in the data model is limited to non
| production environments and all data entities are materialized
| consistently across the whole data model.
| i_like_waiting wrote:
| what type of company you are working for? Usually there is
| not even CIO, I haven't even heard about company with both
| CDO and CDEO (or even CDEO itself).
|
| I thought big portion of need that data mesh fills is the
| organizations who are missing resources in their core BI
| team.
| tremoloqui wrote:
| A data mesh approach probably wouldn't work in the sort of
| organization you describe.
|
| IMO - To make it work you need a consistent taxonomy or way
| of translating from a particular domain to some sort of
| interchange format.
|
| If you have that then a set of centralized tools can pull
| from the separate domains using a core set of protocols to
| produce reports etc.
| gxt wrote:
| There's no magic. you need a core team that pivots from
| writing code at O(n) cost enterprise wide to more or less
| amortized O(1) where n is the amount of work required to
| process a new data stream - ie having to write code once
| per stream vs once for a standardized stream that gets
| reused. With only datamesh I don't think it's going to work
| but with standardized tools that allow your teams to write
| transformations and code as data then every team
| effectively gets access to a self-service data warehouse
| with only access to pre-approved happy paths that can be
| automatically monitored for the most part. That's where you
| gain in efficiency and can let your BI teams focus on BI
| and not boilerplate code, infrastructure, conformity, etc.
| i_like_waiting wrote:
| Yes, its similar path that I am taking (while leading BI
| in my org.) Having first sights of self-service from
| analysis perspective is super easy thanks to tools like
| metabase.
|
| For bringing data in, thats completely different story,
| especially in non-tech organizations. The gap between how
| power user from specific department and somebody from my
| team brings and transforms data is still too big and
| somehow hard to enforce (following naming conventions,
| keeping same data formats for same columns, lowercasing
| certain columns, so joins are done correctly...). They
| usually have their "playground schemas" they use, but its
| very far from saying that they "own" data quality there.
| LaserToy wrote:
| I looks like a weird attempt to build a consulting business
| around a simple idea.
|
| Treat data assets like micro services and pipelines like network.
| Period.
|
| Prescribing everything else rubs me wrong way.
|
| So, data mesh is: architecture in which data in the company
| organized in loosely coupled data assets.
| robertlagrant wrote:
| It really feels like data mesh is a fairly half baked concept
| born out of short term consulting gigs and a desire to become a
| technical thought leader.
| i_like_waiting wrote:
| Reminds me of first OLAP cubes a lot, something that consultant
| online praise as much as possible, just so then 3-4 years later
| they are contracted by the company to fix the mess it created.
| edmundsauto wrote:
| What are the downsides of OLAP cubes, and how were they
| fixed? Curious to level up my understanding.
| i_like_waiting wrote:
| I guess they had their place in some point and time, but I
| still vividly remember my old manager speaking about
| building OLAP cube in 2018.
| https://www.holistics.io/blog/the-rise-and-fall-of-the-
| olap-...
| i_like_waiting wrote:
| So if I understand this correctly, data mesh is just data mart,
| that doesn't bring data in database as a table, but uses S3
| storage instead (I assume because thats cheaper in the cloud?)
| skrrr wrote:
| That + a central data platform team that provides infra,
| quality monitors, data lineage and catalogue capabilities + a
| central team that provides guidelines on SLAs, metadata
| standards etc. Sounds good in theory, I am eager to see how it
| fails in practice
| mountainriver wrote:
| This seems like mostly common sense. Infrastructure teams should
| always be building tools that the org consumes (and ideally the
| general public)
|
| In a lot of orgs this goes sideways and the infrastructure teams
| end up owning everything and never have time to do anything else.
| Usually this happens due to upper management putting on the
| squeeze.
|
| In order for teams to actually own their infrastructure and data
| we need better tooling to help them. This is coming along
| nowadays but isn't fully there.
| sdze wrote:
| If you need so many "slides" to persuade your clients of
| something, I think you lost already.
| rad_gruchalski wrote:
| Considering how many big companies go about implementing this
| right now, I don't agree. C line likes slides.
| MikeDelta wrote:
| Indeed, the Future State Architecture documentation from the
| central architects that I have seen were all powerpoint
| presentations with at least 100 slides.
___________________________________________________________________
(page generated 2022-03-19 23:00 UTC)