#107 - Data Mesh: Delivering Data-Driven Value at Scale - Zhamak Dehghani

 

 

“If you want to unlock the value of your data by generating data-driven values, and you want to do it reliably and resiliently at scale, then you need to consider data mesh."

Zhamak Dehghani is the author of the “Data Mesh” book. In this episode, we discussed in-depth about the data mesh, a concept she founded in 2018, which has then been becoming an industry trend. We started our conversation by discussing the current challenges working with data, such as the data centralization approach and why the current data tools are still inadequate. Zhamak then described data mesh and why organizations should adopt it to generate data-driven values at scale. Zhamak then explained the 4 principles of data mesh, which include domain ownership, data as a product, the self-serve data platform, and the federated computational governance.  

Listen out for:

  • Career Journey - [00:06:49]
  • Challenges Working with Data - [00:10:19]
  • Centralization of Data - [00:13:53]
  • Why Current Tools Not Adequate - [00:16:00]
  • Data Mesh - [00:19:32]
  • Drivers Adopting Data Mesh - [00:22:16]
  • Principle of Domain Ownership - [00:25:54]
  • Data Pipeline is an Internal Implementation - [00:29:40]
  • Single Source of Truth Myth - [00:32:09]
  • Principle of Data as a Product - [00:35:57]
  • Data Product Owner & Developer - [00:38:17]
  • Principle of The Self-Serve Data Platform - [00:40:51]
  • Agnostic Data Platform - [00:44:00]
  • Principle of Federated Computational Governance - [00:46:01]
  • Data SLO - [00:50:20]
  • 3 Tech Lead Wisdom - [00:52:23]

_____

Zhamak Dehghani’s Bio
Zhamak Dehghani works as the CEO and founder of a stealth tech startup reimagining the future of data developer experience. She founded the concept of Data Mesh in 2018 and since has been implementing the concept and evangelizing it with the wider industry. She is the author of Architecture the Hard Parts and Data Mesh books.

Zhamak serves on multiple tech advisory boards. She has worked as a technologist for over 24 years and has contributed to multiple patents in distributed computing communications. She is an advocate for the decentralization of all things, including architecture, data, and ultimately power.

Follow Zhamak:

Mentions & Links:

 

Our Sponsor - Founders Wellbeing
Mental well-being is a silent pandemic. According to the WHO, depression and anxiety cost the global economy over USD 1 trillion every year. It’s time to make a difference!
Learn how to enhance your lives through a master class on mental wellness. Visit founderswellbeing.com/masterclass and enter TLJ20 for a 20% discount.
Our Sponsor - iSAQB SAG 2022
The iSAQB® Software Architecture Gathering is the international conference highlight for all those working on solution structures in IT projects: primarily software architects, developers and professionals in quality assurance, but also system analysts who want to communicate better with their developers. A selection of well-known international experts will share their practical knowledge on the most important topics in state-of-the-art software architecture. The conference takes place online from November 14 to 17, 2022, and we have a 15% discount code for you: TLJ_MP_15.
Our Sponsor - DevTernity 2022
DevTernity 2022 (devternity.com) is the top international software development conference with an emphasis on coding, architecture, and tech leadership skills. The lineup is truly stellar and features many legends of software development like Robert "Uncle Bob" Martin, Kent Beck, Scott Hanselman, Venkat Subramaniam, Kevlin Henney, Allen Holub, Sandro Mancuso, and many others!
The conference takes place online, and we have the 10% discount code for you: AWSM_TLJ.
Our Sponsor - Skills Matter
Today’s episode is proudly sponsored by Skills Matter, the global community and events platform for software professionals.
Skills Matter is an easier way for technologists to grow their careers by connecting you and your peers with the best-in-class tech industry experts and communities. You get on-demand access to their latest content, thought leadership insights as well as the exciting schedule of tech events running across all time zones.
Head on over to skillsmatter.com to become part of the tech community that matters most to you - it’s free to join and easy to keep up with the latest tech trends.
Our Sponsor - Tech Lead Journal Shop
Are you looking for a new cool swag?
Tech Lead Journal now offers you some swags that you can purchase online. These swags are printed on-demand based on your preference, and will be delivered safely to you all over the world where shipping is available.
Check out all the cool swags available by visiting techleadjournal.dev/shop. And don't forget to brag yourself once you receive any of those swags.

 

Like this episode?
Subscribe and leave us a rating & review on your favorite podcast app or feedback page.
Follow @techleadjournal on LinkedIn, Twitter, and Instagram.
Pledge your support by becoming a patron.

 

Quotes

Career Journey

  • To my surprise, the world of data was far from the agility and nimbleness and autonomy and distribution decentralization that were seen in app dev world. It’s slow moving. It’s based on paradigms around centralization of data and middleman moving data around.

Challenges Working with Data

  • The problem that I saw was that we had an assumption that to get value from data, and value being through BI (Business Intelligence), through reporting, through training machine learning models, all sorts of analysis of data, data needs to come and get modeled or raw and get centralized. That was a response to the silo-ing of data in application databases that really didn’t allow cross cutting analysis of data across the applications.

  • That idea around centralization had led to a very fragile, very slow-moving architecture and bottlenecks for really getting value from data. What are the points of fragility that cause so much waste? Those are data pipelines. To centralize data from applications, we have this concept of ETL, ELT, extraction of data from application databases. I think it’s one of the biggest crimes that we can do in the architecture. It’s so intrusive, right? Because there’s no contract. There is no abstraction. And that is a very fragile, and it causes a lot of waste. By the time the data has popped out of the other end of the pipeline, the source has moved on and you have got problems to solve.

  • On the other hand, you have this big bottleneck. So the assumption is that there is a data team responsible for the data from everywhere, and they put it in a warehouse or lake. It is a flawed assumption in an organization that needs to move fast, needs to share data more peer-to-peer. They become a bottleneck. The team and the architecture itself become a bottleneck. So you have frustrated users, data users. They can’t find the data they need. They can’t access it. They don’t trust it. By the time data got made available to them, the source has moved on.

  • Data people in the middle of that, they have been given this impossible task of getting data from people that have no intention of sharing data and giving it to people that they have no idea how they’re going to use them, and they’re stuck in the middle, like cleaning data, watching and shoveling data with no real purpose, to be honest. They are under a lot of pressure and they don’t perform.

  • The applications don’t have much of a visibility or even opportunity to use that data. They just put data out to put that in the databases and use it for their operational needs, and they’re never in the conversation in analytics. They’re isolated from the real application of the possibility of ML embedded into their applications.

Centralization of Data

  • It’s easy to get started with, right? When your problem space is new or the solution space is new and you require specialization. You’ve got a new set of tools and new ways of working with data. You can’t leverage your majority of organization. You can’t really push the responsibility into every team. You have to centralize the specialized people. And also, it’s easier to control and easier to get started with.

  • There has to be a pivotal point that the complexity of your environment increases to the point that centralized simple solution does not work anymore.

Why Current Tools are Not Adequate

  • The tools are solving niche problems. They are created for an operating model that at the end of the day is pipeline, data, transform, put into storage, layer with metadata, or sprinkle ML on top and voila, you get value on the other end. So they have been organized around this very centralized pipeline model at a very micro level. They’re organized to solve ingestion problems. They’re organized to solve big, hairy pipeline problems.

  • Unless you change the operating model and that mental architecture, no matter how locally you optimize a particular solution, we’re going to optimize and solve the really hairy centralized data pipelines, you’re still going to have a hairy centralized data pipeline. You get a little bit better at maybe detecting the errors or connecting to more resources, but ultimately, you are still stuck in the past paradigm, and those fundamental assumptions that need to be invalidated and need to change remain.

  • People who are deeply in the data space and try to use these tools, they stay struggling and they suffer. They’ve got this massive landscape of fragmented technologies that frankly work with difficulty with each other. The cost of integration of these technologies into a meaningful, scalable solution is very high.

Data Mesh

  • It actually started as an architecture because I am a technologist. So, I kind of apply the lens of technology to solve problems. So I saw this as an architecture to organize how we decouple and how we break down this big problem of how I get value from data.

  • But very quickly, I realized as we know, Conway’s law and just real life experience, technology and architecture mirrors and get influenced by the way we organize our organizations and teams. So very quickly I had to self correct.

  • We’ve got to rethink the organization of teams, the modes of communications, the contract for data sharing between the teams and the responsibilities, like data product owner was a new role that we introduced. So hence, it became a socio technical, as in we tried to find excellence in our solutions involving the interaction of people and teams and the technology.

  • If you don’t have the organization complexity, if your lake or lake house or warehouse model is not a bottleneck for you, if the centralized data team is doing a great job and everyone’s happy, well, why introduce a concept that’s rather complex? It creates a kind of system complexity. It’s not for everyone at this point in time.

  • Maybe in the future, technology advances, our thinking, our approaches, our process advances in a way that bootstrapping with data mesh is as easy as bootstrapping or even easier than if it’s centralized. So then at that point, data mesh is for everyone, because the maturity of support of the environment has reached that level of maturity.

Drivers Adopting Data Mesh

  • If they want to get value from their data, generate data-driven value, and they want to do that by applying analytics and AI in almost every aspect of their business, they need to utilize data from all aspects and all touchpoints and all applications inside the company and outside. If they have such a mission and they want to do that reliably, resiliently, and do that at scale fast, then they’ve got to consider data mesh. It’s all about really unlocking the value of the data.

  • It’s all of these other teams that you need to now put on your centralized backlog and plan somewhere and get me to the data that I need. That doesn’t scale.

  • So imagine your organization, imagine the missions and the values that can be enabled through the data, and see if you have bottlenecks that need to be addressed. And if you do, then think about data mesh.

Principle of Domain Ownership

  • Domain ownership is about, it’s the same as what Domain -Driven Design meant at the strategic design level to applications, which is you have this smaller business domain oriented teams and groups of people that are collaboratively, cross functionally working to solve business problems with technology. So they are not only responsible for developing applications and software to enable that business outcome, but they’re also responsible for using and sharing data and for analytical purposes for the application of machine learning model.

  • Break down the responsibility of data sharing around the themes of organizations. So you have this infinitely scalable model as you introduce new domains, you introduce new data such that those domains can use and share and give the responsibility of data sharing to people for analytical use cases, for this kind of crosscutting use cases, to people that are capable to be responsible for it because they’re so close to it. They understand. They know what this data is about. Don’t give that responsibility to someone downstream that actually doesn’t know the domain, and it’s very hard for them to keep the cognitive load of knowing all the domains in the team’s heads.

  • We came from the traditional approach where we have a centralized team. They have to understand all the domains within the company organization. Understand the data model, the evolution of all data, and try to put them in one central place. But I think your approach here, your principle putting into a domain ownership means that the domain team themselves is responsible for not just the operational data, but also the analytical data part where probably they will transform their operational data, and responsible for sharing them for analytical purpose as well. And since they’re the experts of the domain, they are probably the best person to come up with that kind of data model.

Data Pipeline is an Internal Implementation

  • If data mesh is really successful the way I had imagined it, there’ll no longer be any data pipelines. As in, if you go to the macro level, macro view of your architecture, you really shouldn’t see data pipelines anymore between these domains, between the data products.

  • Pipelines are job-oriented, task-oriented computations that happen on some input from the data and transform and put it on some output sink. And you repeat these task-oriented processes until data gets transferred in a mode that somebody can use it. Usually, there is structure around: we need to extract information. We need to cleanse them. We need to model. And they’re usually done in between the sinks of the data outside of the source, outside of the destination, somewhere in between.

  • Data mesh completely challenges that concept because we’re no longer working this task-oriented kind of environment. We’re working on this value-oriented, outcome-oriented environment, data product-oriented environment. So then the job of the transformation is really an implementation detail of one of these data products in one domain. It’s not something that happens in between.

  • It really falls into the similar principle of microservices that we had. In microservices world, this enterprise service bus was an anti-pattern, right? Because we want to localize logic and computation and complexity inside a boundary of a contract abstracted within a service.

  • So we came up with this idea of smart endpoints and dumb pipes. So your pipes are super dumb. They just transform data. And your endpoints are smart because they implement the logic behind the APIs, and that led to more API thinking world.

  • These pipelines are like the enterprise service bus analogy which I’ve used, and they have the same challenges. So we’ve got to break them apart. The pieces that are relevant should go where they should belong. They can be implemented as a pipeline if you want to within the implementation or other ways.

Single Source of Truth Myth

  • Because a single source of truth is not a real thing. It just doesn’t exist in the real world.

  • The aim is we still want to be able to get a consistent view and understanding of the data. But we want to do that in a way that it doesn’t slow movement. It doesn’t slow value generation. It doesn’t become a stale source of truth very quickly. We want to do that in a way that we support the chaos of the reality of organizations. There are different teams moving at different pace. They generate different bits and pieces of properties of the same entity, but those properties come from different sources with different cadence.

  • So we want to embrace that almost complexity in chaos, but yet create a system that gives the same outcome that a single source of truth wants to give, which is if I search for information about customer, even though that information comes from different places, I can understand a particular snapshot of the customer at a point in time has a sort of consistent values.

  • Data mesh has a lot of constraints and disciplines built into it. For example, the data that data nodes provide are read only. They never changed. They’re temporal. They have two timestamps. Bitemporal. So at any point in time, they’re streamed across different nodes in a way that if some data arrives, new data arrives upstream, the downstream nodes that are transforming that or copying that and transforming it into a new shape of the data, they get notified. They have a responsibility to either react to it and generate a new slice of the data at this point in time or not.

  • Also data mesh provides a means of stitching this polysemes concept. The polyseme concept of the customer can be stitched together by the consumer because there are links that are created between those systems or between those data products.

  • There are a set of constraints. There are a set of operational disciplines like this polyseme linkage and bi-temporality and immutable data that still results in the same outcome as a single source of truth. But it’s designed for an inherently complex business model and operating model.

Principle of Data as a Product

  • A mind shift that needs to happen when we put data sharing and serving and delighting the experience of people using data as a first class concern. It’s also an antidote to the first principle. The problems of the first principle.

  • Domain-oriented data ownership, one can imagine, can lead to data siloing. Why should I care about sharing that and be responsible for consumers, which adds a ton of work? Data as a product try to incentivize people to share that data, be part of an ecosystem that is generating value through data exchange and through data sharing. Put some discipline and constraints in place for that to be done effectively.

  • It defines such roles of people that are responsible for that, defines success metrics for data as a product, defines an architectural concept. If we define usability characteristics around discoverability, addressability, like all the things that make the experience of a data user really easy, really delightful, those need to be translated into structural architectural components that are built into this data as a product. So then, you have a kind of technology that needs to shift and change.

Data Product Owner & Developer

  • If data product becomes a thing that we are creating, maintaining, operating, evolving, retiring when it’s useless and nobody uses it, then there have to be people and roles to take that responsibility on. And it’s very unfair, I think it is impossible to say to app developers whose work and consumer are very different personas, like they’re serving the end user, to say, oh, now from here on you also have these other responsibilities.

  • Half of your day, you have to face around and face this data analyst and data science in different domains who want to use analytical data that you’re generating. You’ve got to serve them too. You can’t serve these two purposes at the same time.

  • You need to have a very explicit responsibility and accountability for that part of the job. If there are people that are pairing like the data product developer and app developer, they’re pairing and collaborating closely, but there are two different people playing these roles and that’s fine too.

  • We have to allocate space. We have to empower people. We have to keep people accountable and responsible. Hence the roles.

  • For data to be also usable as a product, you mentioned about all other usability attributes, like discoverability, addressability. It should also be understandable, and it’s trustworthy as well, and it can be interoperable.

  • Interoperability is also a good concern that the data domain owner should think about. And that’s why I think having a data product owner that maybe can define a set of requirements and maybe this kind of all usability concerns.

Principle of The Self-Serve Data Platform

  • When we think about the roles of platforms in general, platforms are often shared kind of infrastructure on top of which you build domain specific solutions. So they’re often like domain agnostic infrastructure that empowers other teams to build solutions on top. So very horizontal, not really vertical so much.

  • In an organization that’s implementing data mesh, now you have these domain teams. You have data product folks in them, and they’re doing their daily job. Their daily job should be focused on delivering value based on the outcomes of that domain. Their daily job should not be focused on metal work, creating the kind of domain agnostic pieces of technology that they need. Platform as a means to empower autonomous teams to lower their cognitive load, to do what they need to do more easily.

  • The data platforms, many of them exist. There are lots of technologies out there. They’re built to give basic tools. But the tool is low level for developing data products. So we need a new layer of the platform that really takes data product away. Treated as a first class concern and hides away those details of, oh, I need a storage, a pipeline or this or that.

  • And it gives life to a completely new concept. This new concept of data quantum and data product.

  • The job of that platform, the reason I put it in, was making it feasible for independent domain teams to do data work.

  • Some of the attributes that you mentioned of this data platform are that it should be autonomous, interoperable, and domain agnostic.

Agnostic Data Platform

  • The agnosticity depends on the platform and its independence from the underlying technologies. The level of it depends on the appetite of an organization and how independent they want to be.

  • I think what we need is interoperability between the different technologies. So if we spill the solution on top and the solution requires data from across two different platforms, there is a level of interoperability that I can access data across two different clouds or across two different technology stacks. So at the minimum, we need a layer that creates that interoperability even if the vendors themselves are not incentivized to do that right now.

  • When it comes to underlying infrastructure agnostic, again, I don’t think it’s meaningful in all organizations because you end up with a like most common denominator of the set of features that are available on all the platforms, and that’s not ideal. That’s a lot of work and very little result.

  • I don’t know if you need to be completely agnostic, but we have to have the pieces of it that enable interoperability and movement. Moving from one to another and remove locking as much as possible. And those pieces are usually around crosscutting concerns. How do I manage security? How do I build automation so that if tomorrow I want to move to a different set of infrastructure, my processes are automated and not hand cranked? I can kind of throw automation, facilitate the movement much faster.

Principle of Federated Computational Governance

  • If I could sneak in another word, I probably have called this principle of embedded federated computational governance.

  • The concept really is an antidote to the problems that arise from the previous principles, which is we need interoperability. We have now these disparate sets of data products, domain-oriented, their own teams, their own cadence, their own life cycle. How can we apply a set of concerns that need to be standardized across all of them? And what’s the best way to go about defining them? What’s the best way to go about implementing them and observing and enforcing them?

  • As an example, if you need to have secure data or you need to have high-quality data; let’s go with high quality, some sort of quality data, definition of quality and then enforcing quality, and sharing quality data.

  • One way of doing it is to say, okay, I’m going to put a quality control team, my governance team, in the process of generating every data. And these guys are going to sit in the middle and verify at some point in that life cycle of the data, if this data is acceptable to be shared.

  • That system is going to have a massive bottleneck, and it’s not going to scale. So how can we achieve a defined level of quality without creating necessarily just controls? We need the consensus or definition around what constitutes quality? As in what attributes do we use to describe the quality of data? Is it completeness? Is it integrity? Is it timeliness? Like is it all the above and others? So let’s define those. And in that definition, let’s have subject matter experts. Let’s have domain people who actually know their data and how they can articulate the quality involved in defining that. And once that’s defined, let’s automate it. Let’s put it into the platform as a platform capability.

  • The moment you are instantiating it in a product, you will get out of the box, a library or some SDK or something that gives you the ability to now calculate, capture and share these quality metrics. And then you will have observability that runs across this mesh, across all of his data products and captures that information, shares that information, and also validates whether you are meeting the requirements of the quality that you’ve done. So that’s the computational part.

  • The embedded part that I haven’t put in the title is that this enforcing quality and measuring quality becomes an embedded concern in every single data product. It’s not something that’s smeared over and added later on. It’s actually from ground up, built in the data product itself. So it’s embedded in there.

Data SLO

  • For people to trust your data, you need to share a set of real time almost, or at least as real time as your data is, information to give people trust that this is a suitable data. The dimensions of that are around quality. They’re around timeliness. They’re around completeness.

  • There are classes of information, additional data that you’ve got to provide. So that the people that want to directly self-serve use the product, they can self-assess if this data suits their use case or not.

  • SLOs in the app dev are more about part uptime and response time and downtime and so on. And in the data world, the computational part of it still has those concerns. But the data part of it, it has a different set of concerns that defines the usability metrics of a data.

3 Tech Lead Wisdom

  1. Separate leadership from management.

    • As a leader, you need to believe in your mission and create a mission-oriented team and organization and continuously through communication, through reinforcement to have like embodying the right behavior to achieve that mission. Reinforce that and remind your teams and keep realigning the team.

    • Maybe there are different styles of leadership, but that mission oriented, vision oriented leadership resonates with me.

  2. To get to that mission, you have two ways of going there.

    • You have a way of leaving maybe some casualties behind, like going in a way that not everyone can catch up, and if you wounded soldiers along the way.

    • You want to make sure that everybody’s aligned. You need to think about every single member of the team, their needs, their pace, their specific hopes, and it’s about not only having the vision and charging the path but also making sure everyone can come along and think beyond yourself.

    • Your people need to be able to trust you and believe in you. You need to be very self aware in terms of your strength and your weaknesses.

  3. If you are a technical leader, I personally respect technical leaders that still stay close to their craft. They still keep up to date with their craft.

    • The technology moves really fast, so you have to find a way to keep yourself relevant and up to date. And sometimes that means going really deep for a moment in time, get your hands dirty and coming back up.

    • As your scope of leadership grows, your ability to go really deep diminishes, because just time doesn’t allow for that.

    • So carving out space to go deep when it’s needed, even for a very short period of time.

Transcript

[00:02:16] Episode Introduction

Henry Suryawirawan: Hello again, my friends and my listeners. Welcome to the Tech Lead Journal podcast, the show where you can learn about technical leadership and excellence from my conversations with great thought leaders in the tech industry. And you’re listening to the episode number 107. If this is your first time listening to Tech Lead Journal, subscribe and follow the show on your podcast app and on LinkedIn, Twitter, and Instagram. And if you’d like to support my journey creating this podcast, subscribe as a patron at techleadjournal.dev/patron.

My guest for today’s episode is Zhamak Dehghani. Zhamak is the founder of data mesh concept in 2018, and since then has been evangelizing it to the wider industry, including writing her latest book titled “Data Mesh”. In this episode, we discussed in-depth about the data mesh concept, which is starting to become an industry trend nowadays. We started our conversation by discussing the current challenges working with data, such as the outdated data centralization approach and why the current data tools are still inadequate. Zhamak then described data mesh and why organizations should adopt it to generate data-driven values at scale. Zhamak then explained the four core principles of data mesh, which include domain ownership, data as a product, the self serve data platform, and the federated computational governance.

I really enjoyed my conversation with Zhamak, learning the data mesh concept in-depth, which has been something I would love to learn more about, and this episode taught me a lot about it. If you also enjoy listening to this episode, will you help share it with your friends and colleagues who can also benefit from listening to this episode? My ultimate mission is to spread this podcast to more listeners. And I really appreciate your support in any way towards fulfilling my mission. Before we continue to the conversation, let’s hear some words from our sponsors.

[00:06:00] Introduction

Henry Suryawirawan: Hello, everyone. Welcome to another new episode of Tech Lead Journal podcast. Today, I’m so excited to meet Zhamak Dehghani. She was last a Director of Emerging Technologies in ThoughtWorks. She was there probably around 11 years. I know Zhamak following her work when I was working in ThoughtWorks as well. So she was part of the Technology Radar committee, and always come up with all these emerging technologies. And recently, in the last few years, Zhamak came out with this concept called data mesh. I think it was around in 2018, if I’m not wrong. Since then, the data mesh concept has taken a surprise by many people and many people rave about it. So today we’ll be talking a lot about data mesh and I’m really looking forward to have this conversation with you, Zhamak.

Zhamak Dehghani: It’s a pleasure to be here, Henry. And thank you for that. Perfect pronunciation of my name.

Henry Suryawirawan: Oh, surprised. Okay.

Zhamak Dehghani: Yes.

[00:06:49] Career Journey

Henry Suryawirawan: So Zhamak, I would love to probably know more about your career. So I always start to ask my guests to share their career journey or any turning points or highlights in their career. Maybe if you can share a little bit about yourself.

Zhamak Dehghani: Sure. Love to. I guess my journey is filled with detours and going to new places led by curiosity. So I started as a software engineer. For the first 14 years of my career, I worked in deep tech R&D companies where they were building a technology product. So I did distributed systems before cloud and large scale distributed systems were a thing. Build monitoring and observability from scratch. Building streaming systems. Building databases that basically get signals from critical infrastructure, analyze those signals, turn them into reports, and so on. The full stack of what a real world distributed system could look like. And I did that on multiple operating systems, various flavors of UNIX and HP tandem and Windows, and so on. That really gave me a good insight bottom-up to the technology stack. I realized that a lot of technologies today, maybe start with web development or app development, and they don’t get the opportunity to really like look inside the kernel on how to do system programming. How to build protocols? I did all of that, which was awesome.

And then I took a bit of a detour. I went to hardware for a little bit. I worked for a company that was building various hardware from scratch. We were building the firmware on digital pen systems, and that was interesting as well. Again, a lot of great learnings. I had to build embedded firmware.

And then another detour. I came to consulting with ThoughtWorks. As we worked at ThoughtWorks, we worked on kind of larger scale execution. So that led to microservices, and again, building large scale, I guess, distributed solutions with microservices and that ecosystem. I did quite a bit of work in that space for quite a while. And I was excited about service mesh and Kubernetes and all of the technologies that’s really made it possible. And as you said, about four years ago, I kind of started putting my nose into the data space. To my surprise, the world of data was far from the agility and nimbleness and autonomy and distribution decentralization that were seen in app dev world. It’s slow moving. It’s based on paradigms around centralization of data and middleman moving data around. And I thought it was a very sad observation, to be honest. And I thought, okay. I could be the little kid who points to the naked emperor. And I don’t mind doing that even if I get attacked.

So I started like talking about, okay, how can we shift the paradigm? What are the pain points? Like, wake up people. And yeah, data mesh really came as a hypothesis, as a question with some answers, some basic answers. Since then, I’ve been kind of building it, evangelizing it. And as of two weeks ago, I decided to start a company to build a product, a technology deep tech kind of dev product, developer facing product that is going to make it so easy to kind of work with data under the data mesh principles. To really show the world a different way of doing things, and enable developers most importantly.

Henry Suryawirawan: Wow. Sounds really exciting. Yeah. I saw your post about quitting ThoughtWorks job. It was surprising to me. But I think looking at the opportunity of investing more effort in building the tools about data mesh, I think that will be definitely crucial and will help a lot of people.

[00:10:19] Challenges Working With Data

Henry Suryawirawan: But before we go into this exciting journey, I think first of all, the topic of this conversation is about data mesh. And you wrote a book with the same title, “Data Mesh” with the subtitle of “Delivering data-driven value at scale”. So maybe if you can share a little bit. What did you actually see the problems when you were working with the data? You mentioned about some old ways, paradigm, centralization, and things like that. But what kind of challenges and problems when you work with the data problem during that time? And maybe if you can give an overview how data has always been approached in the delivery or in the day-to-day development?

Zhamak Dehghani: Yeah. That’s a good way of positioning data mesh. The problem that I saw was that we had an assumption that to get value from data, and value being through BI (Business Intelligence), through reporting, through training machine learning models, all sorts of analysis of data, data needs to come and get modeled or raw and get centralized. That was a response to a siloing of data in application databases that really didn’t allow cross cutting analysis of data across the applications.

But that centralization, that idea around centralization, had led to a very fragile, very slow-moving architecture and bottlenecks for really getting value from data. So what are the points of fragility that cause so much waste? Those are data pipelines. So to centralize data from applications, we have this concept of ETL, ELT, you know, extraction of data from application databases. I think it’s one of the biggest crimes that we can do in the architecture. It’s so intrusive, right? Because there’s no contract. There is no abstraction. So you are constantly breaking very task-oriented, job-oriented, elaborate pipelines, like complex pipelines. Moving stuff around and converting them, putting them from one sink to another sink. And that is a very fragile, and it causes a lot of waste. By the time the data has popped out of the other end of the pipeline, the source has moved on and you got problems to solve.

And then, on the other hand, you have this big bottleneck. So the assumption that there is a data team responsible for the data from everywhere, and they put it in a warehouse or lake. It is a flawed assumption in an organization that needs to move fast, needs to share data more peer-to-peer. They become a bottleneck. The team and the architecture itself become a bottleneck. So you have frustrated users, data users. They can’t find the data they need. They can’t access it. They don’t trust it. By the time data got made available to them, the source has moved on. Data people in the middle of that, honestly, I have all the love and empathy for them in the world that they have been given this impossible task of getting data from people that have no intention of sharing data and giving it to people that they have no idea how they’re going to use them, and they’re stuck in the middle, like cleaning data, watching and shoveling data with no really purpose, to be honest. They are under a lot of pressure and they don’t perform.

And then the applications really don’t have much of a visibility or even opportunity to use that data. They just put data out to put that in the databases and use it for their operational needs, and they’re never in the conversation in analytics. They’re isolated from real application of possibility of ML embedded into their applications, but becoming analytical data users, because it definitely got just put aside from this data world. Yeah. So fragility, long lead time from data to value, bottlenecks. All of that are major problems that you see with the past paradigms.

[00:13:53] Centralization of Data

Henry Suryawirawan: I think in my career so far, I’ve experienced things like, for example, big giant database, where you have vendor databases, like Oracle, SQL Server, right? And then we moved into the paradigms of data warehouse concept, where you have another set of tools where you pump the data from the OLTP database into this analytical database. And then in the last, maybe 10 years, we moved to another concept called data lake where you probably centralized raw data, put it there, and from there, you move into different data marts maybe, or different small data warehouse. And now we move into this cloud model where we also see cloud technologies helping. I’m familiar much more with like BigQuery. In Amazon, maybe you have Redshift. So if you see all this historical, the unique thing about it, like you mentioned, the centralization. Why people actually move more towards centralization rather than the decentralization part?

Zhamak Dehghani: Yeah. It’s easy to get started with, right? When your problem space is new or solution space is new and you require specialization. You’ve got a new set of tools and new ways of working with data. You can’t leverage your majority of organization. You can’t really push the responsibility into every team. You have to centralize the specialized people. And also, it’s easier to control and easier to get started with, like we know, right? Any startups start building an application or solution would say, like, start with the monolith. Find your market fit and then break it down, because decentralized distributed systems are inherently complex. So there has to be a pivotal point that the complexity of your environment increases to the point that centralized simple solution does not work anymore.

And I think the data warehouses or lakes or lake houses as an architectural paradigm, not so much as an underlying technology, they’ve been suitable for this world that the data wasn’t ubiquitous, perhaps. We weren’t capturing data from every touchpoint. The data wasn’t being used in every single application and domain. And having a centralized team and a centralized technology site to deal with it was acceptable. But we no longer live in that world. We’ve passed that pivotal point of complexity.

[00:16:00] Why Current Tools Not Adequate

Henry Suryawirawan: So you mentioned about this point about operational versus analytical divide, where the application teams probably just dump data into database, and maybe there’s another set of data analyst, data engineer trying to get the data from those databases and put it in a central place and getting insights. There are a lot of complexities, definitely. But I think in the last few years we also see a lot of advancement in data technologies. So I think I saw in your book and presentation, you have this one slide where probably there are so many different technologies, the logos are too small. Why those tools still couldn’t solve this kind of problem? Because it seems like there are so many advancement.

Zhamak Dehghani: Yeah. That’s a very good question. I mean, the tools are solving niche problems. I’ll actually clarify with an example. They’re nevertheless, they are created for an operating model that at the end of the day is pipeline, data, transform, put into storage, layer with metadata, or sprinkle ML on top and voila, you get value on the other end. So they have been organized around this very centralized pipeline model at a very micro level, right? If you zoom out and if you start sprinkling these tools to an overall kind of big picture meta architecture, that’s what they’re organized to do. They’re organized to solve ingestion problems. They’re organized to solve big, hairy pipeline problems. Airflow like those sorts of technologies. They’re designed to solve the big data storage or parallel processing problems. So, unless you change the operating model and that mental architecture, no matter how locally you optimize a particular solution, we’re going to optimize and solve the really hairy centralized data pipelines, you’re still going to have a hairy centralized data pipeline. You get a little bit better at maybe detecting the errors or connecting to more resources, but ultimately, you are still stuck in the past paradigm, and those fundamental assumptions that need to be invalidated and need to change remain.

In fact, people who are deeply in the data space and try to use these tools, they stay struggle and they suffer. They’ve got this massive landscape of fragmented technologies that frankly work really with difficulty with each other. And they have to work with them. They have to integrate them themselves, the cost of integration of these technologies into a meaningful, scalable solution is very high. I mean, if you look at every vendor on that diagram, and if you go to their connector pages, there’re businesses built around just custom proprietary connectors to yet another data source, yet another data sink. Lack of standardization is just mind-boggling in this space. So it’s like the tower of Babylon. Like it’s just falling apart, in my mind. It just feels like nobody speaks the same language. To a large degree, tools are built to solve a very custom solution, and the foundation of this house is about to fall off. So you still have have broken foundation. I know I can be very polarizing as I described this because I’m very passionate about let’s fix the foundation. Let’s rethink our operating model.

Henry Suryawirawan: Totally makes sense, because, yeah, the way you mentioned about cost of integration, I can imagine every data product that I see, you will see all these integrations. Then the more, the better. After you do it, actually, the investment, you kind of like lock down and you put so much effort and maybe money to actually move your data. But eventually, if you need to switch also the costs for you to move the data out to different technologies, I think that’s also painful. So I’ve been into some of these kind of projects, I think I agree with you about that problem.

[00:19:32] Data Mesh

Henry Suryawirawan: So you invented this data mesh, right? So if I may describe data mesh from your book. You mentioned data mesh is a decentralized sociotechnical approach to share access and manage analytical data in complex and large-scale environments within or across organizations. So there are so many interesting topics, but the first that I picked is actually, you mentioned it as a decentralized socio-technical approach. So tell us more about this.

Zhamak Dehghani: Sure. It actually started as an architecture because I am a technologist. So, I kind of apply the lens of technology to solve problems. So I saw this as really an architecture to organize how we decouple and how we break down this big problem of how I get value from data. But very quickly, I realized as we know, Conway’s law and just real life experience, technology and architecture mirrors and get influenced by the way we organize our organizations and teams. So very quickly I had to like self correct. No, this is not just a technical solution. No, this is not just an architecture. We’ve got to rethink the organization of teams, the modes of communications, the contract for data sharing between the teams and the responsibilities, like data product owner was a new role that we introduced. So hence, it became a socio technical, as in we tried to find excellence in our solutions involving the interaction of people and teams and the technology. Some people say, oh, is it a techno social, or is it a social techno? I don’t really care which one comes first, as long as they’re both involved. So hence the word.

Henry Suryawirawan: Thanks for sharing that. It seems like the Conway’s law is really like a true principle in many software designs or architecture, right? So I think data mesh is probably one of it. And you mentioned that it is an approach to solve complex and large-scale data problems. So does it mean that not everyone will need to go to data mesh since the beginning?

Zhamak Dehghani: Yeah, I think at this point in time. I mean, I answer usually this question by saying, well, at this point in time, if you don’t have the organization complexity, if your lake or lake house or warehouse model is not a bottleneck for you, if the centralized data team is doing a great job and everyone’s happy, well, why introduce a concept that’s rather complex? It creates kind of system complexity. So yes. The short answer is, yes, it’s not for everyone at this point in time. And maybe in the future, technology advances, our thinking, our approaches, our process advances in a way that bootstrapping with data mesh is as easy as bootstrapping or even easier than if it’s centralized. So then at that point, you say, well, data mesh is for everyone, because the maturity of support of the environment has reached that level of maturity.

[00:22:16] Drivers Adopting Data Mesh

Henry Suryawirawan: So you have shared all these problems, challenges that you saw before, and you came up with this concept. But for those people who are already in this state of complexity dealing with their data, either data architecture, pipelines and things like that in organization that is very large scale, maybe global, where they have all these data challenges. Maybe tell them what are some of the reasons why they should consider moving to data mesh? So maybe in business value or maybe in some kind of more impact value-driven kind of a benefits.

Zhamak Dehghani: Yeah. Just simply if they want the subtitle of my book, if they want to get value from their data, generate data-driven value, and they want to do that by applying analytics and AI in almost every aspect of their business, and they want you to do that, they need to utilize data from all aspects and all touchpoints and all applications inside the company and outside. If they have such a mission and they want to do that reliably, resiliently, and do that at scale fast, then they’ve got to consider data mesh. It’s all about really unlocking the value of the data.

So let’s give a real world example. If you are in a particular part of the business, let’s say, I use this example in my book of a Spotify like company. I called it App Inc. It’s a digital streaming company. And if you have a team whose job is really to create immersive musical experiences personalized for every moment of every person in the world depending of what they do. That team constantly comes up with new hypothesis on how to use data about music and artists and listeners and their behavior to create a more immersive experience, more personalized to that moment in life. Every one of those hypotheses, they require discovery of the data and access to the data. So are they going to be more successful to be able to discover and get access to the data and even ask people to provide the data, the data’s not there, if they were working in a peer-to-peer fashion. Or are they going to be more successful if there was a centralized team in between all parts of the business?

So as an example, if the playlist team that generates immersive playlist want to create targeted music for people if they’re doing cycling or running. Are they going to be more successful to go and talk to teams that are taking care of partnership with cycling platforms directly and say, “Look, we need to see what people are responding to when they’re on their pelotons”, I suppose, as an example? Or are they successful if they say to a middleman data broker team and say, “Look, I have this hypothesis”. So it’s all of these other teams that you need to now put on your centralized backlog and plan somewhere and get me to the data that I need. That doesn’t scale. So imagine your organization, imagine the missions and the values that can be enabled through the data, and see if you have bottlenecks that need to be addressed. And if you do, then think about data mesh.

Henry Suryawirawan: The way you described this use case is very interesting, because yeah, maybe not all organizations are in this state where you have data and you do discovery and maybe shape the next set of data that the application produced. Again, do hypothesis and maybe analyze, and then, again, reshape the data over and over iteratively. And then maybe one day you’ll come up with a new insight and maybe new business lines as well. Because the data has transformed so much with the scale of the discovery and also the scale of the hypothesis that the team does. So I think that’s really a very interesting concept. I haven’t experienced it myself, because I haven’t worked in this kind of organization, but thanks for sharing that.

[00:25:54] Principle of Domain Ownership

Henry Suryawirawan: So let’s move on to the principles. I think when I read all these data mesh concept that you share, you always come up with these four principles of data mesh. And when I read all of them, I find it interesting because you kind of like use other kind of framework or maybe approach from application development, and mix it into the approach dealing with data. So maybe if you can go briefly one by one, right? The first one is you call it the principle of domain ownership. And this is something like applying Domain-Driven Design to data. Maybe if you can tell us more about this principle?

Zhamak Dehghani: Sure. You are right. Your observation is very correct that I wasn’t as clever and as creative. I basically contextualized the things that I had seen working in, you know, 24 years of my career in complex environments and operational systems and say, “Let’s contextualize them. Let’s apply them to the world of data. We’ve seen these principles work before. Why shouldn’t they work with this bottleneck? They’ve solved previous bottleneck”. So domain ownership is about, it’s the same as what Domain -Driven Design meant at the strategic design level to applications, which is you have this smaller business domain oriented teams and groups of people that are collaboratively, cross functionally working to solve business problems with technology. So they are not only responsible to develop applications and software to enable that business outcome, but they’re also responsible for using and sharing data and for analytical purposes for application of machine learning model.

Again, back to the digital streaming, if you have a team that is working on your player application, and their job is to give the best digital experience to the user that’s playing music or playing and liking and recording, or whatever the interactions are. You are also responsible as a team, well augmented perhaps with new roles and team members, for sharing that data in a way that data can be used to directly by some sort of an analytical workload, and that analytical workload might be a machine learning model that is being trained by your data and some other data from other domains. Or it could be a report that we are producing in terms of errors and anomalies of this application. So we can improve it over time.

So that’s the core of it, is that break down the responsibility of data sharing around the themes of organizations. So you have this infinitely scalable model as you introduce new domains, you introduce new data such that those domains can use and share and give the responsibility of data sharing to people for analytical use cases, again, for this kind of crosscutting use cases, to people that are capable to be responsible for it because they’re so close to it. They understand. They know what this data is about. Don’t give that responsibility to someone downstream that actually doesn’t know the domain, and it’s very hard for them to keep the cognitive load of knowing all the domains in the team’s heads. So that’s the first principle.

Henry Suryawirawan: The way you described it, it sounds intuitive, right? Yeah. Why not? But we came from the traditional approach where we have a centralized team. They have to understand all the domain within the company organization. Understand the data model, the evolution of all data, and try to put them in one central place. But I think your approach here, your principle putting into a domain ownership means that the domain team themselves is responsible for not just the operational data, but also the analytical data part where probably they will transform their operational data, and responsible for sharing them for analytical purpose as well. And since they’re the experts of the domain, they are probably the best person to come up with that kind of a data model. So I think it’s really intuitive after you explain it.

[00:29:40] Data Pipeline is an Internal Implementation

Henry Suryawirawan: There’s also one concept that is on this principle, right? Where you mentioned that the data pipeline now is not a responsibility of a central team, but now it becomes an internal implementation of that domain team itself. So maybe if you can describe why this is so important?

Zhamak Dehghani: Sure. Well, if data mesh is really successful the way I had imagined it, there’ll no longer be any data pipelines. As in, if you go to the macro level, macro view of your architecture, you really shouldn’t see data pipelines anymore between these domains, between the data products, which is a concept that we can introduce in the next principle.

So if you think about pipelines, what’s the purpose of them? Pipelines are job-oriented, task-oriented computations that happen on some input from the data and transform and put it on some output sink. And you repeat these task-oriented processes until data gets transferred in a mode that somebody can use it. Usually, there is structure around, okay, we need to extract information. We need to cleanse them. We need to model. And they’re usually done in between the sinks of the data outside of the source, outside of the destination, somewhere in between.

So data mesh completely challenges that concept because we’re no longer working this task-oriented kind of environment. We’re working on this value-oriented, outcome-oriented environment, data product-oriented environment. So then the job of the transformation is really an implementation detail of one of these data products in one domain. It’s not something that happens in between. And it really falls into the similar principle of microservices that we had. In microservices world, this enterprise service bus was an anti-pattern, right? Because we want to localize logic and computation and complexity inside a boundary of a contract abstracted within a service. The enterprise service bus wasn’t doing that. So we came up with this idea of smart endpoints and dumb pipes. So your pipes are super dumb. They just transform data. And your endpoints are smart because they implement the logic behind the APIs, and that led to kind of more API thinking world. It’s the same concept. So these pipelines are like the enterprise service bus analogy which I’ve used, and they have the same challenges. So we’ve got to break them apart. The pieces that are relevant should go where they should belong. They can be implemented as a pipeline if you want to within the implementation or other ways.

[00:32:09] Single Source of Truth Myth

Henry Suryawirawan: Again, very intuitive. If you, again, compare with the application development, right? So ESB has long been an anti-pattern. So the same concept in data. If you want to connect two different data products, you should not create like a complex data pipeline. It should be just maybe transferring the data. When transferring the data, this is also another concept under this principle that you mentioned there is now probably not a kind of like notion of one source of truth anymore about the data. You will have multi, maybe, shape of the data, multiple copies. Maybe it is in the domain team themselves, or maybe it’s already copied to the other consumer of the data where they will use it for their use case. So why is this the case? Why no more single source of truth? I thought like in the data world, people love to have, hey, where’s this single source of truth? Where is the data that I can trust?

Zhamak Dehghani: Because a single source of truth is not a real thing. It just doesn’t exist in real world. Okay. So let’s unpack that. I don’t intend to claim that we are becoming irresponsible about data and you will have contradictory copies of the data lying around and you have to kind of reconcile those. That’s not the aim. The aim is we still want to be able to get a consistent view and understanding of the data. But we want to do that in a way that it doesn’t slow movement. It doesn’t slow value generation. It doesn’t become a stale source of truth very quickly. We want to do that in a way that we support the chaos of reality of organizations. There are different teams moving at different pace. They generate different bits and pieces of properties of the same entity, but those properties come from different sources with different cadence. So we want to embrace that almost complexity in chaos, but yet create a system that gives the same outcome that a single source of truth wants to give, which is if I search for information about customer, even though that information comes from different places, I can understand a particular snapshot of the customer at a point in time has a sort of consistent values. That’s why I challenge this notion of a single source of truth.

So data mesh. What people don’t actually read or understand is that has a lot of constraints and disciplines built into it. For example, the data that data nodes provide are read only. They never changed. They’re temporal. They have two timestamps. Bitemporal. So at any point in time, they’re streamed across different nodes in a way that if some data arrives, new data arrives upstream, the downstream nodes that are transforming that or copying that and transforming it into a new shape of the data, they get notified. They have a responsibility to either react on it and generate a new slice of the data at this point in time or not.

And then also data mesh provides a means of stitching this polysemes concept, like a customer that comes from the call center versus customer that comes from the eCommerce platform. The polyseme concept of the customer can be stitched together by the consumer because there are links that are created between those systems or between those data products. So there are a set of constraints. There are a set of operational disciplines like this polyseme linkage and bi-temporality and immutable data that still results in the same outcome of a single source of truth. But it’s designed for an inherently complex business model and operating model, if that makes sense.

Henry Suryawirawan: And you mentioned it as the most relevant copy, right? So maybe you don’t get the latest up to date. So it’s like the concept of this asynchronous or maybe eventual consistency. Maybe you just need a snapshot of a data at certain point in time, and yet the consumer will decide, okay, I just need this kind of data instead of always getting the latest.

Zhamak Dehghani: Exactly.

[00:35:57] Principle of Data as a Product

Henry Suryawirawan: Let’s move on to the next principle, which is a principle of data as a product. And I see that you are applying product thinking to this principle. So tell us more why it is important to treat data as a product?

Zhamak Dehghani: Yeah. I think a mind shift that needs to happen when we put data sharing and serving and delighting the experience of people using data as a first class concern. It’s also an antidote to the first principle. The problems of the first principle. So domain-oriented data ownership, one can imagine, can lead to data siloing. On the player domain, I’ve got the data that I need to improve my application. So why should I care about sharing that and be responsible for consumers, which adds a ton of work? So data as a product try to incentivize people to share that data, be part of an ecosystem that is generating value through data exchange and through data sharing. And again, put some discipline and constraints in place for that to be done effectively.

So it defines such roles of people that are responsible for that, defines success metrics for data as a product, defines an architectural concept. If we define usability characteristics around discoverability, addressability, like all the things that make the experience of a data user really easy, really delightful, those need to be translated into structural architectural components that are built into this data as a product. So then, you have kind of technology that needs to shift and change. So yeah. So in short, it’s an antidote to the problems that arise from the first principle and also really focus on, again, getting value from data. Who gets value from data? Data users do. So let’s put them first.

Henry Suryawirawan: And I like the way you explained why the concept of the product is apt here. Because you mentioned for successful products, you need these three attributes, which is visibility, valuable, and usable. And I think traditionally, again, we just treat data, okay, this is just a data. You go and figure it out. So sometimes it’s not usable. So sometimes maybe the query language is different. The database technology is different because we all have these polyglot database technologies. Or maybe it’s so ancient, right? Legacy technologies we don’t know how to deal with it. So I think if treating it as a product, we also need to think about the usability aspect. So I think that is definitely a key when I read about this principle.

[00:38:17] Data Product Owner & Developer

Henry Suryawirawan: There are also a new role that is being created because of this concept of data as a product. You mentioned about data product owner and data product developer. So not all teams have these roles yet. Tell us more about the importance of these two roles.

Zhamak Dehghani: Yeah. So, if data product becomes a thing that we are creating, maintaining, operating, evolving, retiring when it’s useless and nobody uses it, then there have to be people and roles to take that responsibility on. And it’s very unfair, I think it is impossible to say to app developers whose work and consumer are very different personas, like they’re serving the end user, to say, oh, now from here on you also have these other responsibilities. So not only you’re serving these end users that are interacting with the player application pressing buttons and make sure they have a responsive app and all of those great things.

But also, half of your day, you have to face around and face this data analyst and data science in different domains who want to use analytical data that you’re generating. You’ve got to serve them too. It’s impossible. You can’t have two bosses. Like you can’t serve these two purposes at the same time. So if there are super humans that can do both jobs, so be it, that’s fine. Maybe it is possible to kind of share your time and split your time that way. But nevertheless, you need to have a very explicit responsibility and accountability for that part of the job. If there are people that are pairing like the data product developer and app developer, they’re pairing and collaborating closely, but there are two different people playing these roles and that’s fine too. Yeah, so unless this kind of idea of the data product doesn’t happen out of good intention, we have to allocate space. We have to empower people. We have to keep people accountable and responsible. Hence the roles.

Henry Suryawirawan: Yeah. And also not to mention the skill set. They are totally different technologies, different paradigms. I think it’s very difficult to find people who can master both application development and also data engineering. I think for data to be also usable as a product, you mentioned about all other usability attributes, like discoverability, addressability. It should also be understandable, and it’s trustworthy as well, and it can be interoperable. So imagine if you have multiple consumers, and they want to access your data, but they do have set of constraints on how they would integrate with your data. I think interoperability is also a good concern that the data domain owner should think about. And that’s why I think having a data product owner that maybe can define a set of requirements and maybe this kind of all usability concerns. I think that is key why these roles exist.

[00:40:51] Principle of The Self-Serve Data Platform

Henry Suryawirawan: So let’s move on maybe to the next principle, which is about principle of the self-serve data platform. I think this is also interesting because you kind of like apply platform thinking to data mesh. Why we should have a self-serve data platform?

Zhamak Dehghani: Yeah. I think it’s an obvious one. But let’s go deeper into it and say, maybe answer it. What is this data platform? So when we think about the roles of platforms in general, platforms are often shared kind of infrastructure on top of which you build domain specific solutions. So they’re often like domain agnostic infrastructure that empowers other teams to build solutions on top. So very horizontal, not really vertical so much. So in an organization that’s implementing data mesh, now you have these domain teams. They have active folks in them. You have data product folks in them, and they’re doing their daily job. Their daily job should be focused on delivering value based on the outcomes of that domain. Their daily job should not be focused on metal work, creating kind of domain agnostic pieces of technology that they need. So I think platform as a mean to empower autonomous teams to lower their cognitive load, to do what they need to do more easily. They’re wonderful. They’re necessary.

In terms of data mesh, the data platforms, many of them exist. There are lots of technology out there. They’re built to give basic tools. Like you want storage? Sure. I will give you a storage. I can provision that for you. You want workflow processing? That’s fine. I’ll give you that. But the tool is low level for developing data products. So we need a new layer of the platform that really takes data product away. At least data mesh that I defined in visions, treated as a first class concern and hides away those details of, oh, I need a storage, a pipeline or this or that. And it gives life to a completely new concept. This new concept of data quantum and data product. So the job of that platform, the reason I put it in was making it feasible for independent domain teams to do data work.

Henry Suryawirawan: And some of the attributes that you mentioned of this data platform is that it should be autonomous, interoperable, and domain agnostic. So I think one of the challenges when I was working with data related stuff also is that yeah, you have all these tools, like you mentioned, but bootstrapping all these tools takes a long time. You have to set up clusters. You have to maybe install things, dependencies, and on top of that, then you have to write code. You have to maybe understand the source, the sink, and things like that. And then you have to write the pipeline itself and then deploy it and things like that. Yeah. It takes a lot of effort just to come up with a very simple pipeline. And I could imagine having this kind of self-serve data platform, maybe something like a UI console where you can just go log in and, you know, click, I want this data from this source and move it to my sink. And then it just creates everything for you. I think that will be a perfect scenario where maybe some of the tools not yet catching up. But yeah, hopefully one day we will reach that experience for the data engineer or maybe for business people. They don’t even need to care about dealing with data engineers themselves.

[00:44:00] Agnostic Data Platform

Henry Suryawirawan: So you mentioned about this concept of self-serve data platform. Because one of the challenges of building this platform is that you will need to make it agnostic. So I think this is probably one of the challenges because we have so many different data technologies. So how should we think about building this platform so that it becomes agnostic? Because yeah, we have so many technologies, right? We have so many different shapes of database technologies.

Zhamak Dehghani: Yeah. I mean, it depends, the agnosticity, I guess, of the platform and its independence from underlying kind of technologies. The level of it depends on the appetite of an organization and how independent they want to be. So I think what we need is interoperability between the different technologies. So if we spill the solution on top and the solution requires data from across two different platforms, there is a level of interoperability that I can access data across two different clouds or across two different technology stacks. So at the minimum, we need a layer that creates that interoperability even if the vendors themselves are not incentivized to do that right now.

When it comes to underlying infrastructure agnostic. Again, I don’t think it’s meaningful in all organizations because you end up with a like most common denominator of the set of features that are available on all the platforms, and that’s not ideal. That’s a lot of work and very little result. So I don’t know if you need to be completely agnostic, but we have to have the pieces of it that enable interoperability and movement. Moving from one to another and remove locking as much as possible. And those pieces are usually around crosscutting concerns, right? How do I manage security? How do I build automation so that if tomorrow I want to move to a different set of infrastructure, my processes are automated and not hand cranked? I can kind of throw automation, facilitate the movement much faster. That’s the way I think about this being tech agnostic, as opposed to a nice layer on top. I don’t think that’s really realistic.

[00:46:01] Principle of Federated Computational Governance

Henry Suryawirawan: Speaking about crosscutting concerns. So this also touching on the next principle, the last principle, which is principle of federated computational governance. It’s quite a mouthful to mention that. But it’s taking care of all these crosscutting concerns that you mentioned. Things like, for example, security, policies, and things like that. You are kind of like applying systems thinking to this principle, so that we can govern the data better. So share more about this principle, because to some people, this might be hard to kind of like understand.

Zhamak Dehghani: Yeah. It’s a mouthful already. And if I could sneak in another word, I probably have called this principle of embedded federated computational governance. But I think Martin Fowler would’ve not posted my article if I did that. Yeah. So I think that the concept really is, again, an antidote to the problems that arise from the previous principles, which is we need interoperability. We have now these disparate sets of data products, domain-oriented, their own teams, their own cadence, their own life cycle. How can we apply set of concerns that need to be standardized across all of them? And what’s the best way to go about defining them? What’s the best way to go about implementing them and observing and enforcing them? That leads to this principle. So as an example, if you need to have secure data or you need to have high quality data, let’s go with high quality, some sort of quality data, definition of quality and then enforcing quality, and sharing quality data.

One way of doing it is say, okay, I’m going to put a quality control team, my governance team, in the process of generating every data. And these guys are going to sit in the middle and verify at some point in that life cycle of the data, if this data is acceptable to be shared. That’s where systems thinking comes to play. That system is going to have a massive bottleneck, and it’s not going to scale. So how can we achieve a defined level of quality without creating just necessarily just controls? We need the consensus or definition around what constitutes quality? As in what attributes do we use to describe the quality of data? Is it completeness? Is it integrity? Is it timeliness? Like is it all of the above and others? So let’s define those. And in that definition, let’s have subject matter experts. Let’s have domain people who actually know their data and how they can articulate quality involved in defining that. And once that’s defined, let’s automate it. Let’s put it into the platform as a platform capability. The moment you are instantiating it in a product, you will get out of the box, a library or some SDK or something that gives you the ability to now calculate, capture and share these quality metrics. And then you will have observability that runs across this mesh, across all of his data products and capture those information, shares that information, and also validates whether you are meeting the requirements of the quality that you’ve done. So that’s the computational part.

And the embedded part that I haven’t put in the title is that this enforcing quality and measuring quality becomes an embedded concern in every single data product. It’s not something that’s smeared over and added later on. It’s actually from ground up, built in the data product itself. So it’s embedded in there. So hopefully that gives a good example of achieving, I guess, well oiled and cohesive mesh of interconnected data products through embedding policies, the standard policies, in an automated fashion in everyday product and have the teams that are responsible for guaranteeing those policies involved in this defining what these policies are.

Henry Suryawirawan: Yeah. If we can borrow things like from the application development, right? So we have this concept as well, policy as code. There are some tools in Kubernetes clusters where you can embed this kind of policy. So before you apply something, you will check towards the policy and if it doesn’t comply, it will reject. So data governance is probably one of the least sexy part of the data management. Because there are things like PII data security, maybe it should not be leaked out or it should not be exposed. Maybe things like data quality, right? How much lagging, for example, the data could be? And I think all these definitely need to be governed because otherwise, it’s really difficult.

[00:50:20] Data SLO

Henry Suryawirawan: And you mentioned about observability and you used the concept from SRE, where you have also data SLO. So maybe if you can touch a little bit about this data SLO.

Zhamak Dehghani: Yeah, absolutely. So data products, for people to trust your data, you need to share a set of real time almost, or at least as real time as your data is, information to give people trust that this is a suitable data. So, again, the dimensions of that, I think I unpacked in the book, I probably don’t remember all of them on top of my head, but the dimensions of that are around quality. They’re around timeliness. They’re around completeness. There’s a whole set of, often in the language of big data people, they call it metadata as language. I don’t like it, and I don’t want to use, because it’s just a catch all bag of all things. But there are classes of really information, additional data that you’ve got to provide.

For what purpose? So that the people that want to directly self-serve use the product, they can self-assess if this data suits their use case or not. As an example, the distribution of their data. So if I’m doing an analysis or training a machine learning model for a particular use case, I’d perhaps like to have a very, I don’t know, nice bell curve distribution of the data and the samples that I can get for training that machine learning model rather than biased data. So how biased the data is? So these are, again, SLOs in the app dev are more about part uptime and response time and downtime and so on. And in the data world, the computational part of it still has those concerns. But the data part of it, it has a different set of concerns that defines the usability metrics of a data.

Henry Suryawirawan: So thanks, Zhamak, for explaining all this. It seems like a very crash course of data mesh. I hope people do study about this data mesh, maybe from reading your book or your articles or watching some of your talks. I think it’s really an eye-opening for those people who work with traditional data management. So thank you again for this.

[00:52:23] 3 Tech Lead Wisdom

Henry Suryawirawan: I have one last question before I let you go. So normally, I ask these three things called three technical leadership wisdom. Maybe if you can share some of your wisdom for us maybe to learn from your journey, from your experience or your expertise.

Zhamak Dehghani: Oooh. It’s a hard one. So maybe just a few things that I didn’t do as well, that I can share or things that maybe I did okay. I separate leadership from management. I’m a terrible manager. Now, you don’t want me as a manager, but maybe I’m an okay leader because I believe in a mission. I’m a very mission oriented person. So as a leader, you need to believe in your mission and create a mission oriented team and organization and continuously through communication, through reinforcement to have like embodying the right behavior to achieve that mission. Reinforce that and remind your teams and keep realigning the team. Maybe there are different styles of leadership, but that mission oriented, vision oriented leadership resonates with me. I love working with people like that.

And then to get to that mission, you have two ways of going there. You have a way of leaving maybe some casualties behind, like going in a way that not everyone can catch up, and if you wounded soldiers along the way. But you want to make sure that everybody’s aligned. You need to think about every single member of the team, their needs, their pace, their specific hopes, and, really, it’s about not only having the vision and charging the path but also making sure everyone can come along and think beyond yourself. That’s probably an area that I need most help with personally myself, because my mission oriented kind of leadership usually has casualties on the way. Your people need to be able to trust you and believe in you. You need to be very self aware in terms of your strength and your weaknesses. Where you want to delegate? And where you want to actually take something on?

And if you are a technical leader, I personally respect technical leaders that still stay close to their craft. They still keep up to date with their craft. As we all know, the technology moves really fast, so you have to find a way to keep yourself relevant and up to date. And sometimes that means going really deep for a moment in time, get your hands dirty and coming back up. And of course, as your scope of leadership grows, your ability to go really deep diminishes, because just time doesn’t allow for that. So having carving out space to go deep when it’s needed, even for a very short period of time. I’ve seen some technical leaders do that. I admire people who can strike a balance between the depth and kind of the breadth of knowledge and relevance of their knowledge.

Henry Suryawirawan: Wow. Really beautiful. Thanks for sharing that. I think it speaks to some of the leaders where they are more vision-driven as well, rather than managing people. So thanks for sharing that.

So maybe, Zhamak, for people to learn more from you, maybe about data mesh or just to reach out and follow up with the discussion, is there a place where they can reach out?

Zhamak Dehghani: Well, as of now, really Twitter and LinkedIn will be the place. I listen to both channels. But hopefully soon my company’s website will be up and we will have jobs and we’ll have places for people to reach out directly through that. But that’s not up yet. But when it is, I will let you know, and you can share with your network.

Henry Suryawirawan: Really excited to hear about that. So many different data mesh technologies, maybe, will be coming from that. So thanks, Zhamak, for your time. Really, a pleasure to have this discussion with you.

Zhamak Dehghani: It was wonderful to be here, Henry. Thank you.

– End –