#70 - Continuous Architecture (Part 2) - Principles, Scalability, and Performance - Pierre Pureur

 

   

“Delay design decisions until it’s necessary. Architecture is an art, not a science. Don’t architect for things you don’t know. Your design decisions should always be built on facts, not guesses."

Pierre Pureur is the co-author of “Continuous Architecture in Practice” and an acclaimed software architect. In this second of a three-part series of “Continuous Architecture” episodes, Pierre shared his own perspectives on the 6 key principles of continuous architecture. We then discussed in-depth the two important quality attributes, which are the scalability and performance. For each quality attribute, Pierre described the attribute definition, why it is an important architectural concern, and some of the common tactics used to improve the attribute in the modern system architecture.  

Listen out for:

  • Career Journey - [00:05:34]
  • Architect Products, Not Projects- [00:07:31]
  • Focus on Quality Attributes - [00:11:25]
  • Delay Design Decisions Until Necessary - [00:13:41]
  • Power of Small - [00:15:58]
  • Architect for Build, Test, Deploy, and Operate - [00:17:40]
  • Conway’s Law - [00:19:53]
  • Essential Activities - [00:23:18]
  • Quality Attribute: Scalability - [00:26:00]
  • Scalability on The Cloud - [00:28:59]
  • Scalability Tactic: Sharding - [00:31:01]
  • Scalability Tactic: Asynchronous Communication - [00:32:58]
  • Quality Attribute: Performance - [00:35:06]
  • Measuring Performance - [00:37:23]
  • Performance Tactics - [00:39:23]
  • 3 Tech Lead Wisdom - [00:41:15]

_____

Pierre Pureur’s Bio
Pierre Pureur is an experienced software architect, with extensive innovation and application development background, vast exposure to the financial services industry, broad consulting experience and comprehensive technology infrastructure knowledge. His past roles include serving as Chief Enterprise Architect for a major financial services company, leading large architecture teams, managing large-scale concurrent application development projects and directing innovation initiatives, as well as developing strategies and business plans. He is coauthor of the book Continuous Architecture: Sustainable Architecture in an Agile and Cloud-Centric World (2015) and has published many articles and presented at multiple software architecture conferences on this topic.

Follow Pierre:

Mentions & Links:

 

Our Sponsors
Are you looking for a new cool swag?

Tech Lead Journal now offers you some swags that you can purchase online. These swags are printed on-demand based on your preference, and will be delivered safely to you all over the world where shipping is available.

Check out all the cool swags available by visiting techleadjournal.dev/shop. And don't forget to brag yourself once you receive any of those swags.

 

Like this episode?
Follow @techleadjournal on LinkedIn, Twitter, Instagram.
Buy me a coffee or become a patron.

 

Quotes

Career Journey

  • [Continuous Architecture] back in 2015-16, that was the first continuous architecture book. And the idea was we wanted to really, number one, write down our experiences about architecture, and what we felt were the lessons learned, and what would be useful to others.

  • Also one thing that we’re looking at. There is a little bit of a fight now. And the fight has been going on for over 20 years between the Agile people who say, you know, architecture emerges, and the traditional architects that tend to say, well, you should really build your architecture upfront and not change it.

  • I don’t believe that either of those camps are right. Because yes, architecture can emerge, but you have to be very good at what you’re doing, so it emerges right. If not, you end up with a monster.

  • Creating an architecture upfront, hoping that it’s never going to change, I think it’s crazy. Things change all the time. So that’s what “Continuous Architecture” was trying to do.

Architect Products, Not Projects

  • Architect products, not projects. Projects tend to be short-lived while products are long-lived. So it’s better actually to architect products than projects. Also, if you think products, you start thinking about the discipline of product management. You start thinking about the business, which is very important in the architecture.

  • Quality attributes. People say that any blob of software can handle functional requirements, not very well, but it can. What really makes software ticks are the quality attributes requirements. Unfortunately, they are usually hard to define.

  • You need a product roadmap, which is different from the project plan. But what I have seen also, people do something that they call a product roadmap, which looks like a project plan.

  • Basically if you take a product roadmap, the whole concept of base products and derived products, you need to understand what are your base products, how you really build your product roadmap, and how you currently build on top of these base products? Does it get different levels of products? Most importantly, how do you actually adapt to what customers want?

  • One thing that we talk a lot about is feedback loops, which is something we borrowed from Agile. Feedback loops are important. Because honestly, nobody gets it right the first time. You put an architecture there, and you build something and you hope it’s going to work. Well, most of the time, it doesn’t quite work the way you plan it. And unless you have a feedback loop, you can’t really change and adjust your architecture.

  • In project, you have this fallacy that you’re going to be able to do your architecture and you’ll be done. In reality, you evolve all the time. You change all the time.

Focus on Quality Attributes

  • When we wrote the new book, there are four quality attributes which are very important. No special order. You have scalability, performance, security, and resilience.

  • Scalability is interesting because people 20-30 years ago didn’t really think too much of scalability. Basically, you kind of were hoping your system would be scalable, and of course, at some point, you hit a brick wall and the system is not scalable anymore.

  • Performance, on the other hand, is one that people are very knowledgeable about because a badly performing system basically is hard to use. So you lose your users very quickly.

  • Security is everybody’s problem. It’s not just a problem of security people. It absolutely needs to be built upfront.

  • Resilience is critical. Systems break down, but what happens when it breaks down? So resilience is becoming very, very important quality.

Delay Design Decisions Until Necessary

  • “Delay design decisions until it’s necessary” is interesting because architecture is an art, not a science. The whole idea is don’t architect for things you don’t know. So your design decisions should always be built on facts, not guesses.

  • Cost is a very important attribute. Because in the real world, outside of IT, people who architect and design always think of costs. That’s a question we never ask in IT. We never give choices based on, okay, if you spend X, I can build you that. If you spend Y, I can build you that. We never see that. You have a budget and you say, okay, you know, I’m going to make the budget. Most of the time, I’m going to exceed the budget.

  • You need to bake costs into your thinking. Don’t design for things you don’t have to design for. It’s wasteful. When you make a decision, always try to say, do I really need to make that decision now? Or can I delay it a little bit longer?

Power of Small

  • By that, I don’t mean microservices. What I mean is, think small. The idea is it doesn’t mean necessarily think small code, because there are many dimensions of architecture, more than just code. But try to think in terms of modules. Decoupling.

  • Try to think there are some independent modules. Things that you can either deploy on the same box, or same whatever, or different things without major changes. The idea is the smaller those things are, the easier it is to change them and to actually put them in different places should you need to.

  • If you have a choice, small is better than large. So the whole idea is to get the architects to think, do I really need to really design a big thing that does a lot of things, or should I go into really a module that only has one responsibility? Try to cut it down, just do one thing, do it well.

Architect for Build, Test, Deploy, and Operate

  • Principle number five says “architect for build, test, and deploy and operate”.

  • One of the most important thing you can do on scalability is to have a monitoring framework. What’s going on in your system? And also to plan for failure. Because at the end of the day, every system tends to fail. It’s not a matter of if, it’s when it’s going to fail. And you have a choice. You can fail gracefully, and get to a point where your customers don’t even see your system failing, which is great. Or you can just crash, which is not good.

  • How do I really monitor my system? My system is going to be put into production and it’s going to have to run. How do you monitor that? How do I make sure that if there’s a failure, because there would be a lot of capacities guaranteed, I’m going to run out capacity somewhere? How can I compensate for that before crashing all the system? How can I stop a domino effect that basically destroys the system?

  • Another example is performance. If you look at a system like Amazon, it makes you wonder how do they can even test for scalability or performance.

  • The key is you build and you architect around testing. So how do I make sure that it’s possible to test that system? And ask, how is it possible to test the system? How is it possible to deploy system, to operate?

  • Amazon deploys a new version, I think less than one second. It used to be every second, now it’s faster in that. How do I do that? You need to build that in your architecture. Architecture is much more just about architecting code and designing code. It’s the whole life cycle. You need to keep in mind.

Conway’s Law

  • The principle six is really Mel Conway’s principle. The idea is really, if you have your organization, which is really organized around strata, around layers of people, one of the best ways to avoid that is to actually group people vertically instead of horizontally. That was Mel Conway’s insight.

  • The whole idea is that the design of system tends to mirror the organization of the teams. So if you organize your people vertically, so instead of having silos of front-end, mid-tier and back-end, you now have your front-end, the mid-tier, and back-end in one team, so they can talk to each other easily.

  • The idea is you create teams with those people because you have to have multiple people with multiple skills. They actually start working together and communicating, and you’ll see problems go away.

  • When I was running an innovation center, one thing I learned is by sitting people together. Actually, we didn’t have offices. We just had desks in an open space, and communication problems absolutely went away.

Essential Activities

  • Principles are a nice framework. They really define how we think about architecture. But they are not very actionable. So in order to be actionable, you need what we call the essential activities. And essential activities are things that you need to think about day-to-day.

    • The first one is to focus on quality attributes.

    • The second thing is driving architecture decisions. Decisions are actually the most important thing that an architect can do. The most important thing an architect can do is with hard decisions, and make sure that the decisions are properly documented somewhere in a log. By the way, the best way to do that is probably to keep your decision next to your code.

    • The third one is technical debt. Anything you do may increase the technical debt. There’s no escaping that.

      • No matter what I do, anything I do on developing a project is going to increase or decrease technical debt. Technical debt is not necessarily bad. Sometimes, you basically have to take technical debt because you just have to deliver a product quickly, and you have to take shortcuts. That’s technical debt.

      • What’s bad is when you don’t repay technical debt. Because technical debt accrues and interests on technical debt can run very high. When you have to repay it, it’ll be an “ouch” moment. And the more you wait, the worse it gets.

    • The fourth one of the essential activities are the feedback loops. Nobody gets the architecture or design right the first time. You take some guesses. You take some bets, and you put something out there. And you need to get a loop to understand how far you are from the target, and to correct course.

Quality Attribute: Scalability

  • 20-30 years ago, very few people talk about scalability. In those old days when systems are running on mainframes, IBM will come up with bigger mainframe and you were scalable. And then, we moved to distributed systems.

  • I define scalability in the following way: it’s really a property of a system to be able to handle an increase or decrease workload by increasing or decreasing the cost of the system. Again, back to the cost.

  • People get the increase part very well. They say, okay, so I’m going to basically increase my workload. I’m going to be able to handle an increased workload by increasing cost of system. But you should be able also to decrease the cost when you handle decreased volumes. And not many systems can do that. I think that’s an important thing

Scalability on The Cloud

  • Cloud is very important nowadays. A lot of systems run in the cloud. But what people forget, they think that putting your system and app in a cloud is going to make it scalable. Magic happens. And they think that basically scalability is the problem of the cloud provider.

    • Well, it’s not. If it scales badly on your own infrastructure, there’s no way that putting it on the cloud is going to make it better. The only thing we will achieve is going to make it more expensive. Yeah, because now you have inefficient system trying to scale. The system may be able to cope with additional volumes, but at what cost?
  • This goes back to the principle about architect for build, test, deploy, and operate. That’s exactly the same idea, because you need to know, when you architect your system, where you are going to run that system. The challenge is that unless your system is well-designed to involve the cloud, it’s going to cost you an arm and a leg to do that.

  • There are tactics. And the idea of vertical scalability is we solve the scalability problem by running on a bigger box. That works well until you hit your ceiling, because how do you find a bigger box? The point of scalability is there’s always a price to be paid.

  • There’s always a fine line to be walked on between, do I rely on vertical scalability to a point? And then do I switch to horizontal? Am I planning to actually change the system to re-architect for horizontal scalability? Am I really able to run efficiently on the cloud? Those are the questions that architects must really ask themselves.

Scalability Tactic: Sharding

  • The first part of the system where you’re going to hit a scalability issue is usually the database. Databases are the first thing that needs to be scaled. Bad news it is also the hardest thing to scale. So you can go for a bigger database, but that has limit. So you start distributing your data. You start replicating your data, partitioning it, and then sharding it.

  • I mentioned sharding last because honestly, it’s a hard technique to use. And remember, delay design decisions until you have to. Unless you’re absolutely sure you need sharding, and you’re absolutely sure that you got a reason to do that, I would say, try to stay away from that. Cause you’re going to make your code much more complex, and complexity is bad.

  • The whole point of architecture is to decrease complexity. So if you architect on purpose for complexity, it gets you to a paradox. You really need to avoid complex techniques.

  • Sharding tends to depend on database. Sharding is not offered by all DBMSes.

  • Keep in mind at the end of the day, the skills of the team are probably what drives the success or failure of your project or your product.

Scalability Tactic: Asynchronous Communication

  • We all know messaging and its async is definitely better than synchronous for both scalability and performance. Here is the problem. Not many people are familiar with an async model. People read about it. We all know the concepts. But being able to actually efficiently code for it is a different story.

  • Asynchronous is more efficient than synchronous because the idea is you send a message, you put that on some kind of bus or some kind of message carrier. It’s picked up, so you don’t have to block until you get a response.

  • It’s easier to start with synchronous because that’s the way people are used to, but you’re just going to say, okay, I’m going to hit some limitations. How can I avoid to paint myself in a corner, and have to do major rewrite when I switch to asynchronous?

  • Keep in mind, the cost is going to be important. Cost in terms of money and time. My message here is to be very careful. Companies like Amazon, Google use asynchronous because they have to. But unless you have to and you know you’re going to have to, try not to do that.

Quality Attribute: Performance

  • Performance is interesting because that’s one that you don’t have to warn people about. Scalability, you do have to warn people that it’s important. But the challenge is many people confuse scalability and performance, and they are very different.

  • Performance is about timing. So that’s very clear. Now they have also relationship because when scalability gets bad, when the system hits or gets close to a scalability limit, performance just becomes abysmal.

  • Assuring performance is really, to me, same as scalability. First, document your requirements. Requirements for performance and scalability are usually very badly documented. And that’s why in the book, we advise them to use scenarios.

  • The idea of scenarios is to try to define precisely what your requirements in terms of what kind of test. So we have a stimulus. Basically, your response and a measurement. What’s going to happen to your system? So you are going to need to measure performance. It could be just someone needs to do a transaction. Response is how is the system going to respond to that. And measurement is how long is it going to take.

  • What we find is by going through discipline of writing those things down and saying, okay, instead of just saying my system must be fast, what do I mean exactly? If you do that, two things happen. Number one, your designers start thinking much more in practice about what they need to design. And number two, your testers know what to test, which is very important. If all you’re being told is system must be fast, how do you test that?

Measuring Performance

  • Most of the time, we aren’t working in isolation. Most of the time, we’re working with systems that we don’t necessarily control.

  • The whole concept of you took scalability, performance of working set, are changing to the code I’m going to write. It’s probably going to have to live within an ecosystem I don’t control, which kind of brings on performance and scalability in a different light. Because now, how can I control the performance capability of something like Salesforce, which I know the architecture, but I can’t control it.

  • Having said that, monitoring is very important. And instrumentation is very important. Remember principle number four: architect for build, sure, but test, deploy, and run. So make sure that you actually build those instrumentations, those calls, those pieces of code that you need to manage your system.

  • No matter how small your system is, you’re going to need some instrumentation. And you’re going to need to measure performance at each level. If you don’t measure, there’s no way you can actually design efficiently.

  • The same way, remember, dealing with failure. You need to put code in place that will allow you to deal with failure. So you need to be able to fail gracefully. If you think that it’s expensive to do that, think about the cost of not doing it. It’s quite mind-boggling. Think about your system is just stopping dead in its tracks in the middle of trading day.

Performance Tactics

  • You have two types of tactics. The first one is about controlling the resource demands (such as prioritizing requests) and the second one is about managing the supply of resources. So caching is a good example of the second type. Of course, it is not a new one, but it is a very important one.

  • On the database side, you have a lot of tactics. NoSQL is the most famous one. Material views, indexes, and so on so forth. But remember all these techniques, all these tactics come with a cost. NoSQL is a great idea. NoSQL database is usually more performing than SQL databases. Depends which use case, right? If you have structured data, well-defined data, SQL databases are probably a better choice. If you have not well-defined data or no structured data and so forth, NoSQL is a better choice.

  • However, you fall into the skill issue, because do the skills in your team match your architecture? It makes no sense to design something that you can’t build.

3 Tech Lead Wisdom

  1. Architecture is a skill, not a role.

    • The whole concept of being an architect is morphing into architecture belongs to the team. Everybody in a team has to have architectural skills.
  2. Architecture requires a continuous flow of decisions.

    • Architecture is morphing into a continuous flow of decisions. You make decisions. You document them. Everybody knows what decisions are. They need to be well communicated.

    • Everybody needs to be part of the decision. You should not make that decision in an ivory tower. You should have the whole team participating here, understanding the pluses and minuses.

  3. Always plan for monitoring and dealing with failure.

    • Your system will fail. Response time may not be the one you expect.

    • You really need to be alerted on what’s going on in system. Which part of the system is slowing down. Same thing for failure. If part of the system is going to start turning hot, you need to understand that before the whole thing crashes.

Transcript

[00:00:45] Episode Introduction

[00:00:45] Henry Suryawirawan: Hello to all of you, my friends and listeners. Welcome to the last 2021 episode of the Tech Lead Journal podcast. I’ll be taking a couple of weeks break until our next episode in 2022. Happy holidays! I hope that you are enjoying your end of year break, and I wish you a wonderful New Year 2022. This time of the year is always a good time to do some reflections of all the things that have happened this year: the good things, the bad things, and the things that we can definitely do better in the next year ahead.

This year, Tech Lead Journal has released a total number of 50 episodes. And it has also surpassed the 50,000 total number of all-time plays a couple of weeks ago. Thank you so much for listening and for your continuous support this year. It really, really means a lot to me. Out of those 50 episodes, do you have any favorite Tech Lead Journal episodes that you listened to this year? If you do, I would encourage you to share those favorite episodes in the social media, tagging Tech Lead Journal, and tell us why you like those episodes and what you learn out of it.

And if you’re listening this episode on Spotify, Spotify has recently released a new feature that allows you to give a rating on podcasts. I would really appreciate it if you can take a pause and leave the show a rating on Spotify. It will help me a lot to increase the discoverability of this podcast to more people.

And if you’re new to Tech Lead Journal, I invite you to also subscribe and follow the show on your podcast app and our growing social media communities on LinkedIn, Twitter, and Instagram. And if you have been regularly listening and enjoying this podcast, you can also support the show by subscribing as a patron at techleadjournal.dev/patron.

My guest for today’s episode is Pierre Pureur. Pierre is the co-author of the “Continuous Architecture in Practice” and a highly experienced software architect. This episode is the second of the three part series on Continuous Architecture. Previously in episode 67, the first in this series, I had Murat Erder sharing his insights on Continuous Architecture’s principles and essential activities.

In this episode, Pierre shared his own perspectives on the six key principles of Continuous Architecture. And we then discussed in-depth the two important quality attributes covered extensively in the book, which are scalability and performance. For each quality attribute, Pierre described the attribute definition, why it is an important architectural concern that we all should put attention to when designing our systems, and some common tactics used to improve the attribute in the modern system architecture.

I really enjoyed my conversation with Pierre, continuing my learning on Continuous Architecture, deepening my understanding on the six key principles, and the two important quality attributes: scalability and performance. If you also enjoy and find this episode useful, I encourage you to share it to someone you know who would also benefit from it. And you can also leave a rating and review on your podcast app, or share some comments on the social media on what you enjoy from this episode. So let’s get our episode started right after our sponsor message.

[00:04:36] Introduction

[00:04:36] Henry Suryawirawan: Hello, everyone. Welcome back to another new episode of Tech Lead Journal podcast. Today I have with me another co-author of the book “Continuous Architecture in Practice”. His name is Pierre Pureur. So Pierre is actually a very, very experienced software architect, enterprise architect. He has an extensive software development background for sure. He has been Chief Enterprise Architect for major financial services company. He has also directed large architecture teams and doing all these large-scale projects. And yeah, as I mentioned, he has co-authored a book “Continuous Architecture in Practice”. In fact, this episode is also a continuation of the previous episodes that we have. Today, we’ll be covering some of the aspects of architecture. What is continuous architecture in practice? And what are those principles that I think will be useful for us when we go through architecture journey? So Pierre, thank you so much for spending your time with me today. Looking forward for this conversation.

[00:05:30] Pierre Pureur: Thank you. Thank you, Henry, for having me. I’m also looking forward to this conversation.

[00:05:34] Career Journey

[00:05:34] Henry Suryawirawan: So Pierre, normally I would start by asking my guests to share their career journey or any highlights or turning points worth to share with the audience as a learning journey.

[00:05:43] Pierre Pureur: Sure. So little bit of my background. I mean, I’ve been an architect for longer than I can remember. Quite a long time. At my last position, I was actually the Chief Architect of a large insurance company in Connecticut, which is a very interesting experience because it’s really more about chief than architecture. Chief Architect is an interesting title. You basically get to coordinate a lot of people, and get them to do what you think is best, whether they like it or not, which is interesting. So you don’t do a lot of what we traditionally call architecture. You do a lot of coordination and a kind of synergy between people.

So let’s talk about the book. One of my co-authors, Murat, that you interviewed before, and I wrote the book back in 2015 -16, that was the first “Continuous Architecture” book. And the idea was we wanted to really, number one, write down our experiences about architecture, and what we felt were the lessons learned, and what would be useful to others. But also one thing that we’re looking at, there is a little bit of a fight now. And the fight has been going on for over 20 years between the Agile people (who) say, you know, architecture emerges, and the traditional architects that tend to say, well, you should really build your architecture upfront and not change it. And that fight has been going on. And I don’t believe that either of those camps are right. Because yes, architecture can emerge, but you have to be very good at what you’re doing, so it emerges right. If not, you end up with a monster. Creating an architecture upfront, hoping that it’s never going to change, I think it’s crazy. Things change all the time. So that’s what “Continuous Architecture” was trying to do. So the way we define it is we said first, we need some principles. You probably read them the six principles, and none of those are really new. I mean, they have been used before. But I think what’s new is putting them together in a nice set.

[00:07:31] Architect Products, Not Projects

[00:07:31] Pierre Pureur: So the first one is architect products, not projects. That’s a well-known thing because projects tend to be short-lived while products are long-lived. So it’s better actually to architect products than projects. Also, if you think products, you start thinking about the discipline of product management. You start thinking about the business, which is very important in architecture. So that’s why we focus on products.

Quality attributes. People say that any blob of software can handle functional requirements. Not very well, but it can. What really makes software ticks are the quality attributes requirements. Unfortunately, and we’ll talk more about that, they are usually hard to define. Performance is a good example. You sometimes get requirements from your stakeholders that say, “Well, your system must be performing. Our system must be fast. Our system must be whatever.” No actual definition, no practical numbers, nothing we can test against. So that’s hard.

[00:08:26] Henry Suryawirawan: About this principle one, right? So we need to go from projects to products. I think it’s kind of like these days, there’s a lot of movement about people should move from project to product, tends to be long-lived rather than short-lived. So the incentive is actually building for a long-term. So in your experience in the last, maybe few years, recently with all these Agile movements, what do you think is the state of the current industry? They understand that basically, yeah, we have to go through products or there are still aspects of projects that actually still meaningful that we need to do it in our technology industry?

[00:08:59] Pierre Pureur: Well, yeah. Great question Henry. I mean, number one, I don’t think we talk a lot about products and projects. In reality, I don’t think that people are thinking products, really. You do have some product managers, that’s a fact. But you also have project managers. You know when you start saying that you don’t really need a project plan, when we say that, I don’t know, a hundred people raising their arms and say, what do you mean no project plan? What does that mean? I say, no, you really need a product roadmap, which is different from the project plan. But what I have seen also, I don’t know if you’ve seen the same thing, people say, okay, fine, you want a product roadmap. So they do something that they call a product roadmap, which looks like a project plan. One thing I have to be careful about what I’m saying here, but you have a very successful and very followed methodology called SAFe, right? Scaled Agile. It’s way too easy to turn SAFe into a waterfall, with a very thin veneer of Agile on top. I mean, you talk a little bit Agile, but in reality, you’re doing waterfall. So it’s good for senior management because they feel that they’re all using SAFe, so they’re Agile. In reality, you’re still doing project management. You’re still doing basically waterfall. Culture is the hardest thing to change. Waterfall is ingrained in the way people think. So you try to change that, and you run into a lot of big issues here. It’s going to take years and years before we actually get to a point where people see your projects are done with.

[00:10:22] Henry Suryawirawan: So what does it mean by architect products? So you mentioned in this principle, architect products, so what does it mean?

[00:10:27] Pierre Pureur: So basically if you take a product roadmap, the whole concept of base products and derived products, so you need to really understand what is your base products? How do you really build your product roadmap? And how do you build on top of these base products? Does it get different levels of products? Most importantly, how do you actually adapt to what customers want? One thing that we talk a lot about are feedback loops, which is something we borrowed from Agile.

Feedback loops are important. Because honestly, nobody gets it right the first time. You put an architecture there, and you build something and you hope it’s going to work. Well, most of the time, it doesn’t quite work the way you plan it. And unless you have a feedback loop, you really can’t adjust. And you can’t really change and adjust your architecture. So I think that’s also the big change compared to project, is that project you have this fallacy that you’re going to be able to do your architecture and you’ll be done. In reality, you evolve all the time. You change all the time.

[00:11:25] Focus on Quality Attributes

[00:11:25] Henry Suryawirawan: So let’s move on to the second principle. Focus on quality attributes, not functional requirements. I think it’s very interesting the way you mentioned about it this way, because now that I hear about it, it’s kind of right as well. Normally, we think that, okay, product features, whatever that is, is a functional requirement, and that’s what we need to build. But actually, a lot of times what makes or breaks the product is actually the quality attributes, not really the functions or the features. So what are those quality attributes that you can mention? Maybe the core quality attributes?

[00:11:54] Pierre Pureur: Yes. So that’s the four quality attributes. What we found when we wrote the new book is that there are four of them which are very, very important. No special order. You have scalability, performance, security, and resilience.

Scalability is interesting. We’ll talk more about that in a minute. But scalability is interesting because people 20 years ago, or 30 years ago, didn’t really think too much of scalability. Basically, you kind of were hoping your system would be scalable, and of course, at some point, you hit a brick wall and the system is not scalable anymore.

Performance, on the other hand, is one that people are very knowledgeable about, because a badly performing system basically is hard to use. So you lose your users very quickly. But it’s important.

Security, well, I don’t really need to sell you on security. When I started a long time ago, we used to think security was a problem for the security department. They did the security checks and they even reviewed your code and so on so forth. You didn’t have to worry about that. That has changed a lot. Now, security is everybody’s problem. It’s not just a problem for security people. It absolutely needs to be built upfront.

And then resilience. Resilience, of course, is critical. I mean, if you just look at what happened last year and this year about the whole phenomenon of retail investment. That’s probably linked to the pandemic, but people now are starting investing without going through investment advisors. But those systems break down. Because honestly, nobody thought that it would be so important. And nobody’s thought they will be basically so much used. And it breaks down. Could be (due to) scalability. But what happens when it breaks down? Well, a lot of people can’t trade anymore, and that always happens on a day when the market is the most active. So resilience is becoming very, very important quality.

[00:13:32] Henry Suryawirawan: Thanks for sharing these four quality attributes and giving us a recap. I think those are very interesting points and I see these days, those four really are becoming more and more important.

[00:13:41] Delay Design Decisions Until Necessary

[00:13:41] Henry Suryawirawan: So let’s move on to the third principle. Maybe you can describe what is the third principle?

[00:13:46] Pierre Pureur: Yes. So that’s “delay design decisions until it’s necessary”, which is interesting because architecture is an art, not a science. The whole idea is don’t architect for things you don’t know. So your design decisions should always be built on facts, not guesses. It is very hard because you may have some stakeholders that say, “Hey, I want a system which is scalable for 200% of the current workload, or 2000%”. And you say, wait a minute. Okay. Are you sure? Are you really sure? Because if I start with your system, that’s going to be able to handle those volumes, you’re not going to be happy with the cost.

Cost is a very important attribute. Because in the real world, outside of IT, people who architect and design always think of costs. If you hire an architect to build a house, the first question he’s going to ask you, or she’s going to ask you, is how much money do you want to spend? That’s a question we never ask in IT. We never give choices based on, okay, if you spend X, I can build you that. If you spend Y, I can build you that, so on and so forth. We never see that. You have a budget and you say, okay, you know, I’m going to make the budget. Most of the time I’m going to exceed the budget. We know IT spends always more than budget. We know that. But the real question is really, you need to bake costs into your thinking. Don’t design for things you don’t have to design for. It’s wasteful. Now, if you don’t make the right decisions, at some point, you want to say, oh shoot, now what do I do? But when you make a decision, always try to say, do I really need to make that decision now? Or can I delay it little bit longer?

[00:15:15] Henry Suryawirawan: That’s a very key question. Do I really need this right now? Or can I decide it sometimes later? Because I also heard a few other things about architecture. Architecture is about something that is hard to change. So that’s another perspective that some people said. And it’s really funny as well, when you say costs, right? Cost is always kind of like neglected in this current IT world. In my experience these days, about building projects, products for technology, they never say you have to build this product within X budget. It’s always about building it fast, building whatever features. But then cost is always the afterthought. So, thanks for sharing this delay decision, I think is a very critical thing in architecture. Because once you decide on a hard to change architecture, basically you’re stuck, and you have to do a major rewrite.

[00:15:58] Power of Small

[00:15:58] Henry Suryawirawan: Which brings us to principle four. Can you explain to us what is principle four?

[00:16:02] Pierre Pureur: Principle four is really the art and craft of really leveraging what I call the power of small. By that, I don’t mean microservices. What I mean is, think small. The idea is it doesn’t mean necessarily think small code, because there are many dimensions of architecture, more than just code. But try to think in terms of modules. Decoupling. The idea of loose coupling is very old that dates back to 1960s. But try to think there are some independent modules. Things that you can either deploy on the same box, or same whatever, or different things without major changes. The idea is the smaller those things are, the easier it is to change them and to actually put them in different places should you need to.

People talk a lot of monoliths, and sometimes you don’t have a choice. You have to have monoliths because the cost of breaking it down is just too high, back to cost. However, if you have a choice, small is better than large. So the whole idea is get the architects to think, do I really need to really design a big thing that does a lot of things, or should I go into really a module that only has one responsibility? Try to cut it down, just do one thing, do it well.

[00:17:10] Henry Suryawirawan: Yeah. I can see like, especially, the movement about microservices this day, and also coming back to the concept of team boundaries and all that. They try to break down teams, also into smaller and smaller teams rather than a big team members where they have to coordinate with each other. So I think the power of small these days becomes more important, especially when you build like a distributed system. Because these days, systems talk to each other, it becomes so complex. So I think once you get into a certain large size, I guess it’s pretty hard to change and also evolve the system, so to speak.

[00:17:40] Architect for Build, Test, Deploy and Operate

[00:17:40] Henry Suryawirawan: So let’s go to the principle number five. Maybe you can elaborate on that.

[00:17:44] Pierre Pureur: Oh yes. So principle number five says “architect for build, test, and deploy and operate”. That’s a very important thing. One of the most important things you can do on scalability is to have a monitoring framework. What’s going on in your system? And also to plan for failure. Because at the end of the day, every system tends to fail. It’s not a matter of if, it’s when it’s going to fail. And you have a choice. You can fail gracefully, and get to a point where your customers don’t even see your system failing, which is great. Or you can just crash, which is not good. Crashing is absolutely not good. So have the architects think about that. How do I really monitor my system? My system is going to be put into production and it’s going to have to run. How do you monitor that? How do I make sure that if there’s a failure, because there would be a lot of capacities guaranteed, I’m going to run out capacity somewhere? How can I compensate for that before crashing all the system? How can I stop a domino effect that basically destroys the system? That’s one example.

Another example is performance, of course. Testing for performance is interesting. If you look at a system like Amazon, it makes you wonder how do they can even test for scalability or performance. Last year, with COVID, where absolutely all physical shops were closed. So everybody worldwide went and did their shopping on Amazon. How do you survive that? And they have a lot of techniques, but the whole key is really you build and you architect around testing. So how do I make sure that it’s possible to test that system? And ask, how is it possible to test the system? How is it possible to deploy the system, to operate? Amazon deploys a new version, I think less than one second. It used to be every second, now it’s faster than that. How do I do that? You need to build that in your architecture. So architecture is much more than just about architecting code and designing code. It’s the whole life cycle. You need to keep in mind.

[00:19:28] Henry Suryawirawan: So I think it’s very important as well to understand about the whole life cycle. So you mentioned here architect for build, test, deploy, and operate. Most of the times, I see failing projects or project that didn’t go smooth is because they forget some of these aspects from the lifecycle. Maybe either it’s tests, maybe it’s the deployment that is still manual and clunky, or is it the operation itself, right? There’s no proper enough observability. So I think this principle definitely is a key.

[00:19:53] Conway’s Law

[00:19:53] Henry Suryawirawan: And let’s go to the last principle number six.

[00:19:56] Pierre Pureur: So last principle is really what most people are familiar with. That’s Mel Conway’s principle. The idea is really, if you have your organization, which is really organized around strata, around layers of people. So you have the front-end, the mid-tier, and the back-end, it’s going to be very inefficient. Because the front-end people are going to create that front-end, and they will be happy. The mid-tier people are going to do the middle where they’re happy, and the back-end people are happy as well. But at some point, you get into, oh my God. Now you need to do integration testing. You need to do system testing, and nothing works anymore. The protocols don’t talk to each other. To avoid that, cause that’s expensive and usually happens at the end of the lifecycle, and basically, when senior leadership says, you know, you’ve got to deliver and, of course, nothing works. One of the best ways to avoid that is to actually group people vertically instead of horizontally. That was Mel Conway’s insight.

The whole idea is that the design of system tends to mirror the organization of the teams. So if you organize your people vertically, so instead of having silos of front-end, mid-tier and back-end, you now have the front-end, the mid tier and back-end in one team. So they can talk to each other easily. Basically, the problems go away. I think that’s a very important insight. So now, that’s another principle that everybody agrees with, and very few people follow. I mean, I’m sure that you have seen people look for full-stack developers. But the reality is the skill set you need to be a full-stack developer is very hard to find because you need people who are very cognizant of the front-end, the mid-tier, and the back-end. In an insurance company, back-end systems could be as old as 50 years old, COBOL. So finding someone who is actually going to be able to know that, to know the mid-tier, all the middleware things and the front-end, is impossible.

But the idea is really, you create teams with those people because you have to have multiple people with multiple skills. They actually start working together and communicating, and you’ll see problems go away. For example, if you talk about interface, if you sit next to the people you interface with, it’s easy to solve. If you sit in a different location, you’re going to try to solve it by email, it’s very hard. When I was running an innovation center, one thing I learned is by sitting people together. Actually, we didn’t have offices. We just had desks in an open space. And communication problems absolutely went away.

[00:22:10] Henry Suryawirawan: So this is definitely one of the key themes I keep listening these days, about Conway’s law. Team topologies also have this concept of stream aligned teams, cross-functional team, two pizza team, and now there’s also this data mesh, bounded context. So all seem to revolve around all this same concept, which is basically, you have to build a cross-functional team that is self-independent. And they can build something that is fast without the burden of communication, and all the misunderstandings that could happen if you start to divide them into multiple layers.

So let me recap the whole six principles. I think these are very important in the continuous architecture. The first one is architect products, evolve from projects to products. Principle number two, focus on quality attributes, not on the functional requirements. Principle number three, delay design decisions until they are absolutely necessary. Principle number four, architect for change, leverage the power of small. Number five, architect for build, test, deploy and operate. And lastly, number six, model the organization of your teams after the design of the system you are working on. So basically, the Conway’s law. Thanks so much for explaining all this. I think it really is important key principle.

[00:23:18] Essential Activities

[00:23:18] Pierre Pureur: So the principles were from the first book. And also there is something else we discovered as we wrote the second book. Principles are a nice framework. They really define how we think about architecture. But they are not very actionable. So in order to be actionable, you need what we call the essential activities. And essential activities are things that you need to think about day-to-day. We came up with four of them in the new book. The first one is to focus on quality attributes. We talked about that. We don’t need to repeat that.

The second thing is driving architecture decisions. Decisions are actually the most important thing that an architect can do. People think of architects as people who do complex diagrams. Everybody likes to think that they’re intelligent. And doing a complex diagram is a way to show, especially if nobody else can understand it, to show that you are very smart. You have to do diagrams, but that’s not the most important thing. The most important thing an architect can do is with hard decisions, and make sure that the decisions are properly documented somewhere in a log. By the way, the best way to do that is probably to keep your decision next to your code, GitHub or something like this.

The third one is technical debt. Anything you do may increase technical debt. There’s no escaping that. There’s a great book, “Managing Technical Debt” by Kruchten, Robert Nord, and Ipek Ozkaya. And this is the ultimate book on technical debt, which is something that people need to be keeping in mind. No matter what I do, anything I do on developing a project is going to increase or decrease technical debt. Technical debt is not necessarily bad. Sometimes, you basically have to take technical debt because you just have to deliver a product quickly, and you have to take shortcuts. That’s technical debt. What’s bad is when you don’t repay that technical debt. Because technical debt accrues, and interests on technical debt can run very high. When you have to repay it, it’ll be an “ouch” moment. And the more you wait, the worse it gets.

The fourth one on the essential activities are the feedback loops. We talk a little bit of feedback loops. It’s also so important. And again, that’s an Agile idea, but it’s a great idea. Nobody gets the architecture or design right the first time. You take some guesses. You take some bets, and you put something out there. And you need to get a loop to understand how far you are from the target, and to correct course. So those four things, QAs, architecture decisions, technical debt, and feedback loops are critical to doing architecture.

[00:25:37] Henry Suryawirawan: Thanks for highlighting this. Yeah, I agree. I tend to agree that principles, sometimes, they are good. They are like the fundamentals of understanding and mindset. But they are not really actionable, and thanks for chipping that in the essential activities. So those things are the things that architect needs to take care of. The activities to drive the project well, to drive the good product, not just thinking about the principles and let the team decide that. So thanks for chipping that in.

[00:26:00] Quality Attribute: Scalability

[00:26:00] Henry Suryawirawan: Let’s move on to the next two things which we are going to cover, the two quality attributes that are important that you mentioned earlier. The first one, which is about scalability. So you mentioned a little bit and describe what is scalability. Why do you think scalability now should become top of mind of any architect this day?

[00:26:17] Pierre Pureur: As I said, 20-30 years ago, very few people talk about scalability. In those old days when systems are running on mainframes, IBM will come up with a bigger mainframe and you were scalable. And then, we moved to distributed systems. Now, I define scalability in the following way: it’s a property of a system to be able to handle an increase or decrease workload by increasing or decreasing the cost of the system. Again, back to the cost. People get the increase part very well. They say, okay, so I’m going to basically increase my workload. I’m going to be able to handle an increased workload by increasing the cost of the system. But you should be able also to decrease the cost when you handle decreased volumes. And not many systems can do that. I think that’s an important thing.

Think about Amazon, for example, again, back to Amazon. They can handle normal volumes, and, a few times a year, like for example, around holidays, they can also handle additional volumes. A lot of additional volumes. They don’t have a lot of servers gathering dust all year long, waiting for Christmas to come. That’s the whole idea of scalability to be able to really, so like an accordion, you basically can increase and decrease, but your cost has to increase or decrease as well. It’s really hard. Why was scalability not so important? Well, it’s a hard question to answer, but people didn’t really think too much about it. And suddenly, now you get the big internet companies that came to the fore. But now people look at that, and they said, wow, Amazon can really handle all these volumes. Why can’t I do that? Well, not every company is Amazon. Not every company is Google. The tactics that Amazon and Google use are not necessarily relevant to your business, unless you’re, of course, Amazon or Google. So scalability, you have to take that with a grain of salt. Yes, it’s very important now. Ask Robinhood. I mean, I’m sure they understand scalability is important. But on the other hand, Robinhood cannot afford to build a system as large as Amazon. That’s reality. They just can’t afford that. So when you actually build a system that can increase or decrease or expand or contract, I think that’s very important.

[00:28:13] Henry Suryawirawan: Thanks for highlighting that. When we talk about scalability, sometimes it’s a bit fluffy, right? I want a scalable system. So basically, you can accommodate more and more traffic. But yeah, the other aspect, when we have decreased number of workloads, I think the decreased cost is also something that is important, which sometimes we neglect. We tend to just scale up, but we tend not to scale down. Typically, this will happen if you do a vertical scaling. For example, you increase the spec of the machine, but you basically cannot reduce it down. So I think that’s not a good example of scalability based on the description that you mentioned just now. And also, auto scaling is something that is well understood by many people. You know, we have this function as a service, Kubernetes and all that, which can auto scale really well. So what other types of scalability tactics, or scalability techniques that you think these days are quite important for us to understand?

[00:28:59] Scalability on The Cloud

[00:28:59] Pierre Pureur: Before we go into tactics, one point to be made, I think is very important, is: cloud, of course, is very important nowadays. So a lot of systems run in the cloud, public cloud or vendor cloud. But what people forget, they think that putting your system and app in a cloud is going to make it scalable. Magic happens. And they think that scalability is the problem of the cloud provider. Well, it’s not. If it scales badly on your own infrastructure, there’s no way that putting it on the cloud is going to make it better. The only thing we will achieve is to make it more expensive. Yeah, because now you have an inefficient system trying to scale. The system may be able to cope with additional volumes, but at what cost?

Again, this goes back to the principle about architect for build, test, deploy, and operate. That’s exactly the same idea, because you need to know when you architect your system, where are you going to run that system? I don’t believe that you can just architect a system, and say, oh, it doesn’t matter where it’s going to run. You need to know if your system is going to run at Amazon, at Azure. Most of the time, you know that. But the challenge is that unless your system is well designed to involve the cloud, it’s going to cost you an arm and a leg to do that. So that’s very important to keep in mind.

So yes, there are tactics. And the idea of vertical scalability is to solve the scalability problem by running on a bigger box. That works well until you hit your ceiling, because how do you find a bigger box? The point of scalability is there’s always a price to be paid. And the price is you can’t just take any app, written or developed for whatever, and say, okay, vertical scalability works. Doesn’t work anymore, but I don’t want to change the app. No, you’re going to have to change your app. That’s the bad news. So there’s always a fine line to be kind of walked on between, do I rely on vertical scalability to a point? And then do I switch to horizontal? Am I planning to actually change the system to re-architect for horizontal scalability? Am I really able to run efficiently on the cloud? Those are the questions that architects must really ask themselves.

[00:30:52] Henry Suryawirawan: So, vertical scalability, I think, is something that tends to be avoided these days. I mean, like in the cloud. Yeah. You can find bigger and bigger box with a certain price, definitely.

[00:31:00] Pierre Pureur: Yes.

[00:31:01] Scalability Tactic: Sharding

[00:31:01] Henry Suryawirawan: But yeah, these days people are looking more towards horizontal auto-scaling, that kind of scalability. There are other techniques which I’ve seen happen as well in distributed services, especially when it grows so large. Things like sharding, like database sharding, because at one point in time, database also can’t scale up all the time. So maybe you can also mention a little bit about this technique, about sharding.

[00:31:21] Pierre Pureur: Sure. Actually, you touched on a very important point, which I should have mentioned. The first part of the system, where you’re going to hit a scalability issue, is usually the database. Databases are the first thing that needs to be scaled, bad news, it is also the hardest thing to scale. So you can go for a bigger database, but that has limit. So you start distributing your data. You start replicating your data, partitioning it, and then sharding it.

Now, I mentioned sharding last because honestly, it’s a hard technique to use. And remember, delay design decisions until you have to. Unless you’re absolutely sure you need sharding, and you’re absolutely sure that you got a reason to do that, I would say, try to stay away from that. Cause you’re going to make your code much more complex, and complexity is bad. The whole point of architecture is to decrease complexity. So if you architect on purpose for complexity, it gets you to a paradox. So you really need to avoid complex techniques. One thing also to keep in mind is you are basically doing these with a team. So if you have basically a bunch of hotshot developers that are very conversant with sharding, okay, by all means go for it. But most of the time, we don’t.

In my experience, I didn’t find too many people who use sharding. Also, sharding tend to depend on database. Sharding is not offered by all DBMSes. Sharding is usually, of course, NoSQL, and each NoSQL database is a little bit different. So it’s hard enough to find experts on NoSQL databases. I’m not only going to have to find people who know NoSQL database, but they’re also very familiar with sharding on the database. You’re going to have a hard time finding candidates. So keep in mind at the end of the day, the skills of the team is probably what drives the success or failure of your project or your product.

[00:32:58] Scalability Tactic: Asynchronous Communication

[00:32:58] Henry Suryawirawan: And there’s another important technique, which we will cover for this scalability is about asynchronous communications. So especially when you have distributed systems, why synchronous communications is worse than asynchronous communications for scalability?

[00:33:11] Pierre Pureur: Yes it is. But there is reasonable challenge. Same as sharding. Yes, we all know messaging and its async is definitely better than synchronous for both scalability and performance. Here is the problem. Not many people are familiar with an async model. People read about it. We all know the concepts. But being able to efficiently code for it is a different story. So definitely, asynchronous is more efficient than synchronous because the idea is you send a message, you put that on some kind of bus or some kind of message carrier. It’s picked up, so you don’t have to block until you get a response. Yes, that’s good news. The bad news is you’re going to have your response coming back at you at some point, unless you’re prepared for that.

So again, you have to walk a very fine line. One point that we make in the book. And the book, by the way, was built around that case study. We spend a lot of time defining that case study. It’s not a fully baked system. It’s actually a letter of credit system. In the case study, we make it a point of, okay, if you know at some point, you’re going to have to switch from synchronous to asynchronous, it’s easier to start with synchronous because that’s the way people are used to, but you’re just going to say, okay, I’m going to hit some limitations. How can I avoid to paint myself in a corner, and have to do a major rewrite when I switch to asynchronous? And the idea was maybe instead of basically doing direct point-to-point communication, we’re going to have a service that handles that point-to-point. And at some point in the future, we can switch from synchronous to asynchronous for certain communication that we know are going to have problems. But keep in mind, the cost is going to be important. Cost in terms of money and time. Again, my message here is be very careful. Companies like Amazon, Google use asynchronous because they have to. But unless you have to and you know you’re going to have to, try not to do that.

[00:34:52] Henry Suryawirawan: So this comes back to the principle of delay decisions until you really need it. I would say, every time you think that you need asynchronous, for example, think about the cost, complexity, and also the time it requires you to build. Thanks for highlighting that again.

[00:35:06] Quality Attribute: Performance

[00:35:06] Henry Suryawirawan: So let’s move on to the next quality attribute, which is about performance. I think many people must be able to relate about performance, which is about speed, latency and all that. Anything to add to this performance? Why is it important these days about performance quality attributes?

[00:35:19] Pierre Pureur: Well, so performance is interesting because that’s one that you don’t have to warn people about. Scalability, you do have to warn people that it’s important. Performance, they already get it. But the challenge is many people confuse scalability and performance, and they are very different. Performance is about timing. So that’s very clear. Now they have also a relationship because when scalability gets bad, when the system hits or gets close to a scalability limit, performance just becomes abysmal. So keep that in mind.

Assuring performance is really, to me, same as scalability. First, document your requirements. Requirements for performance and scalability are usually very badly documented. And that’s why in the book, we advise to use scenarios. This actually came from the Carnegie Mellon, the SEI (Software Engineering Institute). The Architecture Trade-off Method. The idea of scenarios is to try to define precisely what your requirements in terms of what kind of test? So we have a stimulus. Basically, your response and a measurement. What’s going to happen to your system? So you are going to need to measure performance. It could be just someone needs to do a transaction. Response is how is the system going to respond to that. And measurement is how long it is going to take.

So simple things, but what we find is by kind of going through discipline of writing those things down and saying, okay, instead of just saying my system must be fast, what do I mean exactly? If you do that, two things happen. Number one, your designers start thinking much more in practice about what they need to design. And number two, your testers know what to test, which is very important. If all you’re being told is system must be fast, how do you test that?

[00:36:52] Henry Suryawirawan: Yeah. I think especially, for product requirements that is vague or something that probably you just build from scratch. But I think I see what your point is. If you just say fast, basically it’s an abstract thing, right? How to measure it? How do you know actually the user actually appreciates that performance, right? Because sometimes user interaction doesn’t have to be in millisecond, unless you are doing like some kind of trading, like Robinhood, just what you mentioned. But a lot of things like money transfer, maybe it doesn’t have to be milliseconds. It could be few seconds and people are still happy about it. As long as the money arrives safely to the other accounts, for example.

[00:37:23] Measuring Performance

[00:37:23] Henry Suryawirawan: So the other aspect of performance is actually to measure it, as you mentioned. And I’ve seen in my career as well, a lot of times people don’t actually measure. So, what do you think are some of the good strategies to actually start measuring? Is it at the code level? Is it at the service-to-service level? Or what kind of measurement that people should have?

[00:37:40] Pierre Pureur: Yes. So before I answer the question, actually I’d like to give some context here. Most of the time, we aren’t working in isolation. Most of the time, we’re working with systems that we don’t necessarily control. For example, Salesforce, most companies I know of use Salesforce. The interesting thing, about 10 years ago, an architect used to think how do we integrate with Salesforce? Nowadays is how do I work within Salesforce? So the whole concept of you took scalability, performance of working set, are changing to the code I’m going to write. It’s probably going to have to live within an ecosystem I don’t control, which kind of brings performance and scalability in a different light. Because now, how can I control the performance capability of something like Salesforce, which I know the architecture, but I can’t control it. So just keep that in mind.

Now, having said that, monitoring is very important. And instrumentation is very important. Remember principle number four, right? Architect for build, sure. But test, deploy, and run. So make sure that you actually build those instrumentations, those calls, those pieces of code that you need to manage your system. No matter how small your system is, I’m going to need some instrumentation. And you’re going to need to measure performance at each level. If you don’t measure, there’s no way you can actually design efficiently. People don’t like to think about that because especially at senior management say, why do you waste time writing that code? This is not productive. Yes, it is. It’s very productive code. The same way, remember, as dealing with failure. You need to put code in place that will allow you to deal with failure. So you need to be able to fail gracefully. If you think that it’s expensive to do that, think about the cost of not doing it. It’s quite mind-boggling. Think about your system just stopping dead in its tracks in the middle of the trading day.

[00:39:23] Performance Tactics

[00:39:23] Henry Suryawirawan: So yeah, looking back at the tactics, techniques like just what we covered for scalability. I mean, there might be some obvious one, but what do you think are some of the interesting new techniques for increasing system’s performance?

[00:39:36] Pierre Pureur: So basically, you have two types of tactics. One is you control the resource demand within the forces. So caching, of course, that’s not a new one, but caching is very important one. And I think that, honestly, this is one which is not necessarily well understood. Sometimes people use caching because they think is good. So, you know, I want to make sure that the messages go through. So I just put some caching in it. I don’t really try to understand how to use it. But also, what happens with caching, it sometimes is, oh, my system doesn’t run the way I expected, so let’s put some caching in it. Well, too late. This is one case where principle " delay design decision" should not have been used, because you should build caching a little bit earlier. And that’s why it’s an art, not a science.

On the database side, you have a lot of tactics, which are basically, NoSQL is, of course, the most famous one, material views, indexes, and so on so forth. But remember all these techniques, all these tactics come with a cost. Back to cost, right? NoSQL is a great idea. NoSQL database is usually more performing than SQL databases. Depends which use case, right? If you have structured data, well-defined data, SQL databases are probably a better choice. If you have not well-defined data or no structured data and so forth, NoSQL is a better choice. However, you fall into the skill issue because do the skills in your team match your architecture. It makes no sense to design something that you can’t build.

[00:40:57] Henry Suryawirawan: That’s a great point. There’s no point of designing something that you cannot build. So sometimes, I see that the role of architects, some people are like rolling their eyes if there’s an architect in the room, drawing diagrams, fancy diagrams, but actually leaving it back to the development team. I think a good point there. There’s no point in designing a system that nobody can build, and even understand.

[00:41:15] 3 Tech Lead Wisdom

[00:41:15] Henry Suryawirawan: So Pierre, thank you so much for this time. I really learned a lot. I enjoyed this conversation. You really are passionate about this topic, definitely, right? I can tell. But due to time, unfortunately, we have to cut it short. But I have one question that I always ask to all my guests, which is to share about three technical leadership wisdom. So you have a very long career and a lot of experience as well. So can you maybe share some of wisdom? So in particular, three from your journey, from your career experience, for us to learn from.

[00:41:41] Pierre Pureur: Sure, I can. So the first thing that we are seeing now is architecture is a skill, not a role. So the whole concept of being an architect is morphing into architecture belongs to the team. I think that’s a very important thing. Everybody in a team has to have architectural skills. The role is really disappearing more and more.

One thing I think we spoke about that quite a few times today is architecture is morphing into a continuous flow of decisions. You make decisions. You document them. Everybody knows what decisions are. They need to be well communicated. For example, if you’re going to decide to go NoSQL database, you need to communicate that very well, and everybody needs to be part of the decision. You should not make that decision in an ivory tower. You should really have the whole team participating here. Understanding basically the pluses and minuses. So the whole concept of architecture moving to a continuous flow of decisions.

And a third one, which I’ve learned the hard way, is always plan for monitoring, and dealing with failure. Because your system will fail. Response time may not be the one you expect. If you wait until you get calls from your users saying, well, this system is slow, like it’s terrible, that’s too late. You really need to be alerted on what’s going on in the system, which part of the system is slowing down. Same thing for failure. If part of the system is going to start turning hot, you need to really understand that before the whole thing crashes. So those are the three things.

[00:42:59] Henry Suryawirawan: So thanks again for sharing that. Spoken like a true architect, actually. All the wisdoms are related with architecture. So Pierre, if people want to continue their conversation about architecture, is there a place where they can find you online or maybe even the book, right?

[00:43:14] Pierre Pureur: Yeah, the book, and we actually have a blog. What I’ll do, I’ll send you, rather than trying to give you the address. It’s ContinuousArchitecture.com, but in one word. That site has a lot of information. I’m also going to publish an article on StackOverflow. So that’s going to also happen soon.

[00:43:29] Henry Suryawirawan: I’ll make sure to put all these in the show notes for people who want to follow and refer further. So thanks again, Pierre, for your time today. It’s been a pleasure learning about architecture from you.

[00:43:39] Pierre Pureur: Thank you so much, Henry.

[00:43:40] Henry Suryawirawan: Thanks.

[00:43:41] Pierre Pureur: Take care.

– End –