×

Podcast: Making the DevOps Pipeline Transparent and Governable

MMS Founder
MMS David Williams

Article originally posted on InfoQ. Visit InfoQ

Subscribe on:






Transcript

Shane Hastie: Good day folks. This is Shane Hastie for the InfoQ Engineering Culture podcast. Today, I’m sitting down with David Williams from Quali. David, welcome. Thanks for taking the time to talk to us today.

David Williams: Thanks, Shane. It’s great to be here.

Shane Hastie: Probably my first starting point for most of these conversations is who’s David?

Introductions [00:23]

David, he’s a pretty boring character, really. He’s been in the IT industry all his life, so there’s only so many parties you can go and entertain people with that subject now. But I’ve been working since I first went to school. My first jobs were working in IT operations in a number of financial companies. I started at the back end. For those of you who want to know how old I was, I remember a time when printing was a thing. And so decorating was my job, carrying tapes, separating print out, doing those sort of things. So really I got a very grassroots level of understanding about what technology was all about, and it was nowhere near as glamorous as I’ve been full to believe. So I started off, I’d say, working operations. I’ve worked my way through computer operations systems administration, network operations. So I used to be part of a NOC team, customer support.

David Williams: I did that sort of path, as low as you can get in the ladder, to arguably about a rung above. And then what happened over that period of time was I worked a lot with distributed systems, lights out computing scenarios, et cetera and it enabled me to get more involved in some of the development work that was being done, specifically to manage these new environments, specifically mesh computing, clusters, et cetera. How do you move workloads around dynamically and how does the operating system become much more aware of what it’s doing and why? Because obviously, it just sees them as workloads but needed to be smarter. So I got into development that way, really. I worked for Digital Equipment in its heyday, working on clusters and part of the team that was doing the operating system work. And so that, combined with my knowledge of how people were using the tech, being one of the people that was once an operations person, it enabled me as a developer to have a little bit of a different view on what needed to be done.

And that’s what really motivated me to excel in that area, because I wanted to make sure that a lot of the things that were being built could be built in support of making operations simpler, making the accountability of what was going on more accountable to the business, to enable the services to be a little more transparent in how IT was using them around. So that throughout my career, luckily for me, the tech industry reinvents itself in a very similar way every seven years. So I just have to wait seven years to look like one of the smart guys again. So that’s how I really got into it from the get go.

Shane Hastie: So that developer experience is what we’d call thinking about making it better for developers today. What are the key elements of this developer experience for us?

The complexity in the developer role today [02:54]

David Williams: When I was in development, the main criteria that I was really responsible for was time. It was around time and production rates. I really had no clue why I was developing the software. Obviously, I knew what application I was working on and I knew what it was, but I never really saw the results. So over the years, I wasn’t doing it for a great amount of time, to be honest with you. Because when I started looking at what needed to be done, I moved quite quickly from being a developer into being a product manager, which by the way, if you go from development to product management, it’s not exactly a smooth path. But I think it was something that enabled me to be a better product manager at the time, because then I understood the operations aspects, I was a developer and I understood what it was that made the developer tick because that’s why I did it.

It was a great job to create something and work on it and actually show the results. And I think over the years, it enabled me to look at the product differently. And I think that as a developer today, what developers do today is radically more advanced than what I was expected to do. I did not have continuous delivery. I did not really have a continuous feedback. I did not have the responsibility for testing whilst developing. So there was no combined thing. It was very segmented and siloed. And I think over the years, I’ve seen what I used to do as an art form become extremely sophisticated with a lot more requirements of it than was there. And I think for my career, I was a VP of Products at IBM Tivoli, I was a CTO at BMT software, and I worked for CA Technology prior to its acquisition by Broadcom, where I was the Senior Vice President of Product Strategy.

But in all those jobs, it enabled me to really understand the value of the development practices and how these practices can be really honed in, in support between the products and the IT operations world, as well as really more than anything else, the connection between the developer and the consumer. That was never part of my role. I had no clue who was using my product. And as an operations person, I only knew the people that were unhappy. So I think today’s developer is a much more… They tend to be highly skilled in a way that I was not because coding is part of their role. Communication, collaboration, the integration, the cloud computing aspects, everything that you have to now include from an infrastructure is significantly in greater complexity. And I’ll summarize by saying that I was also an analyst for Gartner for many years and I covered the DevOps toolchains.

And the one thing I found out there was there isn’t a thing called DevOps that you can put into a box. It’s very much based upon a culture and a type of company that you’re with. So everybody had their interpretation of their box. But one thing was very common, the complexity in all cases was significantly high and growing to the point where the way that you provision and deliver the infrastructure in support of the code you’re building, became much more of a frontline job than something that you could accept as being a piece of your role. It became a big part of your role. And that’s what really drove me towards joining Quali, because this company is dealing with something that I found as being an inhibitor to my productivity, both as a developer, but also when I was also looking up at the products, I found that trying to work out what the infrastructure was doing in support of what the code was doing was a real nightmare.

Shane Hastie: Let’s explore that when it comes, step back a little bit, you made the point about DevOps as a culture. What are the key cultural elements that need to be in place for DevOps to be effective in an organization?

The elements of DevOps culture [06:28]

David Williams: Yeah, this is a good one. When DevOps was an egg, it really was an approach that was radically different from the norm. And what I mean, obviously for people that remember it back then, it was the continuous… Had nothing to do with Agile. It was really about continuous delivery of software into the environment in small chunks, microservices coming up. It was delivering very specific pieces of code into the infrastructure, continuously, evaluating the impact of that release and then making adjustments and change in respect to the feedback that gave you. So the fail forward thing was very much an accepted behavior, what it didn’t do at the time, and it sort of glossed over it a bit, was it did remove a lot of the compliance and regulatory type of mandatory things that people would use in the more traditional ways of developing and delivering code, but it was a fledging practice.

And from that base form, it became a much, much bigger one. So really what that culturally meant was initially it was many, many small teams working in combination of a bigger outcome, whether it was stories in support of epics or whatever the response was. But I find today, it has a much bigger play because now it does have Agile as an inherent construct within the DevOps procedures, so you’ve got the ability to do teamwork and collaboration and all the things that Agile defines, but you’ve also got the continuous delivery part of that added on top, which means that at any moment in time, you’re continually putting out updates and changes and then measuring the impact. And I think today’s challenge is really the feedback loop isn’t as clear as it used to be because people are starting to use it for a serious applications delivery now.

The consumer, which used to be the primary recipient, the lamp stacks that used to be built out there have now moved into the back end type of tech. And at that point, it gets very complex. So I think that the complexity of the pipeline is something that the DevOps team needs to work on, which means that even though collaboration and people working closely together, it’s a no brainer in no matter what you’re doing, to be honest. But I think that the ability to understand and have a focused understanding of the outcome objective, no matter who you are in the DevOps pipeline, that you understand what you’re doing and why it is, and everybody that’s in that team understands their contribution, irrespective of whether they talk to each other, I think is really important, which means that technology supporting that needs to have context.

I need to understand what the people around me have done to be code. I need to know what stage it’s in. I need to understand where it came from and who do I pass it to? So all that needs to be not just the cultural thing, but the technology itself also needs to adhere to that type of practice.

Shane Hastie: One of the challenges or one of the pushbacks we often hear about is the lack of governance or the lack of transparency for governance in the DevOps space. How do we overcome that?

Governance in DevOps [09:29]

David Williams: The whole approach of the DevOps, initially, was to think about things in small increments, the bigger objective, obviously being the clarity. But the increments were to provide lots and lots of enhancements and advances. When you fragmented in that way and give the ability for the developer to make choices on how they both code and provision infrastructure, it can sometimes not necessarily lead to things being unsecure or not governed, but it means that there’s different security and different governance within a pipeline. So where the teams are working quite closely together, that may not automatically move if you’ve still got your different testing team. So if your testing is not part of your development code, which in some cases it is, some cases it isn’t, and you move from one set of infrastructure, for example, that supports the code to another one, they might be using a completely different set of tooling.

They might have different ways with which to measure the governance. They might have different guardrails, obviously, and everything needs to be accountable to change because financial organizations, in fact, most organizations today, have compliance regulations that says any changes to any production, non-production environment, in fact, in most cases, requires accountability. And so if you’re not reporting in a, say, consistent way, it makes the job of understanding what’s going on in support of compliance and governance really difficult. So it really requires governance to be a much more abstract, but end to end thing as opposed to each individual stay as its own practices. So governance today is starting to move to a point where one person needs to see the end to end pipeline and understand what exactly is going on? Who is doing what, where and how? Who has permissions and access? What are the configurations that are changing?

Shane Hastie: Sounds easy, but I suspect there’s a whole lot of… Again, coming back to the culture, we’re constraining things that for a long time, we were deliberately releasing.

Providing freedom withing governance constraints [11:27]

David Williams: This is a challenge. When I was a developer of my choice, it’s applicable today. When I heard the word abstract, it put the fear of God into me, to be honest with you. I hated the word abstract. I didn’t want anything that made my life worse. I mean, being accountable was fine. When I used to heard the word frameworks and I remember even balking at the idea of a technology that brought all my coding environment into one specific view. So today, nothing’s changed. A developer has got to be able to use the tools that they want to use and I think that the reason for that is that with the amount of skills that people have, we’re going to have to, as an industry, get used to the fact that people have different skills and different focuses and different preferences of technology.

And so to actually mandate a specific way of doing something or implementing a governance engine that inhibits my ability to innovate is counterproductive. It needs to have that balance. You need to be able to have innovation, freedom of choice, and the ability to use the technology in the way that you need to use to build the code. But you also need to be able to provide the accountability to the overall objective, so you need to have that end to end view on what you’re doing. So as you are part of a team, each team member should have responsibility for it and you need to be able to provide the business with the things that it needs to make sure that nothing goes awry and that there’s nothing been breached. So no security issues occurring, no configurations are not tracked. So how do you do that?

Transparency through tooling [12:54]

David Williams: And as I said, that’s what drove me towards Quali, because as a company, the philosophy was very much on the infrastructure. But when I spoke to the CEO of the company, we had a conversation prior to my employment here, based upon my prior employer, which was a company that was developing toolchain products to help developers and to help people release into production. And the biggest challenge that we had there was really understanding what the infrastructure was doing and the governance that was being put upon those pieces. So think about it as you being a train, but having no clue about what gauge the track is at any moment in time. And you had to put an awful lot of effort into working out what is being done underneath the hood. So what I’m saying is that there needed to be something that did that magic thing.

It enabled you with a freedom of choice, captured your freedom of choice, translated it into a way that adhered it to a set of common governance engines without inhibiting your ability to work, but also provided visibility to the business to do governance and cost control and things that you can do when you take disparate complexity, translate it and model it, and then actually provide that consistency view to the higher level organizations that enable you to prove that you are meeting all the compliance and governance rules.

Shane Hastie: Really important stuff there, but what are the challenges? How do we address this?

The challenges of complexity [14:21]

David Williams: See, the ability to address it and to really understand why the problems are occurring. Because if you talk to a lot of developers today and say, “How difficult is your life and what are the issues?”, the conversation you’ll have with a developer is completely different than the conversation you’ll have with a DevOps team lead or a business unit manager, in regards to how they see applications being delivered and coded. So at the developer level, I think the tools that are being developed today, so the infrastructure providers, for example, the application dictates what it needs. It’s no longer, I will build an infrastructure and then you will layer the applications on like you used to be able to do. Now what happens is applications and the way that they behave is actually defining where you need to put the app, the tools that are used to both create it and manage it from the Dev and the Op side.

So really what the understanding is, okay, that’s the complexity. So you’ve got infrastructure providers, the clouds, so you’ve got different clouds. And no matter what you say, they’re all different impact, serverless, classic adoption of serverless, is very proprietary in nature. You can’t just move one serverless environment from one to another. I’m sure there’ll be a time when you might be able to do that, but today it’s extremely proprietary. So you’ve got the infrastructure providers. Then you’ve got the people that are at the top layer. So you’ve got the infrastructure technology layer. And that means that on top of that, you’re going to have VMs or containers or serverless something that sits on your cloud. And that again is defined by what the application needs, in respect to portability, where it lives, whether it lives in the cloud or it’s partly an edge, wherever you want to put it.

And then of course on top of that, you’ve got all the things that you can use that enables you to instrument and code to those things. So you’ve got things like Helm charts for containers, and you’ve got a Terraform where developing the infrastructure as code pieces, or you might be using Puppet or Chef or Ansible. So you’ve got lots of tools out there, including all the other tools from the service providers themselves. So you’ve got a lot of the instrumentation. And so you’ve got that stack. So the skills you’ve got, so you’ve got the application defining what you want to do, the developer chooses how they use it in support of the application outcome. So really what you want to be able to do is have something that has a control plane view that says, okay, you can do whatever you want.

Visibility into the pipeline [16:36]

David Williams: These are the skills that you need. But if people leave, what do you do? Do you go and get all the other developers to try and debug and translate what the coding did? Wouldn’t it be cool instead to have a set of tech that you could understand what the different platform configuration tools did and how they applied, so look at it in a much more consistent form. Doesn’t stop them using what they want, but the layer basically says, “I know, I’ve discovered what you’re using. I’ve translated how it’s used, and I’m now enabling you to model it in a way that enables everybody to use it.” So the skills thing is always going to exist. The turnover of people is also very extremely, I would say, more damaging than the skills because people come and go quite freely today. It’s the way that the market is.

And then there’s the accountability. What do the tools do and why do they do it? So you really want to also deal with the governance piece that we mentioned earlier on, you also want to provide context. And I think that the thing that’s missing when you build infrastructure as code and you do all these other things is even though you know why you’re building it and you know what it does to build it, that visibility that you’re going to have a conversation with the DevOps lead and the business unit manager, wouldn’t it be cool if they could actually work out that what you did is in support of what they need. So it has the application ownership pieces, for example, a business owner. These are the things that we provide context. So as each piece of infrastructure is developed through the toolchain, it adds context and the context is consistent.

So as the environments are moved in a consistent way, you actually have context that says this was planned, this was developed, and this is what it was done for. This is how it was tested. I’m now going to leverage everything that the developer did, but now add my testing tools on top. And I’m going to move that in with the context. I’m now going to release the technology until I deploy, release it, into either further testing or production. But the point is that as things get provisioned, whether you are using different tools at different stages, or whether you are using different platforms with which to develop and then test and then release, you should have some view that says all these things are the same thing in support of the business outcome and that is all to do with context. So again, why I joined Quali was because it provides models that provide that context and I think context is very important and it’s not always mentioned.

As a coder, I used to write lots and lots of things in the code that gave people a clue on what I was doing. I used to have revision numbers. But outside of that and what I did to modify the code within a set of files, I really didn’t have anything about what the business it was supporting it. And I think today with the fragmentation that exists, you’ve got to give people clues on why infrastructure is being deployed, used, and retired, and it needs to be done in our life cycle because you don’t want dormant infrastructure sitting out there. So you’ve got to have it accountable and that’s where the governance comes in. So the one thing I didn’t mention earlier on was you’ve got to have ability to be able to work out what you’re using, why it’s being used and why is it out there absorbing capacity and compute, costing me money, and yet no one seems to be using it.

Accountability and consistency without constraining creativity and innovation [19:39]

David Williams: So you want to be out of accountability and with context in it, that at least gives you information that you can rely back to the business to say, “This is what it cost to actually develop the full life cycle of our app, in that particular stage of the development cycle.” So it sounds very complex because it is, but the way to simplify it is really to not abstract it, but consume it. So you discover it, you work out what’s going on and you create a layer of technology that can actually provide consistent costing, through consistent tagging, which you can do with the governance, consistent governance, so you’re actually measuring things in the same way, and you’re providing consistency through the applications layer. So you’re saying all these things happen in support, these applications, et cetera. So if issues occur, bugs occur, when it reports itself integrated with the service management tools, suddenly what you have there is a problem that’s reported in response to an application, to a release specific to an application, which then associates itself with a service level, which enables you to actually do report and remediation that much more efficiently.

So that’s where I think we’re really going is that the skills are always going to be fragmented and you shouldn’t inhibit people doing what they need. And I think the last thing I mentioned is you should have the infrastructure delivered in the way you want it. So you’ve got CLIs, if that’s a preferred way, APIs to call it if you want to. But for those who don’t have the skills, it’s not a developer only world if I’m an abstraction layer and I’m more of an operations person or someone that doesn’t have the deep diving code skills, I should need to see a catalog of available environments built by coding, built by the people that actually have that skill. But I should be able to, in a single click, provision an environment in support of an application requirement that doesn’t require me to be a coder.

So that means that you can actually share things. So coders can code, that captures the environment. If that environment is needed by someone that doesn’t have the skills, but it’s consistently, because it has all that information in it, I can hit a click. It goes and provisions that infrastructure and I haven’t touched code at all. So that’s how you see the skills being leveraged. And you just got to accept the fact that people will be transient going forward. They will work from company to company, project to project, and that skills will be diverse, but you’ve got to provide a layer with which that doesn’t matter.

Shane Hastie: Thank you very much. If people want to continue the conversation, where do they find you?

David Williams: They can find me in a number of places. I think the best place is I’m at Quali. It is David.W@Quali.com. I’m the only David W., which is a good thing, so you’ll find me very easily. Unlike a plane I got the other day, where I was the third David Williams on the plane, the only one not to get an upgrade. So that’s where you can find me. I’m also on LinkedIn, Dave Williams on LinkedIn can be found under Quali and all the companies that I’ve spoken to you about. So as I say, I’m pretty easy to find. And I would encourage, by the way, anybody to reach out to me, if they have any questions about what I’ve said. It’d be a great conversation.

Shane Hastie: Thanks, David. We really appreciate it.

David Williams: Thank you, Shane.

Mentioned

About the Author

.
From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.