Mobile Monitoring Solutions

Close this search box.

Presentation: Quarkus and GraalVM: Booting Hibernate at Supersonic Speed, Subatomic Size

MMS Founder

Article originally posted on InfoQ. Visit InfoQ


Grinovero: I have a lot of very interesting things to share so I will start right away and not waste much time. I hope to fit it all in. First, let me introduce myself, I’m Sanne Grinovero. I’m Dutch, I’m Italian, I’m living in London and I came here because I was invited to introduce this to you, so I’m very glad for that, thank you. I work for Red Hat where I’m in the middleware research and development area, so mostly Java. I’m known to lead the Hibernate team. I have been working for 10 years on Hibernate now and more recently we started this Quarkus thing, which started a bit like, “Hi, can you improve bootstrap times because clouds work better if you boot faster,” and many other ideas.

While working on that on all this time I have been contributing in several other open-source projects. That’s a little list, I started contributing a bit from GraalVM as well now because I got really interested in the potential of this project. What are you going to talk about? I’ll try to introduce you to the GraalVM and native images, what they are and what the problem and benefits are with them, and that’s where we get to Quarkus. Then I’ll do a little demo of showing you, hopefully, how nice it is to code on this platform. I really hope to be able to explain a bit more into detail how it actually works behind the scene, so what are the tricks and how we actually get this stuff working.

Native Image

Let me start with native image. This is a term I’ve been throwing around, but it’s not always clear. What is a native image? Let’s start with a very quick demo on that. First thing, if you want to try this at home, it’s really simple. You will need to download the GraalVM distribution and then it looks like a JDK. You point your environment variables like Java Home, you point it to where you extract this, and you probably want to put it on your path as well because it has some additional binaries in there which are useful. Then what you do is, you build your application as usual, but then there is an additional phase called the native image, which can convert a JAR into a native image, which is like a platform-dependent specific highly optimized binary, which runs straight away.

Let’s see this directly. I have a demo here, which is extremely simple. You all know Hello World, as a main class here, which is printing Hello World. Let’s make a native image out of this. The first thing I do is, of course, I need to compile my file. Initially, we had this and now we have the Main.class file as well. Now we need the JAR from this, ‘cfe,’ the name of the application I want to run, what’s the main entry point, and which classes to include. Ok, let me see if we created a JAR – it’s there. Now, in Java, I run this like that, and it’s printing, “Hello World!” Nothing fancy or special about this so far.

We can also do this, native-image-jar-app.jar. This is using the GraalVM compiler to compile this code and all the code of the JDK to build a highly optimized binary. Let me just show you what this looks like. There is this app binary now, you see this is an executable and it’s 2.4 megabytes. This is the Linux Elf binary, so it’s linking directly to some other libraries and I can run it like that, “Hello World!” Another benefit is this doesn’t require a JDK. You just take this binary, drop it in a docker or container and that’s your application. It’s ready to roll, ready to go.

That’s a native image. The question is, of course, “Ok, that’s Hello World. How do I get it running for more complex applications?” Well, you need to learn about some of the limitations. It cannot just convert any application and make it executable.


Let’s talk a second about GraalVM. GraalVM is a large project which has several interesting tools in there; it has Truffle components, which I’m not going to talk about today. It’s also integrated in open JDK so you can actually run the Graal compiler within the JVM to optimize your code instead of in C2. Chris Dollinger explained how to do that earlier in this room. We are going to focus now on the Substrate VM component, which is using the GraalVM compiler to compile your application classes, the JDK classes that your application is using and these bits from the SubstrateVM, which are like the implementation of some components that cannot be ported otherwise. That’s mangled together, extremely optimized by the compiler and you get this executable out of that.

A big thing about this is, it needs to run this static analysis, so this is ahead-of-time compilation. You’re not doing just-in-time compilation like we used to in Java. It needs to compile everything, which means it needs to also see everything. That brings us to this closed world assumption like here; when you’re building the application, you need to have the code there and the compiler needs to be able to see all the code and the possible flows that are going to be executed.

There are great benefits from this approach, mostly like dead code elimination. For example, in the demo I ran before, all the components in the JDK – and I know we have modules, but modules are very [inaudible 00:06:15]. This is looking at method by method, field by field. Are you actually going to need this field in your final image? If not, it’s thrown out, it’s not being included. That’s keeping the memory costs very low, the bootstrap times very low and just the size on this, all the resources pushed down to the minimum. We really like this aspect of aggressive dead code elimination, which comes from a static analysis.

It’s a strong suit, but it’s also a very weird way to look at the Java platform because we’re not used to this, and all the libraries we’re running and all the platforms we’re running make assumptions on the fact that it is a very dynamic and rich platform that you can do stuff, generate code at run time and do interesting things.

These limitations bring us to, you cannot have dynamic classloading. In practice, if you try to get the reference to a class loader or your current class loader in the context of something you’ll get a null back, null reference, which means most of the libraries out there will throw some random null pointer exceptions there, because they might try to do something nobody expected when writing this code, that it could return null, but now it’s returning now when it’s compiled.

This also implies you cannot deploy jars and wars at runtime. You started with Tomcat years ago and then you produce a war and then you load these kinds of multiple wars, like plugins of your containers, and you’ll load these things dynamically. That’s not possible because this is violating the principle of the compiler needs to be able to see all the code so that it can actually prune the dead code. It cannot know what it can remove if you are allowed to introduce additional methods which might code, invoke things, methods, JDK methods or used constants, which he actually removed upfront.

Another thing as well is that the management extensions and the tooling interface are not available, which means no agents, which means no JRebel, no Byteman, no profilers, no tracers. This is killing a whole area of very interesting tools that we are used to, and no Java Debugger. You cannot use the JVM debugger to connect one of these native images because if you didn’t tell him that you’re going to use these kinds of things, the infrastructure to support this stuff has been thrown out to save more memory, save more disc. Now, of course, some of these things can have flags that you enable at build time and that make the compiler behave differently so that some of these things are not thrown out, because you might want to do them later, but you have to tell it explicitly.

Other areas are like “No: security manager”, and we’re like, “Finally,” because that’s very complex to handle, but also, why do you need the security manager if nobody can actually load new code in there and there is no classloader? No finalize() support, which is great because it’s been deprecated by years and now you just cannot usually use them, so you’ll [inaudible 00:09:30]. This is a bit of a sore point; InvokeDynamic and MethodHandles have only very limited support so if you’re doing crazy optimizations based on InvokeDynamic and MethodHandles, there’s a good chance that this will not compile to native either. Why is that? Because you are generating code essentially at runtime, that’s a violation of the closed work principles.

Reflection – if you allow reflection, the compiler cannot see which fields you are going to read, which methods you are going to invoke, and so on. Reflection is not allowed, except of course if you explicitly tell the compiler, “Look, when you’re compiling this, can you keep in mind that I will invoke reflectively this one constructor of this class?” If you tell it that upfront, it will just keep the infrastructure in the image so that this is then going to work at runtime. But if you forget one of these things, it’s not going to work.

Of course, dynamic proxies, aspect-oriented programming, interceptors created at runtime, loading resources or using JNI, Unsafe Memory, all of these things are not available either unless you hint at the competitor, “Look, I have specific needs. I need this method to work in these conditions,” and then it can fold this information into the optimization decisions. It’s an opt-in thing, you have to tell it.

The static initialization is another interesting aspect. This is a very special behavior. Every block which is static in your code, which also means simple constants – you have a “public static final string” and then there is something there; that’s a static initializer. You know that when your application is initializing this class, that string is created at that point. What’s happening here is that this string is actually created during the compilation of your application. It’s not just strings; it’s every more complex static block that you might have. Every more complex object that you might be initializing is run, the code is run within the context of the build, so the compiler is running it.

It’s actually checking what kind of code you are including in your static initializers, because some things are not legal, and we’ll get back to that later. Essentially these things that we look at, they look constants when you’re reading the code. We know they’re not really constants because we have the reflection API. You can say, “I said that’s a final field, but let’s make it a writeable again. Remove the protections because my library needs to do something so I’m going to change it again.” In the JVM, all these things are allowed and they kind of have back doors. You cannot allow these backdoors to run in GraalVM because that would mutate the constants, which are strong assumptions that the compiler again is going to use to optimize your code like crazy.

In a way, I prefer this; if it’s a constant, it really is a constant, and the optimizations behind that are quite strong. What happens is, these objects that you created during the build will take a snapshot of that memory, and that’s compiled into the native and that’s already initialized, ready to go. This goes in the constant pool of the binary.

Among the things that are not allowed to run in the static initializer, if you try to open file handles, sockets, or start background threads, there are many things that need a timer that started in the background to do some periodic things; if you do anything like that, the GraalVM compiler actually sees through that, and that’s a violation of the rules.

Of course, you don’t want to start these timers within the compiler. I’d want to start them in the end application. In these cases, to be fair, this is changing right now in [inaudible 00:13:35], so I’m not sure in what direction we’re going, but it seems the idea is that rather than failing the compilation, it will automatically detect that these blocks will not be run during the build, and they will be run later. This is a change of how the current GraalVM latest release is, which would just fail to build. You’re not allowed to do this in static new initialization, and you can explicitly opt-in yourself to deferred initialization of a specific list of classes.

These three are illegal, but you also have to be careful with other things. If you’re taking the current timestamp in your constant, you want to know when this application was booted, that timestamp is actually going to have the timestamp of when this application was built, maybe on a different machine, maybe years before it was actually started on this new machine. You also don’t want to capture environment variables or things like that, system-dependent constants. If you’re doing optimizations based on the number of CPU cores your system has, you might be capturing the number of cores that the system has when it compiled your application, which is not maybe what you really meant to look at. You need to be careful at what these static blocks are containing.

I mentioned you need to disable some things, like JMX is not supportive. Let’s look at some examples from the Hibernate library, which is what we are going into. In Hibernate, you can enable the management bean, we have some configuration attributes and that allow you to look at statistics, like what’s my slowest query, what’s the right query ratio and all kinds of other metrics that are registered in this management bean if you enable these in configuration.

At a very high level the code will look this. If it’s enabled, then we register some things on the management bean. This code will not compile because the registerJMX() method will do something which invokes this API from the management beans, which is a violation of your reachability. You’re reaching into code, which is not implemented, so there is no way to compile this. The compiler cannot see that this flag is maybe off in your specific configuration, because what will this binary do when you actually enable this in your configuration? It needs to know what to do in this case. The correct way to disable the feature is to have code that looks like this. You have to have a constant which blocks any flow of code from going into that method. The compiler sees this and the reflection is not allowed, so this is a real constant. This is always off. This method will never be invoked; it’s not even compiling it. The whole code is going to disappear. The trick is you need to make sure that it looks like that, that’s what we’re getting at.

You need to adapt your code to make sure that the code flow that you’re having is really legal code, not having these assumptions that are no longer valid. It’s not that much work for your library. The big problem is you need to compile all your dependencies as well, because everything is going to be compiled and optimized into the single binary that’s representing your application. It’s not just your code. That’s really all of your dependencies, your dependencies and the dependencies of your dependencies. Everything that you have on class parts, including the JDK, is included into this. That’s a lot of code that you need to probably verify is behaving as you’re expecting.

What’s the impact on Hibernate? I’m taking Hibernate here first off, because I ported Hibernate to Quarkus myself, so I can answer very deep questions about this. What we learned about porting Hibernate to GraalVM can be applied to all the other libraries. It’s interesting to discuss Hibernate because pretty much all the illegal things that we are seeing here, it does them somewhere. We had to figure out how to have an alternative plan to make sure that you can have all the benefits of Hibernate, no trades-off and still compile your application to a native.

Hibernate ORM & GraalVM

Let me switch to a mind map here. This is pretty much the steps that we needed to do. Let me just select some, we don’t have to discuss all of them. For example, resource loading; you might have an import sequel script that you want to be imported in your database when your application is run. This is mostly done for developments, but some people do it in production as well. You need to have these resource included in the binary, and that’s what the Quarkus Hibernate extension will do. If it sees that you’re importing this by just parsing your configuration, it will include as a resource in there, easy.

No support for JMX, you’ve seen that before. It’s just disabled in this specific code, which is generating the bootstrap of your application. It’s making sure that the compiler can see that this is really off, and there is no way for you to enable it. Same for a security manager and securities, and then of course, there were some libraries that were starting threads in the background on class initialization. That’s a bit dodgy anyway, so I just patched it to make sure that doesn’t trap and again, and that’s resolved.

Reflection – you might assume Hibernate does a lot of reflection. In fact, it doesn’t have to. There are alternative ways that it can enhance mostly for performance reasons, it can enhance access to your entities. It turns out that using reflection in GraalVM is actually super-efficient because the GraalVM, as soon as you tell it, “I’m going to do this,” it’s not really using the same code, it’s just shortcutting the whole operation. What’s missing here is that you need to register the reflective needs of your application. All the entities you have, all the constructors of these entities, the accessors of these collections, the getters, setters, all these methods that the framework needs to invoke or read, need to be registered in a configuration file for the compiler, which is to say, “All of this stuff needs to have reflective access.”

If you are not using Quarkus, you can use a JSON file and just list all of that stuff in there, but it’s very tedious. What Quarkus is doing is literally, “I know your application.” You’re not allowed to mutate it later, so you’re not going to deploy additional entities later. You’re not going to instrument it, so we’re just looking at your entities on the classpath. That’s the list of things that we automatically then register into the compiler. There is a callback from the GraalVM compiler into the Quarkus framework, which then delegates to all the frameworks that are integrated in the Quarkus build, and each of them can list, “I’m going to search for JPA entities. These are the classes for which I will need to have reflective access.” Then you’re good to go. The compiler knows this and it’s folding this information in.

Another thing we had to do was on dependencies; all the high burner dependencies have been converted to run fine on GraalVM. A big one was I needed to convert even the JDBC drivers. PostgreSQL driver had some issues initially. First off it does a reflection internally. We patched this to not do that anymore. It uses phantom references, which used to be another thing that’s not all really allowed within the GraalVM, but actually is out of date. Now phantom references work fine within GraalVM, but there was a patch for that.

Then you discover interesting things, like the MariaDB driver; when it’s authenticating to your database, it’s actually initializing JavaFX in background. Did you know that? I had no idea, so every time I ever connected to MariaDB or MySQL, you have additional memory consumed by your JDK because it’s initializing all of these Swing, Java 2D, and JavaFX classes just because in the security options, one of the ways to connect to get the password is to open a dialogue box and show you the dialogue box. Even if you’re not using that, that code is still being initialized and triggers the JVM class initialization in a chain. That’s just an example of all the things we had to do to get there.

Hibernate also needs ANTLR to do the parsing of your queries. It needs a transaction manager – the Narayana transaction manager, which is now probably better known as the JBoss transaction manager, was also converted to have a Quarkus extension and work there. XML parsers need to work, a connection pool, and so on.

Let’s go to the more interesting things. Hibernate can not create proxies to do lazy initialization of your entities, and it cannot enhance your entities at runtime. If you are very familiar with Hibernate, you might know there are also plugins for Maven, and Gradle, and Ant to do class enhancement of your entities at build time. That’s what you’re using. Technically all your entities are being enhanced before the compilation phase to native. The interesting fact is you don’t have to set up these tools anymore. What you do here with Quarkus is, we already set up all the tools automatically, so that all the classes when they are inspected for your application, then we know, “There are JPA entities here, let me enhance them.” Then, it’s the enhanced version of the class that’s then being put in the JAR with all the other enhancement that Quarkus is generating and that’s then later compiled to native codes. There is no dynamic classloading, there are no proxies, there is no runtime bytecode actually happening in your application because it was already done before that.

Another aspect of this is you don’t really need to do ClassPass scanning. This is probably the slowest component of booting Hibernate. When you’re starting it, it needs to find out which are your entities in your model. To find them, we need to look for JPA annotations in all your dependencies to figure where they are. This stuff can be done at build time. Then there is no need to repeat it every time you’re starting the same application. Since we are talking here about microservices or immutable applications, and not servers that are behaving like a container in which you can dynamically add or remove code, then everything is known during the build.

As you are looking at your code and you know which entities you have, so can the compiler look and see what are the entities you have, and that’s a constant. As soon as we have these constants, these constants are literally generated as static initializers constants in classes which are dumped on disc and this is additional code that’s been included in the compilation of your application.

When we have these constants, these can actually optimize the size of the whole application like crazy. If I can see that in your entities, you’re always using a specific ID generation and you’re not using any other ID generation, then these other idea generation strategies which are in the Hibernate library, are getting removed. If we see that you’re compiling your application to connect to a Postgres database, all the code that we have to support MariaDB, Oracle databases, and all the other stuff is removed. That’s because you can now rely on these constants after the build.

This area is also interesting. What we’re doing is, we can start Hibernate within the compilation. When you’re compiling and bootstrapping Hibernate up to the point that it needs to connect to the database, we don’t want it to connect to the database because you are compiling. Maybe you don’t even have the passwords to your production database, or it’s not reachable from this machine. What we did is split the phases within the framework so that we can get to a point in which all the metadata about your application has been computed. We have all the entities. We have them all enhanced and all the code that’s ready to initialize it has been run. Then we take a snapshot of that and that’s what’s included in the binary.

All that work doesn’t have to happen every time the application is run, because it’s already done. The only thing that’s missing is, “Now you really can connect to the database and we get going.” The same approach can be applied to all the other frameworks. The important thing is that you can reorganize the code into this phase, which is not what Java developers usually do. Then, we snapshot that and that’s the state of your application when it’s meant to run.

What Is Quarkus?

This is the introduction to Quarkus. What is it really? It’s a bit of a toolkit and a framework to start your application. We’re focusing on Java applications of course, and a little bit on Kotlin as well, so exploring it. It looks interesting; so far there is an extension for Kotlin too, if you want to play with that. It’s really designed to be really light and aiming for cloud environments. It’s designed upfront thinking about the problems of GraalVM.

What I really like is these limitations of GraalVM; they become almost like a strong point. Since we can really rely on these constants and the dynamic aspect is gone, then you can optimize even further for everything. This is what it looks like in terms of process. You have your application which gets compiled, then the Quarkus build plugins, that are for Maven and that are for Gradle, they work pretty much to inspect your application. I can see if you’re using Hibernate or not and which entities they are, and we can enhance them, apply some additional magic. That’s then a very highly optimized JAR representing that application which can run on the JVM. In fact, it runs on the JVM consuming far less memory than everything we had before, because most of the work was done at build time.

Since we’re moving into this idea that one application goes in one JVM and it’s not really changing – when you want to have a change, you just build it over and maybe you create a new container and replace the old one – it means that all of this work that we are used to doing during the initialization of the framework can really be moved into the phase in which you are creating the application. It can go the GraalVM way and build the native executable as well.

The interesting thing is, like I said before, you cannot use the Java debugger on the native executable, but since it’s the same application that you can run on the JVM mode as well. If you have a problem with your business logic, you will just run on the JVM and you work as usual. It’s just consuming less memory and booting in way less. It works via these extensions and for all of these frameworks that we’re supporting now need one extension to support it. All the stuff I showed you on the mind map for Hibernate, we did that for several other libraries, and you might want to add some four additional libraries that you need. They do multiple things, one of them is to make sure that the library that comes compatible with the native image of GraalVM, but that’s not the only goal. Quarkus is very interesting even if you are not planning to run it as a binary – we’ll see some more benefits now, like the development live reload capabilities – and it just takes way less memory in JVM mode.

These are some of the libraries that are supported now. There are actually many more, but these are highlights. You might see Kafka is quite good now, Hibernate, RESTEasy, Undertow is here too and it’s not on the list, Netty, Vert.x and we’re heavily focused on Kubernetes going for open shift, and Infinispan, Prometheus, the whole thing, and there are many more coming here.

Another goal of Quarkus in terms of a platform is to expose the ease of use world; it’s going to try give you an API which is for imperatives and reactive, which look similar. You can mix imperative and reactive coding like the RESTEasy and the Vert.x things, they can be used reactively, but I’m not going to talk much about this today. With this goal of being container first, we see the size of this becomes really small, the boot time is much faster, which means you can really scale up your instances on CloudZone Kubernetes without worrying too much about all the time it takes to boot the whole JVM and the warmup or fabrication of the JVM because everything was precompiled.

You can have your full application started in milliseconds. We’re looking at memory consumption, but we’re not really looking at heap sizes anymore. That’s not really interesting in Kubernetes. What we’re looking is the total resident set size consumed by your application. What is that? That literally is the sum of all the memory regions that your application is consuming. It’s not just a heap; it’s your heap, of course, but also all the other costs that JVM has when starting. The more threads you have, they all need to stack, that’s one region. Then you have the metadata of all the classes that you are including. The less classes you have, the less metadata. All the dead code elimination helps there too.

The compilation, the just-in-time processes, and all these things that need to happen in the JVM, they are consuming additional memory that you don’t really necessarily see in the heaps. If you’re breaching your RSS limit on the cloud, your application gets killed, so it’s important for us today to focus more on RSS than heap consumption, even if consumption matters as well. This is how we measure memory – we’ll see that in a second. How much memory does it consume? In a REST application compiled to native, the total memory consumption is about 13 megabytes. This is with one-megabyte heap. That gives you an idea that you really need to look at the whole thing, not just at the heap, which is just one.

The whole memory consumption on the same application running on Quarkus on the JVM is about 74 megabytes. If you try to beat that with any other framework out there today, you’re unlikely to get below 140 megabytes. We can say that both of these are running on the JVM traditional, not GraalVM, so the savings in memory are very strong. With more complex applications, which are using Hibernate as well, which include a caching library, connection pool, a transaction manager running, the JDBC driver and a lot more, the memory consumption is still very low and super-competitive even compared to other frameworks.

Start up time – let’s just skip this, but let’s keep to the time it takes to run, the same REST demo. In native, it will start in about 14 milliseconds. On JVM, it will run in less than one second. In a different cloud stacks, it will be at least four seconds. With JPA enabled, it gets a bit slower; do you know why? It doesn’t actually get much slower at all, but it’s also connecting to the database, filling the pool. It’s multiple connections being authenticated to the database, creating the schema and all these things. You can boot it in about 50 milliseconds, including a Hibernate and the transaction manager, everything. On the JVM, it’s 2 seconds traditional, it’s getting close to 9, 10 seconds with this specific demo we have.

Developer’s Joy

There’s this comic on the website, it’s pretty good. It feels a bit like coding on PHP. You’re making code, your code changes, and you just reload it in the browser and the code is there already. How do we do that? In practice, since it is becoming so light and fast, we can actually reboot the whole application from scratch. When you’re refreshing, everything will reboot and there is no drawback. Let’s have a look at this small demo here, just have a look at the POM file, and how it works. I’m importing the Quarkus [inaudible 00:36:08] and then I have some extensions – the ORM extension, the connection pool or RESTEasy and MariaDB driver. I have the MariaDB driver running here. This is in Docker and this application has one configuration file, that’s the Quarkus configuration file. It has mostly just data source properties, telling Hibernate that it needs to drop and recreate a database every time it’s restored.

Also, we want to log SQL statements, but just to see what it’s doing. We have default import SQL statements, you might be familiar with this. Then, there is a page and then there are just two classes in this application. This is a RESTEasy entry point, so we’re exposing a REST API to load all the fruits and then load the single fruit, create a fruit, update a fruit, and that’s it. Note that you don’t need anything else. Since we can inspect your application, then we can infer what you’re needing out of this and nothing else is needed.

Then there is one entity, the fruit entity. This is using autogeneration for IDs, it has a unique name and that’s it. There is an Entity here and note to name it query. Let me just show you what it looks like. I have some terminals here, “mvn package DskipTest.” Let’s have it as a package first. It just runs a build, it’ll create a JAR file. It ran those tests, it connected to my MariaDB locally and verified we are good to go. Let’s set Quarkus in development mode, it’s running the tests again, and then it started. This is what it looks like. There are apple, banana, and cherry in the database. We can remove some and we can add some new fruits in the database.

Let me just show you this table as well. Let me remove the sequence because it’s not there yet. There is a fruit table here. “Show create table Fruit.” Of course, it has an ID and a name, which is a primary key and there is a unique index on name. Now, if I want to go and make some changes here, we can say the “unique” needs to go. Let me just switch it here and then I go here, I refresh this. You see the lowest terminal had some noise here. It did a hot-replace here and if we now switch to this one shown, there a new fruit here. The uniqueness is gone, you can have live changes on your entities included.

That’s when you make it public, public String, Let me add the field. I refresh here, it’s restarting. I look at the table again and then your column is there. Much funnier to work with this and it works with anything in your application. If I want to, let’s say, add something here, “pineapple,” column four. I save this, I refresh, and pineapple is there. You can change really anything except the dependencies, because dependencies need to make them to rebuild everything. This is much safer than existing tools, which try to connect and use agents to replace some components because it was too heavy to reboot it. Now it’s very light and we can reboot everything, so it’s the same as killing an application and starting it over again. That’s the LiveReload capability.

The other alternative is we can build a native. This is going to take some time. We just have our minutes left and I think it will take four minutes to build. It runs the test first, it’s created the JAR version and now this is starting the native image phase of the build of Quarkus. Let’s have a look at the log here. It’s invoking a native image with a ton of special flags, but that’s not all what Quarkus is doing. Quarkus also generated in previous phases a bunch of bytecode, which is then stored in disc, and these are all the call decks that the competitor is running. This is the compiler, here this is booting Hibernate. This has selected the dialect and it’s started the transaction manager and XNIO and some other technologies. They have been booted within the JVM context of the compiler phase.

In a traditional observer, when you’re deploying something, there is a lot of stuff that needs to happen. Take again the Hibernate example; you need to parse, say, the persistence XML file. It’s cheap to parse a single file but before you can do that, you have to initialize all the classes of the JVM that enable the XML parser subsystem. That actually takes a lot of time, the first time you do it.

Within Quarkus, the XML file is not really parsed into the runtime; it was parsed before, which means the XML parser implementation of the JVM is not included in your final image because it’s not really needed anymore. It’s the same for anything related with annotation lookups. We’re not really reading your annotations at runtime because they’ve been run. They have been read before at build time, and even just validating that your model is fine and all of these things. Since we run Hibernate, we need a compiler, it would have failed. You would have got feedback about invalid code already.

All that we do is then record it into bytecode, and you get the static initializer or a main depending on what the specific framework needs to do. Let’s go back a second to see if that was done- it’s still compiling. It’s a very heavy process and it takes a lot of memory, which we can probably see here. See, this is a poor dual-core machine with hyper-threading, but it’s taking my four hyper-threaded CPUs to the extreme and I don’t have more than that memory so it’s trying to use all of it.

Let me finish with the slides. The architect within Quarkus is working with these main components. There is the GraalSDK which allows us to have these call decks to the compiler and enable flags, or make sure that the specific classes initialize later, before, or with the special flag. Gizmo is a new library we created to create this bytecode that’s being dumped on disc and compiled then later. Jandex is an indexer, it’s a super fast and efficient way of finding your annotations without initializing those clusters. With Jandex, we scan and have a very good picture of what your application is meant to do and which annotations are there, which technologies are being used without needing to initialize it all.

You have seen hot reload goes into action in milliseconds as well. That’s not in native mode. Consider that during a hot reload, we need to also rescan your whole application again. That’s fast because of these other technologies. Then there are extensions for all these other things; every different library has its own extension. The purpose is both to work around the limitations of Graal, or let’s say, take advantage of the limitations of Graal, but also optimize it for JVM mode. All the code needs to split very cleanly between what’s being run during the build, that’s one thing, and what’s being run at your runtime. They are different, to the point that we have different dependency sets. The dependencies that you need at runtime are a much-trimmed version of what you actually have during the build, because if you don’t need it later, but just for build, then it can go.

Build success. That took 3 minutes 38. What do we have now? We have the binary in my target. You see this? We have the JAR, which is the normal executable, let me start that one first. That’s the Quarkus application ready to run in a JVM. “-jar demo2, runner.jar” and it connected to the database, created the new schema and all that in less than two seconds on my very poor laptop. If we do the same but without Java, we can run that other executable we have, runner up. That’s six milliseconds boot time, and this is the same application. This is using RESTEasy, Undertow, transaction manager, Hibernates, connection to the database and everything else, but I think the most interesting part really is this code. This code is following the standards and the libraries you are already used to.

There is not much for you to learn for, “How do I get working with this technology?” This is the usual RESTEasy with the standard annotations, and this is totally just the standard JPI and we just can convert everything to the native thing. Let’s just test this application. You’ve seen here it’s logging the queries, I can make those same changes as before. I can’t live reload now because this is a binary. It was highly optimized to do what it is doing. How much memory is this taking? This is consuming a total of less than 40 megabytes, but that’s total RSS, it’s not just heap. In fact, how much heap is this using? It’s using one-megabyte heap total for a total of 40 megabytes application. That’s the same use of application.

Questions and Answers

Participant 1: I think you mentioned at the beginning, when you start adding new libraries and dependencies, who is going to do the work of adapting to Quarkus, say, for a JSON library or something else?

Grinovero: It depends on how complex the library is. I did the work on Hibernate because I know Hibernate very well internally. Obviously, that was a difficult task; you wouldn’t have done that in a day. Also, I don’t expect most libraries to do all this crazy stuff that a framework like Hibernate is doing right. Many, libraries, if they’re just using reflection, it’s just a couple of flags. If they’re doing weird things during starting initialization, then there is a single flag to say, “That flag needs to be run at runtime.” It really depends on what the library is doing. I think the good news is the GraalVM compiler is extremely thorough. It has to; it’s analyzing all the code flow paths, so it knows every possible action of this library. It will just fail the build, telling you, “There is this illegal thing that I found in this method over there,” and you get an error and then you will need to make an extension for Quarkus, or ask somebody to make an extension for Quarkus pretty much.

See more presentations with transcripts

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.

Training Your Managers to Support Mental Health for Your Team

MMS Founder

Article originally posted on InfoQ. Visit InfoQ

We still do not offer clear advice to our organisations and managers on the best ways to raise awareness of and manage mental health in the workplace, according to a recent review of the literature on mental-health awareness training by the Institute for Employment Studies (IES) and UK’s Rail Safety and Standards Board (RSSB).

Compared with only a decade ago, the ease with which most modern workplaces now embrace serious and empathetic conversations about mental health is a cause for celebration. This is not to declare the struggle against ignorance and stigma to be over, by any means, and there is so much more to do.

The review discovered there is not enough research to verify the best ways to create medium to long-term improvement in mental-health awareness and the best ways for managers to triage this with their staff. This has led them to start their own randomised trial to attempt to answer this question.  Stephen Bevan, IES head of HR Research Development and Sally Wilson, IES senior research fellow, wrote an article explaining their concerns.

Agile and digital transformations have brought a focus to individual and team psychology in the workplace. Many managers have now heard of Thinking Fast and Slow and Drive, both highlighting the importance of psychology to team performance. A quick look at the front page of talks from Agile2019 quickly shows the emphasis on the physical and mental health of our people and teams in the technology industry. Three titles quickly stick out: “Empathy: A Keystone Leadership Habit“, “My So-Called Agile Life (as seen through my Fitbit)“, and “More about Thinking Fast and Slow“.

There is also a noticeable increase in mental-health focus from governments around the world, often as part of workplace health and safety legislation. Within the UK, the government commissioned an independent review of mental health in the workplace which identified effective people management as a core standard that organisations should meet.

The IES are concerned that Mental Health First Aid (MHFA), a popular mental-health training concept, did not have enough research to justify its popularity and use in the UK, especially since it might be legislated within the UK.

The IES are currently running a randomised assessment of line-manager training options to see if other forms of mental-health awareness and triage training will give better medium to long-term results.

The response from Nataly Bovopoulos, former CEO of MHFA, admits that more research is needed but recommends that MHFA can provide proven gains in the ability and knowledge of managers to deal with mental health issues, at least in the short term. She states that MHFA:

leads to participants having better knowledge, skills and attitudes and much of this is sustained six months after training. Participants also report having used their skills to help people they’ve been concerned about.

InfoQ will continue to follow this ongoing research as we believe this is highly applicable to the Agile and digital movements focus on the best ways to create high-performance workspaces and processes which put teams and individuals first.

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.

Presentation: Sink or Swim – Effective Collaboration Between Eng & Product

MMS Founder

Article originally posted on InfoQ. Visit InfoQ


Barmash: First of all, this talk is about collaboration between technical leaders and product managers. A very quick agenda – we’ll start talking about keys to effective collaboration, then we’ll talk through a few scenarios and frustrations that Khadija and myself brainstormed that are common based on our experience. We’ll do a little scene at the end about conflict resolution.

Ali: Just a little bit about myself: I’m currently a director of product at Before this, I worked at Compass for four and a half years. I was with that team for quite a long time and before that, a senior PM at Chloe and Isabel.

Barmash: I currently work as VP of engineering at Komodo Health, which is a healthcare analytics startup. Prior to that, I also worked at Compass. Together, we worked for about 18 months side by side, day to day. We shipped three new products, as well as did major revisions for two other products. We had a lot of things that we jointly owned.

Ali: Yes, this is an authentic relationship. This is a relationship where we were tech lead and PM on the same team.

Barmash: Yes. It certainly took some ups and downs for us to create that collaborative relationship. We’re going to try to share some of the lessons learned here.

Defining Keys to Effective Collaboration

First of all, what are the keys to effective collaboration? The five that we came up with are empathy, building trust, communication, role clarity and accountability around that role, and negotiation and conflict resolution. In fact, those of you who have been to a few of our sessions, you can see that this mimics a lot of the topics that we talked about in this track: communication, asking questions, empathy, negotiation, and other good stuff like that.

Let’s quickly talk about what are some of those things. Empathy is perspective taking. It’s, do I understand what the other person is going through? What are some of the pressures that they’re under, so that I can understand their perspective, and maybe come up with a solution that isn’t just “me, me,” but also takes into account what their needs are.

Building trust – hopefully, most of us know what trust is. It’s basically being able to rely on the fact that the person you’re working with will do as they say, and having a track record. One framing that I like to think of it is, there are two dimensions, there is intent and then there’s capability. Maybe I trust your intent, but I don’t think you’re up to the job. Or maybe I think you’re very capable, but I fundamentally think you’re out to get me. Therefore, that’s obviously a barrier to trust. The goal in pretty much all of collaboration relationships is how do you create that environment of trust together?

Ali: Communication is a big part of my job, but sharing information and context with each other is so critical to my relationships with my tech leads at work, and for us to actually effectively produce great products. The key is not to make assumptions; the key is to ask whenever you don’t know. I prefer for folks to actually overcommunicate. I think that goes on both sides, not just product, but also on the tech lead side. Another key part that I think a lot of people miss out on the communication skill set is also knowing when to communicate and when not to communicate, that’s also key. I won’t talk a lot about when not to – that I think is very specific to each one of your organization and stakeholders.

Then role clarity and accountability. I find that as a product manager, a lot of folks even on the engineering team do not fully grasp my role. A big part of this is because in many organizations, product managers are different. To be honest, we have to have a real “come to Jesus” is what I call it, where the truth is about product, it’s different in most organizations depending on their needs. With that said, it’s so important to, one, ask questions, don’t make assumptions on what your product manager thinks their job is and the same for the tech lead also. For product managers who sometimes are far more technical, they’ll assume some sort of a tech lead position. You’re, “No, that’s not your job,” so, holding yourself and each other accountable.

Let’s just go ahead and define the roles, I mentioned a little bit about it. On the tech lead side, point blank, engineering owns the how. I really am big on respecting this. When I go into a meeting with a tech lead, I’m not going to say, “This is how we should build it,” that’s technically not my job. I’m very clear about that, I appreciate that because I did not have an engineering interview when I joined the company; I had a product interview. On the flip side, on the product side, we focus on the what and the why and part of the why is the who. That’s a big part of our job. To be honest in practice – I put it up there – but it is important to know there’s a lot of gray in between these two. That doesn’t mean that I own it, and only I can contribute to it; I think we both have some room for collaboration on both of these responsibilities.

Barmash: Yes, in practice, there are a lot of gray areas because once again, this is a partnership, so things do get blurred. A lot of this talk is trying to explore some of those blurs and how to think about different things. Let’s get a little bit deeper. Technical leads, as Khadija mentioned, they own the how. What does that mean? First of all, fundamentally, it’s about creating a technical strategy as well as delivery strategy around either the business goal or, in this case, the product goal. Tech leads are typically supposed to be focused both on the immediate delivery, “How do I get this feature out the door?”, but also thinking a little bit more medium, as well as long-term, about the health of your technological system or architecture.

You should also be obviously rigorous about engineering practices, “Do I have right deployment, do I have the right build, are the code reviews being done?” You also care about non-functional requirements, so things like performance, scalability, security. Product managers will obviously come in with a point of view on those, but a lot of times, you will be the person who will bring up the issues there. In general, I very much ask of my tech leads and myself to come up with a point of view to the discussions, for example, road mapping.

Product’s job is fundamentally to come up with a bunch of features that will bring value to customers. Our job is to think about what are the needs of the technology systems, so that we’re delivering now, as well as in the future, with a reasonable and hopefully, increasingly high velocity. Then, of course, tech leads have other jobs that start crossing over to people management, such as coach and mentor engineers on the team, even if you’re not explicitly a people manager.

Ali: In terms of product management, I get this question sometimes from engineers, usually more junior engineers: Why do PMs exist? Why do we need them? To be honest, I can see why this question sometimes arises. It’s because a lot of times, engineers don’t actually have a lot of insight into what we do day to day, who do we interface with, and what are the key parts of our job that we may not be showcasing every day to them. It’s a huge part of our job to understand what to build. That’s so important to the business. It’s also really difficult to define, to be quite honest. There’s a lot of room for failure there.

Not only that, but in determining that are also facilitating stakeholders’ needs, designs, thoughts, as well as engineering. Just sitting in the center of all that and bringing it all together from a strategic perspective is a big part of my job. Keep in mind – I think this is something that people do not also pay attention to a lot – a PM usually is not the person building the product, sometimes can’t even build the product, may not even understand how it’s built. Then they’re also the person who’s not designing the product. Typically, a good PM can wireframe, but they’re definitely not going to be better than their UX counterpart.

On top of all of that, they are working with all these folks, managing them to help deliver this product, but none of them actually report to them. Imagine, you’re working with all these people, and you’re, “We’ve got to get this done. This is the strategy. This is what’s key for the business,” but you really don’t have any authority over anyone. Going back to Roi’s [Ben-Yehuda] presentation, which was earlier if some of you were here, a big part of our job is influence for the benefit of the business and our customers.

Barmash: The fifth major aspect of great collaboration is negotiation and conflict resolution. We’re equal partners; neither one of us reports to another person. We must discuss disagreements and come to some compromise for the best thing to do for the business. This is a great opportunity for win-win. The way that I think of it is, the roles are designed to be in conflict. One role wants “More, more features.” The other one is, “I really care about architecture.” Clearly, there’s a very fundamental tension here. We need to come together and figure out the best solutions.

The reason why I think this ultimately works, and a lot of organization adopted product manager, tech lead, dichotomy, is specifically to get the best out of teams. If one person has one perspective, and another person comes with a different perspective, and we figure out what is the most valuable thing that we can possibly ship in a reasonable amount of time, that’s ultimately the way we deliver the most value.

Here, I wanted to remind you that the Venn diagram of the two is, yes, there are a bunch of product concerns, there are a bunch of engineering concerns, but fundamentally, we both have a very common goal, which is deliver value to the customers or deliver against the mission of the team. Therefore, we have a lot of common ground to build on. Once again, the five major keys of collaboration: empathy, building trust, communication, role clarity and accountability, as well as negotiation and conflict resolution.

PM and TL Frustration Examples

Ali: Now, we want to look at some real world scenarios that have it all the time between product and engineering, in terms of frustrations, some which I’ve actually experienced in my career. One is architecture. An engineer wants to spend weeks and weeks on thinking through the architecture, or would actually rather build it in a much more scalable way when we haven’t really effectively tested even the feature. How would you deal with this?

Barmash: Well, Khadija [Ali], that’s because we do have to think about things. We want to design the product in a way that we’re not just delivering something right now, but we’ll be able to continue delivering for you when you come with additional features. Also, I know that when you design this feature, by the time you came up with the product spec, you spent weeks, maybe months, talking to different stakeholders, thinking about the different interplay there. Hopefully, I was there with you in some of that journey. You actually spend a lot of time. Now, we’re trying to spend the time on our side, to think through the problem from a technical perspective so we can come up with something that’s pretty great for you.

The engineering frustration is when a PM asks for a large feature, and it’s so large. You’re, “Ok, this is going to be many months of work, why is she asking for that? There are clearly ways that we could potentially phase it, doesn’t seem like she thought through how we might phase it over here. It seems like we might be able to perform some experiments to direct some stuff. What gives?”

Ali: Actually, I’m so guilty of this, just to put it out there. I’ll tell you why, I have a very good reason. Earlier in my career, I used to actually come in with very scoped features. What I actually realized is my engineering partner then didn’t have a vision of actually what I wanted end to end from an experiential perspective. This is a huge mistake that PMs do. I want to present you with the big picture and then us work together to phase it out. It can only be done effectively that way. PMs come in with features that are super MVP, and I know there was a world where that was super celebrated, but I think the key is “Let’s partner up together to actually figure out what those phases are and how we have actually develop those experiments.” Because then, we’re setting ourselves up not only just for the product success, but also how we actually intended to build it. That doesn’t necessarily mean that everything will be done from an architectural perspective. If we decided to hack, it’s a hack, but we’re on the same page and we know why.

I actually think this is something that should be encouraged. As tech leads, I think the one thing that you guys need to keep in mind is that don’t make assumptions and say, “You just came in with a five-month long feature. Is she serious?” No, actually, maybe you should ask and say, “You do realize this is five months long? Let’s talk about how we can break this out.” The other thing is they may not actually know how much it costs, so it’s fair for them to come in with a big feature and for you to cost it. Again, your responsibility or accountability, role clarity, and communication all wrapped in one here.

Barmash: Who here experienced this frustration? Quite a lot. All right, PM frustration. Go ahead.

Ali: I get this all the time – I want a perfect spec, I cannot start building this until you have thought through every use case, every edge case, every little thing you want. It’s just unrealistic. I get it, you’re working with a computer, and computer is really dumb. Essentially, in order for you to do your job well, you don’t want to make any assumptions. In some instances, it makes a lot of sense. This is one thing I’m really frustrated with, but I’d love to hear actually, Jean [Barmash], how you would go about it.

Barmash: From my perspective, an engineer on the team, first of all, should have some business context overall with problem they’re solving. Certainly, they need to understand the product, it’s pretty fair that you give them 85% of the solution. There are some gaps that they should be able to close by themselves. Of course, they can always also reach out, ask additional questions, point things out during the sprint planning “Have you thought about this?” No person is perfect. In terms of having empathy, your product manager did not have infinite time to write every single ticket. They’re trying to move things along.

We take some compromises, because we could spend another day on cleaning up the code and making it super elegant, same thing from the product perspective. We just need to work together to figure out what’s the best thing. Engineering frustration is when the PM does not act as the CEO of product. There’s a whole thing in the industry that the PM is the CEO, and why aren’t they just calling all the shots when it comes to product features?

Ali: I hate this. I literally think this is the biggest mistake when they coined this term as the CEO of the product, because I think a lot of people first off – actually the majority of people – do not know what it’s like to be a CEO, so you can only assume what they think of when they see this. It’s the person who comes into the door and goes, “I’m in charge now. This is what we’re going to build, no questions asked. Don’t answer back, build it now.” How many PMs do that? Very junior, hopefully. Nobody senior in your organization is doing this. But I think this was a very huge mistake.

I think we’re now figuring it out in the product industry that this is really not the way it should be. We are facilitators. We are the folks that yes, we do make the call, we take accountability when the product does not work out or is not a success, it’s on us. The business can point to me and say, “Khadija [Ali], what happened?” But at the same time, I want to make sure that I’m taking into consideration your thoughts as engineering leads, but also my stakeholders, and especially the real CEO. I always tell junior PMs, “Yes, you’re the CEO of the product, but there’s actually a real CEO here.” I think this is a really great one, I hear the frustration.

Barmash: Plus, I think the CEO obviously has a additional source of power, which is that everybody reports to them, where product manager does not. It’s much more of an influencing position through ideas.

Ali: I have had this happen where a tech lead is involving himself or herself in product work. I think product is really sexy for some people, because it looks like all I do is just say, “Yes, do that. Yes, do this. Call the shots.” It looks really glamorous if you really don’t know what’s happening. At the end of the day, I say, “I’m the janitor. I’m the person doing the QA when there is no QA. I’m the person who needs to facilitate everything and any gap that is happening in the product development process to get the job done.” That’s not really always glamorous, especially if you’re dealing with challenging stakeholders, which I’m sure we all have in our organization. How do you broker that deal? How do you negotiate? How do you manage these conflicts?

Barmash: I think this is definitely one of the gray areas because as an engineer, you’re hopefully thinking about the product. The product manager thinks about the product 80% to 90% of the time, you’re probably thinking about different product features and understanding user frustration 10%, 15% of their time, but obviously, you do have ideas. A good product manager should be humble enough to realize that great ideas come from anywhere.

Sure, maybe you shouldn’t write a 10-page spec for new product feature unless you specifically discuss it with your product manager, but certainly coming up, “Have you thought about this way? What about this approach? Did you realize that from what I heard, there seems to be a problem with customers in this area? Would this solve it?” So approaching it from a very both humble perspective, and that’s where some of the influencing negotiation stuff comes in.

Ali: In some instances, there really isn’t a PM on your team. If you need to fill this gap, I totally get that too, and it happens. I just want to highlight that, that’s an important note.

Barmash: Another frustration is when PMs don’t include engineering in the planning process. Anybody here face that? Ok, quite a lot.

Ali: Yes, this is a common mistake. I think, to be honest, it’s so important to be empathetic. I know, as a PM, I’m asking for empathy on this one. The reason why is, I think, as a product manager, you make a lot of assumptions. One is, “My engineering team doesn’t really have time to be involved in planning. They’re so busy trying to get this done.” The other one is, “I don’t think my engineers are going to be so excited to be sitting in on user studies or design prototype.” There’s a lot of assuming.

I think when you see something like this, it’s about communication. “Khadija [Ali], I would like for us to be involved in planning a lot earlier.” To be honest, this is a real story between Jean [Barmash] and I. I think being heads down so much and just in that rat race of going at full speed, especially if you’re working at younger companies, you sometimes forget, “Oh, yes, I should include.” Jean actually reached out, and was, “I’d love the team to be more involved in planning up front.” I’m, “Yes, it makes total sense.”

Barmash: As an engineering leader, this is also an opportunity for you to set expectations with your team. For example, if you’re working on a product that gets shipped to users, then certainly understanding your users for engineers is hugely important. You can set that expectations that you are product engineer, you need to at least know something about the users. Yes, you’re not going to go to every user feedback session, but maybe you’ll read the summary once every few weeks, and you should attend once every month or once every few months.

Ali: This is a big one. You come to your tech lead with a feature and they just say, “No, it can’t be done.” It happens all the time. I love it, it’s like a showdown, it can’t be done. I’m, “Ok. Do you have maybe other options, or can you explain some tradeoffs?” Then they go, “I didn’t expect you to ask that.” Jean [Barmash], you’ve been in this situation with a PM. How do you actually help to alleviate this frustration?

Barmash: Once again, this is your job as a tech lead. If you’re not offering options, then what are you doing there? If you’re there to just say no, that’s not really collaboration. I think this is an opportunity to understand where the sources of value are. What is the most valuable thing that we could be doing? Because maybe you can actually find the things that are relatively cheap to do and relatively fast to do that are the most valuable, so until you have that engagement and really understand. Thinking about options, and being able to communicate them effectively is very much one of the core aspects of the job of a tech lead. I guess another frustration is when a PM does not prioritize technical debt.

Ali: How am I supposed to know? Literally, how am I supposed to know?

Barmash: This is an opportunity for you to come in with a point of view. The product manager is thinking about what is the value on the business and customer side. Your job is to come in with a list, as well as explain what is the return on investment of the different initiatives, but have a very real conversation. Only if you put everything on the same chopping block, can you then tradeoff technical improvements from product features. I know, at the end of the day, it’s always difficult to do something that’s maybe not clearly functional. Yet again, part of your job is to get better at explaining the return of investment.

For example, “We’re building this humongous new product, but first, we need to spend two months refactoring, and here is why, because we created some model that no longer makes sense in the new product that we’re trying to build. We can’t avoid but having to basically remove it.” Everybody understands that feature exists. Everybody understands that this feature will not exist in the new product. Obviously, some work needs to be done in order to address that delta.

Ali: I think from a product manager perspective, we’ll return the favor of you actually bringing this up and making sure that it gets placement on the roadmap, and highlighting the risks of not doing it now versus later, by going in and actually defending this item on the roadmap with business stakeholders. If you can’t explain this to a PM, imagine explaining it to someone in marketing. They’re, “This doesn’t have business value right now. This is not really addressing a need. How is this serving customers?” There’s always this suspicious, “Are you guys just making stuff up?” It’s like, “No, the PM will actually be your partner and help to break that down, and really take a lot of the ambiguity around what this really means and how this will actually impact the business in the future.”

Negotiation & Conflict Resolution

Barmash: Let’s come back a little bit to negotiation and conflict resolution. We’ll basically go through a very quick scenario that I think is very common, which is product puts pressure on engineering to deliver something quickly. We’re going to do a little role play. As we do the role play, here are a few things for you guys to keep in mind.

Ali: This is the scene. I’m going to lay it out really quickly before we actually get into it. We both work at a SAS company, Imagine. I’m coming in with a request.

Barmash: Here, this is what’s happening, Khadija [Ali] is going to try to hold me accountable. I am going to try to understand a little bit more about the business. Then the product manager will explain a little bit some of the pressures that she’s facing from her side. I’m going to try to basically understand how much flexibility there is in what she is really asking me for. Then we go into solutioning mode, and hopefully come up with something useful.

Ali: Do you remember that feature I was talking to you about, the invite feature?

Barmash: Yes.

Ali: I need this in a two-week sprint.

Barmash: Yes, I talked to the engineers. We did some thinking, it’s about four weeks. Minimum of four weeks, unless we completely hack it in a super crazy away.

Ali: I’m having some difficulty with that estimate. Four weeks is a lot, that’s a whole month. To be quite honest, I just feel like maybe there’s no clarity around the business need and urgency around this feature.

Barmash: Yes, can you maybe tell me a little bit more about what are the drivers behind this feature? The way that I am understanding it, you want to invite additional users. There are internal users, external users, as well as groups of users, right?

Ali: You’re right, those are the three use cases I highlighted that are a need for us for this feature. To be honest, digging in a bit more, I can explain the drivers. We need 1,000 paying users. The urgency is really that we’re raising right now. We’re raising our D round. We need to paint this really perfect picture to investors around growth, obviously, most companies, meaning more than 1,000. For this instance, it’s 1,000.

I actually did some digging with the data. The past sales pitches that recently happened to our users – looked at about 1,000 of them – and realized that they all didn’t sign up because of not having this one feature. The ask is really to be able to share within the company between coworkers the dashboard feature. They can’t do that today. I think that makes a lot of sense, I wouldn’t pay for something if I couldn’t collaboratively work with you in the same company and give you some insight into my dashboard and work. Raising is happening in two weeks. That’s the other added pressure there.

Barmash: Sounds like you confirm that if we deliver this feature, that will move the needle?

Ali: Yes, definitely.

Barmash: Ok. Can you tell me more about what the most important things are here? For example, to give you a little bit of context, I think inviting internal and external users is actually going to be relatively easy, maybe one and a half weeks or so. The group users, that’s basically majority of the four weeks estimates, frankly.

Ali: Actually, to be honest, that’s a good point. I really don’t need group invite, that isn’t even part of the initial built. Really, again, going back to the drivers, it’s really being able to share internally with another coworker. I think just one is fine. Then they can share it with someone else, instead of doing a whole group invite together. We can not include that for the initial build. To be honest, external invites, as well as don’t really need it, just the internal invite feature.

Barmash: Once you do the internal, you can do external for almost no additional effort. If we’re going to do one, we might as well do the other.

Ali: Just a quick question, just to remind me, that’s not the build of the hack that you’re quoting. You’re thinking about building this in a scalable way?

Barmash: Yes. One and a half weeks is an actual build out. I think if we hacked it, there was about three days. However, we would still need to pay that one and a half weeks of full build out right afterwards because that’s something that we’re going to need to do to build the group feature.

Ali: Ok, I see. All right, this makes a lot of sense. Let’s just settle on the internal and external since they’re dependencies. I’ll go back to the business stakeholders, and just let them know. Obviously, we still need to account for bugs and things that we may not know that could come up. I’ll add some padding on to that as well.

Barmash: Ok, that sounds good. And scene. Those are some of the tensions that we observed. Once again, these are the keys to the collaborations that we have thought about: empathy, building trust, communication, role clarity. How do we make sure we understand what each other’s job is, and to make sure that we are both doing it in the appropriate way. Then of course, conflict resolution and negotiation.

Questions and Answers

Participant 1: My question is, how do you sell this relationship up the chain? I think a lot of us understand this intrinsically and how these roles work. How do you get the CEO to understand how these roles work?

Ali: That’s a really great question. I think I look to the CTO and the chief product officer to help me sell it. I think the thing is with CEOs who may not really understand this relationship – which to be honest, in New York I have encountered that – is that they need to see the proof in the pudding. You can’t go pitch something and they don’t actually see the result of it. I think the idea is having that open line of communication. I think definitely making sure you constantly say, when you have successes, “This is because I have a great tech lead, I have a great relationship, a great collaboration.”

Another thing that Jean [Barmash] and I used to do is I used to go to rooms. We had already spoken before we got into our room about that agenda. There were no surprises, we were aligned around what we were trying to achieve in that meeting. Even with the CEO, we would say, “This is a problem I think is going to happen. He’s going to ask about this. What are your thoughts on it?” We already had a mutual understanding of where each of us were coming from before we entered into that room, which meant that they met a team that was very much already aligned. It wasn’t a matter of someone just spinning off their own thing on the side.

Barmash: I think the way that I would approach selling it to somebody is, first of all, to have a person who can both think about the needs of the business deeply and is technical enough is very difficult. Those people do exist, but they’re basically founders of companies a lot of the time. I think you can point out that, if you have a working relationships, then you do get the best solutions. Once again, most value for lowest cost and the fastest.

Participant 2: I work for a team that is in the middle layer building APIs for all of our applications. I don’t have a product owner; I have all the product owners. I have good working relationships with them, but is there any way to get out of the infighting between the product owners about priority of projects?

Ali: Sorry, I want to make sure I understood that question. Is the question, how do you define the priorities for the product managers or help them understand it?

Participant 2: How do I avoid being in the middle of that priority negotiation that they’re working on?

Barmash: Sounds like you have a lot of people coming at you with different priorities, right?

Participant 2: We do. They work very hard to follow priority plans, but in the middle things get changed. My team has to switch gears, etc.

Barmash: Given that your team is very technical, unless your tech lead is able to have those conversations, which sometimes works as well, then I would encourage you to think about bringing in the technical product manager, because that’s exactly the skillset that, I think, at least at a high-level would solve that problem. Somebody who’s able to talk to the different product people, somebody potential also reporting up the chain to the product organization. Therefore, they can get high-level priority from their leadership as well as adjudicate some those conflicts, but then can still translate those conversations to a more technical roadmap that your team requires.

Ali: Unfortunately, it sounds like you actually may have to play that role until you get that person. It’s unavoidable, to be quite honest.

Participant 3: I think my question is similar, but from a different perspective. The role play you did was for a small feature that involves one team. Oftentimes, from what I see, there is an initiative. For example, we want to globalize our platform. That means almost the entire engineering organization is making changes across many teams and working. How do you suggest different product owners across the organization work with each other to make an initiative like that happen? Who should work along with them to make that happen, because obviously, there are many leads of different teams?

Ali: In your organization, is there a CTO and a head of product?

Participant 3: It’s not a startup, it’s a very large enterprise. You want to globalize the entire platform, something like that. That was one of the things where I’ve seen a problem.

Ali: I think this is where a tech lead is incredibly powerful. Product managers, yes, do make this mistake, but as a tech lead or anyone in that role, can definitely put their hand up and say, “This actually affects a lot of different places. Can you please go check with your other PMs before you come back to me about this?” I’ve actually had that happen as well. They may just not realize that it’s making changes across. A lot of times, they will, especially if it’s something as big as an initiative like that. If you put the onus on your PM to go do that work, they will mobilize together. You just need to tell one and just say, “We can’t start on this until you’ve talked to everyone and really highlighted the dependencies.” This is not something we can assume a PM would know, but more you highlight to them. It could be dependencies within the tech stack, all sorts of things that they don’t have insight into. I think that’s a good piece of advice there.

Barmash: Another additional thing to think about here is, first of all, are there executive sponsors for this initiative? Because this clearly seems like a very large initiative. Then, what is the execution mode around that initiative? It might make sense to have a product manager who’s thinking across multiple products and defining what is the work, as well as technical program manager who’s going to be in charge more of an execution piece, and tracking things down across the larger organization. I think the question is, what is the right supporting structure to facilitate that initiative? Because, yes, it does require a lot of communication, a lot of decision making, a lot of prioritization, which is its own job or multiple roles in a larger organization.

Ali: If there are gaps, don’t assume that people understand those gaps. You will see it. Just say, “These are the gaps.” I think there’s so much power in that communication and even the person who’s the tech lead of your team. I think that’s key.

Participant 4: I think I’ve been in most of the scenarios you described. You touched a little bit on the technical depth. I guess my question is around, how do you negotiate or communicate beforehand and plan for when you’re trying to build for an MVP? In that case, you are going with, not the perfect architecture, not the perfect solution, because you want to get something out fast and see if there’s value to what you’re building. Then let’s say that the MVP actually goes really well, so you decide to build for that solution. You have to re-platform the entire thing. That will take basically duplicate the effort, and you can’t use what you’ve built so far. How does that communication go between the tech lead and the product person?

Ali: I think in the beginning, as a PM, it’s communicating the use cases that I want to build for the need and what I think are, from a hypothesis perspective, the impact on the business. Whenever I go into a situation with a tech lead, where he or she has expressed to me that it will be a hack, I’m ok with that. I just think that we’re going into this together. I’m going to communicate that to the business stakeholders and say, “We’re going to test this out. This is just a hack. There’s a much better and scalable way to build this,” and probably Jean [Barmash] can speak to it more, “but by the way, I think right now, it’s probably not worth building the scalable version, because it’s just too costly, and we really need to test it.”

Also, let them know early on that if this actually works, we’ll have to revisit it. I think that’s the mistake that a lot of people make, that communication is not happening very early on in the beginning of a test, or a hack, or MVP. It’s like, “This is an MVP, let’s build it. Ok, let’s test it. Oh, damn, it worked. Now, we got to build on it.” But nobody was involved in actually briefing everybody else involved around exactly what decisions we’re making, and the result of that, and the impact in the future roadmaps. I think that’s when you have set yourself up for success.

Barmash: To be fair, this happens a lot. Sometimes it’s frankly unavoidable because you have an MVP that becomes a successful product, then you have to start moving fast. I think if you do communicate clearly to Khadija’s [Ali] point, then at least you can set the expectations that, “Two months from now, I’m going to come to you. I’m going to ask for a bunch of time to catch up,” and you as a technical lead should still come up with some kind of architecture and technical vision that you want to start pursuing, because even if you don’t have time to actually execute on it fully, having a vision will allow you to start aligning around where you would like to be going. Definitely challenging situations. I recognize it is very rare – if you fail, nobody cares, you throw it out, you move on. If you succeed, that’s when it becomes more challenging. But at least at a minimum, you can say, “Remember, I warned you. We’re going to pay downs and re-platform before we move forward.

Ali: Another thing, too, to build a case even further, and I think this is important to know, is the business might really just not have time for you to build it the right way. It’s the real world, businesses have to make money. I think sometimes in product and engineering, we forget that piece unless you’re really involved in the revenue piece, which happens sometimes for product folks. But I think that at the end of the day, there is that pressure of getting users, telling the story to the investors or literally just hitting your bottom line that month. I think you have to take into account all of the pieces of data that influence your decision and then try to build something in a scalable way, especially if an MVP is a success and compromise. Part of our job is everyday decisions that we have to make a compromise on together.

See more presentations with transcripts

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.

Reactive Foundation Launched Under the Linux Foundation

MMS Founder

Article originally posted on InfoQ. Visit InfoQ

On September 10, the Linux Foundation announced the launch of the Reactive Foundation, a community of leaders established to accelerate technologies for building the next generation of networked applications. The foundation is made up of Alibaba, Facebook, Lightbend, Netifi and Pivotal as initial members, and includes the successful open source Reactive Streams and RSocket specifications, along with programming language implementations.

Reactive programming uses a message-driven approach to achieve the resiliency, scalability and responsiveness that is required for today’s networked cloud-native applications, independent of their underlying infrastructure. The Reactive Foundation establishes a formal open governance model and neutral ecosystem for supporting open source reactive programming projects.

“With the rise of cloud-native computing and modern application development practices, reactive programming addresses challenges with message streams and will be critical to adoption,” said Michael Dolan, VP of Strategic Programs at the Linux Foundation. “With the Reactive Foundation, the industry now has a neutral home for supporting the open source projects enabling reactive programming.”

Arsalan Farooq, CEO of Netifi, believes the Reactive Foundation will help achieve ambitious goals for RSocket. “We hope to see the modern network protocol RSocket replace HTTP as the lingua franca of microservices and distributed systems.”

Reactive systems have become increasingly common since Lightbend published The Reactive Manifesto in 2014 and created the first JVM version of the open source Reactive Streams in 2015. Reactive Streams is an initiative to provide a standard for asynchronous stream processing with non-blocking back pressure. Reactive streams is a set of four interfaces (publisher, subscriber, subscription, and processor), a specification for their interactions, and a technology compatibility kit (TCK) to aid and verify implementations. Crucially, it provides the assurance that connecting publishers, processors, and subscribers—no matter who implemented them—will provide the flow control needed.

RSocket is an open source protocol that builds upon reactive streams to provide application flow control over the network to prevent outages and increase resiliency of applications. It is designed to support reactive programming and today’s modern microservices-based and cloud-native applications as a high-performance replacement of traditional HTTP.

RSocket allows the use of a single connection, through which messages are passed as streams of data. It enables long-lived streams across different transport connections, which is particularly useful for mobile to server communication where network connections drop, switch, and reconnect frequently.

During a joint presentation at QCon London 2019, Robert Roeser, CIO and Co-Founder at Netifi, Ondrej Lehecka, software engineer at Facebook, and Andy Shi, developer advocate at Alibaba, showed how RSocket can be used to solve real-world architectural challenges.

Roeser described how, when he was working at Netflix, they needed a protocol to make it easier to build distributed systems, and for applications to communicate in a consistent way across a network. The result was Rsocket, which provides the communication model, a network protocol, and flow control.

Regarding the Reactive Foundation and the inclusion of RSocket, Shi said, “RSocket is designed to shine in the era of microservice and IoT devices. We believe the projects built on top of RSocket protocol and reactive streams in general will disrupt the landscape of microservices architecture. The Reactive Foundation is the hub of these exciting projects.”

For additional information, InfoQ has several years of news, presentations and articles about Reactive Streams and RSocket.

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.

Presentation: Building Resilient Serverless Systems

MMS Founder

Article originally posted on InfoQ. Visit InfoQ


Chapin: My name is John Chapin, I’m a partner at a very small consultancy called Symphonia. We do serverless and cloud architecture consulting. We’re based here in New York. If you want to talk more about that, come find me after. I’m happy to chat more about that, but what we’re here for today is to talk about building resilient serverless systems. Here’s what we’re going to cover. We’re going to review a little bit about what we, as Symphonia, think serverless is. We’re going to talk about resiliency, both in terms of applications and in terms of the vendor systems that we rely on, we’re going to give a brief live demo, so I’ll be tempting the demo gods today, and then we’ll hopefully have some time for some discussion and Q&A at the end. If for whatever reason we don’t have time for that, I will be out in the hallway right after the session happy to answer all of your questions.

What is Serverless?

What is serverless? The short answer is go read the little report that we put together. We did this with our Ariely. There’s a link down at the bottom here. This link is in the slide, you can download the free PDF. We think serverless is a combination of functions as a service, so things that you all have heard of, like AWS Lambda, Auth0 Webtask, Azure Functions, Google Cloud functions, etc, etc, and also, backends as a service. These are things like Auth0 for authentication, DynamoDB, Firebase, Parse – it no longer exists, but that would have been one of these – S3 even, SQS, SNS Kinesis, stuff like this. Serverless is a combination of functions as a service, these little snippets of business logic that you upload to a platform, and backends as a service. These are more complete capabilities that you make use of.

What are some attributes of serverless that are common across both of these things? We cover this in detail in this report, so I encourage you to check that out. There’s no managing of hosts or processes, these services are self-auto-scaling and provisioning. Their costs are based on precise usage, and the really important thing that we’re going to cover today related to that is that when you’re not using them, the cost is zero, or very close to zero. What we’ll also covered today is they also have this idea of implicit high availability, and I’ll talk more about what that means a little later on.

What are some of the benefits of serverless? You’ve heard this from a bunch of different places in the last couple of years. These are the benefits of the cloud, but just taken one more step. Further reduced total cost of ownership, much more flexibility in scaling, and much shorter lead time for rolling out new things, much lower cost of experimentation. If I want to put up a new serverless application with some Lambdas and a DynamoDB table, I can get that up and running in a couple of hours. If I don’t like it, I can tear it down in two minutes, and I haven’t invested all of this money in this persistent infrastructure.

With all of these benefits though, we give up some things as well. One thing we call out in this report, and this is the thing we’re going to talk about a lot today, is this idea of loss of control. The more we use these vendor-managed services, the more we’re giving control over to vendors like Amazon, like Microsoft, like Google, in return for them running these services on our behalf. We’re sort of saying, “Ok, Amazon, I think you’re going to do a much better job of running a functions as a service platform than John Chapin will.”

With that loss of control though, we’re also giving up our ability to configure these things at a really granular level so however that Lambda service behaves, we have pretty limited number of configuration options for how it works for us. We have far fewer opportunities for optimization, especially optimization that might be specifically related to our business case or our use case. Again, with Lambda, we have one knob to turn. We get to adjust the memory and affect the performance, and we get to optimize how we structure our code and things like that, but we don’t get to go in there and tune Linux kernel parameters or things like that.

The last part of this loss of control is this idea of hands-off issue resolution. Who here remembers when S3 went down a few years ago? Who here had a direct part in fixing that? One person here…lying. When S3 goes down, this guy right here can fix it. S3 went down, we all built systems that relied on that or relied on systems that relied on S3, and when it was down, we couldn’t do anything to bring it back up other than ring the pagers for our account managers and whatnot. We couldn’t proactively take any action to bring that back. We had to trust the vendor to get that back up and running. We lose some control when we’re using these vendor-managed services, these serverless services, or as some people like to call them now, servicefull services.


With all that in mind – we’ve established what is serverless, what are some of the benefits and tradeoffs – let’s talk about resiliency. This quote, “Failures are a given and everything will eventually fail over time.” This is a quote from Werner Vogels, who’s the CTO of Amazon, and he has this great blog post, “10 Lessons from 10 Years of AWS.” This link is on the slides as well. I highly encourage you to go read this. He’s basically talking about 10 years of running AWS at ever-increasing scale, and what’s happening as they’ve been doing that.

What does he say on embracing failure? He says systems will fail, and at scale, systems will fail a lot. What they do at AWS, is embrace failure as a natural occurrence. We know things are going to fail, so we’re just going to accept that fact. They try to architect their systems to limit the blast radius of failures, and we’ll see what some of those blast radiuses look like. They try as much as possible to keep operating, and they try to recover quickly through automation.

I throw this up here. A lot of people use this as, “We’re doing something really bad, and this is not where we want to be.” This is actually the current state of the cloud right now. This should say, “This is normal.” Something is always failing, and at scale, some things are always failing a lot. I used to joke that this was a webcam view of us-east-1 but nobody laughed. Thank you.

We talked about what is serverless, we talked about resiliency, what Werner Vogels had to say about resiliency, so what do failures in serverless land look like? Serverless or servicefull architectures are all about using these vendor-managed services. In doing that, we have two broad classes of failures, we have the application-level failures. Things like, “Ok, we shipped some bad code,” or, “We misconfigured our cloud infrastructure,” or, “We did something to cause our application to fail in some way.” These problems were caused by us, and they can also be resolved by us. We can fix our code, we can redo our Cloud formation, or Terraform template, or whatever the case may be.

There’s that class of failures, and then there are the service failures. These are the things like S3 going down, for example. From our customer’s perspective, those failures are still our problem. If our customers can’t get to our application, or our website, or whatever, they still blame us. The resolution, like we talked about in the first section, is not within our grasp. We have to rely on our vendor to resolve that, so, again, application failures and service failures.

What can we do? Is there anything we actually can do when those vendor-managed services fail? This presentation is really about the answer to that question, being yes. What we’re going to try to do is mitigate these large-scale vendor failures through architecture. We don’t have any control of resolving the acute vendor failures. Ok, S3 goes down, none of us except for that guy in the third row can go in and fix S3 specifically, so we have to plan for failure. We have to architect and build our applications to be resilient.

The way we’re going to do that at a large-scale is we’re actually going to take advantage of the vendor-designed isolation mechanisms. Werner Vogels was saying, “You’ve got to limit the blast radius for failures.” They document those blast radiuses, and they have isolation mechanisms in place – Amazon, Microsoft, Google, all of them. We’re going to take advantage of those isolation mechanisms – in this case, it’ll be AWS regions – and we’re going to take advantage of the vendor services that are architected and built specifically to work across those regions to help us keep our application up, even when one of those regions goes down.

This last bullet point, if I only had one thing to say, and this was an AWS-specific presentation – it’s sort of an AWS-specific presentation – it would be, go read the well-architected framework reliability pillar. Amazon, and also Microsoft, and Google, they document how to architecture applications to take advantage of their isolation mechanisms. They give you the answers right here, so I highly encourage you to go check that out. If you’re on one of the other clouds, seek out the same information there.

Let’s talk really quickly about AWS isolation mechanisms. I’m going over this because we’re going to see it later in the architecture diagrams and in the demo. These big circles are AWS regions, and within each geographic region is a number of, you can call them logical data centers. We have these availability zones, is what AWS calls them. Each availability zone is its own isolated little unit of maybe power, and network, and other things that a data center needs, other things that servers need, and the idea is you architect your application. If you’re running something like on EC2, for example, you might have several EC2 virtual instances in us-east-1a, you might have several in us-east-1b, and the idea is you would have a load balancer in front of those. If one availability zone goes down, you’re still up and running in another availability zone.

This is the classic regional high availability that AWS gives you. This is services running across multiple availability zones, and one quick way you can tell that you need to do this explicitly is if the service you’re using addresses resources, and it’s addressing them down to an availability zone level. When you spin up an EC2 server, it says, “What availability zone do you want to put this in?” You know that’s your clue right off the bat, “Ok, if I want to be resilient to an availability zone failure, I need to have more than one of these in different zones,” versus services like Lambda, services like Dynamo that you address on a regional level. We’ll talk about how we use those, but I just wanted to go over this real quick.

Serverless resiliency on AWS – we talked about regional high-availability, these are services running across multiple AZs in one region. EC2, that’s our problem, we have to architect our applications to do that. With serverless, again, Lambda, Dynamo, S3, SNS, SQS, AWS handles that for us. We say, “I just want Lambda in us-east-1, in one region,” or,” I want Lambda in eu-west.” AWS is handling what happens if an availability zone within that region goes down. We don’t have to worry about that.

To take another step up, global high-availability, these are services running across multiple regions, and we can actually architect our application at this level. We can architect our application to take advantage of multiple regions and have global high-availability. If an entire region of us-east-1 goes down, we can stay up and running. I mentioned it before, but that serverless cost model is one of the huge advantages. There are some other advantages to that serverless model when we’re trying to build these global applications. This idea of event-driven serverless systems, things like Lambda, things like API Gateway, these are event-driven by nature.

When you combine that with this idea of externalized state in something like DynamoDB – and I think several of the other serverless presentations at QCon this year are talking about this idea of event-driven serverless systems or serverless microservices. What that means is we have little or no data in flight when a failure does occur, and that data’s persisted to highly reliable stores. If we need to switch from one region to another, it’s really seamless. We don’t have anything in flight that we need to figure out what to do with. We can just switch from one to the other.

Another property of these serverless systems, that’s surprising that it comes out of this, is several systems tend to be continuously deployed because they’re so damn hard to manage otherwise. At least that’s something that we see. When you have that continuous deployment, you’re specifying all of your infrastructure in code or configuration. You don’t have any of this persistent infrastructure to rehydrate. If you need to have that running in many regions, and we’ll see this in the demo, it’s very straightforward to do, and oftentimes, that just comes naturally.

We talked about resiliency, and what we actually also get out of this style of architecture is not just resiliency, we actually get, again, this distributed global application that’s better for our users and may not actually cost a lot more than running in a single region. We’ll see this in the demo, and we’ll see this in the architecture diagram, but that regional infrastructure is closer to your regional user. If we have our application deployed in us-east-1 and in eu-west-2, we can actually route users in those regions to those two instances of the application.

Because serverless is pay per request, those total costs are similar. If half of our users are going to us-east-1 and half are going to eu-west-2, our total bill is the same as if all of them were coming to us-east-1 in that first region. That pay per request total costs are about the same. Infrastructure-as-code minimizes the incremental work in deploying to that new region, so if we decide, “We want to spin up in Asia-Pacific,” we use that same configuration template, send it to Asia-Pacific, and we have the capability to do automated multi-region deployments. I’ve got a link at the end of this presentation for some work that Symphonia did providing an example of how to do that, but that makes it easy to keep this multi-region infrastructure up-to-date. The premise of this talk really is that the nature of serverless systems makes it possible to architect for resiliency to vendor failures. Not easy, but possible.


Let’s talk about the demo. We’re going to build a global highly-available API, and I say we’re going to build it – I’m going to show you, and make available the source code for you so you can take this home and play with it yourself. We’re going to build a global highly-available API. The source code is there, and the slides will be distributed after. It’s a SAM- Serverless Application Model- template, some Lambda code, we’ll call it a basic front-end written in Elm. What does the architecture look like, though?

Here’s what this application is going to look like from the architecture side. Imagine I’m actually standing all the way to the left with a web browser, or a phone, or something like that, and I’m using the front-end app, and it’s making a network request. The first thing it’s going to do is say, “I need to take this DNS name and translate that into an IP address. I want to hit” It sends that request, that request is handled by Route 53, this is AWS’s global DNS service. Route 53 looks at where I’m coming from on the network and returns an IP address that’s closest to me based on what’s available for this application. If I’m sitting here in New York, it’s going to give me an IP address in us-east-1 down in Virginia. If I’m sitting over in London, or in Europe somewhere, it’s going to give me an IP address that’s over in Europe, actually, that’s in the London region, eu-west-2.

Based on that, at that top level, traffic is going to get routed to one of two regions. The Lambda functions deployed in that region are going to handle that request. They’re going to talk to a couple of DynamoDB tables, and the other interesting thing is on the back-end of this is what AWS calls a global DynamoDB table. We’ve essentially set up two mirror DynamoDB tables that both accept rights and then propagate those rights to their peers behind the scenes. A lot of people stop me and say, “John, you’re just describing some DNS tricks, and some AWS configuration magic.” Yes, that’s totally the point here. We have to architect in this way to survive these regional failures. The point is, we can do it. I described the request flow here. What we’re actually going to do, going back to this architecture diagram, we’re going to see that users sitting in one of these two locations get routed appropriately, so that’s better for our users, that’s that good user experience we talked about. We’re also going to see, if we simulate a failure that traffic can get rerouted as well. We might get lucky because we’re in us-east-1, but if we don’t get lucky, basically what I’m going to do is fail the health check that Route 53 uses to determine what regions it has available, and that’ll simulate a regional failure. Or, maybe this guy in the third row can help us out too, but you could bring down S3 on purpose this time.

We’ve talked about simulating failure. Let’s jump right into the demo. I’m just going to run this little Elm front-end. Again, the instructions for doing this are in the source repo, so I encourage you to take that and experiment with it. The only thing you’ll have to change is the DNS setup to match whatever domain name you have available because you can’t use mine, it’s taken.

I’m going to go over to Chrome here, I’m going to bring up our little front-end. This is a chat application, what we’re seeing here is nothing yet, so let me put something in here, “Hello QCon.” Obviously, on the left side, we have our message, the source and read column. That source column is telling us that’s where the message came into the system. What we’re seeing here is this message was accepted and processed by a Lambda in us-east-1, and we’re actually using WebSockets. What this is also telling us is that that message was read from the system in us-east-1. That makes sense, pretty straightforward, but I just want to explain what you’re seeing here.

Next thing we want to do, let’s go ahead and switch our network location so that the back-end, at least, thinks that we’re coming from Europe. I’m going to just travel over to Denmark via VPN. I’ve refreshed the application. Our message is there, our message was persisted, but we’re now talking to an instance of this application running in eu-west-2. That information is being read from eu-west-2. That all makes sense. We can put another message into the system, “Hello, QCon from Denmark.” We can see that this message was accepted by our application in eu-west-2 and then read back out of there as well. This is pretty easy, pretty straightforward.

Let’s jump back to the United States. I’m just going to refresh here, and now you can see that we’re reading all of these messages out of us-east-1 again. I’m back in the U.S., this application is behaving just the way we’d expect, reading data from us-east-1, and I’ll show you where that data is living on the back-end a little later on in the demo.

Now, what we’re going to do is we’re going to simulate us-east-1 going down. On the network, I’m here in New York, I should be routed to us-east-1. If we take us-east-1 down, if anybody needs to turn off their pager duty or something, now’s the time. The way we’re going to do that is, we’re going to go to Route 53, I’m going to my Health checks, and you can see I’ve got two health checks here – the first one is us-east-1, the second one is eu-west-2. I found, for the purposes of this demo, the quickest way to fail a health check is actually not to change the code behind it, but is to do what’s called inverting it.

We basically are telling Route 53 to treat bad as good, and good as bad, and cats as dogs, and up and down, and all of that. I’ve now inverted that health check, and we’re going to wait for that to go from healthy to unhealthy. While we do that, I’m just going to point out the DynamoDB configuration. DynamoDB is AWS’s highly available, very performant key-value store. We have a messages table, and so I’m sitting here in us-east-1. I’m looking at the messages there. This is set up to receive messages in us-east-1 for traffic routed here. These messages get copied over to the same table in eu-west-2 in London. I can jump over there and see I have the same table in London with the same messages.

This is DynamoDB global tables, I encourage you to check it out. There are some limitations that we’ll talk about later, but it’s a really interesting way to have these global applications that share data across different instances in the deployed application. I’m going to go back and check Route 53, see if this thing is showing up as unhealthy yet. Route 53 is telling us that that health check is unhealthy. What we would expect here, if we refresh this page, is that we’ll be reading data from eu-west-2, from London, even though on the network, we’re still here in New York.

I’m going to try to refresh this. I’ve actually hit this before, this is a Chrome DNS issue, so I’m going to do an incognito window here, and you can see that this is now reading from eu-west-2. It doesn’t seem like much, but we failed a region there, and our application stayed up. I can still interact with this application, “Where’d us-east-1 go?” Our users are still able to use this application. People sitting in New York might get routed to Europe, but it’s still up. We’ve architected successfully around a regional AWS failure. It doesn’t seem like much on the screen here, but this is huge.

If you were architecting in this way when S3 went down however many years ago, your application would’ve stayed up and your users never would have noticed. They might have hit a little bit of extra latency, but that’s it. AWS and the cloud vendors give us the capability to do this, we just have to take advantage of it and do it. I’m going to reset our health check here to bring us back to the U.S., although we may not wait on that to actually happen. That is our demo of a globally available application on AWS.

Rough Edges

Let’s talk about what some of the rough edges of this are. Many of these rough edges are specific to AWS, so global tables, and WebSockets, and custom domains, and cloud formation don’t work very well. What that means is that my infrastructure-as-code approach that I really want to use for deploying this really easily to many different regions, there are some rough edges and some pieces of that are broken. I have to perform some manual actions to actually get this completely set up. These things will be fixed over time.

That third one, if you’re using global tables, they can’t have any data in them when you link them together, so that’s a big caveat. This Stack Sets thing at the bottom – Stack Sets are just a way on AWS to deploy to multiple regions at once easily. We can actually mitigate that just with our deployment pipeline. Some other rough edges – and these are just more architectural challenges, I don’t see a slide for this, unfortunately – but your application may have special considerations around whether you can accept data written in multiple regions or not. That’s an architectural challenge for you to overcome. Your users may need to have affinity to one region or another. You may not be able to actually move them around or accept their data in multiple places for compliance reasons or for other reasons. Some rough edges there and some architectural challenges, but what we showed is that it is definitely possible.

I’ve got some links here for multi-region deployment. There are some other ways to architect different pieces of application, so there’s this great documentation on a new feature of Amazon CDN called Origin Failover. That basically means I can set up a CDN. I can have the backing store for that CDN, maybe it’s S3, or database, or something. If that backing store becomes unavailable, then cloud from the CDN will failover to another one.

Other CDNs, Fastly, and CloudFlare, and folks like that have had those features for a while, but it’s now also available on AWS. There’s Global Accelerator, which lets you do all of this but at a much more fundamental network level. I also have some AWS resources that I just want to call your attention to. I mentioned earlier that well-architected framework that you should all read if you’re running applications, building applications on Amazon.

In James Hamilton’s talk from re:Invent 2016 called “The Amazon Global Network Overview,” he goes into deep technical detail about how they build out their global infrastructure, how they structure regions, how they structure availability zones. He talks about how generator cutover switches are poorly designed. It’s super deep, and it’s super interesting, and he’s incredibly enthusiastic when he’s talking about it. If you’re using Dynamo, I strongly recommend Rick Houlihan’s talk from last year at re:Invent, “Advanced Design Patterns for DynamoDB,” and he goes into some of the technical detail around global tables as well.

Then there’s a bunch of other prior art around building these global applications. We brought some of it together in this demo today. Then Symphonia, we have some resources out there as well, which I encourage you to check out. Feel free to email or hit us up on Twitter if you have any questions or just want to talk more. I would love for you to stay in touch again. I welcome questions by email, Twitter.

Questions & Answers

Participant 1: This seems like a nice, happy path story. Can you talk about the problems, the pitfalls, the edge cases which we’re not thinking about? This is a nice, obvious happy pathway to do things, and you could potentially have routing within your region as well for local-regional failures, and it goes all the way down. What are some of the problems? What are the things we are not seeing here?

Chapin: The question, to rephrase it and condense it a little bit is, “Boy, John [Chapin], you showed a nice happy path here, surely it’s not always like this. What can go wrong? Can you do this within a region?” What I would go back to on that second part of the question, “Can you do this within a region,” is if you’re doing that, you’re probably not using serverless components anyway, and so I’m going to just not cover that because we’re focused on serverless event-driven systems in this case.

I pointed out some of the minor rough edges somewhere in here. Those are the architectural challenges. If your application is not like this – when I say, “not like this,” I mean, if your application can’t operate like this – then obviously you need to make different choices. We’re relying heavily though on globally available, or what should be globally available services like Route 53. There have certainly been cases where Route 53 has had problems. That’s a big rough edge.

That being said, I would rather that Amazon, or Microsoft, or Google, or whoever runs a global DNS system than myself. Also, you could take Route 53 out of the equation here and use a third-party DNS provider too. You’re still susceptible to some of those vendor failures and some of those service failures, but just at a much, much higher level.

Participant 2: This kind of routing is great when your service is absolutely not available, but you hinted there, what if there’s a service failure, or you have a chain in tiers, and in one of the tiers there’s a service failure. What do you have to propagate that, show that failure all the way to the routing so that it can be routed to the other region, potentially?

Chapin: The question is, what if you have multiple tiers, and within those tiers maybe, can you redirect, or can you account for failures?

Participant 2: Yes. The failure is not an availability one, perhaps a functional one. You might want to route to the other region completely.

Chapin: The way we’re doing this is with a health check that is programmatically produced. If your failure is a functional one, reflect that in your health check. Your health check could encompass not only, “Do I exist at all,” so it’s either there or it’s not. It could also encompass, “Am I getting the right kind of data back that I expect?” Or, “Am I producing the results that are needed?” Or whatever the case may be.

You can roll all of that up into a health check. There’s a lot of danger there too, and you run the risk of with testing sometimes, it’s re-implementing a lot of your system in the health check, but it’s certainly possible. A lot of this can be sliced up into tiers. Route 53, for example. This doesn’t all necessarily have to be just at the front-end of your application. This could be different tiers within it.

Participant 3: You mentioned that the database has to be set up with no data. How would you go about implementing or onboarding, say, another region? Is that something you’ve done, or is that something that you’re just waiting a solution for?

Chapin: DynamoDB global tables have to be empty when you want to link them together, make them globally available, basically. The question to me was, have we onboarded applications that already have data into an architecture like this? The answer is no. It’s so new that we’re not using it heavily yet, and I’m waiting for a solution. The solution to that would be a manual cutover; dump all the data, create a backup, and then very carefully move things over. It’d be a dance though. It’d be a pain.

Participant 4: On the edges, you mentioned that it’s not available in cloud formation?

Chapin: Yes.

Participant 4: Just elaborate on that. Is that just set it up manually and stuff like that?

Chapin: Yes. The comment was, some of these services cannot be configured in cloud formation. Just to review, cloud formation is Amazon’s infrastructure-as-code service. I have a big YAML file or a JSON file, and it has all of my resources in there, a bunch of Lambda functions, some API gateways, maybe some DNS records. Some of these pieces that we showed in this architecture cannot be configured right now using that infrastructure-as-code service.

For some of these things, in particular, the custom domain name for the WebSockets, you have to either go into the AWS console and manually configure that, or you can do it through API calls, but again, you have to write the script to do it. Terraform may support that, I haven’t tried, actually. The other piece of that is that linkage between DynamoDB tables across regions is the other thing that you can’t do in cloud formation, in that infrastructure-as-code tool.

Participant 5: With the database having both rights, there could be data discrepancies and maybe two sites having a split-brain concept that could lead to data ambiguity and how consistent it is. If the rights are more, there may be a replication lag. How do we handle all these?

Chapin: I agree with all of those things. I didn’t hear a question in there, but the comment was basically, “You have a dual master situation, or rights going to two tables in different regions at the same time. What do you do about the data?” The answer is, architect to build your application to handle that possibility. You could use convergent data structures. For example, in this case, this being a chat application, it’s pretty straightforward just to have the messages be singular, and if we have duplication, we actually handle on the client’s side, which if you dig into that app is actually what it does. You design your application with that in mind if you want this kind of system. No silver bullet, no magic there. You have to put in the work.

See more presentations with transcripts

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.

Python 2 End of the Line Is January 1st 2020

MMS Founder

Article originally posted on InfoQ. Visit InfoQ

After spreading the news at conferences, on the Python announcement list, and on countless blog posts and books, the Python Software Foundation has finally taken the step to formally announce Python 2 will reach end of life (EOL) on January 1st, 2019.

The message the Python Foundation is trying to make loud and clear is developers should transition to Python 3 as soon as possible without waiting any longer:

We have decided that January 1, 2020, will be the day that we sunset Python 2. That means that we will not improve it anymore after that day, even if someone finds a security problem in it. You should upgrade to Python 3 as soon as you can.

Python 3 was released at the end of 2008, nine years after the inception of the language by its creator, Guido Van Rossum.

From the very beginning, Python 3 was meant to break away from the past, as the only way to fix a number of flaws that affected Python 2 and bring the language evolution forward. Since Python 2 had accumulated multiple ways of accomplishing the same task, one of the guiding principles in the language redesign was making sure there was only one obvious way to do one single task.

While paving the way for the future evolution of the language free of legacy constraints, breaking backward compatibility also significantly slowed down Python 3 adoption. Among the reasons that led developers to not want to transition to Python 3 were Python 3 sub-par performance vs. Python 2, at least for some years after the initial 3.0 release and until version 3.3 hit the road; initial lack of support for Python 3 among third-party tools and Python libraries (a case of catch-22); its focus on features that developers did not deemed relevant at the beginning. In spite of this, the language grew significantly over the years to include advanced constructs such as generators and coroutines, async/await, concurrent futures, itertools, and more. And while it is true that for some time there was a certain amount of confusion about how to deal with two source-incompatible versions of the language simultaneously, porting Python 2 code to Python 3 is now a much better understood problem, for which great tools exist such as caniusepython3, Futurize, Modernize, pylint.

All of this contributed to making many organizations postpone indefinitely the transition to a largely more modern and expressive language. Until now, that is, when the Python Foundation is warning them they will be soon on their own, and their only way to get support will be paying for extended support.

The news raised mixed reactions. On the one hand, many developers highlighted how straightforward was in their case to port their code to Python 3.

Python 2 to 3 (at least by 3.3 or so) was one of the easiest transitions I’ve ever done. There’s a library (“six”) to help, and in almost all cases you can write 2-and-3 compatible code, which means you can go piece-by-piece.

Even more so, some argue getting ready to move to Python 3 did not require anything more than applying good engineering practices, such as keeping your unit tests in shape and making sure to keep dependencies up to date.

If you haven’t put in the priority to update your Py2 to Py3 apps by now, I really think your shop has the wrong priorities. It’s not just about Py2/3. Dependency rot is one of the worst form of technical debt. It often shows broken CI, broken security scanning, lots of generally broken processes that will just keep hurting a team further and further down the line.

Others remarked porting may be easy or feasible in the vast majority of cases, e.g., with Django or Flask-based projects, but there are still many scenarios where porting hits into some kind of stopper, such as an incompatible C extension or an irreplaceable dependency. This is often the case in the computer graphics/visual effects and scientific computing world.

Users and studios are held back because the Python runtime is used inside major applications; Nuke, Houdini, Maya as well as libraries and APIs. None of them have released a version that runs Python 3 yet. […] Also, I’ve worked at a couple studios many people have probably heard of and none of them have unit tests covering much of their code. The focus is on tools that facilitate in-house artists where responsiveness to needs are valued over architecture and completeness.

The entire science stack is “terrible C extensions”. This is also the kind of tight-budgeted stuff that needs relatively rare Python/C developers.

Other developers complained that, in spite of all efforts to make the community at large aware of Python 2.7 EOL, Python 2 documentation still offers no hints of that to its viewers.

Contrast this with the PostgreSQL website, which tells me I’m browsing old docs (because Google still offers old links), see e.g.

This once again bring into focus the “poor” handling of the transition from Python 2 to Python 3 from its very start.

Other developers contrasted the Python approach to a sounder “don’t break the user space” approach, which gives languages and systems written using them a longer life, which is often desirable.

We’ve got a large amount of FORTRAN code from the ’70s that works fine with the latest and greatest FORTRAN compilers (barring deprecation warnings which help guide refactoring efforts). Same with C/C++ from the ’80s.

As a final note, it is worth remarking that the Python Foundation announcement does not rule out the possibility other organizations take the burden to maintain Python 2.7 and keep it up to date. For example, RHEL 7 is based on Python 2.7 and guarantees security/maintenance support till June 2024. The same holds true for Google App Engine and other enterprise service providers.

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.

Microsoft Presents Static TypeScript, a Fast Subset of TypeScript Targeting Embedded Devices

MMS Founder

Article originally posted on InfoQ. Visit InfoQ

Microsoft recently submitted a research paper introducing Static TypeScript to the Managed Programming Languages and Runtimes 2019 (MPLR 2019) international conference. STS is a subset of TypeScript targeting low-resource embedded devices. STS programs may run on devices with only 16 kB of RAM, faster than embedded interpreters would, thus extending battery life.

Static TypeScript is a syntactic subset of TypeScript, built specifically to target microcontroller units (MCUs), and which compiles to machine code that runs efficiently on MCUs in the target RAM range of 16-256kB.

STS eliminates the most dynamic features of JavaScript, like with, eval, prototype-based inheritance, the arguments keyword, or the .apply method. The this pointer and the new syntax are not allowed outside classes or on non-class types. STS also does not implement recent additions to the JavaScript language, such as generators, the await and async function expressions, or file-based modules.

STS also departs from TypeScript typing conventions. Static TypeScript has nominal typing for classes, while TypeScript uses structural typing. This implies in particular that an interface cannot have the same name as a class, a non-class type cannot be casted to a class, classes cannot inherit from built-in types, this cannot be used outside of a method, and functions cannot be overloaded. STS in particular separates, at the type level, objects which act as key-values map from class instances and other special-purpose JavaScript objects, like functions and arrays.

STS language choices allow it to efficiently compile classes with Virtual Call Table techniques. As importantly, the language choices facilitate type inference, resulting in code which looks like standard JavaScript. Daryl Zuniga, software engineer on MakeCode, the primary user of STS, explained on HackerNews:

Because of the heavy use of type inference, most beginner programs have no type annotations and it looks just like Javascript.

An HackerNews user expressed his enthusiasm:

Extremely interesting idea. Sounds like many of the things they get rid of are some of the more problematic features of JS anyway. The end result is fairly Swift-y; a loosely OOP static language that has closures and really convenient hash-map syntax.

The Microsoft research paper reports that the STS compiler produces efficient and compact ARM Thumb machine code. The latter point is illustrated with a platform game written with STS, and which runs at 30 frames per second on a the 120×160 display of a $25, 120MHz, 192kB of RAM AdaFruit device. Alternative JavaScript embedded interpreters, like IoT.js, Espruino, Duktape, or MicroPython, displayed, in the tests presented by the research paper, a significantly higher memory consumption profile, and worse performance.

STS programs may be deployed on embedded devices with a web browser or manually, that is, without any app or driver installation. The research paper explains:

Compiled programs appear as downloads, which are then transferred manually by the user to the device, which appears as a USB mass storage device, via file copy (or directly through WebUSB, an upcoming standard for connecting websites to physical devices).

Zuniga emphasizes the importance of browser-based deployment and simulation for young pupils learning physical computing, an important target audience of MakeCode:

STS comes with a simulator that runs in the browser and so programs can be tested there before they run on hardware. For educational purposes, we found diagnosing issues on hardware to be really hard for students, so we try to catch and present errors as early as possible.

Going forward, Microsoft envisages STS having a significant impact on the Internet of Things (IoT):

We see statically typed languages such as STS playing an important role in the future of IoT, allowing embedded devices to run more efficiently—faster and, as a result, with reduced energy requirements—as well as providing programmers who aren’t embedded developers with an easier, higher-level alternative to the lower-level languages generally used to program these devices.

Developers may consult online the technical details of the STS language and the current draft of the Microsoft research paper.

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.

Presentation: Getting Real About Managing Up

MMS Founder

Article originally posted on InfoQ. Visit InfoQ


Elliot-McCrea: Today we’re going to talk about a couple of things. We’re going to talk about who I am, why we’re talking about managing up at a technology conference, the basics of managing up, some advanced techniques, some things that don’t work.

My name is Kellan Elliot-McCrea. I tend to manage large groups of engineers. Cards on the table, I am the person you are managing up to. Five years running the team at Etsy, five years running a team at Flickr through the acquisition. A couple of startups, I’ve been known to manage product and design. If things are really grim, I might manage HR. That’s a little bit about who I am. I also do a fair amount of coaching and advising. I’ve worked with 30 different startups over the last few years. This talk draws on both of those experiences. When Jean asked me, “Hey, do you want to do a talk on non-technical skills for technical folks?” I said, “Yes, I know exactly what I want to talk about.”

Managing up in Tech

Why are we talking about managing up in tech? You all came, so I’m going to believe that you have some interest in this topic, but I don’t want to assume too much. You might just be here taking a break, using the Wi-Fi. Roi [Ben-Yehuda] is going to be speaking next, better ask him better questions. I’ve done some coaching with Roi [Ben-Yehuda], that will be a killer talk. If for no other reason, you should stick around for that talk. My theory about managing up in tech is a little bit like my theory about engineering management in general. It is a distinct discipline, it has a different set of skills in managing generic humans. Managing up in technical organizations has a different set of characteristics than what you’re going to learn from the million articles that you can find on the web about managing up.

There are three key contributors to that distinct characteristic. Modern software development is deeply entangled. The myth of the apolitical engineer, and the growing expectation engineering manager skill gap. Modern software development is deeply entangled. Yes, it’s complex, it’s fast moving, it’s high paced. There’s a lot of experimentation, we have small cross-functional teams, we’re building on deep abstractions that are moving underneath us. The days of, put your headphones on, code for six months, throw it over the wall to the release manager, which, by the way, were not the good old days, I promise you, those are long gone. Modern software development is incredibly complex, and that means you need to be good at collaboration, good at communication, good at coordination, planning, prioritization, focus – management skills. It’s one of the reasons we’re talking about it at a technology conference. It’s key to building software today.

The other thing is the myth of the apolitical engineer, this is the thing that we do to ourselves. She’s talking about Jean a little bit before the talk, “Yes, this whole track is a little bit about things you were supposed to have learned in kindergarten.” The fact of the matter is, you did learn all of this in kindergarten and then you spent the rest of your career being told that you didn’t actually have to know it. Again, this is going back to, “We’re just hyper rational beings here. I don’t want to engage in politics, I just want to be allowed to do my work. Let the work speak for itself, let the data speak for itself. This technology is clearly superior. Why are we even having this conversation? I’ll just build a prototype to prove the point.”

Good news, bad news – bad news first. This is not how groups of humans make decisions, sorry. The good news is, if any of these tactics have ever worked for you, and for many of us they have, you’ve experienced competent management. You had a manager who was out there breaking ground for you, so good on you.

The third key contributing factor: we have a growing gap in our expectations of management skill and how many good managers we have available. Again, this is a good news, bad news situation. The good news is, we have more talented engineering leaders than we’ve ever had. We’ve figured out a few things about managing software development teams, there are a few good books on the topic. We’re investing in training, conferences like this one are actually putting some focus on these skills.

The bad news – the demand’s never been higher. There are more teams, there’s more need for management skills on software development teams. Frankly, our expectations that work should be a place that is both well-run and fulfilling are going up over time. I don’t know what we can do about that, I think we should try to bring that bar down again. Whatever it is, our need for management is going up. We have a problem, it’s going up much faster than our available supply. We’ve come up with a solution as an industry, or at least there’s a trend as an industry – influence-based leadership.

Influence-Based Leadership

If you’ve spent any of the last few years hanging around career ladders or promotion committees – who doesn’t do that for fun? – you’ve heard a phrase like influence-based leadership. Influence-based leadership is leading without authority, mentoring and coaching, all of these great sorts of words that you might have heard, they tend to be highly correlated with senior ICs. How many people are managers versus ICs in the audience? Yes, managers. For the rest of you, you’re ICs, that’s management speak for you don’t manage. It stands for individual contributor.

What does influenced-based leadership mean? It’s a trap. It means, I, as your manager, am totally overwhelmed by the accelerating demands of my job, and I’m going to shift some of that load to you. You’re welcome, you’ve been deputized. Good news, it means I’m giving you some control over your destiny. This is good news, right? This is totally what you’ve got into doing this work. That’s the context that I want to talk about managing up at a tech conference, that the real story is to be a senior contributor, to do great work in tech, you have to take on some of these management responsibilities. Good news, you probably don’t need to do one-on-ones, though you might, or you probably don’t need to work on compensation conversations, though you might. You almost certainly don’t need to fire anybody. That’s the line.

In that context, what we mean by managing up is making it easier for your manager to support you in doing great work. That’s our definition of managing up. There are other definitions of managing up out there, but for our context in tech, that’s what I’m talking about. Your manager is overwhelmed and stretched to the limit and barely coping, and you’re going to make that a little bit better so you can get back to doing your work. That’s our conversation today. The basics is we’re going to talk about four skills: getting curious, understanding your manager and their job, building a positive relationship, and making your manager look good.

Get Curious

First, get curious. This is your core foundational skill for managing up, also managing down, managing sideways, managing backwards. Another word for it is asking questions, but being sincere in those questions. One of the things that I encounter a lot, both as a leader and as a coach, “That jerk committed to a new deadline again without talking to me. Why are they doing that?” “They’re talking about how the new architecture is going to solve our performance problems and we’re not even focused on performance in this release. Why are they doing that?” My answer in almost all of those cases is, “What did they say when you ask them that question?”

Surprise, people have never actually asked the person in question. They just want me to read their mind somehow. I can read their mind, but I choose to make it a teaching moment instead. I say, “You should go ask them that question.” – no, obviously I can’t read their mind. Curiosity is a really powerful tool for two reasons, some of which may not be obvious. The first really powerful tool that curiosity brings to bear, the effect that it changes in the world, as it is relevant to managing up is, curiosity says, “I care about you, I care about your opinions. I want you to succeed, I want you to be successful.” There’s also a small chance that you might learn something you didn’t know. Really, we’re doing it for that first thing: “I care about you.” Managing up, get curious.

Understanding Your Manager and Their Job

Understanding your manager and their job – the single most important thing to know about your manager is that they are not thinking about you. Science has showed us that humans spend about 85% to 90% of the time thinking about themselves, understanding the world in relationship to their emotions, their actions. That leaves about 10% left over and at least half of that is thinking about their boss. You’re getting a very narrow time slice here. We’re going to come back to this a lot, because this is really one of the core insights when you think about managing up, and shifting load, and creating an environment where you will be able to more successfully do work.

There is great news though. When I piss you off, you should know it’s not personal because I’m not even thinking about you – I’m just saying. One example of how this plays out is thinking about your manager’s different perception of time. This is a typical manager schedule, I particularly like the no-meeting Wednesday there in the middle. It’s a concrete example of the asymmetry that we’re experiencing. As a manager, I’m split between a lot of different people and a lot of different things and I’m thinking about different things. Two weeks ago, you asked me for help on a hard problem. You know what? It’s two weeks later, and that’s our one-on-one, you’re, “What is the deal? You told me to bring you problems, I brought you problems an eternity ago and you have done nothing.” I’m, “What? I’ve thought about this twice for 15 minutes in the last two weeks. I don’t know what you’re talking about. Why are you so impatient?” These sorts of asymmetries show up in the relationship all the time. That’s just probably one of the most concrete, but it’s a thing to really be thinking about as you approach this work.

Questions Regarding Your Manager

Let’s dig into understanding your manager’s job. First, what exactly is their job? If you’ve never done their job, I actually really recommend getting a book on it, like Camille Fournier’s, “The Managers Path.” It’s a great book, it takes you through the ”I have a manager,” to “I am CTO” and that kind of whole arc. It’s a good cheat sheet for understanding what your boss is actually supposed to be doing with their day, whether or not they’re doing it.

What do they value? Again, managers are humans, it comes as a surprise sometimes. They value different things; some of them really value technology, some of them really value product, some of them really value the business, learning. I really like operational excellence, I tend to give a little extra energy and focus for people who frame their problems in that phrase for me. Learning what your manager values can be really important for communicating with them. As distinct from what they value, how are they being evaluated? Does your organization have goals? Does your organization care what its goals are? Are they actually being held to those? Are they being held to something else? Are they being evaluated on success in hiring, or success in shipping new features, or meeting the OKRs, or saying yes to the CEO? There are a lot of different ways someone can be evaluated, and understanding how your manager is being evaluated is a really key piece of building that relationship.

Then, finally, what are they particularly good at? Again, everyone’s good at different things. Maybe they’re good at organizational politics, maybe they’re good at running meetings, or getting you promoted, or thinking through architectural roadmaps, or whatever it is. Really start to understand what are the things that you should go to your manager for help on and what are the things that maybe you should go to somebody else for help on. Because they exist.

Finally, don’t be surprised if your manager doesn’t actually know the answers to any of these questions, because if this stuff was easy, they’d already be doing it. That was understand your manager and their job. The most important takeaway was, they’re not thinking about you.

Establish a Positive Relationship

Second, establish a positive relationship. You’re getting to know them, you want to keep this conversation mostly constructive. In particular, in the beginning, keep it actionable. We are a new relationship here; you’re going to bring your challenges and problems and frustrations to me? I really want to be able to help – helping makes people feel good, people like to be able to help – on the flip side, if I can’t help, I’m going to feel bad. I’m not really going to enjoy our relationship. Just think about that. One of the things that is strange to me, and maybe it’s strange to me because I’ve been a manager for so long, but I find that there is an implicit rule against giving your manager compliments. I’m saying, it works great as long as it’s genuine. The reason it’s important that it’s genuine is, it’s going to call forward that behavior in them.

I can think of an example where I was working with the CEO, and I’m, “Look, the team just really needs to hear from you what the strategy is.” She got up and she gave a long, very detailed presentation on the strategy. The team was, “Great, we’ve got it. That was amazing.” Then she did it the next week and the next week, because you know what? She got a lot of positive feedback on it. At some point, the team was, “Please tell her to talk about something else.” My favorite compliments to give people by the way in this managing up and trying to call behavior forward, tell them about something they’ve taught you. People like hearing that, and they might teach you some new things – just a thought.

Make Your Boss Look Good

Final skill and the basics of managing up – make your boss look good. There are a few things to do to make this possible. You want to align yourself with your team’s mission. That might sound obvious, but that is a huge percentage of how people are being evaluated. It’s a huge percentage of how success is going to be evaluated for both you and your boss. It’s going to determine whether or not you get resources to do great work, which is really one of the major points of what we’re talking about here. If you don’t have the same priorities as your boss, this is one of those great opportunities for these constructive conversations we were just talking about. Do great work, relatively straightforward. Who needs help doing great work?

Equip them to speak fluently about your work. You may not know this if you’ve not been a manager, but at least half the job is being in a meeting and somebody says, “What’s the status of Project X?” and you’ve got to be able to answer that question. There are two things that are going to happen in that situation. I’m either going to use the answer that you’ve prepared me with or I’m going to make something up. It’s going to work better for both of us if I’m prepared.

Translate into their frame – this is an interesting one, and this is where we really are starting to get collaborative about this, it’s a two-way street. I have a project this quarter to figure out why the connection pool is thrashing the database to death. You have a project this quarter to deal with site reliability. These are the same projects, but we’re using different language to talk about them. One of us is going to have to translate across that barrier. If you can do that work of translating across that barrier, my job just got a little bit easier. Think about talking about why is the connection pool thrashing the database to death project in terms of site reliability.

Help Them See Around Blindspots

Finally, this is probably the most important skill of the whole section of making your boss look good and managing up, help them see around blind spots. Management breeds blind spots; it is the nature of the work that we don’t know what we don’t know. We are divided, we are scattered, we’re the boss, we make snap decisions all the time, we are often the least-informed people in the room, and we don’t know it. Our job requires us to go on making those decisions. This is one of the frames that I really like to use when I think about managing up. The best managers need to be managed up. One of the things we often talk about is managing up to bad managers. We’ll talk about bad managers briefly a little bit later.

The best managers are the ones that are inviting you to manage up to them. They’re the ones who know they have blind spots, who need help. One of the phrases that someone said to me at one point that I really liked, it has been echoing in my head now for 20 years, “It’s my job to be pushing. I need you to tell me if I’m pushing us off a cliff.” That’s a blind spot. “I am the technical expert here; you are the person who has been given the strategic marching orders. We should collaborate.” Something that’s really useful for me, a blind spot that I have is, I could really use you telling me who is killing it. Who is just doing an awesome job? Because my sample set is limited. I know who speaks up in meetings. This is going to come as a shock, but there is not a one-to-one correlation between who speaks at the meetings and who is doing great work – I know, just wild. “I need your help on that. I need your help on lots of other things, about what the lived experience on the ground is as a software developer.” This is a two-way relationship. This is really where managing up becomes a collaborative thing that the best managers are seeking.

Another great example that’s happened to me a few times is, you saw that calendar. If I’m eating lunch, I’m eating lunch in a meeting. I don’t have time to go wander around the company being curious, talking to the marketing department. They have a totally different theory of how we’re going to increase conversion this quarter and maybe we should get aligned. Help me see around blind spots.

Is it working? Good question. You never really know. There was supposed to be a build here, which is not building, so I’m just going to do it from memory. There are some things that you can look for to know whether or not your managing up to your manager is working. Are they asking you for your opinion more often? Are they sending you to speak in their stead at meetings? Do they seem a lot calmer when they come into one-on-ones with you? These are all clues that you are applying the basics of managing up successfully, congratulations. Let’s move on to advanced techniques.

Ask for Advice

We have at least five advanced techniques. I’m going to tell you don’t try these until you’ve mastered the basics, they are advanced. Ask for advice, not feedback. Your boss dreads the question, “Do you have any feedback for me?” “No, you’re doing great. Keep it up.” It’s because they’re not thinking about you. In the best case, they’re thinking about the project you’re working on. They don’t have any feedback for you. The standard advice is, ask for specific feedback. I don’t know, maybe your boss is a bit better than my boss is, even that seems like a major stretch. I like to ask for advice, people like to be asked for advice. Advice puts the attention back on them not on you. “Do you have any suggestions for working with that PM? Because I’m really having trouble about how we negotiate around deadlines” “Sure. I’ve got lots of suggestions about that.” Nothing about you, it’s all about me. Ask for advice, not feedback.

Closed Loop Communication

What do I mean by closed loop communication? I actually think this is something I learned from Roi [Ben-Yehuda]. There’s something in the psychological literature called the Zeigarnik Effect. This is that thing that we’ve all experienced; projects which aren’t done, which are incomplete, which we may forget about, those are the ones we obsess about. Our brains are in a tight loop constantly, “Don’t forget to do that. Don’t forget to do it. Don’t forget.” This is the fundamental insight behind some of the productivity systems like getting things done. Write it down and forget about it, free up that brainpower to think about something else.

That also works for managing up. If I make a contract with my manager where I’m going to push them the information they need, consistently and reliably, they can stop worrying about me. If they’re spending all their time worrying about how my project is going, they’re going to want to solve that problem, and they’re going to want to solve that problem by coming down and doing my job for me. That is not going to go well for either of us. If they know that they don’t have to worry, not because things are going to go well, they aren’t going to be hiccups or there’s not going to be any surprises, but they know that I’m going to make sure that they hear about the surprise first, then they can take that deep breath and think about something else and let me do my job.

Again, this is advanced skills, there’s a super pro move in here. This is one of those things where if you’re a senior leader, maybe a CTO, and there’s an outage, the cadence of a closed loop can change a lot based on how much adrenaline there is in the room. My job as a CTO, when the site is down, is to sit on the CEO. You are going to know the site is down from me and I’m going to give you updates every 10 minutes, and I really need you to stay out of the slack war room. As long as I keep that contract with him, he mostly stays out of the war room. I say “him” because I’m thinking about a particular one, but you know, “them.” That’s closed loop communication. Closed loop communication can be over the span of months, or a weekly update that I’ve asked you for that you’re actually sending every Friday morning like I asked for it, or every 10 minutes depending on how much adrenaline there is in the room.

Your Boss is Repeating Themselves, Listen

Here’s another pro move – your boss is repeating themselves, you may want to listen. This one actually took me a long time to figure out. Maybe you are all smarter than I am. “The best boss I’ve ever had left me alone to do my work.” “That’s interesting. Can we get back to the part where I’m asking you to help me?” “It seems like we aren’t firing fast enough.” “Nah, we’re hiring fast enough, it’s great. We don’t want lower the bar.” “I’m worried about our July deadline.” “Don’t worry about our July deadline.” None of those are the right responses. If your boss is repeating themselves, they are trying to tell you something whether or not they’re conscious of it. One of those key managing up techniques is figuring out what they aren’t actually telling you.

Dealing with unreasonable requests for detail, this is one of those things that you run into more and more as you go up, either up as a senior IC or up as a manager. “Why is X running late?” “The spec changed, and we had that other thing, and there was a security thing, and GDPR.” “Great. Just tell me what everyone’s working on and I will help them re-prioritize their work so we can hit our deadline.” “Yes, that’s not going to happen,” but you’re not going to say, “That’s not going to happen,” to the CEO, for example, who wants to lay out everybody on the engineering team and is later in Reese’s Pieces cups, to pick out an entirely hypothetical example.

What you’re going to say is something like, “Great, I can do that. I can get you that information. It’s going to take a little time. Do you have any advice for me in the meantime? I am curious about what you think we could be doing better. Are there any particular problems that you see that you’d like to tell me about? Is there a particular format to the state that would be super useful?” Then just get them the summary that they asked for. Curious, got to understand your job, go to try close the loop. Putting it all together to deal with this thing where someone is trying to pierce past the appropriate level of abstraction in the organization.

Give Them Something to Talk about

My final pro move is, give them something to talk about. This is a picture at Etsy, me when I was much younger, some other folks, and two of our board members, Danny Rimmer and Fred Wilson. One day, instead of having the board meeting, because we didn’t want to have the board meeting because the slides weren’t ready, we’re, “You know what? We’re just going to teach you how to plug the site. Come on over, we’re going to make a little code change. You’re going to deploy, it’s going to be amazing.” We got a couple of things out of that that I have since rolled forward into my practice throughout.

One, it just increased the empathy. They had a little bit more sense of what we were talking about when we talked about deploying the site, but much more importantly, if you are very senior- you’re a CEO, you’re a board member, you’re a VC, maybe you’re a director at a major company – your currency is graphs, insights, stories, and people can be really happy for a long time on one story to keep them busy while you go back to doing good work. Just say it: it’s a great, advanced technique.

Things That Don’t Work

Let’s talk about a few things that don’t work. “I got this,” drown them in detail, catastrophizing, “They should just appreciate me for me” and “That’s not my job.”

“I got this.” I hear you’re worried about the July deadline. “I got it. It’s under control.” That doesn’t work, it’s good to talk about your challenges. Not, “I’m so stressed out and I don’t know how to solve these. This is unreasonable,” but “Yes, there are challenges.” Because you know what? If you don’t talk about your challenges, your boss just assumes you don’t know about them and you’re clueless, and then you’re going to start losing authority and losing the ability to do that great work that we’re all aspiring to do.

Similarly, drowning them in details. I seem to provoke this in people, I don’t know why. If someone wants to take me aside afterwards and tell me what I’m doing wrong, I would love to know. “But why are they micromanaging me? If they really want to know, I’m just going to send them the change log, as it happens, and all the PRs, and all the notifications, and then they’ll stop asking me.” Like, “I got this,” your boss just assumes you have no idea what’s important. By the way, this is a really great thing when your boss decides you have no idea what’s important, they’re going to tell you you’re thinking tactically, not strategically. If anyone ever tells you you’re thinking tactically, not strategically, what they’re saying is, “You’re not talking my language.” Now you know.

Catastrophizing, “Oh my God, everything is broken. This sucks, this is awful. No one is writing tests, the database is on fire. What are we even doing here?” All I’m hearing is, “I can’t be transparent with you. I’m only going to tell you what’s going on when it’s set in stone.” By the way, it’s never set in stone; it’s always changing. You’re now going to be the last person to know. I figured none of us needed to see another picture of Jack Nicholson, so just imagine that you can’t handle the truth image there.

“Appreciate me for me.” Being loved and appreciated for who we are is a basic human need, we all have it. That’s not what work is for, unfortunately. I’m not thinking about you – going back to our original point. I see a lot of people run into this, it’s really hard. We think we’re worth something, we think we’re special. We think that we know something about the world and our unique values and contribution. That’s true.

I don’t mean to say that you will never be able to talk to your boss or other people at your company about your values, and about what’s special to you, and about the change you want to set. The relationship is supposed to be in give and take. You’re not going to be able to demand it. You’re not going to be able to get there if you don’t understand the asymmetry in the relationship and the asymmetry in the level of focus. This is what I’ve struggled with personally, and most other people I know struggle with this one. This is a very real thing. You need to find that validation somewhere, just maybe not your boss.

“That’s not my job.” I’ve had somebody who’s worked for me twice now. They’ve also quit on me twice, so I don’t know if that means they like working for me or don’t like working for me. I’m still trying to figure that one out. I was constantly asking them, “I need you, as the area expert, you are the expert on this topic, I need you to help us think about strategy here. I need you to step in and make sure that we’re planning and you don’t just come to me and complain after-the-fact that we’re doing the wrong thing.” They said a lot, “Sounds like you’re asking me to manage the team and that’s not my job.” I think it’s probably telling that they ended up quitting on me a couple of times because this was just a frustrating back and forth. Neither one of us ever figured out how to get past it. But if someone is asking you to do this thing, what they’re telling you is, “This is part of your job” to a certain extent. They are asking you to be a leader. It’s a trap, admittedly, but they are asking you to do that influence-based leadership and just saying “no” is not going to work.

A couple of year later, this particular person, a very senior woman back-end engineer – I realized that part of our disconnect might have been that she was working really hard not to be labeled as non-technical, and was very sensitive to being asked to do sort of the emotional labor of managing, so that’s a thing. But in general, you need to find a way to talk about that. I wish I had been smarter to figure it out faster, but, that’s real. It doesn’t mean you can just say “no”, unfortunately.

A Stressed Boss vs. a Bad Boss

Bad bosses absolutely exist. This talk, so far, we’ve mostly assumed the bosses are doing their best. Their best may not be very good, they’re stretched thin, they’re probably doing work they were never trained to do, but they are trying. Even good bosses have bad days, and good bosses are under pressure. Let’s not say good bosses, there are bosses under pressure and there are bad bosses. When you start managing up, it’s not always going to be appreciated. They’re going to bark some days, they’re going to snap at you. You know what I’m going to say next: they’re not thinking about you. It has nothing to do with you, they’re under pressure. That doesn’t excuse bad behavior, but that’s just what’s happening.

The difference between a stressed boss and a bad boss, is does it ever work? Does it ever feel like you’re getting what you need out of the relationship? Are they ever taking your feedback? Are they ever enabling you to do great work? Do the managing up techniques ever lay out? In which case, you have some hope. There is no way to manage up to bad bosses. Bad bosses you survive and leave, it’s not your job to try to fix them, but most of us who have had bosses who we didn’t click with, they weren’t bad bosses. They were stressed bosses, they were bosses trying to play a game above their level. That is a situation where managing up can really enable both you and them to do your best work.

I want to throw out a bunch of people who I’ve talked to about managing up over the years who I really appreciated giving their insights into this. Julie Evans, Doreti, Maggie, D.B. Smasher, Allison, a bunch of others. Two really good follow-up resources on this [slide]. Julia Evans’ “Help! I have a manager” is a great zean on the topic. Then, like I said, Camille Fournier’s, “The Manager’s Path.” A great book on what exactly is it that you say you do here?

Questions and Answers

Participant 1: How do you know if a manager is an upward manageable boss? For example, I know a manager who is currently managing 30 people.

Elliot-McCrea: The question is, how do you know if someone is on an upward trajectory, is an upward manageable boss, with the caveat, they currently know is somebody who is managing 30 people. The first thing I’d say is, do they know they have a problem or are they in denial still? That’s their first clue about how supportable this person might be. Then the other thing, even though this whole talk I just gave was about how to help your boss do their job better, it’s not really your job. That’s your responsibility in the context of your trying to do great work.

One of the things that you should know is, don’t take it on if you don’t want to take it on. My general trend line is, when I try to give them feedback, when I try to use these techniques, when I try to connect with them about things that I know that they care about, how well does that land? Is it getting better over time? Micro. But if someone is managing 30 people, the only signal I’d be looking for is how quickly they’re trying to shed that. If they’re not trying to shed it, they’re a little bit delusional, probably.

Participant 2: I’ve used some of these techniques before. I find them sometimes effective and always exhausting, if you get what I mean. How do you stay resilient in the face of that?

Elliot-McCrea: It’s supposed to be exhausting, because you’ve been drafted into being a manager, and managing is exhausting. Part of it is just acknowledging that. You also can’t do it as a full-time job and also be a software engineer as a full-time job. You have to understand that there is a tradeoff there. Once you’re stepping into that ring of being a manager for some percentage of your day, then you have to start using manager survival techniques. The manager survival techniques are having people to talk to outside of work about just how nuts it’s going. Ideally, maybe a coach, a therapist, and a drinking club. All three of them are solid.

Having things that you do that actually make you feel like, “I’m good at this. Because the things that I spend most of my day doing right now I’m bad at.” Then exercise is the other one. That’s how I survived doing this work semi full-time. Some version of that for a reduced set of doing it full-time.

Participant 3: Any advice on hiring manageable managers? We have a good process of hiring manageable people, or under managers, but we don’t have a good process for hiring managers and so on.

Elliot-McCrea: Hiring managers is really hard. I don’t know if I necessarily have advice on how to hire manageable ones particularly. Some of the things I like to do when I’m hiring managers full stop, there are a few things that are high pass filters. You ask someone about a project that’s went well, why did it go well, if it’s all about them, they’re out. Then the next one is creating role play-like situations, either explicitly, “We’re going to do a role play about this,” or sending a junior engineer in to ask some questions and see how well they do with it, which is like a role play-like situation. You’re really looking for people to actually demonstrate some of those skills.

But it’s also high false positives, high false negatives. It’s a thing. One of the things that you need to be doing as the person who is the hiring manager for those managers is making sure that they don’t create a new blind spot for you. A new manager in your organization is likely to create a new blind spot where they can control access to information for you about how good or bad they are at their job, and you need to make sure to put systems in place to avoid that.

See more presentations with transcripts

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.

Presentation: CockroachDB: Architecture of a Geo-distributed SQL Database

MMS Founder

Article originally posted on InfoQ. Visit InfoQ


Mattis: Today I’m going to walk you through some of the high-level components of the CockroachDB architecture, and at times diving down into some of the details. I first want to give you the elevator pitch for CockroachDB: Make Data Easy. This is actually the mission statement at Cockroach Labs. Our thesis at Cockroach Labs was that too much of a burden has been placed on application developers. Traditional SQL databases didn’t provide horizontal scalability or Geo distribution of data and NoSQL databases which promised a horizontal scalability, they’ve required you to give up transactions, and indexes, and other goodies that made application development easier. So, we came up with CockroachDB.

CockroachDB is a geo-distributed SQL database. By distributed, we mean it’s horizontally scalable to grow with your application. It’s geo-distributed to handle a data center failure. We place data near where it’s being used and we also push computation closer to your data.

SQL’s the lingua franca for rich data storage. There’s still contention of whether SQL’s hard to use or not, and yet it is known by almost everyone. SQL provides schemas, and indexes, and transactions, and these all make your life as an app developer easier.

As Wes introduced me, my name’s Peter Mattis, I’m the CTO and co-founder of Cockroach Labs – that’s the company behind CockroachDB. I’ve been in the data storage industry for a little over 20 years, which shocks me when I think about it. Some of the highlights of my career: I built the original search and storage engine for Gmail back in Google, that’s almost a lifetime ago. I also worked on Google’s second-generation distributed file system called Colossus. I’ve been working on CockroachDB for the past five years. It´s my obsession right now, I love it.

Here’s the agenda for today. There’s actually quite a bit to cover. We’re going to go pretty fast, so let’s just jump right in.


There are many places we can start at covering the architecture of CockroachDB, and I’m going to build things from the bottom up. The bottom is a distributed, replicated transactional key-value store. I’m going to start right out today and disappoint you, this key-value store is not available for your use, it’s purely for internal use, I’m sorry. There are many reasons for this, but we really want to be able to tailor and evolve this key-value store for the SQL layer that sits on top and focus all our energies on making the SQL exceptional.

Where to start with is distributed, replicated transactional key-value store? The first thing to notice is that it contains keys and values, and keys and values are arbitrary strings. At this level, the keys and values don’t have any structure. At higher levels, I’ll talk a little bit later about how some structure is imposed upon them. Everything’s ordered by key in the key-value store. We use multi-version concurrency control, and what that means is the keys and values are never updated in place. Instead, in order to update a value, you write a newer value and that shadows the older versions. Tombstone values are used to delete values. What this provides is a snapshot view of the system for transactions. We describe CockroachDB as having a model with the key space; that means there aren’t separate key spaces used by different tables, it uses one big key space and we just do tricks with how we structure the keys.

That’s a little bit of a depiction of what the monolithic key-space looks like. Here, I have a set of keys, these are just dogs. Some of them are names of dogs of people who work at Cockroach Labs. It’s monolithic key-space, it’s ordered by key. For many of the remaining slides, I’m just going to be showing you keys and not keys and values. In reality, behind the scenes, there are values, the values are versioned. I’m not showing that just for clarity, reality is always more complex than presentations like this.

The key space is divided into 64-megabyte ranges. Sixty-four megabytes is chosen as an in-between size where it allows ranges to be moved around and split fairly quickly, but large enough to amortize an indexing overhead, which I will talk about shortly. I should mention that ranges don’t occupy a fixed amount of space; they only occupy the space that they’re consuming and they grow and they shrink as data is added to them and deleted from them.

There’s an indexing structure that sits on top of these ranges. The ranges don’t just float out there and we can’t find them; we have to be able to find a range. This indexing structure is actually stored in another set of ranges. These ranges are stored in this special part of the key space known as the system key space within CockroachDB. This actually presents a little bit of a chicken and egg problem; how do we find these indexing ranges? Well, there’s another index that sits on top of that. This forms this two-level indexing structure, and if any of you are familiar with Bigtable, or HBase, or Spanner, this is exactly analogous to what’s on there.

I should mention that this structure here, this order-preserving data distribution, is a very conscious design decision. It poses quite a bit of complexity on the CockroachDB internals, yet it’s extremely important. If we had chosen something like consistent hashing, we wouldn’t be able to implement full SQL. You’ll see a little bit about why that is later.

By having this fully ordered key space, we can do range scans that span ranges. Here’s a simple example of that. If I want to search for the dogs with names between Muddy and Stella, I can go to the index. I see that this corresponds to ranges two and three, and then I go right to ranges two and three and jump into the data there. Something I forgot to mention earlier is that the ranges themselves, their key-value data, it’s stored down on a local key-value store and we use RocksDB for that purpose.

Transactions are used to insert and delete data into ranges. I’m going to go into the transaction details more a bit later in this talk. Right now, it’s just a simple example. If we’re trying to insert data into a range, we see this key Sunny, we go into the indexing structure, we see this corresponds to range three. We go to range three and we can insert the data, and then the insert is done. If the range is full, what happens? We go to Insert, we see it’s full, and we split the range. Splitting the range involves printing a new replica, a new range, moving approximately half the data from the old range into the new range, and then updating the indexing structure. The way this update is performed, it’s using the exact same distributed transaction mechanism that I was using to insert data into range itself.


I said this is a distributed, replicated transactional key-value store, now let’s talk about replication. We use Raft for replication, Raft is a distributed consensus protocol similar to Paxos. Paxos is notoriously hard to implement, and I chalked some of that up to some of the early descriptions of the algorithms. Having implemented Raft and lived with Raft for the past five years, I can tell you the implementation of Raft is no walk in the park either, it’s quite challenging and complex. Each range in CockroachDB is a Raft group. I want to reiterate that. The unit of replication is a range. Sixty-four megabytes of data at a time are replicated and kept in consistent. This is important. It allows easy data movement within the cluster. It also allows us to tune and control the replication factor on a per range basis.

CockroachDB defaults to three replicas per range, but it’s configurable and we actually default to a high replication factor for some important system ranges, notably the indexing ranges I talked about earlier. Finally, I should note, some people wonder, “Can I have a range that only has two replicas?” In fact, it doesn’t actually make sense. The reason it doesn’t make sense is because of how consensus works. Consensus requires a strict majority. If you see this example right here, there are three replicas depicted in this Raft group, and in order for consensus to be achieved, I need a strict majority, which would be two of them. If I only had two replicas in a range, the majority is two and I have actually achieved nothing except reducing my availability.

I like to describe what Raft provides, what consensus protocols provide, as providing atomic replication. Commands are proposed into Raft, and once they are written to a majority of the replicas, they’re atomically committed. It doesn’t matter which of the majority writes them; any of the majority write them. In this case, if I’m writing to three replicates here, as soon as two have written it, it is committed. It’s available, the system crashes, comes back up. We’re only going to make progress if those two are still available.

Commands in Raft are proposed by the leaseholder and distributed to follower replicas. I’ll get into the details a little bit later. They’re only accepted when a quorum of replicas has acknowledged receipt. If you’re familiar with Raft, you might be slightly confused right now because I’m using this term leaseholder, and Raft talks about having a Raft leader. They are actually separate concepts inside CockroachDB, but they’re almost the same. For most of the talk, I’m just going to talk about leaseholders. Raft leader is the thing that actually proposes leaseholders and optimization, which I will touch on shortly, help optimize reads. For the most part in CockroachDB, they’re exactly the same replica. They’re almost never different, we actually have a background process to make sure they’re the same.

Talking about range leases, this is where the concept of leaseholders comes up. Why range leases are important is, we want to be able to do reads in CockroachDB without going through a consensus round trip. Without range leases, in order to do a read, I’d have to talk to a quorum of the replicas in the range. The latency for the read would be determined by doing that quorum read. We’d also be doubling or tripling the amount of the traffic we send over the network, and all of these stinks. What we’d like to do is be able to read from a single replica. You can’t read from an arbitrary replica, though; that arbitrary replica might be behind, but there is a special replica within the Raft group and that’s essentially the range leader or the Raft leader or the leaseholder, and that leaseholder which has been coordinating all the writes knows which writes have been committed. We send all the reads to the leaseholder, and the leaseholder can then handle the read without consensus. In addition to performing this quorumless reads, the leaseholder coordinates writes. This is important because it actually comes up in our serializable isolation level in terms of key range locking and some other functionality.

I talked about replication, I talked about keys and values, let’s move on to distribution. In a Cockroach cluster, you have more than just three nodes or four nodes. You have many nodes and we can place these replicas wherever we want in a cluster. Where do we place them in? This is the replica placement problem. It’s actually a hard problem. CockroachDB uses four signals in order to drive the replica placement heuristics: space, diversity, load, and latency. I’m not going to talk much about space, it’s the easiest one. It can be summed up as that we try to balance space across the cluster just to get the most disk utilization.

Let’s start with diversity. We all have heard diversity and this term used throughout the industry recently. We know diversity improves teams, improves companies, it also improves availability for ranges. Diversity comes into play in replica placement in that we want to spread replicas across failure domains. What is a failure domain? It’s a unit of failure, a disk, a machine, a rack in a data center, a data center itself, a region. California can be hit by a meteor, we want to be able to survive that. I actually don’t know if we can survive that, but the idea is real. What CockroachDB tries to do is, it looks at these failure domains and it tries to just spread the replicas across them. If we didn’t take failure domains into consideration, the silly extreme is that we could end up placing all the replicas for a range on a single node and that’s obviously bad. Rather than having any sort of fixed hierarchy here, this is something that’s user-specified, so it can be customized here at deployment.

The second heuristic that we use, the second signal for replica placement, is load. The reason load is important as a heuristic is just balancing on space, balancing on diversity, and spreading the ranges throughout the cluster. It doesn’t actually balance on load. The first place this comes up is leaseholders. There’s actually an imbalance between leaseholder and follower replicas. The leaseholder, by performing all the coordination for writes and performing all the reads, has a significantly higher network traffic, as well as CPU usage than the follower replicas. If we weren’t taking care of to try to balance the leaseholders throughout the cluster, we could actually end up having a one or a few nodes in the cluster having all the leaseholders and having a severe imbalance. We actually saw some practice before we implemented this heuristic.

The second place this comes up is that all ranges aren’t created equal. I’m showing here an example where this blue range up here is having higher load than the other ranges in the system. CockroachDB notices this, it actually measures per range load, and by measuring that per range load, it can spread ranges and spread the load across the cluster. In this case, the blue range ends up on nodes by itself. An example of how this might actually occur in practice is you might have a small reference table that’s being accessed on every query, or if you have a few hot rows in your system, the ranges those rows exist on would be very hot.

Finally, CockroachDB is a geo-distributed database, and with geo-distribution comes geographic latencies. Over short distances, geographic latencies can be tens of milliseconds and over longer distances, they can be hundreds of milliseconds. We need to take this into consideration during replica placement. We actually want to place data so that it’s close to where the users are and move data around where it’s close to being accessed. There are two mechanisms inside CockroachDB. What I’m depicting here on this slide, is we take that same dog’s data that we had before, we can actually divide it up and put a prefix on the keys saying, “Here we had European dogs, we had East Coast dogs, we have West Coast dogs.”

Once that’s done, each of those ranges, we can apply the administrator where the application replica can apply constraints to the table and preferences saying, “I want the European dogs, those ranges, those replicas, to be housed in a European data center. I want the West Coast dogs to be in a California data center,” etc. Those are the manual controls that we have over replica placement to improve latency. There are also automatic controls that take place. We track on a per range basis, where in the system reads are coming from and then practically move data towards where those reads are happening. That’s called follow the workload, because it’s meant to adjust to changes in the workload over the course of a day or over the course of a week.

What I described previously was replica placement in a static cluster, but clusters aren’t static. If we add nodes to clusters, we remove nodes from clusters, nodes permanently fail or sometimes they temporarily fail. We need to make all of this easy. That’s our mission statement: make data easy. It should be seamless when a node gets added, that administrator doesn’t have to do anything. What happens when you add a new node to the cluster? The cluster automatically sees this node and starts rebalancing replicas onto it, and it just uses the previously described replica placement heuristics to decide which nodes to move. What does this look like? We first add an additional replica to the range and then we delete the old replica from a node. It’s composed into these two steps. In order to move a replica, it’s an add and delete operation.

Another common occurrence is there’s a permanent node failure. This happens, you run on thousands of machines and nodes will go down permanently and never come back. When this happens, the system notices that that node is gone, and the ranges which had replicas on that node will actually start to create new replicas on other nodes and remove the replicas from the failed node. We’re not able to talk to the failed node, so the failed node doesn’t even know this happened. In some sense, permanent failure is just like an excessively long temporary failure and that’s how the system treats it.

There’s also temporary node failure, and this is actually the much more common occurrence. Permanent failure, happens, but it’s rare. More commonly, you have temporary node failure. The most common reason for this might be upgrading your system, or the other reason, and this is going to be exceptionally hard for you all to believe, is that there’s a bug in CockroachDB, it crashes and the node has to restart. That is a super short temporary failure. Even during that short period of time, a minute, or whatever, the replicas that are on that node can fall behind, and when they fall behind, we have to catch them up when the node comes back up. How does this catch up occur?

There are two mechanisms inside Raft that we utilize to catch up replicas. One of them is we can send a complete snapshot of the current range data, and if the node has been down here for minutes and there’s been significant churn on the range, this is the right thing to be doing. You send all the data and the replica is now up-to-date. What happens if the node was only down for a handful of seconds, and the ranges are only behind by a few records. What do you do then? The other mechanism just sends this a log of the recent commands to the range and you can replay those commands, and for a handful of records, that’s the right thing to do. Now, which one do we use? That is just careful balancing of heuristics within CockroachDB. We look at how many writes occurred while the replica was unavailable, while the node was down, and we determine whether it’s better to send a snapshot or better to send a log of commands.


I talked about replication, and distribution, keys and values. This is a transactional system. I want to give you a bit of flavor of what transactions in CockroachDB mean. ACID, you probably all have seen this: atomicity, consistency, isolation, and durability. Transactions in CockroachDB are serializable. We implement the serializable isolation. Isolation levels in databases and in general are this deep and frequently confusing topic, and I’m not going to get into all the details about them today. Serializable isolation is actually an intuitive isolation level. It’s the one you might imagine, if you’re to think what the isolation between transactions should be, it would be serializable. The transactions and serializability are run as if in some serial order. We don’t actually know what that serial order is, but you could actually lay out what that serial order is. There is one, it’s a gold standard isolation level. It’s not actually the strongest isolation, but it’s right near the top. There are tons and tons of weaker isolation levels, and we embrace serializable isolation as part of our mission to make data easy.

Weaker isolation levels are a frequent source of bugs, and sometimes they’re bugs that application developers don’t even know about. You don’t even realize you’re messing up; you just have a bug and can later be exploited. Some research at Stanford actually showed this. A year or two ago, they made a paper called ACIDRain that showed how many websites have bugs due to weaker isolation levels.

Transactions in CockroachDB can span arbitrary ranges. There are no limitations; it’s not like you have microtransactions or something like that. They’re also conversational, and this is important, the full set of operations for transaction is not required upfront. Why this is important is because frequently, people who are application developers write into a SQL database, they’ll start a transaction, read and write some data, data they’ve read, they apply some application logic to it, and then they write the transaction and back out. That back and forth is called the conversation of the transaction, and some other systems don’t apply that. It’s hard to do, it complicates our life, and yet, it’s important to provide a full SQL database.

I’m going to dive into some of the details of our transaction implementation. Transactions provide a unit of atomicity, and in a distributed database there is a question of how do you actually provide that atomicity for the transaction overall? We bootstrap transaction atomicity on top of the atomic replication that Raft provided. Raft, remember, provides atomic replication of a single record within a range where I can write that record, it’s atomically replicated. We’re going to use that functionality and we bootstrap our transaction atomicity on it, and that’s done by having a transaction record. Every transaction has a transaction record. That transaction record is just a key-value record like any other in the system, and we atomically flip it from pending to commit in order to commit the transaction.

Let’s walk through an example of this, we’ll make some of this a little bit clearer. What I’m showing here is we have a cluster with four nodes. There are three ranges spread across these four nodes, and this is our dog’s data set again. We have this query, we’re inserting two rows into this table. That SQL query will be sent to a SQL gateway, which is just any node in the system and SQL execution will decompose the query into a set of KV operations. This starts out saying, we’re beginning a transaction and we’re writing the key Sunny. When this occurs, the gateway talks to the leaseholder of the Sunny range and the first thing it does is, it creates a transaction record. That transaction record is always created on the same range as the first key written in the transaction. That’s done for locality purposes. The transaction starts out in a pending state. Something I didn’t show there which is important is the transaction record is actually replicated just like everything else, it’s just not depicted here.

After writing the transaction record, the leaseholder then proposes a right of the Sunny key and this is done by sending that write, that command, to the followers as well as to itself, and the Sunny key is in process at this point. One of the followers replies, says, “Yes, I wrote that to disk,” as well as the leaseholder also acknowledges the write to disk, and then we move on to the next write. Now we’re writing Ozzie.

Something I should point out here is that we only required a quorum of the replicas to respond to this write to the Sunny key. Another thing I should point out here is I still have Sunny highlighted in yellow, and the reason it’s highlighted in yellow is that other transactions can’t see this key at this point. This is part of what isolation’s about. Until the transaction is committed, another transaction shouldn’t be able to read the Sunny key. This is implemented internally, there are markers associated with each key, and the marker is actually the idea of the transaction that is writing that key.

Moving on, a very similar process happens with Ozzie. We send the Ozzie to the leaseholder, the leaseholder proposes Ozzie to the followers, one of them replies, and then we acknowledge that KV operation back to the SQL gateway. At this point, the SQL gateway marks that the transaction record is committed, and this commit was a replicated commit, and we did a Raft write there. Once that’s acknowledged, the SQL statement is done, and we reply, and that’s it.

I left out a whole number of details there, and that’s because it’s just a high-level architecture talk. I didn’t talk about read-write conflicts, I didn’t talk about write-write conflicts, I didn’t talk about how we handled distributed deadlock detection, and I didn’t talk about large transactions. All that stuff is actually implemented in CockroachDB, it is just there’s only so much detail I can go into in an overview talk.

One thing I want to highlight is that you’re probably noticing there, there’s a number of round trips that take place during a transaction, handling a transaction. What I described was the original transaction protocol implemented by CockroachDB. We’ve slowly been evolving that protocol over time to make it faster and to reduce the effect of network round trips. What we call this new version is pipelined. What was previously described, that’s the serial model of transactions, but the pipelined one is what we’re currently using.

Let me just step through that same example again, but look at the round trips that are involved. With the serial operations of transactions, we wrote the transaction record, waited for it to write. Then we wrote Sunny, waited for it to write, that round trip. For the pipelined, we actually just write Sunny; we don’t even write the transaction record. The reason this is safe is a little bit subtle. We actually give transactions a grace period so that if another transaction encounters the key Sunny shortly after it’s been written, it has a small grace period where we consider it still pending.

Then we write the key Ozzie. The thing I want you to notice here is on the serial operations, we wrote Ozzie and then we waited for it. But on the pipelined ones, we actually started writing Ozzie before Sunny even returned. Sunny’s still in flight, still being written, now we’re writing Ozzie as well. Finally, we write the transaction record, we get to the commit. Something that’s subtle here is that the transaction record in both cases, we include a list of the keys that were touched by the transaction. This is necessary in order to do this cleanup operation of all the markers on those keys. It also allows something else. In pipelined transactions, those keys allow anybody who comes across the transaction record to go out and determine, is the transaction committed or aborted? When is the transaction committed or aborted? In pipelined, it’s committed once all the operations that were involved in the transaction complete.

Essentially, what we did is we took what was previously a centralized commit marker indicating the transaction was committed, and replaced it with a distributed one. As you might imagine, this is complex and it’s challenging to get this right. We actually took the effort to model this with TLA+ and join the cool kids club when we did that. It was a fun experience to do that, the engineers involved failed to give them a reassurance that they caught everything correct here. I should note that we’ve been evolving the transaction protocol over time, and what I just described is the protocol that’s going to be in CockroachDB in the version released in the fall. The code is written, it just hasn’t been released yet.

SQL Data in a KV World

I started out describing CockroachDB as a geo-distributed SQL database, and all I’ve been talking about is key-value stuff so far. Where does SQL come in? Well, now it’s time to talk about SQL. How many of you feel like you know SQL pretty well? Wow, that’s impressive. I had the same thought and then I got to working and implementing SQL database and it exposed the vast gaps in my knowledge. We’re not going to get into those gaps today, I just want to say that it’s fascinating. You get into implementation and you learn what you didn’t know – I didn’t know a lot. Let’s get started.

Those of you who are familiar with SQL will recognize this. It’s a declarative language, not imperative. What I mean by that is you specify the results you want, but not the sequence of operations for how to get those results. This is kind of great, it’s powerful. It’s also confusing because very frequently, it’s “How do I structure my SQL appropriately?” But what it does is it gives the SQL database a ton of freedom in deciding how to implement the SQL queries. That’s also a ton of burden, but we feel that’s where the burden belongs. It belongs in the SQL database, not in the application developer.

Data in SQL is relational. We have tables composed of rows and columns. The columns are typed, they have types like integer, float, and string. We have foreign keys to provide referential integrity, which is useful both for data correctness, as well as it provides optimization opportunities for the optimizer. I should mention that CockroachDB SQL is full SQL, it’s not half SQL or light SQL. We actually implement the PostgreSQL dialect, more or less compatible just with a few esoteric edge cases, and we also implement the PostgreSQL SQL wire protocol. There are pretty many drivers for every language out there in existence.

“Wait a minute,” you say, I talked about keys and values and keys and values were just strings, and columns have types. Whoa, what do we do? At first glance, this is kind of like this severe impedance mismatch, but it’s not that bad. The question though, is how are we going to store this typed columnar data in keys and values? Let me give you a flavor for how this is done. This is actually a low-level detail, but it’s useful to explain at a high level, just to make clear that this is possible. We have this example here, it’s an inventory table. The inventory table has three columns: ID, which is the primary key, name, and price. Inside CockroachDB, every table has a primary key. If you don’t specify a primary key, which is something that’s valid to do in SQL, one is created automatically behind the scenes.

Every index inside a table, including the primary index, creates another logical key space inside CockroachDB, and a single row in that index creates a single record. In this case, we just have the primary index so each row will create a single KV value. The way this works for the values is straight forward. The non-indexed columns here, name and price, they get placed in the value. The way this is done isn’t super interesting. We could have used something like protocol buffers, it could be JSON – that would be somewhat expensive – it could be Avro. We actually have a custom encoding because this is performance-critical code and we want to make this very efficient. It’s not that surprising that it can be done, the surprising part is how we encode the keys. Unfortunately, I’m not going to get into the details. If you’re really interested, come talk to me afterward, I’ll point you to where this is done in the code.

Essentially, we have this problem. We have this ID column; it’s an integer and we need to encode it into a key in such a way that the integer values are the same as the string values. I’m just going to wave my hands at this point. This is possible to do, it’s possible to do with [inaudible 00:31:15] arbitrary tuples, and this is not something the CockroachDB invented. I first learned about this technique at Google years ago. It’s not something Google invented, I think I saw a paper from the late 1980s that describes this. I’m not even sure that’s the first time it was mentioned, but it’s generally possible to do. What you’re seeing here, though, when I’m showing /1, /2, and /3, isn’t actually the physical encoding of the key, that’s just a logical representation of the key. That’s actually what we print out in debug logs and other places.

We support more than just a single index and more than just a single table. How does that take place? We actually prefix these keys with a table name and the index name. What does that look like for an inventory table? We have a prefix of inventory primary and all these keys. Now, you’re probably thinking, “Oh, my goodness, that is quite expensive. We’re replicating this inventory primary, and just storing the small /1, /2, /3 in each of these keys.” Underneath the hood, we’re not actually putting names there, we actually store table ID and index ID. The reason we’re using IDs is A, they’re smaller and B, it allows us to do very fast renames of tables and indexes. Then down at the lowest level inside RocksDB, there are key prefix compression that takes place, which removes some of the remaining overhead.

What does this look like though, this key encoding, if we had a secondary indexing? Here I’ve added an index on the call name, it shows up here. Something I want you to notice is the key doesn’t just contain the name call; it actually still contains /1, /2, and /3. Why is that? The reason is this is a non-unique index, and a non-unique index means I can insert multiple rows that have the same name column. Here I’ve added another row, and this contains a duplicate of the bat name. We have two bats, one for $4.44, one for $1.11.

What does this translate into in terms of a key? Well, we have a key and the suffix of the key is /4. What you’re seeing here has actually made the keys unique, and the way I made them unique was to use the columns that are present in the primary key. We know the columns that are present in the primary key are unique because that’s by definition; primary keys are unique. It’s a unique index on the table. This is just a very quick high-level overview of how the SQL data mapping occurs.

SQL Execution

SQL execution – there’s actually a relatively small number of SQL operators, relational operators. We have projection, selection, aggregation, join, scanning from a base table, and there’s order by, which is actually technically not a relational operator, but most people think it is just for the service. These are specified in SQL queries via select, where, group by. We have join, and you intersect, and you specify tables using the from clause.

These relational operators are part of relational expressions and all relational expressions have zero, one, or two inputs. There’s usually often a scale or expression that sits off on the side. An example of that is the filter operator; a filter operator has a child input operator feeding it rows, and it has a scalar expression, which is the filter that’s being applied to decide if a row should be emitted on the output. A query plan is just a tree of these relational expressions. What SQL execution does is it takes query plan, and runs the operations to completion.

Let me give you a simple example, a flavor of how this works. You have an inventory table, and I’m just going to scan it, and filter, and return the names of inventory items that start with B. The way we break down and do basic SQL execution is you start with the table, that’s the base of the operation. We’re scanning the inventory table. The rows that are output from the scan operation are sent into a filter operation and the filter operation looks at each row and just is applying the scalar expression. Filter the rows, and the output from that is sent to a projection operation and that projects out the name columns so that’s all that remains in the output, and then we send the results to the user.

If you haven’t been exposed to SQL execution before, you’re, “Shouldn’t there be more to it than that?” At some level, there isn’t anything more to it than that. It’s actually fairly straight forward. There are complexities with regards to hindering the generality. You have user-defined schemas, you have arbitrary types, and whatnot, and yet I want to shrug and say at some level it’s just a small matter of programming to handle all that stuff, easy peasy. It’s a little bit more than a small matter of programming, but it’s not conceptually difficult. There is efficiency that comes into play. Something that happens here is that we don’t actually want to do a table scan on each operation. Now, I want to tie this down a little bit about how this works at KV level. We did the scan to the inventory table, we did a whole table scan, and it will be correct SQL to always do a full table scan. It’ll also be incredibly inefficient, so what you actually do is you push the filter down as low as possible. Here, we can push the filter down. If we added an index on the name column, we can push it all the way down so that we’re only scanning a small subset of keys. We still send the output from the scan onto the project, and then we get the results.

There are two large areas of concern for SQL execution. The first is correctness. There’s just a ton of bookkeeping, an absolute ton of bookkeeping inside SQL to handle the generality and handle all the semantics. We have user-defined tables and indexes and foreign keys and all that lovely stuff. We also have the semantics of SQL, and some of them are well-understood and some of them are a little bit more esoteric. The biggest stumbling block in terms of semantics, both for SQL optimization and for SQL execution is the handling of NULLs. I just want to point this out and highlight it because the bane of the existence of database implementers is NULL handlings; it’s the most frequent source of bugs in SQL execution, in SQL optimization. I’m not going to get into the details.

The other area of concern is performance. I kind of touched on this already with the scan example. We have three main tactics for SQL performance. One is tight, well-written code; you don’t do allocations. You try to avoid allocations in every row. You use good data structures, you try to be frugal with CPU, avoid redundant operations. The other is operator specialization, which I’m going to get into in just a moment. Where this comes up is that there’s oftentimes a general way to implement an operator, and you have to implement that general way. For example, aggregation has a hash group by operator and I’m going to walk through an example of that. It’s also a specialized operator that can be used, a streaming group by operator, and a streaming group by operator can be used if the input is assorted on the grouping columns. There’s just a ton of these operator specializations that are used in the special cases when they apply. Another good example of operator specialization is that we implement hash join, merge join, lookup join, and zig-zag join. They’re used in different cases where appropriate. Lastly, we distribute execution, push it down to data, as close to data as possible.

Let me work through an example of group by. Here we have a query, I was reading from customer’s table and doing an aggregation, a group by operation on country and emitting a count of number of customers per country. I have this sample data here. We have two users in the United States, two in France, and one in Germany. One thing you should notice is right now the data is sorted on name, so we must add some index on name. What the hash group by operator does is it maintains this in-memory hash table and says, “Consuming the rows of input from its input source, it’s taking that grouping column, country in this case, looking up in the hash table and incrementing the count.”

I’ll just walk through how this works, it’s really straightforward. We’re getting through the France users, we aren’t omitting France, but then we get to another United States user and we have to go back and increment the count for the United States. It’s really nothing more complicated than that. There are, again, details in terms of generality and handling arbitrary types and tuples of grouping columns, but this is more or less it. Now you might be thinking, “That seemed relatively straightforward. How do we make this better?” Hash tables are fast, it’s old one lookup. What could be faster than a hash table? The answer is, don’t have a hash table at all.

How do we get rid of having a hash table? Let’s take our sample data again and let’s imagine it is sorted on the grouping column. How did it get sorted? We can sort it ourselves, but more likely, there’s an index on the country column and that index on the country column allowed us to read the data just already sorted on country. Now, what happens? There was something obvious but useful; all the groups are now just contiguous. As we walk through the groups, as we move from one group to the next, we can just omit those rows. We don’t have to maintain a hash table any longer, we just have to maintain a single variable, which is the name of the group we’re in.

We walk down through the data again, we get through our France users, and now we can omit the France row. We get through our German users, we can omit the German row and we get through our United States users and we’re done. I just want to highlight, one of the big pieces of work that takes place at SQL execution, is all these specialized operators. Streaming group by versus hash group by is one of the easier ones. We have the zig-zag join I mentioned earlier, which is more complex. There’s also work on future optimizations and specializations such as a loose index scan, which is something that we discovered by paying attention to what other SQL databases do.

Onto distributed SQL execution. CockroachDB is a geo-distributed database. In a geo-distributed database, network latencies and network throughput are important considerations. What we try to do in order to handle and avoid those network latencies, is we want to push fragments of computation for the SQL query as close to the data as possible. Let’s walk through what this looked like for the streaming group by.

Revisiting that streaming group by example, if we’re scanning over the customers’ index – actually the country index for the customers’ table – we would actually instantiate three different processors, one in each of the data centers that contained fragments of their customer’s table. The output from that scan would be sent to a local group by operator, [inaudible 00:42:10] data centers. What we do is we’re doing the scan, we’re doing the group by, and it’s all just local to the data. There’s nothing being sent over geographic distances. If the query had come into the East Coast, we’re then doing this final aggregation of the data to the East Coast data center. What’s being sent from the network here is vastly smaller than the original customer’s data. It’s only a handful of rows.

SQL Optimization

Onto SQL optimization – I mentioned this earlier, SQL’s declarative; a SQL database has many possibilities, many choices, for how it chooses to execute a SQL query. It’s the job of the SQL optimizer to choose from all these logically equivalent query plans and decide which one to execute. This is the depiction of the high level of what it looks like to process the SQL query. The first step is we take the textural SQL query, we parse it. The output of parsing is an abstract syntax tree. That abstract syntax tree is then sent into the stage called prep. What prep does is it prepares the query. It translates the abstract syntax tree into a relational algebra expression tree. This is what I mentioned earlier and what essentially SQL execution can process.

AT this stage we do semantic analysis such as resolving table names into table IDs, reporting semantic errors. We also fold constants; we check column types. There’s some other stuff that takes place here as well, other cost-independent transformations. The output of the prep stage is something called a memo, and a memo isn’t something I get to talk to you about today other than it is this fantastic data structure which everybody who comes across it in the course of working on SQL optimizer is, “Holy crap, this is amazing.” Essentially, the memo is able to represent a forest of query plans. Instead of having one query plan that a query represents, it stores an entire forest of query plans and it stores it compactly so that I can explore alternative query plans efficiently. The memo is sent into the search stage, and it’s called the search stage because we’re searching for the best query plan in this large forest. What the search stage does is it repeatedly applies cost-based transformations to the query. The output of the search stage at the very end is we choose the lowest cost query that we’ve determined and we execute it.

Let me get into a little bit more detail about what cost-independent transformations look like. These are transformations that always make sense. Some examples of these are constant folding, that’s the canonical example. I always want a full constant upfront instead of doing that constant folding repeatedly on every row as you execute a query. There’s also filter pushdown, decorrelating subqueries. Inside CockroachDB, we actually call these transformations normalizations because they essentially normalize fragments to the query into some common pattern.

These transformations, when they can be applied, they’re always applied. That’s where it means to be cost-independent. The way this is actually implemented inside CockroachDB is kind of cool. There is a domain-specific language for transformations. It’s a tiny language that specifies patterns through match against fragments to the query. That DSL is compiled down to code which officially matches the patterns within the memo and then applies the transformations. Currently, I just counted this morning, and there’s approximately just a little over 200 transformations defining the code. We add more in every release. The vast majority of them are cost-independent, and there’s a smaller number which are cost-based.

Let’s see a simple example of a cost-independent transformation. Here we have a query; we’re joining two tables, and then filtering. The initial plan that’s constructed, the initial transformation from the AST to a relational algebra expression, has us doing a scan of the two tables, then we join them, then we filter, and then we had the results. That filtering column is actually on both sides of the join, and this is one thing that the DSL implements, it allows both matching on the shape, as well as various properties of the tree. We can actually push the filter before the join. This is a good idea because the filtering operation is really fast, and the join operation is relatively slow. The estimated cost per join is dependent on how large the data coming into it is. We also always want to do this filter pushdown. This also allows for further transformation to be applied again. We then later try to push it down and did scan operation.

Cost-independent transformations are no big deal. Almost every SQL database influences cost-independent transformations, but cost-based transformations are where the real money is. These are transformations that aren’t universally good. Some examples of these are index selection and join reordering. Because the translations aren’t really good, it brings up the question of how do we decide whether to apply the transformation or not? This is where the memo comes into play. You can’t really decide on transformation by transformation basis, whether to apply it. You actually have to apply it and keep both the untransformed and the transformed query. This actually creates a state explosion. This is one of the huge engineering challenges of a cost-based optimizer, is how to handle that state explosion. The memo data structure keeps the memory usage under control, but then there’s also just a lot of fine engineering to make this really fast.

At the end of the cost-based transformation process, we estimate the cost of each query. I don’t have time to go into costing in detail, but basically, you look at table statistics, and using those table statistics you build from the leaves of the query up to the root, and are looking at the cardinality of the inputs to each operator. Based on that cardinality, you’re making some estimate of the cost of how long it’s going to take to execute the query, and then you choose the one with the lowest cost.

I’m going to go through an example of this for index selection. Index selection is actually affected by a number of factors. It’s affected by the filters and the query, filters that are present in join conditions. There might be a required ordering in the query that affects index selection, such as an order by. There might be an implicit ordering in the query, such as a group by, where if we read order data from the lower level, we can actually use a streaming group by operator. There’s covering versus non-covering indexes, which I won’t talk about, and then there’s locality as well, which comes into play, the locality of where the data’s stored.

Let me give you an example of how required orderings affect index selection and why it’s cost-based. You probably all know, sorting is relatively expensive, and yet sorting could be the better option if there are very few rows to sort. Here you have a query, it’s reading from this table, it’s filtering on a column, and it’s sorting on a different column. The naive plan is to scan the primary index, filter, then sort. If we have an index on X, you might be able to push the filter all the way down into the scanned and then we can scan on index X, and sort. This is almost certainly better than the first plan. If we have an index on Y, it’s possible we can scan on Y and then filter. Now, the question comes up, which of these last two plans is better? Let me just quickly run through what this looks like. Imagine the output of the filter is only 10 rows. Sorting 10 rows is very efficient. That’s really the best thing to do, especially if there are 100,000 rows in the table. The filter might actually output 50,000 rows and sorting 50,000 rows is going to be super expensive. In that case, it’s better to actually get the sorted data and then filter.

Locality-Aware SQL Optimization

The last topic I’m going to cover is locality-aware SQL optimization. For geographic database, network latencies are important. What we can do is we can duplicate certain read mostly data in a system, and then partition that data into various localities. What does this look like? Here I’m depicting a reference table, the postal codes table. What we have here is we actually have three indexes on this table. We have a primary index and two secondary indexes, and that storing syntax is basically saying that these secondary indexes store exactly the same data as the primary index. Then we can use replication constraints on these indexes to say all the ranges in the primary are stored in U.S. East; all in the European index and the U.S. West index also have their ranges constrained to different localities.

When a query comes into the system, we know which locality the query is coming in from and the cost-based optimizer takes locality into account as cost model, and preferentially, it can choose any of these indexes to read from. They all contain the same data and it’ll choose the index that is in the current locality. This is a very simple description of how this works, yet this is a general mechanism that the cost-based optimizer can use whenever you actually have data that’s replicated like this.

Here’s a very quick review. We talked about all this stuff. At the base, there’s a distributed, replicated transactional key-value store which has monolithic key spaces broken into 64-megabyte ranges. We use Raft for replication. There are various replica placement signals: space, diversity, load, and latency, that we utilize to decide where replicas should reside inside a cluster. Transactions implement serializable isolation. The operations are pipelined. On this key value store, we map SQL data. SQL execution involves a lot of bookkeeping, a heavy dose of performance, and we have specialized operators. Distributed SQL execution pushes fragments of SQL computation as close to the day as possible, and finally, we have an optimizer that ties all this together. At this point, we’ve wrapped up, you guys are all certified to now go implement a distributed SQL database.

See more presentations with transcripts

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.

Podcast: Deborah Hartmann Preuss on Creating Joyful Workplaces

MMS Founder

Article originally posted on InfoQ. Visit InfoQ

In this podcast, recorded at the Agile India 2019 conference, Shane Hastie, Lead Editor for Culture & Methods, spoke to Deb Preuss about life coaching, creating joyful workplaces, diversity and inclusion.

Key Takeaways

  • Life coaching is a skill that helps others find their best selves 
  • Culture has shifted and people know that it is possible to find pleasure at work, so autocratic leadership styles will no longer be accepted in the workplace
  • The freedom to be your authentic self reduces your stress and frees your uniqueness to bring a richness into the workplace
  • Diversity and inclusion are two different things and we need both in our workplaces
  • By paying attention to small things done consistently with intention we can build a culture that genuinely is inclusive

Subscribe on:

Show Notes

  • 00:48 Introduction
  • 01:33 Being a life coach in the dynamic environment of today 
  • 01:57 Help people slow down just long enough so they can hear what’s going on inside themselves 
  • 02:17 The value of sitting quietly not knowing the answer 
  • 02:31 Asking questions that elicit “ha” responses
  • 02:57 Demonstrating powerful questioning 
  • 03:16 How coaching looks is different for every person 
  • 03:34 The skill of seeing what is there and reflecting it back so the coachee notices what’s there
  • 04:16 If we are stressed and overwhelmed when bringing ideas to others that becomes part of the message we bring 
  • 04:27 Trying to teach people to be aligned with what they love
  • 04:42 This matters because it enables people to be more effective 
  • 04:57 If we can let people have their “soft and fuzzy” aspects they can bring their whole selves to the work they do
  • 05:20 It’s OK not to have all the answers 
  • 05:31 The world is moving too fast to tell people what to do – we need to share leadership and change our leadership styles 
  • 05:47 Culture has shifted and people know that it is possible to find pleasure at work, so autocratic leadership styles will no longer be accepted in the workplace
  • 06:08 When leaders can design a way forward that is rewarding to them they can make where they are a satisfying part of their life 
  • 06:24 Stress does not have to be part of the package of work-life today
  • 06:32 People who aren’t stressed out bring much more to the work that they do
  • 06:40 Authenticity as an important building block of a joyful workplace
  • 06:57 Some of the practices that Menlo Innovations embrace as part of their quest for a joyful workplace
  • 07:08 You don’t have to apologise for being who you are 
  • 07:22 The freedom to be your authentic self reduces your stress and frees your uniqueness to bring a richness into the workplace
  • 07:34 Joy is not the same as happiness
  • 07:50 To enable this to happen leadership needs to create safety so people can be vulnerable 
  • 07:54 A leader who doesn’t feel safe will have trouble creating safety for others
  • 08:10 Asking people to be vulnerable has to happen inside a place of trust
  • 08:21 There are tools which can be used to help create safe environments 
  • 08:28 We build trust by talking openly about things and being accountable to others 
  • 08:40 Advice in the book The Speed of Trust on how to build trust and to repair trust 
  • 09:08 Exploring the current state of diversity 
  • 09:20 The importance of inclusion in addition to diversity 
  • 09:28 Diversity is not the same as being inclusive 
  • 09:36 Inclusion is honestly welcoming differences  
  • 09:54 The comparison of an open space event compared to a traditional conference is similar to going from a traditional workplace to an agile workplace 
  • 10:08 Creating a space where people feel free to choose to step in
  • 10:16 How do we invite diversity rather than selecting diversity? 
  • 10:28 Inviting diversity is challenging 
  • 10:58 The desire to be inclusive is not the same as accomplishing inclusiveness 
  • 11:16 The resources are starting to be available for allies 
  • 11:28 By paying attention to small things done consistently with intention we can build a culture that genuinely is inclusive 
  • 11:52 Deb’s own work with women through Ten Women Strong 
  • 12:12 The key thing Deb brings is to help women be courageous through affirming who they are, how they are and encouraging them to be bold 
  • 12:38 How Deb got connected with the Ten Women Strong program 
  • 12:57 The program targeted at women in agile which gives a common foundation for conversations and collaboration 
  • 13:20 The common set of values that all agilists share 
  • 13:33 We are socialised to help others rather and forget to listen to what we need 
  • 14:28 What it means to work in a totally remote way across multiple initiatives 
  • 14:44 Co-active coach training conducted entirely remotely  
  • 15:09 Being an astonishingly good listener and offering that as a service
  • 15:21 The value of coaching through exploring what the coachee wants and needs 
  • 15:49 If I can help leaders bring more joy in the workplace, my work is multiplied 
  • 16:09 An invitation for anyone who is curious about coaching to book an hour to see what it’s about 

Agile India
Richard Sheridan – Menlo Innovations
Book – Liftoff
Book – The Speed of Trust
Book – Better Allies
Ten Women Strong
Co-active coaching
Contact Deb at A Bigger Game

More about our podcasts

You can keep up-to-date with the podcasts via our RSS Feed, and they are available via SoundCloud, Apple Podcasts, Spotify, Overcast and the Google Podcast. From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.

Previous podcasts

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.