Month: September 2022
MMS • Edin Kapic
Article originally posted on InfoQ. Visit InfoQ
Microsoft announced on August 25th that .NET 7 SDK will include support for creating containerised applications as part of a build publish process, bypassing the need for explicit Docker build phase.
The rationale behind this decision was to simplify boilerplate Docker code and to reduce developer cognitive load, making it possible to build containerised applications in .NET faster than before. Developers can use the generated containers in local development or leverage it to build images as part of a CI/CD pipeline. The reactions of the developer community have been cautiously positive so far.
Microsoft’s Chet Husk, product manager for .NET SDK, explains that in order to build this feature they had to add support for handling TAR files directly into .NET. It allowed them to change and update Docker image files, which are packaged as TAR files according to the Open Container Initiative specification, during the usual .NET build process. All the information needed for building a container image of the .NET app is already present in the moment of the build and the .NET SDK was extended to include the container image build process written in C#.
A Docker image can have many configuration settings. In the .NET SDK image build process, these configurations are exposed as properties on the project level. For example, an ASP.NET Core project has a default base container image from the Azure container registry. If you want to change it to a different base image, you would have to change the ContainerBaseImage
property in the project file and point it to a new image. The build process will also take the assembly name of the project as the image name, which is also possible to override using ContainerImageName
project property.
A major limitation is that Dockerfile RUN commands are not supported. It seems to be a design decision due to the scope of .NET build process (In Mr Husk’s words: There’s no way of performing RUN commands with the .NET SDK). Dockerfile RUN commands enable you to build intermediate images by running operating system commands on the image being built, usually to install a tool or change a system configuration. As a workaround, Microsoft suggests to build a base image with RUN commands using Docker and then specify that image as a base image when building containers using .NET SDK.
In order to try this new feature in .NET 7, you will need to have Docker installed on your development machine, at least for now. The dependency on Docker is because the SDK is still relying on it for providing authentication to container registries. However, Docker is still needed to run the generated container image.
The feature has support only for Linux images for the moment. Microsoft states in the project milestone at GitHub that Windows images and container registry authentication will be solved before .NET 7 release, together with other image customization possibilities.
.NET 7 is currently in preview 7 and expected to be released in November 2022.
MMS • David Manda
Article originally posted on Database Journal – Daily Database Management & Administration News and Tutorials. Visit Database Journal – Daily Database Management & Administration News and Tutorials
Data management and storage are important features when considering the software options for developing a website or e-commerce store. MySQL and MariaDB both offer support features to match a user’s website design needs and to aid in performance and security. The abilities of these two relational database management systems (RDBMS) – and whether they are right for your project’s specific needs – depends on what goals your have for your website and database-driven applications.
MySQL and MariaDB can support the business functions of your organization and your application goals in particular. MariaDB, for its part, is a newer version of MySQL that is smaller in size, yet capable of adding unique features in database management systems. However, MySQL is database software that can handle multiple performance operations simultaneously and has a reputation for being reliable from a functional and security perspective. This database programming tutorial will explore the details of these two database management systems and seek to highlight why each one might be a fit for your project.
What is MySQL?
MySQL became operational in 1995 after David Axmark, Michael Widenius, and Allans Larsson went public after its launch. The original function of MySQL was to organize data and store records of business queries in a database. Using a systematic procedure of data storage, MySQL is able to plot the relationship between variables (or data storage items whose values can change) to maintain the consistency of databases through tables. Primary Keys and Foreign Keys, meanwhile, are constant values (values that cannot change) used in MySQL tables to support the quick retrieval of data upon request.
MySQL has owned a significant market share since its launch, with a 44.04% lead in relational database management systems in the world. The United States has a 31.39% stake in using open-source software, which ranges from small to large-scale businesses. MySQL is second in rank compared to Oracle, which leads with a 30.2% of the market share, while the former holds a 16.65% share among database management systems.
In a typical setting, MySQL operates with high scalability, allowing multiple users to complete web development projects through WordPress or phpBB applications, among several others. With additional customization, MySQL can serve the database management of common social media platforms like Twitter or Facebook. Marketing and software development are typical functions of MySQL concerning database management.
Interested in learning MySQL? We have a tutorial listing the Top Online Courses to Learn MySQL to help get you started.
What are the Pros of MySQL?
Below is a list of the pros and benefits of using MySQL:
- MySQL has a portable feature that allows web applications to operate on multiple platforms that is commonly available in database management systems. The software supports different web developing languages like PERL, PHP, C, Python, and Java.
- Server connection are reliable with MySQL because of availability of UNIX and TCP sockets that enable data integrity, which supports continuous connectivity when transmitting across different networks.
- The security encryption of MySQL is advanced enough to protect sensitive data from being exposed when using web applications. The software has complex algorithms that prevent web applications from exposing information when using similar servers frequently.
- MySQL is an affordable database management system that helps companies functioning with a limited budget. The open-source software is reliable and affordable to small-scale and large-scale operations and software development teams that want to limit their investment in web platforms.
What are the Cons of MySQL?
Below is a list of some of the cons and downsides of using MySQL as your RDBMS of choice:
- Database management for commercial services is a challenge when using MySQL because web applications are not ideal for handling bulk data processing.
- MySQL is less inclined to provide security updates or publish bug reports that limit the general development of the software. Web developers tend to prefer software with advanced features and updated support for the comprehensive functionality of new applications.
- The rise of new database management software makes MySQL less popular in the market, while its lack of customization means popular web applications like Slackware Linux or Fedora are excluded from its functionality. Most web developers will prefer newer software to create applications that might further reduce MySQL’s market share.
What is MariaDB?
MariaDB is a newer version of MySQL with enhancements in features like performance and security relating to database management systems. When the software first launched in 2009, its objective was to create a free license for MySQL users. The latest version of MariaDB is 5.1, which supports small and large tasks with regards to data processing.
MariaDB is in position 14 among the ranks in the relational database management market. The market share of the database software is 1.95%, which shows its popularity is growing. WordPress, Google, and Wikipedia are popular companies that use MariaDB for developing web applications. MariaDB ranks position 3 in the USA database market. There are 53,555 websites in the USA that use MariaDB.
Database views and invisible columns are standout features that MariaDB offers. Storage engines like Connect, XtraDB, Aria, Cassandra Storage Engine, and Memory Storage Engine are available when using MariaDB. The MariaDB Foundation manages the open-source software, allowing developers to make changes based on community preferences. MariaDB can achieve over 200,000 connections, making it a favorite among e-Commerce companies where online transaction processing is a frequent occurrence.
Read: Best Courses for Database Administrators
What are the Benefits of MariaDB?
MariaDB offers the following benefits for database developers and those that create database-driven web applications:
- MariaDB allows backward compatibility that allows older platforms to function with newer applications. As open-source software, the community can contribute changes to maximize performance and provide security updates to the RDBMS.
- Commercial companies that deal with bulk online transactions benefit from new features courtesy of Galera cluster technology that prevents loss of transactional reports and slave lag. MariaDB offers better node read scalability to help clients complete transactions smoothly.
- MariaDB is open-source software that is free and accessible to anyone. The software grants full access to its features when installed via its GPL license.
- The dynamic thread pool is a function of MariaDB that closes inactive threads and allows the server to prioritize active threads. The optimization feature allows for large connections where updates and replication operations occur at higher speeds.
What are the Cons of MariaDB?
Here are some of the cons and downsides to using MariaDB as your database:
- JSON data types version 10.2 is the minimal supported update on MariaDB, which can limit functionality on older database platforms. JSON column types need prior configuration after converting the JSON data from MySQL. This is because the updated version uses LONGTEXT as a default, which can create compatibility problems.
- Technical support for enterprise features and customer representatives is only accessible through the subscription plan, which must be purchased. MariaDB restricts community uses to certain features, which means expert database knowledge and support through the community can be lacking.
- The stability of the MariaDB cluster version is unreliable. This means e-Commerce platforms can experience delays when processing bulk online data. Software delays can affect server performance too, as caching struggles to process large databases.
- Some of the MySQL features used in the Enterprise database application are excluded in MariaDB. The plugins used for data masking are open-source versions that are limited in compatibility. Further, MariaDB requires additional updates that are not guaranteed to be released in the future.
Database Comparison: MySQL vs MariaDB
Below, we compare key aspects of the MySQL and MariaDB relational databases, in an effort to help you choose which is better for your software development project.
User-friendliness
MariaDB is better than MySQL in terms of user friendliness. For small businesses, mid-sized companies, and large commercial enterprises, MariaDB is more user-friendly than MySQL. MariaDB is simple to download and install, which makes it easier for web developers to adapt the database into their systems. Real-time analytics are improved features of MariaDB that are absent in MySQL. New features like SHOW STATUS provide accurate reports about the status value, making MariaDB more user-friendly.
MySQL is user-friendly for database administrators and database programmers who run the software on older database platforms. Web developers who process smaller data can benefit from better optimized server performance when using MySQL. However, MariaDB supports fast data processing speeds, even for beginners. This is because the software is community licensed, which allows users to share their knowledge. Because MySQL is an old database management software, community support is limited. This means beginners can struggle to gain support when searching for online resources.
Features
There are 12 storage engines available in MariaDB, while MySQL has fewer storage options. This feature makes MariaDB better because large commercial enterprises need to process bulk data and space for server storage. MySQL is slower in data processing speeds compared to MariaDB. The speed improvement improves user latency to allow applications to run faster, especially when dealing with online transactions.
The connection pool in MariaDB is complex to support faster processing speeds and to connect over 200,000 databases. In contrast, MySQL has a slower connection pool that achieves a maximum of 200,000 connections on a single occasion. The limitation of speed and size in the connection pool makes MariaDB the better database management software.
JSON WITH and KILL statements are new features available to MariaDB. MySQL lacks these new features that serve as additional extensions to data processing functions. MariaDB required a paid license to access Enterprise Edition features. Similarly, MySQL has a private code for users that seek the Enterprise Edition of the software. Overall, MariaDB is better than MySQL in support of Enterprise Edition access because of expert technical support.
Integrations
MariaDB is better than MySQL because it can integrate with other database management platforms. The large storage engines available in MariaDB support high-performance data processing and storage. This allows for quick integration with older database platforms. MySQL lacks the support for integration with client applications, which is a major restriction in the transmission of processed data.
MariaDB supports data processing for complex transactions and does not request selective entry of transactions based on data volume. In contrast, MySQL operates simple transactions when processing data that is restricted based on the volume of transactions. MySQL and MariaDB can be integrated into OLAP and OLTP systems, although MariaDB shows better performance.
Collaboration
MariaDB supports the use of third-party software, together with other servers and products relating to enterprise development. The SkySQL Foundation is a feature that combines with MariaDB to provide cloud service to web developers. The collaboration options for MariaDB are many and companies can choose the desired package based on database needs.
In contrast, MySQL is limited to three options: MySQL Enterprise Edition, MySQL Standard Edition, and MySQL Cluster Carrier Grade Edition. This standard means database companies are limited to the features available to the available options. In regards to collaboration with other products and services, MariaDB is better than MySQL.
Pricing
MariaDB has a beginner charge of $0.4514 per hour when selecting the SkySQL Foundation option. This cloud service is attractive to many companies that seek an Enterprise Edition package. Nevertheless, MariaDB is open-source software that contains most database management features available in MySQL.
The Enterprise Edition of MySQL begins at $2,000 per year. This subscription gives access to the MySQL Standard Edition. Although MySQL is open-source software, the lack of enhanced features limits the use of the Enterprise Edition for companies that seek unique functions. For pricing, MariaDB is better than MySQL.
Which Database Should I Choose: MySQL or MariaDB?
In the analysis of MariaDB vs. MySQL, the applications show equal benefits and restrictions in database management systems. The decision of the company in selecting a relational database management system depends on resources and commercial objectives when considering MariaDB or MySQL.
MySQL is a widespread software for web developers in the database market. Its popularity makes it a preferable option for commercial companies that desire proven software for processing data. MariaDB remains new in the database market and experiences unstable features that can cause concern among commercial database companies.
Most features of MariaDB are enhanced over MySQL but require improvement to maintain functional performance. The issue with MariaDB is that updates can be delayed and immediate implementation is not a guarantee. However, the decision between MariaDB versus MySQL favors the former because of the flexibility in optimization of server performance that leaves room for improvement in the future.
Read: MongoDB vs MySQL
MMS • Philip Howes
Article originally posted on InfoQ. Visit InfoQ
Subscribe on:
Sponsored by EMQX
Struggling to scale your IoT infrastructure? Get more performance out of your network today with EMQ. From proof-of-concept to large-scale deployment, EMQ enables mission-critical IoT solutions at the edge, in the cloud, and across multi-cloud.
EMQ: Mission-critical data infrastructure for IoT.
Transcript
Roland Meertens: Welcome to the new episode of the InfoQ podcast. Today, I, Roland Meertens, am going to interview Philip Howes. In the past, he was a machine learning engineer and currently he is a chief scientist and co-founder in Baseten. He has worked with neural networks for a long time, of which we have an interesting story at the end of the podcast.
Because of his work at Baseten, Philip and I will talk about how to go from an idea to a deployed model, as fast as possible, and how to improve their model afterwards in the most efficient way. We will also discuss how the future of engineering teams looks like and what the role of data scientist is there. Please enjoy listening to this episode.
Minimizing time to value when deploying ML models [00:55]
Welcome, Philip, to the InfoQ podcast. The first topic we want to discuss is going from zero to one and minimizing time to value. What do you mean by that?
Philip Howes: I guess what I mean is, how do we make sure that machine learning projects actually leave the notebook or your development environment? So much of what I see in my work is these data science projects or machine learning projects that have these aspirations and they fall flat for all sorts of different reasons. And really, what we’re trying to do is get the models into the hands of the downstream users or the stakeholders as fast as possible.
Roland Meertens: So, really trying to get your model into deployment. What kind of tools do you like to use for that? Or what kind of tools would you recommend for that?
Philip Howes: I keep saying that we’re in the Wild West and I keep having to sort of temperature check. Is it still the Wild West? And it turns out from this report last week that I had read, yes, it is.
I think at least in enterprise, most people are doing everything sort of in-house. They’re sort of building their own tools. I think this is even more the case in startup land, people hiring and building rather than using that many off-the-shelf tools.
I think that there has been this good ecosystem that’s starting to form around getting to value as quickly as possible. Obviously, the company I started with my co-founders is operating in this space, but there are other great ones, even in the space of just out of these Jupyter notebooks. There’s like Voila. And then some more commonly known things like GradIO, Streamlit, Data Bricks, all the way up to, I guess, the big cloud players like Amazon and others.
Roland Meertens: Do you remember the name of the report? Or can we put it in the show notes somehow?
Philip Howes: I think it’s just an S&P global report on MLOps. I’ll try and find a link and we can share it.
Roland Meertens: Yes, then I’ll share it at the end of that podcast or on the InfoQ website. So, if we’re talking about deploying things, what are good practices then around this process? Are there any engineering best practices at the moment?
Philip Howes: I mean, I think this is a really interesting area because engineering as a field is such a well established field. We really have, through the course of time, iterated on and developed these best practices for how to package applications, how to do separations of concerns.
And, with regards to machine learning, it’s kind of like, well, the paradigm is very different. You’re going from something which is very deterministic to something that’s probabilistic. And you’re using models in place of deterministic logic. And so, some of the patents aren’t quite the same. And the actual applications that you’re building typically are quite different, as well, because you’re trying to make predictions around things. And so, the types of applications that make predictions are pretty fundamentally different from applications that serve some sort of very deterministic process.
I think there’s certainly some similarities.
Involving different stakeholders [03:52]
I think it’s really important to involve all the stakeholders as early as possible. And this is why minimizing time to value is such an important thing to be thinking about as you’re doing development in machine learning applications. Because at the end of the day, a machine learning application is just a means to an end. You’re building this model because it’s going to unlock some value for someone.
And usually, the stakeholder is not the machine learning engineer or the data scientist. It’s somebody who’s doing some operationally heavy thing. It might be some toy app that is facing consumers who might be doing recommendations. But as long as the stakeholders aren’t involved, you’re really limiting your ability to close that feedback loop between, what is the value of this thing and how am I producing this thing?
And so, I think this is true in both engineering and machine learning. The best products are the ones that have very clear feedback loops between the development of the product and the actual use of the product.
And then, of course there are other things that we have to think about in the machine learning world around understanding, again, we’re training these models on large amounts of data. We don’t really have the capacity to look at every data point. We have to look at these things statistically. And because of that, we start to introduce bias. And where are we getting bias from? Where is data coming from? And the models that we’re developing to put into these operational flows, are they reinforcing existing structural biases that are inherent in the data? What are the limitations of the models?
Iterating on existing models [05:27]
And so, thinking about data is also really important.
Roland Meertens: The one thing which always scares me is that, if I have a model and I update it and put it in production again, will it still work? Is everything still the same? Am I still building on the assumptions I had in the past? Do you have some guard rails there? Or are there guard rails necessary when you want to update those machine learning models all the time?
Philip Howes: Absolutely. I mean, there’s, of course, best practices around just making sure things stay stable as you are updating. But coming from an engineering background, what is the equivalent of doing unit tests for machine learning models? How do we make sure that the model continues to behave in a way…
At the end of the day, you’re optimizing over some metric, whether it be accuracy or something a little bit more exotic. You’re optimizing over something. And so you’re following that number. You’re following the metric. You’re not really following sort of, what does that actually mean?
And so it’s always good to think about, “Okay, well, how do I think about what this model should be doing as I iterate on it?” And making sure that, “Hey, can I make sure that, if I understand biases in the data or if I understand where I need the model to perform well, and incorporating those understandings as kind of tests that I do, whether or not they’re in an automated way or an ad hoc way…”
I think obviously automation is the key to doing things in these really closed tight feedback loops. But if I understand, “Hey, for this customer segment, this model should be saying this kind of thing,” and I can build some statistics around making sure that the model is not moving too much, then I think that’s the kind of thing that you’ve got to be thinking about.
Extending your dataset in a valuable way [07:06]
Roland Meertens: I think we now talked a bit about going from zero and having nothing to one where you create some value. And you already mentioned the data a couple of times. So, how would you go at extending your data in a valuable way?
Philip Howes: I guess fundamentally we have to think about, why is data important to machine learning?
Most machine learning models, they’re trained doing some sort of supervised learning. Without sufficient amount of data, you’re not going to be able to extract enough signal so that your model is able to perform on something.
At the end of the day, that is also changing. The world around you is changing and the way that your model needs to perform in that world has to also adapt to a changing world. So, we’ve got to of think about how to evolve.
Actually, one sort of little tangent, I was reading the Chinchilla paper recently. And what was really interesting is, data is now becoming the bottleneck in improvements to a model. So, this is one of these things that I think, for a very long time, we thought, “Hey, big neural nets. How do we make them better? We add more parameters to the model. We get better performance by creating bigger models.”
And it turns out that maybe actually data is now becoming the bottleneck. This paper showed that basically, the model size… Well, I guess the loss associated with the model is linear in the inverses of both the model size and the size of the data that you use to train it. So, there is this trade off that you have to think about, at least in the forefront of machine learning, where we’re starting to get this point where data becomes a bottleneck.
So, data’s obviously very important.
Then the question is, “Okay, how do we get data?”
Obviously, there are open data sets and that usually gives us a great place to start. But how many domain specific data sets are there? There’s not that many. So, we have to think about, how do we actually start collecting and generating data? There is a few different ways.
I think some of the more novel ways are in synthesizing data. I think that’s a whole nother topic. But I think for the majority of people, what we end up doing is, getting some unlabeled data and then figuring out, “Okay, how do we start labeling?” And there’s this whole ecosystem that exists in the labeling tools and labeling machine learning models. And if we go back to our initial discussion around, “Hey, zero to one, you’re trying to build this model,” labeling is this process in which you start with the data, but the end product is both labeled data and also the model that is able to score well on your data set, as you are labeling.
How to select your data [09:38]
Roland Meertens: I think often it’s not only the availability of data. Data is relatively cheap to generate. But having high quality labels with this data and selecting the correct data is, in my opinion, the bigger problem. So, how would you select your data, depending on what your use case is? Would you have some tips for this?
Philip Howes: Yes, absolutely. You’re presented with a large data set. And you’re trying to think, “Okay, well, what is the most efficient way for me to pull signal out of this data set in such a way that I can give my model meaningful information, so that it can learn something?”
And generally, data is somewhat cheap to find. Labels is expensive. It’s expensive because it’s usually very time consuming to label data, particularly if there’s this time-quality trade off. The more time you spend on annotating your data, the higher value it’s going to have. But also, because it’s time, it’s also cost, right? It’s certainly something that you want to optimize over.
And so, there are lots of interesting ways to think about, how should I label in my data?
And so, let’s just set up a flow.
I have some unlabeled data. And I have some labeling interface. We can talk about, there’s a bunch of different labeling tools out there. You can build your own labeling tools. You can use enterprise labeling tools. And you’re effectively trying to figure out, “Okay, well, what data should I use such that I can create some signal for my model?”
And then once I have some initial set of data, I can start training a model. And it’s obviously going to have relatively low performance, but I can use that model as part of my data labeling loop. And this is where the area of active learning comes in. The question is, “Okay, so how do I select the correct data set to label?”
And so, I guess what we’re really doing is, we’re querying our data set somewhat intelligently around, where is the data points in this data set such that I’m going to get some useful information?
And we can do this. Let’s say that we have some initial model. What we can do is start scoring the data on that model and say, “Hey, what data is this model most uncertain about?” We can start sampling from our data set in terms of uncertainty. And so, through sampling there, we’re going to be able to give new labels to the next iteration of the model, such that it is now more certain around the areas of uncertainty.
Another thing which maybe creates robustness in your model is maybe that we have some collection of models that can do some sort of weak classification on our data. And they are going to have some amount of disagreement. One model says this, another model says B, A and B. And so, I want to form a committee of my models and say, “Hey, where is there disagreement amongst you?” And then, I can select data that way.
I mean, obviously there are lots of different querying strategies that we could use. We could think about maybe, how do I optimize over error reduction? Or how much it’s going to impact my model?
But I guess the takeaway is that there’s lots of intelligent ways for different use cases in data selection.
Roland Meertens: And you mentioned initial models. What is your opinion on those large scale, foundational models, which you see nowadays? Or using pre-trained models? So, with foundational models, I mean like GPT-3 or CLIP.
Philip Howes: I think that there’s a cohort of people in the world that are going to say that, basically, it’s foundational models or nothing. It’s kind of foundational models will eat machine learning. And it’s just a matter of time.
Roland Meertens: It’s general AI.
Philip Howes: Yes, something like that.
I mean, I think to the labeling example, it’s like, “Yeah, these foundational models are incredibly good.” Think of something like CLIP that is this model, which is conditioned over text and images. And let’s say I have some image classification task. I can use CLIP as a way to bootstrap my labeling process. And then, as I add more and more labels, I can start thinking about, “Okay, I can not just use it to bootstrap my labeling process. I can also use it to bootstrap my model. And I can start fine tuning one of these foundational models on my specific task.”
And I think that there is a lot of value in these foundational models in terms of their ability to generalize and particularly generalize when you are able to do some fine tuning on them.
But I think it raises this very important question because, you mentioned GPT-3, this is a closed source model. And so, it’s kind of worrying to live in this world where few very large companies control the keys to these foundational models. And that’s why I think the open science initiatives that are happening in the machine learning world, like big science. I think, as of time of recording this, I’m not sure when this comes out, but a couple days ago, the stable diffusion model came out, which is super exciting, which is essentially a DALL-E-type model that does image generation based off text, which does amazing high quality images from text.
Certainly, the openness around foundational models is going to be pretty fundamental to making sure that machine learning is a democratized thing.
Roland Meertens: And are you at all concerned about how well models generalize or what kind of model psychology is going on? Overall problems a model can solve? Or what abstractions it learned?
Philip Howes: Yes. I mean, it’s like just going back to stable diffusion.
Of course, obviously the first thing I did when I see this model get released, I pulled down a version. And this is great because this is a model that is able to run on consumer hardware. And the classic thing that you do with this model is you say astronaut riding horse. And then, of course, it produces this beautiful image of an astronaut riding a horse. And if you stop to think about it a little bit and look at the image, it’s like, “Oh, it’s really learnt so much. There’s nothing in reality which actually looks like this, but I can ask for a photograph of an astronaut riding a horse, and it’s able to produce one for me.”
And it’s not just the astronaut riding a horse. It understands the context around, there’s space in the background. And it understands that astronauts happen to live in space. And you’re like, “Oh, wow, it’s really understood my prompt in a way that it’s filled in all the gaps that I’ve left.”
And then, of course, you write, “Horse riding astronaut.” And you know what the response is from the model? It’s an astronaut riding a horse.
And so, clearly that there is some limitation in the model because it’s understood the relationship between all these things in the data distributions that it’s been trained on. And it’s able to fill in the gaps and extrapolate around somewhat plausible things. But when you ask it to do something that seems really implausible, it’s so far out of its model of the world that it just defaults back to, “Oh, you must have meant this. You must have met the inverse because there’s no such thing as a horse that rides an astronaut.”
Roland Meertens: Oh, interesting. I’m always super amazed at how, if you ask the model, for example, to draw an elephant with a snorkel, it actually understands that elephants might breathe not through their mouth. So, it draws to snorkel in a different place than you would expect. So, it has a really good understanding of where to put things you would put on humans, but put on animals.
I’m always very amazed at how it gets more concepts than I could have programmed myself manually.
Philip Howes: I think it’s amazing how well these things tend to generalize in directions that kind of make sense. And I feel as though this is where a lot of the open questions exist. It’s just like, where are these boundaries around generalization?
And I don’t think that the tools really exist today that really give us some systematic way of encapsulating, what is it that this model has learned? And very often, it’s upon sort of the trainers of the model, the machine learning experts, to maybe know enough about the distributions of the data and about the architecture of the model to start poking it in the places where maybe these limitations might exist.
And this is where bias in machine learning is really frightening because you just really don’t know. How do I understand what’s being baked into this model in a way that is transparent to me as the creator of the thing?
Roland Meertens: Yes, the bias is very real. I think yesterday I tried to generate a picture of a really good wolf, like a really friendly wolf meeting the Pope. But all the images generated were of an evil-looking wolf, which I guess is the bias on the internet towards wolves. And you don’t realize it until you start generating these images.
Did you see this implicit bias from the training data come through your results in ways you don’t expect?
Philip Howes: And I think this is where AI, not just on the data bias in the technical sense, but also in the ethical sense, is to really start thinking about how these things get used. And obviously, the world’s changing very rapidly in this regard. And people are trying to understand these things as best they can, but I think it just underscores the need to involve the stakeholders in the downstream tasks of how you’re using these models.
I think data scientists and machine learning engineers, they’re very good at understanding and solving technical problems. And they’ve basically mapped something from the real world into something which is inherently dataset-centric. And there’s this translation back to the real world that I think really needs to be done in tandem with people who understand how this model is going to be used and how it’s going to impact people.
How to set up your data science team [19:05]
Roland Meertens: Yes. If we’re talking about that, we already now talked about minimizing the time to value and extending your data in a valuable way. So, who would you have in a team if you are setting this up at a company?
Philip Howes: I think this is a really big question. And I think depends on how end to end you want to talk about this.
I think machine learning projects start at problem definition to problem solution. And problem definition and solution generally operate in the real world. And the job of the data scientists is usually in the data domain. So, everything gets mapped down to this domain, which is very technical and mathematical. And there are all sorts of requirements that you have on the team there in terms of data scientists. Data scientist means so many different things. It’s like this title that means everything from doing ETL, to feature engineering, to training models, to deploying models, to monitoring models. It also includes things that happen orthogonally, maybe like business analyst.
But I think on the machine learning side of things, there’s a lot of engineering problems that’s starting to get specialized in terms of, on the data side of things, understanding how to operate over large data sets, data engineering. Then you have your data scientist who is maybe doing feature engineering and model architecture design and training these things.
And then it’s like, “Okay, well now you have this model. How do I actually operationalize this in a way that is now tapping into the inherent value of the thing?” And so, how do you tap into the value? You basically make it available to be used somewhere.
And so there’s traditional DevOps, ML ops engineering that’s required. And then, of course, at the end of the day, these things end up in products. So, there’s product engineering. There’s design. And then surrounding all of this thing is the domain in which you’re operating, so there are the domain experts.
And so, there’s all sorts of considerations in terms of the team. And what tends to happen more often than not is, in companies that are smaller than Google and Microsoft and Uber, a lot of people get tasked with wearing a lot of hats. And so, I think when it comes to building a team, you have to think about, how can I do more with less?
And I think it becomes the question around, what am I good at? And what are the battles that I want to pick? Do I want be an infrastructure engineer or do I want to train models? And so, if I don’t want to be an infrastructure engineer and learn Kubernetes and about scalability and reliability, all these kinds of things, what tools exist that are going to be able to support me for the size and the stage of the company that I’m in?
Particularly in smaller companies, there’s a huge number of skills that are required to extract value out of a machine learning project. And this is why I love to operate in this space, because I think machine learning has got so much potential for impact in the world. And it’s about finding, how do you give people superpowers and allow them to specialize in the things that create the most value where humans need to be involved and how to allow them to extract that value in the real world?
Roland Meertens: If you’re just having a smaller company, how would you deal with lacking skills or responsibilities? Can this be filled with tools or education?
Philip Howes: It’s a combination of tools and education. I think one of the great things about the machine learning world is it’s very exciting. And exciting things tend to attract lots of interest. And with lots of interest, lots of tools proliferate. And so, I think that there’s certainly no lack of tools.
I think what’s clear to me is that the space is evolving so quickly and the needs are evolving so quickly and what’s possible is evolving so quickly that the tools are always playing in this feedback loop, with research tooling and people of, what are the right tools for the right job at the right time? And I think that it hasn’t settled. There’s no stable place in this machine learning world. And I think that there are different tools that are really useful for different use cases. And lots of the time, there are different tools for different sizes and stages of your machine learning journey.
And there are fantastic educational resources out there, of course. I particularly like blogs, because I feel as though they’re really good at distilling the core concepts, but also doing exposition and some demonstration of things. And they usually end up leading you to the right set of tools.
What becomes really hard is understanding the trade offs and making sure that you straddle the build versus by hire versus by line effectively. And I don’t think that there is a solution to this. I think it’s about just kind of staying attuned to what’s happening in the world.
New roles to add to your data science team [23:21]
Roland Meertens: And if we’re coming back to all the new AI technologies, do you think that there will be new disciplines showing up in the near future to extend on the data scientist role to be more specialist?
Philip Howes: Yes, absolutely. I mean, I think one of the things that’s happened over the last few years is that specializations are really starting to solidify around data engineering, model development, ML, engineers, ML ops engineers.
But I think going back to our conversation around some of these foundational models, if you are to say that these things are really going to play a pretty central role in machine learning, then what kind of roles might end up appearing here? Because model fine tuning of a foundational model is a very different kind of responsibility, maybe technically lighter but maybe requires more sort of domain knowledge. And so, it’s this kind of hybrid data scientist, domain expert kind of position.
I think tooling will exist to really give people the ability to do fine tuning on these foundational models. And so, I think maybe there is an opportunity for the model fine tuner thing.
I think going back to stable, diffusional or DALL-E type models, I think astronaut riding horse, you get an astronaut riding a horse. Horse riding astronaut, you get an astronaut riding a horse. But if you prompt the model in the right way, if you say maybe not horse riding astronaut, but rather horse sitting on back of astronaut, and maybe with additional prompting, you might actually able to get what you need to do. But that really requires a deep understanding of the model and how the model is thinking about the world.
And so, I think what’s really interesting is this idea that these model are pretty opaque. And so, I think you mentioned model psychology earlier. Is there opportunity for model psychologists? Who’s still going to be the Sigmund Freud of machine learning and develop this theory about how do I psychoanalyze the model and understand, what is this model thinking about the world? What is its opinion and abstractions that it’s learned around the world of the data that it’s built?
Philip’s early experience with neural networks [25:32]
Roland Meertens: And maybe even know that if you want to have specific outputs, you should really go for one model rather than another. I really like your example of the horse thing on the back of an astronaut because I just typed it into DALL-E and even the Open AI website or so can’t create horses on the back of astronauts. So, listeners can send us a message if they manage to create one.
As a last thing, you mentioned that you have extensive experience in neural networks and horses. Can you explain how you started working with neural networks?
Philip Howes: This is a bit of a stretch. But when I grew up, my dad was, let’s say, an avid investor at the horse track. And so, one of the things I remember as the child back in the early 90’s was we’d go to the horse track and there’d be a little rating given to each horse and provide some number. And I learned that N stood for neural network. And so, these people building these MLPs to basically score horses. And so this was a very early exposure to neural networks.
And so, I did a little digging as a kid. And obviously, it was over my head. But as I sort of progressed through teenage years and into university, I was getting exposed to these things again in the context of mathematical modeling. And this is how I entered the world of machine learning, was initially with the Netflix Prize and realizing that, “Hey, everyone’s just doing SVD to win this million dollar prize.” I’m like, “Hey, maybe mathematics is useful outside of this world.”
And yeah, I made this transition into machine learning and haven’t looked back. Neural networks.
Roland Meertens: Fantastic. I really love the story. So, yeah, thank you for being on the podcast.
Philip Howes: Thanks for having me, Roland. It’s a real pleasure.
Roland Meertens: Thank you very much for listening to this podcast and thank you Philip for joining the podcast.
If you look at the show notes on InfoQ.com, you will find more information about the Chinchilla paper we talked about and the S&P Global Market Intelligence report. Thanks again for listening to the InfoQ podcast.
Mentioned
- The Chinchilla paper. A good write up blog post about this paper can be found here.
- The S&P global Market Intelligence Report is called “How MLOPs Can Enable AI To Scale”, by Nick Patience and Rachel Dunning. Their findings can be listened to in this podcast.
.
From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.
MMS • Steef-Jan Wiggers
Article originally posted on InfoQ. Visit InfoQ
AWS recently expanded support for manipulating input and output data by adding 14 new intrinsic functions for AWS Step Functions to simplify data processing, reduce calls to downstream services, and write less code.
Step Functions is a low-code, visual workflow service from AWS to build distributed applications, automate IT and business processes, and orchestrate AWS services with minimal code. It currently supports integrations with over 220 AWS services, 10,000 API actions, and 18 intrinsic functions – initially, there were four intrinsic functions with States.Array, States.Format, States.JsonToString, and States.stringToJson. There are now 14 more, including Array, Unique Universal ID (UUID), Math, and string.
Amazon States Language (ASL) is a JSON-based, structured language that defines Step Functions workflows. Each state within a workflow receives a JSON input and passes a JSON output to the next state. The ASL provides a set of functions known as intrinsics that perform basic data transformations. Developers can apply intrinsics using ASL in task states within the ResultSelector field or in a Pass state in either the Parameters or Result field. Moreover, all intrinsic functions have the prefix “States.” followed by a function.
UUID intrinsic for generating UUID:
"Type": "Pass",
"End": true,
"Result": {
"ticketId.$": "States.UUID()"
}
}
Earlier, when intrinsic functions were not available, the approach was different, as Jones Zachariah Noel, a developer advocate at FreshworksInc, explains in a dev.to blog post:
Without the intrinsic functions, the need for any such data processing or manipulation required you to write code in your AWS Lambda functions which involved having a dedicated Task in the State Machine where it would invoke your Lambda function. And your architectures would have excessive resources; your workflows mostly included AWS Step Functions invoking AWS Lambda functions for all data manipulations.
Yet now, with intrinsic functions, Benjamin Smith, a senior developer advocate for Serverless at AWS, explains in a blog post:
Intrinsic functions can allow you to reduce the use of other services, such as AWS Lambda or AWS Fargate, to perform basic data manipulation. This helps to reduce the amount of code and maintenance in your application. Intrinsics can also help reduce the cost of running your workflows by decreasing the number of states, number of transitions, and total workflow duration.
Similarly, Logic Apps, a competitive workflow service on Azure, offers expression functions. While executing a Logic App, some values can be set by functions provided by the Workflow Definition Language, a similar concept as with intrinsic functions in AWS Step Functions.
Dave Hall, an independent solution architected tweeted:
It’s good to see Amazon investing in intrinsic functions. They’re part of my new serverless strategy.
Lastly, AWS Step Functions’ new intrinsic functions are generally available in all regions where AWS Step Functions is available. Pricing details of the AWS Step Functions are available on the pricing page.
Java News Roundup: NetBeans 15, Jakarta EE 10, jtreg 7, Spring Cloud, Groovy, Helidon, Micronaut
MMS • Michael Redlich
Article originally posted on InfoQ. Visit InfoQ
This week’s Java roundup for September 5th, 2022, features news from OpenJDK, JDK 20, Jakarta EE 10, Spring Cloud 2021.0.4, Quarkus 2.12.1, Micronaut 3.6.2 and 3.6.3, Helidon 2.5.3, important changes to upcoming JDK 8 maintenance release, Hibernate ORM 6.1.3, Reactive Native JHipster 4.3.0, Apache NetBeans 15, Apache Groovy 4.0.5, Apache Camel 3.18.2, Ktor 2.1.1 and the JavaZone conference.
OpenJDK
Version 7 of the Regression Test Harness for the JDK, jtreg
, has been released featuring an upgrade to JUnit 5, which provides the Jupiter API and support for running existing JUnit 4 tests and the Tag Language Specification. The naming convention for third-party library JAR files has been changed to use the base name of the JAR file that was specified when jtreg
was built. This name may depend on the version of the library. JDK tests that were affected by this change have already been updated. JDK 11 is the minimum supported version of jtreg
7.
JDK 19
JDK 19 remains in its release candidate phase with the anticipated GA release on September 20, 2022. The release notes include links to documents such as the complete API specification and an annotated API specification comparing the differences between JDK 18 (Build 36) and JDK 19 (Build 36). More details on JDK 19 and predictions on JDK 20 may be found in this InfoQ news story.
JDK 20
Build 14 of the JDK 20 early-access builds was also made available this past week, featuring updates from Build 13 that include fixes to various issues. Further details on this build may be found in the release notes.
For JDK 19 and JDK 20, developers are encouraged to report bugs via the Java Bug Database.
Jakarta EE 10
On the road to Jakarta EE 10, Ivar Grimstad, Jakarta EE developer advocate at the Eclipse Foundation, announced in his Hashtag Jakarta EE weekly blog that the ballots for the Platform Profile and Web Profile reviews of Jakarta EE 10 are now open until September 13. The Core Profile had already passed its review in August. This appears to be a good sign that Jakarta EE 10 could officially be released sometime this month. More details about the Jakarta EE specifications may be found in Grimstad’s presentation, Jakarta EE 10 – Feature-by-Feature, at the JavaZone conference this past week.
Spring Framework
Spring Cloud 2021.0.4, codenamed Jubilee, has been released featuring updates to all of the Spring Cloud sub-projects with notable changes in Config, Gateway, OpenFeign and Circuit Breaker. More details on this release may be found in the release notes.
Quarkus
Red Hat has released Quarkus 2.12.1.Final featuring a fix to a performance regression issue with the RequestContext
class. There were also dependency upgrades to Smallrye OpenAPI 2.2.1 and Dekorate 2.11.2. More details on this release may be found in the changelog.
Micronaut
The Micronaut Foundation has released versions 3.6.2 and 3.6.3 of Micronaut featuring bug fixes and patch releases for a number of the Micronaut modules such as: Security, Email, Micronaut for Spring, Tracing, Flyway, AWS, Serialization, and Data. Version 3.6.2 also provides an upgraded SnakeYAML 1.31 that addresses CVE-2022-25857, a vulnerability in previous versions of SnakeYAML susceptible to a Denial of Service attack due to missing nested depth limitations for collections. More details on these releases may be found in the release notes of version 3.6.2 and version 3.6.3.
Helidon
Helidon 2.5.3 has been released featuring numerous changes that include: an upgraded protocol buffer to support the osx-aarch_64
architecture in the gRPC component; an access token refresh in the Security component; and a fix for obtaining the parent directory for the watcher service in the Config component. There were also dependency upgrades to SnakeYAML 1.31 and Oracle Cloud Integration Integration 2.41.0.
BellSoft
BellSoft, creators of Liberica JDK, their downstream distribution of OpenJDK, discussed important changes that will affect the upcoming release of JDK 8 Maintenance Release 4 scheduled for October 2022. At the center of these changes is JDK-8202260, Reference Objects Should Not Support Cloning, defined in the JDK Bug System, that describes a critical issue identified in the Java SE 8 Platform:
The semantics of cloning a reference object is not defined clearly in the Java SE Specification. Cloning is closely related to garbage collection, so if the reachability state of a reference object changes during GC activities, the collector may enqueue the object before the code calls the
clone()
method on it. As a result, the clone won’t be enqueued and referenced. This leads to highly unpredictable reference processing.
Changes were subsequently implemented in JDK 9 and JDK 11 that will be backported in JDK 8 Maintenance Release 4. For example, in JDK 11 the Reference.clone()
method was specified to always throw a CloneNotSupportedException
.
Hibernate
Hibernate ORM 6.1.3.Final has been released featuring an optimization in which strings annotated with @JdbcTypeCode(SqlTypes.JSON)
and @JdbcTypeCode(SqlTypes.SQLXML)
will no longer be serialized to JSON/XML. Instead, they will be interpreted as plain JSON/XML to avoid the overhead of serialization/deserialization.
React Native JHipster
Shortly after the release of JHipster 7.9.3, version 4.3.0 of JHipster React Native was made available to the Java community. Improvements include: an upgrade to Expo SDK 46 with React Native 0.69.5 and React 18; a migration to Expo Application Services; support for logout with Auth0; use Node 16 for GitHub Actions; numerous dependency upgrades; and improved quality assurance using Keycloak, Okta, and Auth0. More details on this release may be found in the release notes.
The Apache Software Foundation
Apache NetBeans 15 has been released featuring: support for JEP 413, Code Snippets in Java API Documentation (delivered in JDK 18); a clean up of the code base to remove support for Windows 95 and Windows 98; an upgrade to Maven-Indexer 6.2.0 that included the removal of the workaround that avoided the IndexOutOfBoundsException
; and an update to Oracle Cloud Integration 2.27.0. More details on this release may be found in the release notes.
Apache Groovy 4.0.5 has been released featuring 56 bug fixes, improvements and dependency upgrades such as: JUnit Jupiter 5.9.0, JUnit Platform 1.9.0, Gradle 7.5.1, Spock 2.2, and slf4j 2.0.0. The lone new feature is an enhancement to the DateUtilExtensions
class such that a subset of the static calendar constants may be retrieved. More details on this release may be found in the release notes.
Apache Camel 3.18.2 has been released that ships with 50 bug fixes and improvements such as: support for mail attachments in the Camel Freemarker component; and handling the NoSuchElementException
from the loadProperties()
method defined in the CamelMicroProfilePropertiesSource
class. There were also dependency upgrades that include Spring Boot 2.7.3 and gRPC 1.48.1. More details on this release may be found in the release notes.
JetBrains
JetBrains has released Ktor 2.1.1 featuring improvements to fix issues such as: an exception from Netty HTTP/2; a mismatch between JDK 8 and JDK 11 in building Ktor; and a deprecation of the receiveOrNull()
method that has been perceived as confusing. More details on this release may be found in the changelog.
JavaZone Conference
The JavaZone conference was held at the Oslo Spektrum in Oslo, Norway this past week featuring many speakers from the Java community who presented lightning talks, presentations and workshops.
MMS • Tim Berglund
Article originally posted on InfoQ. Visit InfoQ
Transcript
Berglund: It’s important whether you’re a leader or an individual contributor, a formal manager or an individual contributor, this is the thing that you can drive, and I think that that matters. This is a part of an environment that you almost certainly want. You want this to be true of the place that you work, really, regardless of what you do. Even if you’re not a developer at all, this is important for any team communication. It’s just that as developers, we spend an awful lot of our time talking to each other. It’s something that a lot of people from the outside don’t appreciate. The part of your day that is heads down coding is usually not as long as you’d like it to be. There’s a lot of talking with other people, even if it’s not pointless meetings with marketing that you get dragged into, or whatever. We work together and debate things together. Creating an atmosphere of safety is key to making that collaborative process work well.
Safety
Again, this is a thing that we want to be a part of our lives. It’s a thing that makes teams perform better. If you are a manager, you have a real responsibility to make this happen. It’s up to you to make sure that the environment that your team works in is a safe one. If you’re not a manager, this is still a thing you can contribute to. There’s a lot of things about team culture you don’t set, but there’s behavior that you can engage in, that help makes things safe or not.
Google’s Project
Google in 2012 embarked on what could only be described as a sociological research project, they wanted to know why some teams thrived and others didn’t thrive so much. They wanted to know why some teams were successful and other teams weren’t. They surveyed like 180 teams from all over the company. They’re like Google, they think they’re good at data, and they tried slicing and dicing that among as many dimensions as they could. They looked at that education level, whether the teams socialized together, gender diversity, ethnic diversity, whether they had hobbies in common in the team, all this stuff, and it didn’t work. They couldn’t really find any of those variables that controlled things.
Google’s Frustration
This is a time to step back and think. We all do some of our work in isolation. You’re heads down coding, and sitting crunching on something, you’re in the zone. That’s an extremely enjoyable experience for everybody who’s in this profession. Again, that’s not our whole day. As developers, we spend a lot of time just debating solutions, debating what to name something, debating maybe an architectural decision, maybe debating a big choice of the structure of a system or the adoption of a framework. They are consequential decisions. When we’re working together in groups, there is magic that happens when teams work together. They exhibit a property called collective intelligence. This is this emergent phenomenon of a group of people where the group can achieve more together than its smartest individual could. Let’s just say there’s a spectrum of capability on the team. There’s decisions and code that the best player on the team can output. When you’re firing on all cylinders, you’ve got this collective intelligence thing going on, the whole team together is better than that person. Or, teams can be parasitic on intelligence. You’ve totally seen this, where a dysfunctional team actually produces worse results than its least capable member would individually. If you think about it, that’s quite a trick.
Google’s Discovery
After Google tried to find all these correlations, they did find something but it was completely behavioral. It had nothing to do with the properties of the people on the team. You can imagine if you were embarking on this research project, there might be certain things that you wanted to see. Maybe you wanted to think, teams that just hang out together, they do better. I just want that to be true. Or teams that are more diverse, are going to do better. I want that to be true. These were all dead ends. What they did find were two things, two properties. Separated the effective teams from the ineffective one. One was airtime management, which is where every team member felt like they had the ability to be heard in group discussions. It wasn’t just one voice drowning out the others. It was, everybody really believed and experienced that they could speak if they wanted to.
We saw social sensitivity, which is defined as team members being able to sense how each other felt through unspoken cues. That’s a tough one, because some people are naturally good at that, some people aren’t naturally good at that. Teams where you had good airtime management, good social sensitivity, those were the high performing teams. All of the other things that they wanted to see or that they tried to see, didn’t work out. This was a rediscovery of what a researcher named Amy Edmondson, had first named in 1999, this concept is old enough to drink. She called it psychological safety. She had her own take on this, of course, on how to put this into practice. I recommend, if you’re interested in this that you follow up and you check out Amy Edmondson’s approach. She’s got a book on it. It’s a little bit different from what I’m talking about here. She is the originator of this so she’s worth knowing.
We came up with this definition that psychological safety is a team climate characterized by interpersonal trust and mutual respect, in which people are comfortable being themselves. That comfort is key to unlocking ideas to debate. In order to be creative and propose a new, potentially risky idea, people have to believe that the self that they’re exposing, because they’re exposing a little bit of themselves, some idea that comes from inside them. That’s a little bit of the inner person that they’re letting out. Normally that person is protected. They need to believe that that part of themselves that they’re exposing isn’t going to be attacked. Nobody really exposes all of themselves. There’s always this outward person that you let people see, and then there’s the inner thoughts. That boundary is in there somewhere for different groups of people and different levels of relationship, intimacy, and all that stuff. Certainly, your coworkers don’t totally know the real you. If you protect it all, there’s not going to be a creativity, there’s not going to be any new ideas. If you let some of that be exposed in an environment of safety, then you will have better ideas.
Your Brain Under Threat
Some TED talk level neuroscience. What happens to your brain when it’s under threat? There are a few things that happen. One is short term thinking. You don’t think about what’s going to happen long term. Your vision gets very close to you, and what you’re able to imagine and see and think about gets very close to you when you’re under threat. If you think about that, from a survival perspective, you don’t want to think about the future. You want to think about the animal that’s trying to eat you or the person that’s trying to put a spear in you. Those are all very short term things. There’s also diminished creativity. Creativity is expensive. It’s not a time to be creative. That goes away when you’re under threat. Black and white thinking. You don’t think about nuance when there’s danger, you think about this or that. This is related to a psychological phenomenon called splitting, where maybe you think of yourself as the best in the world or the worst in the world, or other people you might think of as the best or the worst. This is an actual developmental stage that young people go through as like adolescence. That’s the way you can view the world. Normally, as an adult, you develop out of that. Under threat, you fall back to that, this very black and white view of the world and of other people.
Finally, you revert to pattern or revert to training. You don’t think of new things, you fall back to exactly what you know. These are great and important, and we’re made this way to be more survivable creatures. These are great things when we’re under threat. When we’re trying to design systems or debate code or debate naming schemes, the real things that we argue about, probably nobody’s going to die. These are maladaptive to that environment. We don’t want to trigger our brains into danger mode, when we’re trying to make decisions together.
Techniques to Help Your Team Feel Safe – Managing Airtime
Let’s go over some techniques to help teams feel safe. Managing airtime. This is one of the two key things. How do you do it? Again, if you’re the boss, this is to you. If you’re the team lead, this really is for you. If you have a flatter structure, and you’re just an influential person, again, this is for you. It might be that you’ve got a place in the team where this is hard for you to steer. Let’s just take a look at what the points are. You got to make space for quiet people. I’m a professional talker. This is what I do for a living. In meetings, you don’t have to make space for me. If I have something to say, I’m going to take the microphone, I’m going to say it. Not everybody is like me. There’s no rule that says that I’m the right way, and quiet people, that’s just the wrong way to be. That’s not true. Some people are quiet, and we have to make room for them to talk.
We have to help talkative people manage themselves. I’ve certainly had dementia team members who are also people who tend to grab the mic and want to talk. Sometimes you might have to take one of these people aside, maybe it’s yourself. You have to develop some self-awareness about this. You have to develop the tools to know when not to talk. It’s easy for you to talk, you’ve got good ideas. You’re good at articulating them. You think the world’s better off if they hear from you. Sometimes they shouldn’t. You have to learn when to govern yourself. If you’re a team lead, you may do some governing of people. You may have to say, “Tim, stop talking now. We’ve heard enough from you, other people need to talk.” What you really need to help Tim do is develop the skills to govern himself. That’s the growth that you want to see in talkative people.
Also, last point, we want to let people use their airtime to criticize your ideas, speaking to you, again, as a leader or an influential person, it’s important that they be able to use their time to criticize you. That needs to be a safe thing. They need to experience over time that criticizing ideas in the room, particularly the ideas of the leader is not something that results in harm to them. They don’t have negative consequences.
Social Sensitivity
Next, let’s talk about social sensitivity. Again, this may be a thing that’s completely natural for you, and nobody needs to tell you how to read the room, and you’re super good at it. I’m the person who’s good at reading a room, even to a fault. There’s some interesting dark sides to reading the room that you have to look out for. If this isn’t natural for you, here are some tips. First of all, just watch facial expressions. If it’s really hard for you to read facial expressions, and I’m saying watch facial expressions, and you say, “Yes, Tim, I know. Everybody says that. I don’t know what they mean.” Sometimes that is a thing. If that’s you, you can learn. It stinks if you have to, because it’s really hard. You kind of, as it were, have to do this in software. If you feel like other people just do this. This is a feature of the hardware, and they don’t have to think about it. They know what phases mean. You’re like, “I never know what phases mean.” You can do it. You just need to partner with a person who’s got that hardware support. Somebody that you trust, who can tell you, this is what this phase means. This is what this phase means. You can learn it.
On Zoom calls, use gallery mode. Make sure you can see everybody’s face all at the same time, because you’re going to miss things if you don’t. If you’re just looking at the speaker, the speaker might say something and somebody else might be hurt or disappointed or put off. If you’re not able to see all the faces, you won’t know that. Of course, if you’re in the room together, just look around, but typically, we’re doing a lot of online these days. There’s another little cue you can see when you’re looking at those faces, which is watching for people who start to talk and then censor themselves. You got to look for that and give that person space. Say, “Look like you were going to say something, what’s up? Everybody else shut up. This person is talking.” You really want to clear the deck and make it welcome for them to speak.
Again, I know I’m saying stuff that you might be a total natural at this, and it is second nature and nobody ever has to tell you a thing. Or it might seem like completely impossible. I know just from friends I’ve had over the years who are in that, ‘I have no idea how to do this category,’ you can learn. It’s like me and singing. I try to ask people, how do you make your voice do a certain note? They say stuff like, you just do. That’s like completely unhelpful to me because I don’t want to do it. It’s not an ability I have. Apparently, most people can just do that. Ok, that must be nice. People like me can be taught, it’s just laborious. Or like dancing. I’ve taken some dance lessons. I’m not a natural. I’m like a block of wood, but I can learn. I just have to grind it out a step at a time. It’s never going to be great but you can get there.
Mirroring and Labeling
A way to help draw out emotional responses. If you think an emotion is being expressed, but you aren’t sure what it is. There’s this really easy technique, and it sounds stupid, but it works. What you do is somebody says something and you’re like, “Ok, I think that was a feeling. I just don’t know what the feeling was.” Then repeat back the last three words that they said, whatever those words were at the end of the sentence. You could pick three other words if you want. You could just pick the last three words they said, just repeat that back. They take that as a question. People just interpret that as, she heard me and she wants to know more about that thing. It just sounds like this really smart, subtle question. It’s bizarre that it works, but it works. Then they’ll say some more, and if you still don’t know, then pick the last three words, again, of the last thing they said. Keep probing like that. Eventually, you’ll start to get an idea. You could say something like, it sounds like you are anxious about that, or it sounds like you are really enthusiastic about that thing. You come up with a label. The first thing is called mirroring. Then you come up with a label for the emotion that you think you’re seeing.
The amazing thing is you can be wrong. If you’re a person who has trouble reading emotions, you just do this mirroring thing, you get an idea, maybe you’re really bad at it, and it’s the wrong emotion. Then they’ll tell you, because you’ve been doing this listening, you’re bouncing back ideas at them. Then they know, you’ve put an idea there, and they can either say, “Yes, that’s right,” or, “No, it’s more like this.” If you do this, I know it sounds contrived, but this makes you seem like the best listener they’ve ever known. That listening, this mirroring and labeling is going to make people feel safe in expressing ideas. You got to trust me on this, it works. If you don’t have a lot of emotion words, you can get a thing like this, you could just Google, emotion wheel. You’ll probably come up with this diagram. You have super simple words in the middle. Then you go out and you get slightly more detailed emotion words. On the outside they’re like, that might be stuff you don’t even need at work. This can help give you words for your labeling.
Handling Big Feelings
Sometimes there are big feelings. Something might go a little bit wrong in a situation, or in a debate and somebody gets upset, angry, whatever, and there’s just something big. This is key because usually safety is established in all of the little interactions, not in the big things. You have to get all these little interactions, do those well. The big ones can leave a mark too. How you handle big emotions can help establish the context of safety. There are two steps when somebody is real upset about something or just having a big and probably negative emotion. What you don’t do is say for example, calm down. Nobody has ever calmed down by being told to calm down. In fact, usually, being told to calm down makes people a lot less calm. What you do is reflect and diminish. The reflecting is if somebody comes to you and they’re angry, like, “Can you believe this? We were told that we were going to get to build this on Kafka and we’re not allowed to, and I’m mad about that.” You reflect that a little bit. You don’t tell them to calm down. You don’t tell them they’re bad to be angry. You’re like, “Yes, I can’t believe that.” Just reflect that emotion back a little bit, but a little bit less of it. If they’re up here angry, you’re here angry, a little bit less. What that does is it says, it’s safe for you to be feeling this thing that you’re feeling, and I’m going to bring us down a little bit, to a little bit more control.
Normalizing Failure
Another important thing, normalizing failure. You see this teddy bear back here? This is a real background. It’s not fake. This teddy bear is from a place called Build-A-Bear Workshop. That’s a business that’s 24 years old now, founded in 1997. My girls are all grown up. My boys are all grown up. Spent a lot of money at Build-A-Bear over the years. The founder, Maxine Clark, created something called the Red Pencil Award. This is from something she learned from a teacher when she was a little girl. The teacher, one year, had this award called the Red Pencil Award that was a recognition for the most mistakes made in a weekly writing assignment. That was cool, and all, the catch being you couldn’t get credit for a mistake you’d made before. What this did was this helped us celebrate failure where lessons are learned. She instituted this in Build-A-Bear, where you could get an award for pointing out mistakes. Not just failure, obviously, we don’t want to celebrate that because that’s a race to the bottom, but failure where we figured out why we failed. We can learn how not to fail that way again. You want to normalize that thing and make it in a word, safe and even rewarded to do. Some applications: buggiest release of the year, worst blog post of the quarter of engineering, contributes to the blog. This kind of thing. Just pick something and award that thing, but the person who gets the award has to understand what the mistake was and explain how not to make it again.
Safe to Grow
Also, you want to make it safe for people to grow and not stay in the role that they’re in, especially if you’re a leader. You want people to believe that it’s good for them to outgrow you. That you want to see that happen. You don’t want to say, this is the thing that you’re good at and we’ll just keep you here because you’re good at this and you’re not good at those other things. You want a growth mindset, not a fixed mindset.
Psychological Safety Operationalized
Some points to summarize, cap this off, how do we operationalize this? You ask these questions. I actually recommend this, if you want to try to measure how safe is your team, whether you’re in charge or not. If you’re not, you could ask your boss. You could say, I saw this talk and I want to do this. You ask these questions, and just 1 to 5 rating. Where 1 is strongly disagree, 5 is strongly agree. If I make a mistake, it’s held against me. Members of this team are able to bring up problems and tough issues. People on this team sometimes reject others for being different. It’s safe to take a risk. It’s difficult to ask for help, and so on. No one on this team would deliberately act in a way that undermines my efforts. My unique skills and talents are valued and utilized. I’m free to grow in new areas of interest. Simply ask people what they think.
The catch is here, if it is an environment without safety, people might not be honest. If you sense that you really have a toxic, unsafe environment, asking these questions in a public and visible way is not going to work, so if you can make this anonymous, that might help. You got to watch, if you really sense, and I think you’ll know, having thought about the concept of safety a little bit, you’ll know if you’re in an unsafe environment. It can be hard to break out of one without help from the outside. That’s a culture that’s been established by a leader and that leader is probably going to need a lot of help and a lot of work to be able to get out of that mode, and that might not be a thing that you as an individual contributor can do.
Questions and Answers
Van Couvering: One thing I did see as a common theme was this question of what techniques you have to help get their quiet people to talk. You talked about some of them, but particular people talked about how to do this in remote scenarios, for example, managing airtime, social cues, how has that changed remotely? How do you continue to encourage quiet people to come out when you’re in a remote scenario?
Berglund: Encouraging gallery mode in Zoom, or Teams, or whatever, where you’re looking at everybody’s face where everybody’s face is there, is a good idea. Encouraging video to be on. I would not recommend requiring video to be on. Make it a normal thing for that to be the case, and a good thing. You got to be careful in the way that you do that. Like I’ve heard some people encourage video to be on with very inappropriate sorts of statements. You want to make sure that it’s a thing where you’re saying, we just like seeing each other’s faces, period, because it makes it easier for us to collaborate. Yes, looking at faces is key there. Somebody has to keep tabs. This is ultimately up to whoever is leading, formally. Somebody has to keep tabs on who’s talking and who’s not. If somebody is just not talking, then what I want to do after a little while is clear the deck for that person, just say, “Everybody else, hold up a second. I haven’t heard from Neve, I would like to hear what she has to say or whatever.” Just be explicit. Again, there’s risk there. If it’s a person who really didn’t want to talk, that might be intimidating for them. To some degree, it is their job. We’re working together, so if somebody really has a hard time talking in a meeting, then you got to work with them privately on that.
Van Couvering: Annie Ruda said that they’ve used whiteboarding sites to help draw out quiet people, because they can just drop a sticky note rather than trying to share in a group. You can just make your comment by writing something and sticking a note on the whiteboard versus having to speak. At the beginning of the meeting, says James, give them a heads-up that you will be calling on them or going around the room, and then give them a concrete prompt rather than the open ended, what do you think? Actually, I’m thinking of my son who cannot bear open ended questions, does not like having video on, and loves to use chat. I can see those techniques really working for someone like him.
Any thoughts on how to start from scratch when building a completely new team?
Berglund: That’s a great opportunity, especially if you’re the one building it, you get to set the tone. If you could keep these principles in mind, and just put them into effect, you can model the appropriateness of disagreement. You’re going to find there’s going to be somebody on the team who is going to be more likely and more willing to disagree with you. Again, these things all fall on a spectrum. There might be shy people who would think I could never do that, but there’s going to be somebody who doesn’t mind so much. Who sees things differently, and they don’t mind telling you. That person is a gift. It’s your opportunity to help build this environment of safety, because you can then model when that person disagrees with you that you’re not a threat. You don’t defend yourself. That’s a good thing that there’s now a new idea to debate. Lose some arguments. Be willing to lose. People are going to see this. Don’t download these slides and go to that new team and say, “Here are all the things I’m doing, this makes you safe.” It doesn’t work that way, they have to experience it. You can be intentional about even acting out some of these things that they then get to experience, and they will intuitively know, even if they’ve never read a word that Amy Edmondson has written, or anything about Google’s Project Aristotle 10 years ago. People know and experience this and behave accordingly, so just model it.
Van Couvering: It’s funny, you personally have to feel psychologically safe to do the mirroring and labeling so that you can make the room more psychologically safe for others.
Berglund: Yes. That’s a thing that anybody can do. That is leading that anybody does, regardless of your technical chops, your juice in architectural discussions, your place in the org chart, none of it matters. If you’ve got that in you, then you can lead out of that strength of yours. Even if you’re feeling like you’re faking it, you can do that mirroring and labeling which requires you to at least act like you feel safe. If you’ve got that strength to bring, you’re leading.
Van Couvering: I’ve seen teams struggle maintaining that sense of safety when they’re under a deadline. On the opposite side is, how do you help hold teams accountable and still maintain psychological safety? When there are constraints like that, how do you maintain that space while still having to meet schedules and maintaining accountability?
Berglund: That’s when it gets tough, because that’s when you feel the danger mechanisms are kicking in. Again, that’s the responsibility of the leader. This is one reason why to lead really does require a lot of self-awareness about these things, and a lot of strength on the inside. You now are the one who’s afraid because your team is going to miss the deadline, and that impacts you in a bad way because it’s your commitment. You have to make sure that you’re aware of what that brain threat mode stuff is doing to you. You can actually stop that in software. You can be aware of it and you can say, “No, I’m going to decide on principle to act differently.” As long as you get out ahead of it and see it in yourself. You have to learn to look for those things in yourself and decide that you will behave otherwise, which is a tall order, I know. You’re not always going to do it. That’s a fact. This stuff is very real, and it will get you sometimes, but the more self-awareness you have, you have a shot at it.
Van Couvering: Do you have any resources for learning how to read facial expressions? I ask this as a neurodiverse person who has trouble with body language.
Berglund: I don’t. It’s exactly your situation that I was talking to, because I know neurodiverse people, you’re like, what? I understand the things I just am like, what about that other people find easy. I don’t have resources, but that’s a good thing that I should.
Van Couvering: My son’s 15, and he is engaging with a social coach. These social coaches are there for children and teenagers as well as adults to help them succeed better at work and relationships in general by teaching them. We’re just getting started but like how to start a conversation, how to know when a conversation is ending, how to read facial expressions, all those things that are maybe obvious for some of us. Someone does math in their head, for neurodiverse people, they’re like, “We need you to show the work on how you read that facial expression and that body language.”
Berglund: Yes, because there isn’t anything obvious about it. Those of us who are neurotypical just need to know that and be willing to be a friend and a help to people we work with that that doesn’t work. Like if you’re going to sing with me, I would need people to, number one, help me. Number two, be patient with me when I usually don’t sound good, because I’m never going to sound good.
See more presentations with transcripts
MMS • Jordan Bragg
Article originally posted on InfoQ. Visit InfoQ
Transcript
Bragg: My name is Jordan Bragg. I’m a software engineer at Castlight Health. I want to spend a little time talking about analyzing code bases and why it’s important for being a better developer. Why is it important? The quicker you can jump into a code base, and understand it, the quicker you can provide value. This could also lead to you making better decisions on what types of tools, libraries you adopt. It can also give you a better understanding of the risk in certain libraries, or tools, or even into your internal code base, what risks exist in domains that you’re analyzing.
Providing Value
Digging a little bit deeper on the providing value aspect. As you become more senior, the more you are reading code, especially in domains that you know little about, and so being able to jump into a code base to provide guidance and support and not purely friction for engineers, is a big win. Outside of that, sometimes you have to jump into a domain and own it or bring it forward, or deprecate it, or derisk it. Being able to quickly jump in and understand it is vital for some of these time sensitive things. Then, lastly, there’s a bunch of open source libraries that we’ve started to use these days, and so being able to understand what you’re pulling into your code. One of the beauties around open source is that we can read the code and we can iterate and create enhancements or bug fixes, and not just submit tickets and hope somebody fixes it for us. Basically, analyzing code, reading code, writing code, it’s all an acquired skill that we get better at over time, the more we do. It’s something where you have to use it or lose it, and the less you do, the more you’re going to lose that skill.
Analysis
Over time of reading a bunch of code bases both for different companies as well as open source libraries, I see some patterns that I generally follow when I do that. I wanted to codify or structure it a bit, and maybe provide it as a way that could be helpful for you to build yours, or at least reflect on how you do it, and maybe even share some of the things that work for you. Breaking things down into three categories is how I did mine, starting with defining the problem, and doing some planning before we actually just jump into code. Then, if needed, we can explore the code base and break it down.
Define
The define stage is really around defining your problem, which could also be, what is your goal? Are you fixing a very specific bug in some code base? Are you adopting or rolling out some new technology or tool? Or are you trying to understand some complex domain to provide value for teammates and guidance? Or, are you also taking ownership of code? There’s various goals, just make sure you’re clear. The next part is around how much time do you want to invest in this? Do you have an hour, a day, a month? This very much affects how much context you can gather, which leads to the next part of, how much context do you need? Time and the need are intertwined here. Do I need a surface level just so I can provide value and see risks, or do I need to know every component here, because I’m taking ownership and need to innovate on it?
Planning – Context Gathering
Then we move on to the planning phase here. In the planning phase, the first thing we want to do is really just gather some context. This involves seeing what documentation exists. For open source libraries that are supported, there’s a lot usually. Of course, they don’t list all the bad things that happen in the code, but they do give a lot, and so this is not true for generally your company where documentation is either nonexistent or it’s very limited or stale. A lot of times as soon as it’s written, it’s already stale. Then you could also do things like pair program with engineers who know the domain. They can walk you through some bugs or some of the high level flows. Other ways could also be hands-on debugging. Can you get an example, run a simple example, set up your dev environment, write a test? There’s various ways to get your hands dirty and go through a debugger. The only caution here is that sometimes setting up your developer environment takes longer than reading the code, at least depending on how much context you need. Be wary.
Planning – Define Entry Points
The next part of planning is really around understanding the entry points that you want to read through. There’s the general procedural stuff that we want to read, but there’s also some of the how I got here. If you’re looking at a bug, you might have an error or a line number somewhere in the stack, where you’re interested. Then there’s a question of, do you need a larger context to understand how you got here and why it happened? Your entry point might include something further up the stack. If what you’re interested in, involves some state that was created outside of this flow, then you might want to also identify entry points where that state was created and used. Then, in a similar way, what is the asynchronous things going on in your code that you need to include because it affects the things you’re mainly interested in? You can add those as entry points.
Interests for Tagging, and Deep-Dives
Once you have an idea of your entry points, and you have some context, and terminology, then it gets into what you really are interested in. There are some things that I tag as interest as I read through code. One of primary interest for me is always I/O. Anytime I’m calling a service, or inserting into a database, or reading from a cache, or even reading and writing from disk, I want to make sure I tag those as important. I also generally care about anything that’s configuration or input, so things that I can provide that can change the behavior I generally care about. Then we get into domains. For domains, if you inherited, let’s say, a monolithic system, and you’re trying to break it down into microservices or at least understand all the domains. You could, as you’re going through these entry points, tag pieces where they’re intermingled, or you see a domain that isn’t part of this core responsibility. To tag these things which could aid in breaking your service up, and you could tie it with domain driven design, understand the domains, the bounded contexts, and then use this tagging to help detangle.
Then the last three are things that I always tag. The first one is anything that I’ve identified that is a key element or responsibility that warrants, maybe it’s important to dig deeper and understand. Then, if there’s other things that you think are important, and you want to know about, but they’re not key responsibilities of this entry point, so I mark those as passive interests. Then, as I’m going through the code, I really want to maximize the value that I’m providing. Anytime I have questions about things that don’t make sense, or things I don’t understand, I write those down. If I see things that just seem wrong, there’s bugs, inefficiencies, bad readability, make sure you mark those too. That way, you could potentially put some tickets or code it yourself to improve the code. You leave the code better than you left it.
Explore – Breaking it down
Now that we’ve done some defining the problem, planning before we do it, we are looking at, now we’re going to dive in. What does that mean? Breaking the code down into sub-problems is a good path. For that, iterating on some of these entry points, tagging these items of interests, and potentially breaking them apart by those items of interest. Then, really, stay on the surface. Don’t let the depth really pull you away or else you’ll be 10 levels deep, 4 levels removed, very confused about why you’re there. To do this, summarize things the best you can. Make assumptions if it’s not super clear. Question things that you can dig in deeper. You can only just try to simplify how you describe each level.
Exploration Pitfalls
This leads to some pitfalls that I have fallen into. The first one is around understanding every line of code. I always want to understand everything if I’m going to read it. However, the time commitment and sanity and volunteer generally are a limiting factor. You have to be really clear on how much context you really need. Some of the techniques I’ve used to keep myself on track is, again, the surface level analysis or breadth-first analysis. Try not to go in depth before I can summarize the top level. Then if you’re in the top levels, the surface, and there’s complex branching logic, if you can’t summarize it, then simplify it by either if it’s not of interest, go over it. If it is of interest, try to follow only the most common path or the path that your entry point follows. Then, if you really need to go beyond the depth to summarize the top level item, then set yourself a depth limit and time box yourself. For me, I try to keep myself at over than four in depth. Then I like to use the Pomodoro Technique to keep me focused, and tell me when I should come back to the surface.
A second pitfall that has two parts that I’ve fallen into in the past is around documenting nothing. Many times, you don’t do much planning, you just jump into the code and read, and you don’t take any notes or anything. There’s a few problems with that. One is that you’re really not maximizing the value of your time spent in there. Secondly, like for me, I don’t have the memory capacity to consume all that data and be able to verbalize it all and visualize it well. However, this has led to another part of the problem, which is around documenting too much. You can’t document every line of code, you have to keep it at a summary, minimal level about what you’re really interested in knowing about these things. Keep it high. One way I’ve gotten around this in the past is by taking handwritten notes. I get cramps really quickly, so I have to make sure I keep it at a summary level.
Example
Now that I’ve talked a little bit about the structure that I follow internally, I wanted to go through a little example, at least briefly. I’m going to define my problem. My problem is that I wanted to understand how Kafka consumer works. For Castlight, we started leveraging Kafka a lot more. One of the key pieces here was the consumer. I wanted to have a base knowledge of how this thing worked. Obviously, there’s a ton of other questions I have here around, how do I ensure it’s exactly once? How does rebalancing work? How chatty is the consumer? What is the system impact? These are all things I care about. I feel like there, I could build on from a base concept of, how does it work? How much time do I have to spend in this? Let’s just say I have a day.
Gathering context, I spend a good chunk of my time commitment at gathering context. There were so many good articles around that explained how a Kafka consumer works, even like exactly at the code level. It’s very likely, you don’t have to go that deep unless you’re wanting to contribute or understand specific aspects within. There’s great resources. Just looking at a simple example here of entry points. A simple example of a Kafka consumer involves creating a consumer, subscribing to topics, consuming records, doing something with them, and then closing. There’s a lot of internal things going on if you’ve read context about the consumer around it, committing offsets, and joining group, and metadata, and all that. I want to understand why that happens.
Looking at the first entry point here, which is just the constructor. My thought here is that it’s just going to be some simple instantiation, storing some of the configuration I passed in. Yes, no I/O, hopefully. Looking at the code, it is initially a bit overwhelming. You don’t have to read all this. It’s around 150, 160 lines of instantiation logic. I wanted to start tagging these interests and breaking it down a bit. Doing a few passes over this code, the first pass I’m going to do is to tag interests of state that is being managed in the Kafka consumer. If I do one pass through and identify each of the state, I now have 19 different things that this thing manages. If I do another pass, and say, this time, I want to identify interests of configuration, so what inputs that I can provide change the behavior here? I can go through and list each one. You could see that just at a surface level, there’s 34 different configurations. I’m listing which variables they’re tied to so I can get an idea of what state exists here. What state is managed by this Kafka consumer, which is a lot?
Doing another pass, we’re going to say, what things are passive interest and which things are key to the Kafka consumer. In the first page of code here, I saw mostly three things of passive interest. I’m going to list those, who, why. The GroupRebalanceConfig, EnableAutoCommit metrics, these are things that I mostly care about, just because I’m passing the configuration or input into them, and so, of these 34 configs I had, there’s some that I’m not including. I’d like to understand those.
Going into the second page, I start to see things that are key to polling records. What topics am I subscribed to? The metadata about the Kafka cluster and topic ownership, which is the metadata. I see a piece here around how we get the bootstrap servers to resolve the hosts. There is a question here of what I/O happens here, which ends up just being a DNS resolution. Then this K4 is interesting. There’s this metadata bootstrap, and you’re giving it the broker host. There’s a question here of, does this do Kafka API calls? We could set it as a point of interest and dig deeper. Just to give you a brief here is, it does not. It just sets some internal state on nodes.
Moving on, there’s this.client on K5 down below. Just glancing at, it appears to be the main in I/O client that talks between the consumer and Kafka brokers. If we move on to the next page, you could see that that client is passed into K7 and K8. Going back a bit, K6, we’re talking about how we assign topic partitions to different consumer group instances. That’s of interest. Then the last two, the coordinator. The assumption here is it does everything minus actually get records and deserialize them. Whereas the fetcher below is what’s actually polling records from the topic and doing deserialization. One thing that I’ll pause and say that’s of interest here is on K7, there’s this group ID is present. If it’s not present, it’s null. There’s a question here of, what is the behavior of the Kafka consumer if you don’t have a coordinator? How many people now need to assume that the coordinator could be null? I’m sure there’s a good story for that.
Moving on, looking at the next section, we’re looking at the subscribe. Here again, we’re looking for, is there any I/O going on? Is it state based? What’s going on? If we take a quick look, this piece of code is a lot more readable than the last one. There’s not a lot of stepping set here. Scanning through for passively key interests. Of interest here is that K1 and K5 were acquiring and releasing a light lock. Since Kafka consumer is not thread safe, a lot of these operations lock, so they can’t be interweaved. The next part is around validation. There’s this K2 fetcher clear buffer for unassigned topics. I don’t think this applies to us. It’s something that we should maybe dig into and understand why this happens. K3 and K4, we’re talking about subscribing to a topic, and then requesting metadata update. There’s questions there of, does this actually reach out to Kafka and update the metadata, or assign subscription metadata? Does it do anything with Kafka itself? The answer, if you set a depth limit of four, and you go three levels deep, you’ll see that the subscriptions and metadata is pretty much just updating state and instance within there, and there’s no I/O. It’s pretty simple.
Moving on to the last one, which is the poll. This is the beast of the three. We won’t go too deep on it. Taking a quick look here, not a lot of stepping set that we see. However, a few things of interest. Again, K1, K7 we’re looking at the whole, obtaining the light lock and releasing at the end. Looking at some metric of this process, there’s some validation. We have a passive interest on this client TriggerWakeup. We don’t need to dig into it right now, I don’t think. Then K3 and K4 are of interest, because it seems like those are the two core things. UpdateAssignmentMetadataIfNeeded, my assumption here, especially by its WaitForJoinGroup is that it does everything with the coordinator minus polling actual records. This is part of a branch logic where one is in a while loop and one’s a single call. Bubbling up to our initial entry point, we’ll see that it always falls into this first one, and that the second one is deprecated. Then K5 and K6 is around transforming the data before you return it, and optimization to call forward. Really, what we care about here is K3 and K4.
Kafka API Calls
Just giving you an idea here, if we dig a few levels deeper in those two methods, one thing we’ll bubble up here is what API and I/O calls are done between the consumer and Kafka. Just to give you an idea, the UpdateAssignmentMetadataIfNeeded, makes up to 11 different API calls to Kafka, depending on what exists already, the first run versus the nth run. Then pollForFetches is primarily just around fetching and getting metadata.
State and Config by Class
Stepping back a little bit, and I’ll be able to scan through them quickly. One of the things to do is to create some visualizations of what you know, and what’s valuable. One of those is creating a simple class diagram from the Kafka consumer of the 19 items that we cared about. Tying them to the main classes that we found of key interest. Then, the main piece here of interest is I’m tying all of the configuration state to where it’s being used. If I care about where request timeout milliseconds is used, I know it’s in both the Kafka consumer, the consumer network clients. I think that’s it, yes. If I want to understand how they’re used in each one, I can dig deeper. Now I can summarize my overall learnings from that reading, which is that there’s a lot of state which is risky. It’s not thread safe, so be careful. There’s a lot of configurations that can turn so many knobs. There’s the logic in the creation of the consumer, and the subscribe is very minimal and just setting state. Then the poll, there is a huge amount of logic there that happens sequentially every run. We know the two pieces that we really want to dig into more if we need more context.
What’s Next?
Rolling it back. What is next? We did a little bit of verbalizing and visualizing, but continue doing more, iterate on any new entry points to get more context. Make sure that you stop when you have enough context. I think that’s pretty important, since you don’t need to read every line of code. Then, as you do this over and over, you’ll improve your own process, have better tools, remove or add steps.
Questions and Answers
Van Couvering: Definitely a theme that maybe you can talk to more is the tooling around this. How do you actually do the annotation is one big question. Then another one is about overall management of your notes, so you don’t get lost in your own notes. Maybe you can talk more to those questions.
Bragg: I think that there’s going to be no perfect answer on the tooling piece. As you were saying, David, you looked around a little bit, and there’s not a ton. There’s obviously room for improvement here. I think in the past, I’ve locked myself into writing some of these notes, and then summarizing them into more formal documentation in Lucidcharts and things like that. I think that there’s room for improvement. I would love to see more like IDE tools that you can use that add some associated metadata, and then you can upload that into Confluence or something.
Van Couvering: Yes, or even be able to annotate, and have those annotations be part of the code itself that gets checked into GitHub, some standard annotation. The closer you have is comments.
Bragg: You have comments. I think that we also try to document code with like BDD tests and things like that. As we’ve seen, those are hard to keep up to date too.
Van Couvering: One person did mention that VS Code has bookmarks they can add to the code. We talked about using breakpoints to annotate code and then doing notes associated with that.
I was going to ask you about the visual stuff. You had one example of the class diagram. Another thing that I know I’ve done is flow diagrams, whether it’s sequence diagrams or the diagram where you have different entities and they just show the steps using one through four, the step numbers, things like that, to show the flow.
Bragg: I think that depends on the type of entry point that you’re looking at. The example that we did, the first one was building up state, which I feel a little bit like a sequence diagram is not great for. I care a lot about all the configuration and so the class diagram made sense there. Then the later parts where you can have procedural, that’s where those flow diagrams and sequence diagrams can be really helpful.
Van Couvering: Did you use a tool to autogenerate the class diagram, or did you create it by hand just for the properties you note as important?
Bragg: I tried to use some of the tools, but the tools brought in a bunch of other variables that weren’t part of the initial instance. In time crunch, I manually created that one.
Van Couvering: I actually have found that tools can be almost as overwhelming as the code, because the class diagrams, they don’t know how to separate the details that are important from those that aren’t, and so you end up with a class diagram that can be as confusing as the code. I actually learn better when I draw them by hand. It helps that sink into me more when I actually draw it by hand.
Bragg: IntelliJ has an autogenerated class diagram thing. If you say like, add variables or add methods, it adds everything, and you don’t have any way of saying, I only want to see some of the stuff I care about.
Van Couvering: There’s a couple of questions related to instrumentation, tracing tools, and static analysis. Have any of these tools helped you in the past, or have you found them to be useful?
Bragg: I think that it depends on identifying some of the things you actually want to read. Sometimes I’m like, I don’t even know where to start with some of these entry points, if I’m looking at a library or whatever. Observability tools can be good to link me in a certain direction, or tell me what state exists. I’m a big proponent of tracing, instrumentation, debugging, things like that to help you walk through. Even this stuff, when, for example, it probably would have been good if I could have used a debugger to see what the state was, because I ended up having to spend a bunch of time looking at the instantiation, because there was that much state.
Van Couvering: How do you share what you learned with the rest of the team to make sure no one has to dive into that same code base or knowledge again?
Bragg: For me, I’m a very visual person. To me, it’s always about creating visualizations, explaining with examples. When I write these notes, a lot of times, it’s very rough for me, and that’s why the tagging is really important in summarizing what the code does, and not trying to document every line. Because then I want to take that and make it readable for everybody, whoever the audience is, and create visualizations and things like that. It’s an iterative process.
Van Couvering: Once you’ve done that, there is still the risk that it will also get out of date, like any other documentation.
Bragg: It is true. That is the documentation dilemma where, as soon as you write it, it’s stale. I think that that’s where it would be good to have good tooling and have a way to associate it with the code in Git, because then if they change the code in merge reviews, then you could have it at least notify you that your documentation is updated.
Van Couvering: I just was curious what has been one of your more challenging code bases. Then when you conquered it, what benefits did you see out of finally getting it figured out?
Bragg: Some of the most challenging code bases are usually the ones internal, because there’s zero documentation and stuff. I think that, to me, I like understanding, getting at least a base understanding of how something works. That helps me understand when I see issues or if I need to do reviews or rewrite it. It’s really useful. I’m trying to give, like you said, one big win. I think that I do this all the time. To me, it’s just like, it was codifying something that I already do all the time, because I find it necessary. I can never just jump into these code bases and just write code without understanding it somewhat. For this Kafka example, we started using it pretty heavily and there’s so many different weird issues that pop up. Having at least even like the base understanding like we walked through, is really valuable to understand like, “I understand that happens because there’s all this internal state that it managed, and the metadata is out of date,” or whatever. I feel like you save a lot of time on your day to day, once you understand it.
Van Couvering: I do want to share it, because part of the reason I invited Jordan is because he and I worked together at Castlight, and I saw incredible value when you did all that work studying Kafka, and how it enabled us to be much more successful with it. I do think it is a superpower to have these skills and be able to quickly learn a code base and then apply it to what you’re trying to build in-house. I found it really valuable. I’ve always been impressed with Jordan’s ability to quickly understand code. Now I know the secrets too.
Bragg: It seems straightforward to me. Like I said, I do wish there was more tooling. I think that where internally I find it most useful is, at Castlight, we’ve been going more towards microservices and breaking up our domains and stuff. Being able to jump in these big monoliths and understanding the domains and breaking them apart isn’t really useful, especially if you own some table and you don’t know that somebody else is going directly to said table. It’s good to dig in and understand those things.
See more presentations with transcripts
MMS • Renato Losio
Article originally posted on InfoQ. Visit InfoQ
Google Cloud recently announced the general availability of Certificate Manager, a service to acquire, manage, and deploy TLS certificates for use with Google Cloud workloads.
Announced in preview earlier this year, the new service supports both self-managed and Google-managed certificates, and has monitoring capabilities to alert for expiring certificates. Ryan Hurst and Babi Seal, product managers at Google Cloud, explain:
You can now deploy a new certificate globally in minutes and greatly simplify and accelerate the deployment of TLS for SaaS offerings. Coupled with support for DNS Authorizations, you can now streamline your workload migrations without major disruptions.
Google-managed certificates are certificates validated either with load balancer or DNS authorization that Google Cloud obtains, manages and renews automatically. Certificate Manager supports as well self-managed certificates, X.509 TLS certificates that the customer obtains and uploads manually to the service.
Certificate Manager integrates with External HTTP(S) load balancers and Global external HTTP(S) load balancers but they must be on Premium Network Service Tier. After validating that the requester controls the domain, the new service can also act as a public Certificate Authority to provide and deploy widely-trusted X.509 certificates. Hurst and Seal add:
During the certificate manager private preview of the ACME certificate enrollment capability, our users have acquired millions of certificates for their self-managed TLS deployments. Each of these certificates comes from Google Trust Services, which means our users get the same TLS device compatibility and scalability we demand for our own services. Our Cloud users get this benefit even when they manage the certificate and private key themselves–all for free.
Announcing the general availability, the cloud provider added a number of automation and observability features including the previews of Kubernetes integration and self-service ACME certificate enrollment. The plan to leverage Terraform automation was announced too.
Per Thorsheim, founder of PasswordsCon, comments:
Very happy to see Google Trust Services being DNSSEC signed & having a proper CAA record (obviously!). Still want to nudge towards signing google.com though (…) Similarly, seeing the lack of MTA-STS & TLS-RPT records makes for sad clown GIFs, when Google themselves is (was?) promoting their use.
With Amazon offering AWS Certificate Manager (ACM) since 2016, Google is not the only cloud provider with a managed certificate service. Certificate Manager is not the only option to manage a certificate on Google Cloud: if the deployment does not require wildcard domains and has less than 10 certificates per load balancer, Google suggests uploading the certificates directly to Cloud Load Balancing.
There are no additional charges to use Certificate Manager for the first 100 certificates, with an on a per-certificate, per-month pricing structure for further certificates.
MMS • Mingxi Wu
Article originally posted on InfoQ. Visit InfoQ
Subscribe on:
Transcript
Shane Hastie: Good day, folks. This is Shane Hastie for the InfoQ Engineering Culture podcast. Today, I’m sitting down with Mingxi Wu. Mingxi, thank you so much for taking the time to talk to us. You are the VP of engineering for TigerGraph. The useful starting point is who’s Mingxi and who’s TigerGraph?
Introductions [00:39]
Mingxi Wu: Thanks, Shane. Hello, everyone. This is Mingxi Wu. I’m currently a VP of engineering at TigerGraph. So I graduated from University of Florida with a PhD specialized in database and data mining. Both topics span my research area. I stay six years in University of Florida, and then my first job is at the Oracle headquarters and I worked in the query optimizer group for three years. And my main job is to fixing optimizer box and creating new features for relational database optimizer. And after that stint, I joined a startup called Turn, and it’s a advertisement startup managing online users. And there, I used Adobe and Spark to manage big data. And after three years at Turn, I joined TigerGraph 2014 and I have been with TigerGraph for six years. And TigerGraph is the market leader in managing graph data and providing real-time insights for connected data. That’s my background.
Shane Hastie: Thank you very much. Now, in your role as VP, one of the things we were talking about was the engineering culture that you have built at TigerGraph. You want to tell us a little bit about that, please?
The TigreGraph engineering culture [02:00]
Mingxi Wu: So at TigerGraph, we really value engineering talent and we really think the engineering culture is first before TigerGraph can get any visible success. And we spend a lot of time to make sure our culture is correct, and we have our company core value published on our website. And also as a engineer leader, I make sure that I provide a full transparency and learning environments for our engineers so that they can continue learning and contribute back to the product.
Shane Hastie: One of the things that you mentioned to me is that through the pandemic, you managed to actually double the size of the team. How do you do that in the remote environment and maintain the company culture, the values that you’re trying to build?
Manage agreements to empower people [02:51]
Mingxi Wu: That’s a very good question. So when the pandemic comes, we just start hiring more team member. Most the position are created to fit the pandemic dynamics. So most engineers hired remotely and they spread out in the country and across the globe. And it’s really a top priority for me and for the company to help the new hire to integrate into the productivity and integrate into the team contribution. And also don’t feel isolated on their home working environments. So what I found is working during this transition is I really focusing on managing agreement instead of managing people. So I do value people. And once I start managing agreements, I found people empowered and they got my trust and they also got a clarity on what they are expected to accomplish for the next quarter. And this mentality really help us to work collaboratively, distributed across the globe. That’s the first thing I did to manage agreements.
Shane Hastie: So let’s delve a little bit further into that if we may. What does it mean to manage agreements rather than managing individuals?
Mingxi Wu: So we actually define a project for each team and the projects are centered around each team’s mission. So, the team crystal clear about what are the top three mission they want to accomplish this quarter. And those missions are written down and across the team shared PowerPoint slides. And then around the written down missions, the manager and the engineers will define or refine the projects, the concrete deliverables. And we write down those concrete deliverables as agreements between managers and engineers. And then we have a process that deliver particular projects. So we start with stage one. So each agreement need to have a design doc and the design doc need to be reviewed by the team.
And then once the project pass the design review, then the engineers and the teammates are working on implementation, sprint , and then they show progress after each week. And then in the end, they will give a presentation and the user document to the technical writer as the final stage. So this agreement across three stages works really well. And engineer are very clear on what the expectation and the product manager and the technical writer understanding what going to be delivered. So this is how we manage agreements.
Shane Hastie: So one of the metaphors that you used when we were talking earlier was the driver versus the passenger mindset. How does that play out?
Driver vs passenger mindset [05:49]
Mingxi Wu: So, this is related to my personal experience. So my first job is at Oracle and Oracle is established database company, and it’s been there for many, many years. So when I first joined Oracle, so my daily job is fixing box and I’m really having a passenger mindset. So whatever my manager asked me to do, I will just accomplish that. But beyond that, I don’t have any extra contribution to the company and I feel very comfortable. So work and life is very balanced. And then coincidentally, I joined a startup world. So ever since I joined my first startup and then TigerGraph is my second startup, my mindset has shifted. So I realized I was sensing urgency every day. And even though I stay at TigerGraph for eight years, my sense of urgency is staying with me for the past eight years, it never fade off.
So I sit back and I realized that I have shifted my mind from passenger side to become a driver and driver means there is no established business there, and there’s no established product there. And at a startup, you are establishing a new market and you are creating a new product that people has not seen before. There’s no established model that you can copycat. So that’s why you feel, oh, there are new issue popping up due to the new product release and there are no new issue to fix. And there are new difficulty that you never encounter until your early adopters share with you. All these challenges accompany you every day. And then in order to take the startup to the next stage, you have to keep a driver mindset to drive the scarce resource at hand, to solve the competing priority, new challenges every day. So that’s what I observe in my past eight years at TigerGraph.
Shane Hastie: So how do you motivate this driver mindset in the people in your team and your engineers?
Motivating the shift [07:58]
Mingxi Wu: So most people, I think including myself, joining a startup with a mindset I can make a good fortune when the company go to IPO. And I joined TigerGraph when I was younger and I was thinking, “Oh, if this hypothesis works, then I can make a million bucks and maybe I can retire.” So that’s really was my mentality eight years ago, but later, my mindset shifted and I see more and more Fortune 100 companies are relying on our technology to maintain their operation, daily business operation. And I feel that what really drive me to full of energy to come to work every day is to really build a sustainable business that benefits and impact people’s life with new technology.
So my mindset shifted because the environment change from a hypothesis to see the concrete deployment of TigerGraph on the people’s daily life affected business, really make me think bigger and think longer. And I share the same mindset journey with my teammates. I tell people that we are coming here to build something last and really improve people’s life. And the IPO, the stock equity rewards will come naturally. So that really make bigger ambitions, but really a good vision that people buy in and then feel proud to contribute.
Shane Hastie: How do you, in your leadership role, get close to your engineers and build trust?
Practical advice on getting close to engineers and building trust
Basically, I found it’s very necessary to have persistent and continuous weekly one-on-one with you, direct report. And I have, I think, 10 direct report now. And I have either in-person one-on-one or remote one-on-one every week on my calendar. And what I did is for people who are not residing on the same city with me, I use Zoom to have recurring weekly meeting with them. And in the weekly meeting, I maintain a shared Google doc with my direct report. And each time before the meeting, the meeting participant, me or the other party, can write any questions, or suggestions, or discussion points in the Google doc. And when we meet, I can open the Google doc and discuss with the participants to solve the problem and reach agreements. And this Google doc will also record a journey of weekly what happens between me and my direct report.
And I found it’s really important to do it consistently throughout the year. And at the end of the year, when I review the Google doc with my direct report, we really accomplished a lot together. And besides a weekly one-on-one, the second thing is I do travel a lot and I travel to different city where we have engineer teams and I meet them quarterly face-to-face and to discuss what’s their project progress, what their suggestions, improving our current product and what’s going on in the other team. So it’s really important for me to travel to different engineer locations to meet them quarterly.
Shane Hastie: How did you achieve that when we weren’t able to travel?
Mingxi Wu: At the beginning, when the airline shut down, I just use Zoom, but once the vaccine roll out and the situation gets better, I travelled. And I do get COVID during one of the travel, but I found that I recovered within three days. So it’s a limitation, but I think the current COVID situation still permit me travel quarterly. If I cannot travel, I just use Zoom weekly.
Shane Hastie: One of the topics that you mentioned again in our conversation earlier was this concept of a role org chart that you said is very different to the HR org chart. What is a role org chart and how does it help in terms of accountability assignment?
Role org chart rather then HR org chart [12:10]
Mingxi Wu: One of the inventions, I call it invention because I never saw it from any textbook, any other warfare book or management book. So what the challenge I met when we got the headcount to build a global technical support team, the question HR asked me is how many people do you need in order to support hundreds of TigerGraph customers across the globe? And we expect to double the customer base and this simple question, how many headcount do I need, stuck me. And I don’t have a answer for that. And then the only resource I have is one technical support manager. And he has three direct report at that point. And I discussed with him on the one-on-one meeting and he said, “Hey, Mingxi, I don’t know if you let me do the ratio calculation. I can calculate how many technical support we need based on our customer count.” But obviously, we know that naive solution will not work.
And then I sit down with him and said, how we divide the function work for TSE? What are the components? What are the roles that you need in order for the technical support team to operate? Then we draw on the shared Google doc and we need how many team function unit for on-prem customer support. We need another team for cloud support, and we need a manager to review weekly tickets. And we need another manager to build a training material and continue to educate the team with the new product release features and so on and so on. So we build a role org chart and each role has a particular well-defined function that role is responsible for. And then we drew that role org chart on PowerPoint, and then we start assigning job description on each role.
And we look at this role org chart, we know it’s organic operation model. And then we estimate how many employee we need for each role. And then we provide an accurate prediction of the head count. And once we work out this role org chart and provide an accurate forecast, we start hiring, and then start writing job descriptions. And then we extend this role org chart methodology to our quality engineering team and also to our development team. It seems to works really well and people in startup come and go, but the role stays, the business operation model stays and the job description stays. And sometime we adjust the role org chart a little bit based on the variation of the business involvement. It really provides clarity and efficiency across HR and our hiring needs.
Shane Hastie: And for the people in the team who receive these role descriptions, how does that help them align their work?
Engaging new people with the role org chart [15:08]
Mingxi Wu: For each sub-engineering team, on their first day, we share the role org chart for the team that the new hire is going to join. And she will get understanding how the team operate and when she needs help, who she can ask for, which role is the right person she can ask for help. And also, she will explore different role description and put their hands in wet on one of the role. And after three months or six months, they can talk to their manager to switch roles. And it’s also provide career opportunity for the engineers and also provide clarity across the engineers.
Shane Hastie: What else do you do to build your team’s cohesiveness and sharing of knowledge?
Providing sustainable learning environments [15:58]
Mingxi Wu: So, one thing that my personal experience I got lost is after I got out of school, I feel I don’t have a guidance what I want to learn, besides doing my daily routine to fix box or some features. I feel a little bit getting lost, because I get used to work curriculum and see my milestone of learning knowledge. And entering the industry does not have that luxury of guidance from school professors. So I want to provide a sustainable learning environments for new graduates, as well as for veterans. And the technology landscape is changing every day. And no one can say their past knowledge will be enough for doing advanced startup job. So what I did is I create a weekly knowledge sharing team meeting, and every Thursday, 9:30 to 10:30, I will fix that timeframe for my engineering team. And one engineer will prepare and then talk about one trouble within technique, and talk about one algorithm complexity problem that they solve, or talk about one research paper they read that can help our product.
And I have been doing that for the past two years. And I think if I’m traveling and I will have substitute host, never miss it. So that persistence got really a good investment on the engineer side. I try to aggregate knowledge of 70 people into the knowledge and sharing cadence and everyone participates. I build that also as part of the personal development KPI. So everyone need to present at least once every half year. And I ask each presenter to share the slides before the meeting so people can get the background, what knowledge they are going to learn, and they can really digest within that one hour where live questions. And this learning environment is part of the investment to the engineer knowledge building, and also help engineer to learn how to present deep technical results to their peers. And I found it’s very effective and people really love it, including our interns.
Shane Hastie: Thank you very much. Some really interesting topics there. If people want to continue the conversation, where do they find you?
Mingxi Wu: They can connect me on LinkedIn, search my name, Mingxi Wu, or they can email me mingxi.wu@tigergraph.com. So my email is just my first name.my last name@tigergraph.com.
Shane Hastie: Thank you so much.
Mingxi Wu: You’re welcome and thank you for providing the opportunities.
Mentioned:
.
From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.
MMS • A N M Bazlur Rahman
Article originally posted on InfoQ. Visit InfoQ
JEP 429, Extent-Local Variables (Incubator), was promoted from its JEP Draft 8263012 to Candidate status. This incubating JEP, under the umbrella of Project Loom, proposes enabling the sharing of immutable data within and across threads. This is preferred to thread-local variables, especially when using large numbers of virtual threads.
In this JEP, instead of ThreadLocal, a new type, ExtentLocal, is proposed. An extent-local variable allows data to be safely shared between components in a large program. Usually, it is declared as a final static field, so it can easily be reached from many components. It is written once, immutable, and available only for a bounded period during the thread’s execution. Consider the following example:
class Server {
final static ExtentLocal PRINCIPAL = new ExtentLocal();
void serve(Request request, Response response) {
var level = (request.isAdmin() ? ADMIN : GUEST);
var principal = new Principal(level);
ExtentLocal.where(PRINCIPAL, principal)
.run(() -> Application.handle(request, response));
}
}
class DBAccess {
DBConnection open() {
var principal = Server.PRINCIPAL.get();
if (!principal.canOpen()) throw new InvalidPrincipalException();
return newConnection();
}
}
Typically, large Java programs are composed of multiple components that share data. For example, a web framework may require server and data access components. The user authentication and authorization objects need to be shared across the components. The server component may create the object and then pass it as an argument to the method invocation. This method of passing arguments is not always viable because the server component may first call untrusted user code. ThreadLocal represents the available alternatives. Consider the following example using ThreadLocal:
class Server {
final static ThreadLocal PRINCIPAL = new ThreadLocal();
public void serve(Request request, Response response) {
var level = (request.isAuthorized() ? ADMIN : GUEST);
var principal = new Principal(level);
PRINCIPAL.set(principal);
Application.handle(request, response);
}
}
class DBAccess {
DBConnection open() {
var principal = Server.PRINCIPAL.get();
if (!principal.canOpen()) throw new InvalidPrincipalException();
return newConnection();
}
}
In the above example, the PRINCIPAL object represents ThreadLocal, instantiated in the Server class, where the data is initially stored. Then, it is later used in the DBAccess class. Using the ThreadLocal variable, we avoid the server component calling a PRINCIPAL as a method argument when the server component calls user code, and the user code calls the data access component.
Although this approach looks compelling, it has numerous design flaws that are impossible to avoid:
Unconstrained mutability: Each thread-local variable is mutable. This means that a variable’s get() and set() methods can be called at any time. The ThreadLocal API enables this to be supported. A general communication model in which data can flow in either direction between components, leads to a spaghetti-like data flow.
Unbounded lifetime: Memory leaks may occur in programs that rely on the unrestricted mutability of thread-local variables. Because developers often forget to call remove(), per-thread data is often retained for longer than necessary. It would be preferable if the writing and reading of per-thread data occurred within a limited timeframe during the thread’s execution, thereby eliminating the possibility of leaks.
Expensive inheritance: When utilizing a large number of threads, the overhead of thread-local variables may increase because child threads can inherit thread-local variables from a parent thread. This can add a significant memory footprint.
With the availability of virtual threads (JEP 425), the problems of thread-local variables have become more pressing. Multiple virtual threads share the same carrier threads. This allows us to create a vast number of virtual threads. This means that a web framework can give each request its own virtual thread while simultaneously handling thousands or millions of requests.
In short, thread-local variables have more complexity than is usually needed for sharing data and come with high costs that cannot be avoided.
This JEP aims to solve all these problems with ThreadLocal and provide better alternatives.
Developers interested in discussing this new ExtentLocal class may visit this Reddit thread.