Month: December 2022
MMS • InfoQ
Article originally posted on InfoQ. Visit InfoQ
The InfoQ trends reports provide a snapshot of emerging software technology and ideas. We create the reports and accompanying graphs to aid software engineers and architects in evaluating what trends may help them design and build better software. Our editorial teams also use them to help focus our content on innovator and early adopter trends.
One of the core goals of InfoQ is to spot the emerging trends in software development that have broad applicability and make our audience aware of them early. Individual editors are constantly watching for new technologies and practices and then reporting on them. In addition to writing news for the InfoQ readers, our editorial teams also regularly discuss what they’re seeing, and how companies are adopting innovative ideas.
The process to create the trends reports is extremely collaborative, and draws on the collective experience of our practitioner editors. Our real-world experience helps us look past marketing hype and look for new patterns that are actually being put to use.
Following the ideas of Geoffrey Moore in Crossing the Chasm, we categorize the trends into adoption categories. We identify what is new and innovative, and reevaluate earlier trends to see how their adoption has changed since the last report. If you’re trying to understand if a technology you just heard about is brand new, or just new to you, the trends graphs are a great tool to judge how the broader industry is using it.
This eMag is an anthology of the InfoQ Trends Reports for 2022. Each trends report focuses on one persona or category. Because software engineering is rarely confined to a single area, your individual interests likely overlap with many of the reports. The culture and methods report is clearly the most cross-cutting, as it covers how teams are building and maintaining software.
You will also find some individual trends listed on multiple reports, sometimes with differing adoption levels as they pertain to a different audience. For example, cloud-native development and gRPC are in earlier stages of adoption within the .NET community compared to software architecture and cloud development in general.
Data was a major component on several trends reports this year, beyond the AI, ML & Data Engineering report. The Architecture & Design report described “data plus architecture” as a need for software architecture to adapt to consider data. The DevOps and Clouds report had data observability as a critical need. Data mesh also showed up in multiple reports.
New trends reports are published throughout the year. You can see all the historical reports here. From there, you can follow the “InfoQ Trends Report” topic to always be informed when one is released.
With the inclusion of all the reports published in the past twelve months, we hope this eMag can serve as a helpful single reference for trends across the overall software development landscape.
We would love to receive your feedback via editors@infoq.com or on Twitter about this eMag. I hope you have a great time reading it!
Free download
MMS • Matt Campbell
Article originally posted on InfoQ. Visit InfoQ
AWS has released Finch, an open-source, cloud-agnostic, command-line client for building, running, and publishing Linux containers. Finch bundles together a number of open-source components such as Lima, nerdctl, containerd, and BuildKit. At the time of release, Finch is a native macOS client with support for all Mac CPU architectures.
According to Phil Estes, Principle Engineer at AWS, and Chris Short, Senior Developer Advocate at AWS, “Finch is our response to the complexity of curating and assembling an open source container development tool for macOS initially, followed by Windows and Linux in the future”. They note that the core Finch client will always be comprised of curated open-source, vendor-neutral projects.
In a conversation on the CNCF Slack, Estes elaborated on the focus for Finch:
We are focused on the command line client that can help with the developer’s “inner loop” on a Mac: build, run, push/pull of Linux containers. We also are focused on being an opinionated distribution such that we have a signed .pkg installer that makes it easy for companies that need to plug in Finch to their device management suite (jamf for example).
Some questioned the rationale behind building another tool for container management. User Flakmaster92 wondered on a recent Reddit post “what Finch can— or will— do today that something like Podman can’t do?” Jeongmin Hong, on the CNCF Slack #finch channel, also wondered the same thing, “I wonder how different Docker Desktop or Podman Desktop and this project [are]. What is [the] difference between them?”
Nathan Peck, Senior Developer Advocate at AWS, explained that:
The main difference is that Podman uses CRI-O (or CRI-O libs at least) while Finch uses containerd. At AWS we have chosen containerd for operation at scale, and run incredibly large numbers of containerd tasks for customers of AWS Fargate.
Estes also stated that while Finch is based on many of the same components (Lima, containerd, nerdctl) as tools like Rancher and Docker Desktop, Finch is focused entirely on the command line client.
As Finch is based on nerdctl, most of the nerdctl commands and options work the same as if the tool was running natively on Linux. Finch allows for pulling images from registries, running containers locally, and building images using Dockerfiles. Via emulation, Finch can build and run images for either amd64 or arm64 architectures.
While the Finch core will remain focused on vendor-neutral projects, Estes and Short shared that future plans for Finch include support for downstream consumers to create their own extensions and features. AWS-specific extensions will be opt-in to not “impact or fragment the open source core or upstream dependencies that Finch depends on”. The plan is for extensions to be maintained in their own projects with distinct release cycles.
Once installed, the Finch virtual environment must be initialized via finch vm init
. After that, Finch can be started using finch vm start
. Running a container can be done via the run
command, for example:
finch run --rm public.ecr.aws/finch/hello-finch
The run
command will pull the image if it is not present, then create and start the container. The optional --rm
flag will delete the container once the container command exits.
Finch only supports macOS at this time with the team stating that there is a desire for Windows and Linux support in the future. The prerequisites for macOS include a minimum of macOS Catalina (10.15), an Intel or Apple Silicon M1 system, and a minimum configuration of two CPUs and 4 GB of memory.
More details about Finch can be found in the release blog post. The Finch team is actively soliciting feedback and roadmap ideas via GitHub and Slack.
Podcast: 2022 Year in Review: ADRs, Staff Plus, Platforms, Sustainability, and Culture Design
MMS • Thomas Betts Srini Penchikala Shane Hastie Daniel Bryant Wes
Article originally posted on InfoQ. Visit InfoQ
Subscribe on:
Introductions [00:24]
Daniel Bryant: Hello and welcome to The InfoQ Podcast. It’s that time of year where we do our end of year review and wrap up, looking back at what’s happened this year and also looking forward to what’s exciting in terms of topics and trends and technologies, people and processes that we are all interested in. I’m joined by my co-hosts today, Wes Reisz, Thomas Betts, Shane Hastie, and Srini Penchikala. I’ll let them all do a quick round of introductions, and then we’ll get straight into it.
Thomas Betts: I’m Thomas Betts. I’m, in addition to a co-host of the podcast, lead editor for architecture and design at InfoQ. My day job is application architect at Blackbaud, the leading software provider for social good. The trend that I’m looking forward to talking about today is the evolving role of the architect, how we’re documenting decisions and no longer just documenting designs. Srini?
Srini Penchikala: Thanks, Thomas. I am Srini Penchikala. In my day job, I work as a software architect with a focus on data and AIML technologies. At InfoQ, I am the lead editor for data engineering and the AIML community. In terms of trends, I’m looking forward to how AIML is kind of eating the software and the software world, as they say it. We’ll discuss more details later in the podcast. Next, Wesley?
Wes Reisz: My name is Wes Reisz. I am a technical principal for Thoughtworks in my day job, and then I chair QCon San Francisco and the most recently just finished QCon+. I guess the thing that I want to talk about, I’m not taking a side here, but there’s been some things you’ve seen on Twitter. DevOps is dead. Long live platform engineering. Again, not taking a side there, but I definitely would like to talk about platform engineering, things like teams apologies and effective engineering organizations today. Then, I think that goes over to Shane.
Shane Hastie: It does. I’m Shane Hastie. I’m the lead editor for culture and methods, host of The Engineering Culture Podcast. My day job, I am the global delivery lead for SoftEd, and I want to talk about getting back in person, about avoiding hybrid hell. And, how do we maintain team cultures as so much is changing around us?
Daniel Bryant: Fantastic, thank you very much, Shane. Yes, myself, Daniel Bryant. I lead the DevRel team at Ambassador Labs, a Kubernetes tooling company. Also, I’m the news manager at InfoQ as well, a long career in software development and architecture, which I’m super excited to dive into more today. I’m interested in a similar topic to you, Wes, about internal developer platforms. I see them as the jumping off point to a lot of the other things around platforms in general, so super keen to explore that.
How is the role of the architect shifting? [02:50]
Daniel Bryant : But, as we all know, one of our key personas at InfoQ is the architecture persona, the role of the architect. And, how do we all think the role of the architect is shifting now and how it might shift perhaps next year as well?
Thomas Betts: Well, I’ll take that to start. We’ve had some form of this on the InfoQ trends report for A&D for, I don’t know, as long as I’ve been reading it. The architect is technical leader. Architect is team lead. No one is quite sure what the architecture should be, but we’re watching the innovators coming up with new ways of defining what is the architect role, and how do you serve your teams around you? One of the things that’s been coming up repeatedly throughout the year was how it’s all about communicating decisions. Some of this came out because of the pandemic and hybrid workflows that people are having to communicate more asynchronously. They’re having to write things down and finding that it’s not just enough to show a picture. Here’s the diagram of the architecture I want. People are asking, why was that designed in the first place?
So, documenting why the decisions comes out in the form of ADR, architecture decision records. These have been around for a while, but I’ve seen them finally get to a point where companies are adopting them and making them standard practice. But, at my company, people will have conversations about new features and say, “Hey, do we need an ADR for that?” And, architect goes off, spends some time figuring out what to do, and writing it down. Then, people can discuss, why did you make that decision? You see the pros and the cons, and then it’s more of a collaborative process.
Wes Reisz: Hey, Thomas, before you keep going on that, ADR tends to … People have a different mental model for what an ADR actually is. What is an ADR in your organization?
Thomas Betts: So, the way we use it is it always starts with a question. I’m trying to solve this problem. Here’s a specific scenario. How could I go about doing this? Because, we know the answer to any question is it depends. It’s helpful to walk through what does it depend on? So, I like creating a new ADR template. There’s MADR and there’s other tools you can use that say, “Hey, here’s markdown for ADRs and command line tool.” Create a new one, give it a name, and it flushes out a template.
Then, you fill it in and say, “Okay, what is the decision you’re trying to make? What are the possible options you’re considering? What are the pros and cons of each?” Then, what’s the decision? All of that gets checked into a centralized Git repo. Other people who aren’t architects can review them. People who aren’t architects in their day job can write them. So, you can give it to a team and say, “Hey, here’s the thing that you can start thinking about, and you can start understanding the role of architecture and being an architect. Think through the decisions.” And, people start learning that decision making skill.
Wes Reisz: It’s that context, right? It’s so you can establish that shared context of why in the world did you pick that message bus? Why didn’t you do this? This is the things that were behind it. Yes, absolutely.
Thomas Betts: It’s a collaborative process, and I think that’s what this goes to, is that architects use these first, get the pattern established, figure out what works within our company. Each company, you adapt it to be what you need it to be, but for the big cross-cutting concerns, those get put in a shared repo. Then, inside one project, inside one microservice, we might have, hey, here’s how we’re going to do stuff. The team can then discuss it, but then that just gets saved in the documentation as part of the repo. New developer comes on the team. They can say, “Well, why are you doing it this way?” Well, yes, we had different options. We chose this, and now that why is written down.
It’s also living documentation. You can change your mind, and you don’t have to say, “Well, here’s the diagram that is out of date the day that you finished drawing it.” Then, no one goes back and takes the time to find the original source file to update the PDF. They’re like, “Well, I know what’s changed.” If the mental model is just in your head, then it never gets updated. This puts it in a simple text format. People can update it, and then if they want to make a diagram next to it, Mermaid diagrams fit in really nicely. You can do a Mermaid diagram. So, you can do do simple sketches inside the ADR and show, here’s what I’m thinking of. So, seeing how you combine those tools is nice.
How do we maintain this type of living documentation and the culture that supports it? [06:27]
Shane Hastie: Thomas, I’d love to see how do we maintain this type of living documentation and the culture that supports it? Because, oh so often, we’ve tried to bring in documentation. Every microservice must have a comment explaining it, and the comment is, “This is a microservice.”
Thomas Betts: I think when you’re talking about … What are things we’ve used? Wes mentioned service bus. If we’re going to use whatever service bus technology or how we’re going to send messages across the wire, how are we going to serialize it? Things like that, if it’s a big … Here’s the company-wide decision. Sometimes, people just show up, and it’s the story of the monkeys that won’t climb up the ladder because the first monkeys got hit with the fire hose. People don’t remember why. Well, having it written down, and then two years later, you’re like, “This seems like it doesn’t make sense anymore.”
We don’t have two years worth of history, but being able to point to six months ago, what did I decide? I had that personally. A decision I made in January seemed right based on the information I had at the time. By May, new information was available, and new business priorities were available. And, we could reevaluate. Oh, option two was a sensible thing to do in January, but come May when we’ve got a different direction and that’s not the priority, reevaluate the it depends, and a different option comes out and says, “Oh, we should go a different direction.” We say, “This ADR supersedes the previous ADR, and this is the way we’re going.”
Wes Reisz: There’s three things that he said. Just to tease out one, he said the word lightweight. This is very light. It’s not a novel. It’s not a huge book. It is a very lightweight … What was the decision? What’s the context? What does it mean? It’s very, very lightweight, just a few consequences. Then, the second thing he mentioned was Git, which means it’s versioned. So, if you change it, you just give it another version, and it can keep evolving. If it’s in a wiki, cool, but that doesn’t really show you the full version that changed. So, getting it into Git is really key for the versioning. Then, also the third thing is still Git. It’s right there with the code. So, when you check out, when you clone that repo, you’re looking at it. You can look at the 10 ADRs, maybe the 20 ADRs that maybe went into why some of these things happened. That’s what’s so powerful, I think, about ADR sums.
Thomas Betts: The fact that it’s a read me file in a Git repo brings the barrier to entry down to the average engineer developer, not, oh, that’s the ivory tower architect. We’re trying to get away from the architects are over there. They make decisions. They tell us what to do. This is now embedded with your project, and you see it with your code. The team feels empowered to do it. Then, if you see them as examples, the next time when you’re having a big feature discussion, and somebody is sitting there and having half an hour discussion, or a standup is taken up with … I don’t know what to do here.
Tell them to go offline. Fill out an ADR just to walk through the process. Maybe it’ll help them think through it. That doesn’t take an architect, but it sometimes takes that architecture mindset of it depends. Think through your pros and cons and your trade offs, and write it down. It’s like rubber duck debugging. Having to explain yourself makes you understand the problem better. You’re going to come up with a better solution than you just say, “Oh, we’re going to go with option A, because I know it best.”
What impact is the increased visibility of “staff plus” roles having on the industry? [09:27]
Daniel Bryant: One thing I just had as well, Thomas, is that you did a fantastic interview. I think it was with Andrew Harmel-Law a while back. That is well worth referencing, right? Because, he addressed Shane’s question there. He talked a lot about how you get folks involved, who should be involved, and how you incentivize to do that kind of stuff, right? That was a fantastic podcast and fantastic article.
I think that’s a nice segue, Thomas, as well to the next thing I was going to look at was Staff Plus in terms of the role of each individual. You mentioned there, not just the ivory tower architect. I think many of us here started our careers when the ivory tower architect was very much a real thing. What do folks think now in terms of options for senior IC roles, getting folks in to contribute to things like architecture?
Wes Reisz: One thing that’s interesting is that we’re recognizing it, right? Before, it wasn’t so long ago in my own mind. It was like you got to a certain point. Okay, what am I going to do now? I’ve got to this point where I am that big A architect you mentioned. Now, I guess I’ve got to be a manager. I guess I’ve got to get into a director role. Now, with Staff Plus, it’s starting to say, “What is the roadmap beyond that staff level?” So, I think that right there, just being able to have companies intentionally looking at it is a huge point for me.
Srini Penchikala: Just to add to that, right? Architecture is getting the right focus as a discipline and a software craftsmanship, rather than just diagrams and the artifacts. That’s where the architects are becoming more valuable to the enterprise, because what they can do, how they can contribute to the teams, rather than just creating some PowerPoint slides and throwing it over the wall. I think most of them are hands-on, and they are involved as throughout the life cycle of the software development. Also, like Thomas mentioned, it’s an iterative and incremental versioning of the architecture. So, it’s going to evolve through the software product development lifecycle.
Thomas Betts: Yes, one of the things I like about the different levels of engineering once you get past, because I think you’re right, Wes. It used to be you were a junior engineer. You were an engineer, senior engineer, and then, well, the next thing up has to be architect, because that’s the only title we came up with. It was a title, whether you were doing the role or not. Well, that guy’s been around longer. The other option was management. By having the Staff Plus and having actual levels of engineering, it tends to be that more T-shaped role of you have to think about cross-cutting concerns.
You have to think about more than just the one little project you’re working on. You’re seeing companies recognize you need to move up and say, “Okay, how do you solve this problem for two teams that works well?” That gets to ideas like platform engineering that I know you wanted to talk about later. Somebody has to think about the cross-cutting concerns, the big projects, and the big picture of how does this solve multiple problems? How do I come up with new ideas? That goes up that ladder of not just three levels of engineering, and then you’re out.
Wes Reisz: Yes, that T-shaped engineer always has resonated with me, too. Be broad across many things, but deep in a particular area. Yes.
Shane Hastie: Charity Majors gave an amazing talk at QCon SF where she was talking about actually consciously, deliberately bouncing in and out of that senior Staff Plus/Architect role and into management and then bouncing back and forwards and doing this a few times and seeing that as a great way to, one, spread and to move beyond T-shaped to pi-shaped, or broken comb is the other thing where people can build many deep competencies in different spaces and moving back and forth on that and managing your career like a product and very, very deliberately making some choices there. So, I would certainly point people to that QCon San Francisco talk by Charity Majors. It was great.
Thomas Betts: She called it a pendulum, because you can swing back and forth, and it’s not a one way road that you went over to management. You can’t come back. Sometimes you go over, and I like how here, you build software. Here, you build people and teams. You’re responsible for the people you are managing and their careers and supporting them. That is a different role. It’s not a promotion. It is a different job. Just like it’s not a demotion to go back to engineering, it’s a shift in what do you want to do and having that flexibility.
Wes Reisz: Yes, and normalizing the conversation about it, too. I think that’s a really interesting thing. What I loved about Charity’s talk that you mentioned, Shane, was that it was normalizing the conversation. It’s okay. How many people have you talked to, have we all talked to who went into a management position and was like, “I just wanted to go back to being an IC,” and then maybe even went back the other way? That talk, I think it was really cool just for us to all get together and nod and go, “This is normal. This is okay. It’s not a dirty secret.”
Is it possible to move back and forth from individual contributor (IC) and management roles within the same company? [13:49]
Daniel Bryant: One thing I’d ask, do you think it’s possible to do that within a company? Or, do you maybe have to change companies if you want to go between team lead to senior IC?
Shane Hastie: I’m going to respond there with the answer, it depends. If your company is mature enough to understand, and if you can have that conversation, so maybe if you’re the first one moving there and you can influence the organization, because it’s so much better for the organization if we don’t lose people all the time. The cost of replacing a senior person is so huge in terms of the knowledge that walks out the door when they leave because we’re not giving them what they want out of their careers. We see this, and I’m segueing a little bit into the culture stuff and the organization cultures.
One of the trends that we are seeing is this huge great resignation, 30% of people changing jobs, and 70%, according to some studies, are actively dissatisfied with their current position. Well, the cost of the organization of losing those people, phenomenal. What do we need to do at a leadership level to create the opportunities for people to move as their interests shift without losing the people? So, this requires a whole lot of different thinking at the executive leadership level, which now we touch on the other trend that I think is happening is business agility, the recognition of the Agile ways of thinking coming into organizations at higher levels and at different levels. Definitely, I think in that space globally, we’re at the early adopter.
Srini Penchikala: Shane, to add to your thoughts, what I’m kind of seeing in some of the organizations is the senior IC or Staff Plus, these positions are being created more as evolved opportunities, rather than appointed opportunities. So, it’s not like the senior leadership is saying that, “Okay, we are going to make you a senior Staff Plus engineer.” It’s the other way around. So, these team members are able to contribute not only technologically, but also organizationally. They’re able to manage their own people, the stakeholders, speaking of people management. Architects and dev leads, we have our own people to manage, the stakeholders. So, they’re able to do really these kind of things so very effectively, and really over the next level, to contribute 2X or 3X compared to other ICs. That’s where some of these people are getting promotions and making those promotions happen.
The role of the (mythical) 10x engineer; aim for one engineer to make 10 others better [16:17]
Shane Hastie: Yes, there was a great quote, and to my chagrin, I cannot remember who said it. But, it was in one of my podcasts and was talking about a 10X engineer is not an individual who is 10 times faster than anyone else. It is somebody who makes other people more effective, that this person makes 10 people better. That’s what a 10X engineer is.
Daniel Bryant: I think that’s actually maybe a nice segue, Shane, as well. Because, I’ve heard the same thing. It was Kelsey Hightower.
Shane Hastie: It was Kelsey Hightower. Thank you.
What do you all think about platforms, platform engineering, and the goal of reducing developer cognitive load? [16:46]
Daniel Bryant: Perfect, because I was chatting to him as well, and he was saying, “Don’t look for 10X developers. Look for someone who can create platforms, for example, that make other engineers 10 times as effective.” There’s many different takes on it, right? But, Wes, I think that’s a perfect segue into what you’re looking at there, because Kelsey was like, “Hey, the platform really is a massive lever. If you get it right, you can enable all the stream aligned teams as team topologists call them. You can enable folks actually delivering value.” I think that was a great segue, Shane. The floor to you, Wes, what do you think about what’s going on in this space?
Wes Reisz: I think you already introduced it perfectly right there. It’s that we started this conversation a while back with DevOps. If you’ve been on Twitter, if you’ve been on any of the social medias these days, you’ve seen some kind of conversation about DevOps is dead. Long live platform engineering teams. I think what that’s trying to say is that, look, we had dev. We had ops. We brought together DevOps. But, in that process, we took cognitive load. Again, this goes out to that team topology reference you just made. So, just a shout out real quick to Manuel Pais and Matthew Skelton, that book, Team Topologies, has been at the forefront of just about every conversation I’ve had in the last six months. So, well done to those two, and it’s just if you haven’t read it yet, why not? Go read it. But, we took dev. We took ops, and we brought them together into this space called DevOps.
We did amazing things, but in the process, we took cognitive load on our teams, and we went really high. It got really, really high. Burnout is an issue, right? Trying to keep your mental model together of all the things that a team has to deal with today from Kubernetes to Istio to the sidecars to your ingress, and then you’ve got to write code, is getting quite a bit. So, the idea with platform teams is how do you pull that lever that you mentioned and start to reduce that cognitive load? How do you reduce the friction, so you’re … to use team topology vocabulary, your stream aligned teams can deliver on the features, the business capabilities that they need to do? So, platform teams with some of the stuff, again, I think that you want to talk about with internal developer platforms and things along those lines, platform teams are providing the self-service capabilities, reducing friction. They’re doing all these types of things. So, keep going, Daniel. I know this is an area close and dear to your heart, as well.
Daniel Bryant: Yes, for sure, Wes. What I’m seeing as well is something emerges. There’s a distinction between internal developer platforms and internal developer portals as well, because we’ve got to mention Backstage, right? Spotify’s Backstage is seemingly everywhere at the moment. Every person I chat to, they’re sneaking Backstage in their stack. Backstage is an amazing project. It’s a CNCF open source project. There’s many other similar ones, if folks are looking as well. But, what some people are looking at Backstage as is a silver bullet. And, we all know, I think we mention silver bullets every year on the podcast, right? There are no silver bullets. Much like it depends, we always say there are no silver bullets. What folks are saying is, do you think, as you mentioned, Wes, about self-service first? That is the key thing. Lower cognitive load enabled developers to deliver value.
A portal, something like backstage ,may be part of that, but the actual platform itself is a bit deeper than that. How do I provision infrastructure? How do I push my code down a CI/CD pipeline? How do I verify the qualities? All these things like security shift left. All these things we talk about are so important, and that platform must offer you the ability to bake in all those sensible things. That’s stuff that you’ve talked about, Thomas, in terms of all that architecture, all the -ilities, right? The platform should help us as developers bake that in and definitely verify it before it ends up in the hands of users. So, I think next year, we’re going to see a lot more focus on internal development platforms. I think within the CNCF, the Cloud Native Computing Foundation, there’s a bunch of companies, a bunch of projects popping up. That’s usually a good sign that some standardization might happen in that space, I think.
Wes Reisz: The idea is not new. Netflix talked about the paved road for a while. They were kind of saying, “Look, let’s get on this paved road. If you get off it, you’ve got the freedom and responsibility, the freedom to do it, but the responsibility to take care of it.” But, I think what was so powerful, again team topologies, was that it put a name. It put a conversation. It started raising the conversation about this in a way that the platform team specifically’s job is to remove the friction, to improve the velocity of those streamlined teams.
Data mesh and platforms as product [20:53]
Thomas Betts: One of the topics we didn’t actually have on the list was data mesh. I think one of the things that companies struggle with in implementing a data mesh is that they have to create a platform that allows them to take charge of the principles and actually live out and say, “Okay, here’s the dream of having these individual data products.” There’s a governance layer that you have to have to make sure that everybody plays by the same rules, so you then get to the idea of having that standardized mesh that everyone can then put data in and get the data out that they need, as opposed to having the bottlenecks.
It’s just like the monolith was a bottleneck, because one team had to control everything, or there was one repo everybody had to contribute to. And, it’s always a pain. We spread that out. Well, you can’t just spread it out. You need to build the platform to help you spread it out so you can then get those benefits. That’s just one example of where we’re seeing the next level of doing anything with this is going to require some investment in building the platform and coming up with the people who just want to build the platform to then enable the rest of the company to say, “We can now go to the next level.”
Daniel Bryant: I like it. There’s one thing, just riffing back to our staff engineers. I’m seeing a lot more focus with and platforms as products. Actually having a product manager on a platform is a really interesting trend I’m seeing now. I think it’s quite an interesting role, because you have to be empathic. You have to be able to engage with the developers who are the customers, the users. You have to be a good stakeholder management, because often the senior folks are like, “Why am I paying for this platform? What value is it adding?” No, it’s an enabler. You’re investing in solid foundations, be it for platforms in terms of applications or platforms in terms of data. So, I think product management is something we all sort of do. Often, I think in this call, we all do on the side case, but as in I think it’s going to be more and more important in the platform space.
Thomas Betts: Yes, I like that. It has to get past the old idea that IT is just a cost center, and people don’t see it as a benefit. They just make sure the email works. No, software is what’s enabling your business to be more productive. All of these things, you can leverage the ROI on it and say, “This is a good investment. We need to continue investing in it. This is what it takes to invest in it.” You have to have the right people, the right roles. You have to think about it the right way.
What are the interesting trends in the AI, ML, and data engineering space? [22:48]
Daniel Bryant: Yes, 100%. I think you mentioned the data mesh there. It’s probably a good segue into your area, Srini, right? You are our resident data expert here. Have you been seeing much of data mesh this year? I know you’ve been looking at the trends report, and anything interesting in that space that you’d like to comment on?
Srini Penchikala: Yes, definitely. I think data mesh is one of the several trends that are happening in the area. Just like Thomas mentioned, data, similar to architecture and security, is kind of going through the shift to left approach. Data is no longer something you store somewhere, and that’s all it is. It is becoming a first class citizen in terms of modeling the transformation and the processing. The whole end-to-end automated data pipelines are definitely getting more attention, because you cannot have the data in silos or a duplication of the data and the quality of the data, all those problems. So, Yes, definitely database is one of the solutions for that, Daniel, as well as the other trends like streaming first architectures where the data is coming in terms of data streams. How do we process that?
There also the talk of streaming warehouses now. How do we capture and analyze the data coming in terms of streams? Not only we have data warehouse, now we have streaming warehouse. Those are some of the trends happening there. Also, definitely I know if you want to look at all the major developments in this area, they’re definitely data related trends, data management, data engineering. Also, the whole machine learning and artificial intelligence is the second area. The infrastructure for all of this to make it happen, platforms and everything, that’s a third area that is currently definitely going through a lot of transformation and a lot of innovations, also.
Thomas Betts: I’m going to echo what Srini was saying. When we were talking about the architecture and design trends report, which I think was back in February or April, we spent a lot of time talking about data and architecture and how architectural decisions are being driven. Like you said, it’s not just where do we store the data? Or, do I use SQL or no SQL? It is I have to think about data upfront as part of my entire system. So, how do I make sure we have observability, not just of the system, but of the data to make sure that the data is flowing through properly? Are we going to use AI models? Can we get our data into a way that we can feed it into a machine learning model so we can get some of those benefits? All that has to be considered. So, that’s where architecture has to start thinking a little differently, not just here’s the product. Here’s the object model, but what’s the data? And, creating data separately as focus on the data and architect for the data, it’s a different way of thinking.
Srini Penchikala: Yes, it’s almost like data is a product, right, Thomas? Give it enough emphasis for that, right?
Daniel Bryant: I think more and more, we are seeing that role as a product, Srini, be it data architecture, many things. I think that role of treating things as a product, design thinking, systems thinking, to Wes’s point. A lot of that is DevOps based as well, that systems thinking, that design thinking. But, it’s newish to us, many of us I think in software engineering. We just want to write code, is the thing I hear sometimes. But, now you’ve got to be a bit more thinking of the end-to-end experience, right?
Srini Penchikala: Yes, because data, as they say, is the second most important asset of any company after the people. So, Yes, we definitely need to give it as much attention. Data is definitely going through the similar evolution that code and architecture have gone through in the past. There is a continuous CI/CD type of approach for data as well in terms of receiving the data, ingesting it, processing it, on how do you version the data, and all that good stuff. So, definitely data side is seeing a lot of innovations. Machine learning, as you guys know, probably there’s no other technologies that has gone through the same level of innovation as machine learning and AI, right? We can talk more about this if you have time later. We have the GitHub’s Copilot, which was announced probably a year ago, I think.
It has been talked about as a tool to improve developers’ productivity. I have heard from some developers that Copilot has made them 100% more productive, so almost 2X, right? They say they don’t write any basic functions anymore. They don’t need to remember how they’re written. They just ask Copilot, and Copilot creates, generates all code for them. They don’t even use Stack Overflow anymore, because Copilot is next to Stack Overflow. With all that happening right now, we also are seeing the new technologies like ChatGPT that’s getting probably too much attention in a way and how that can change not only developers’ lives, but everybody else’s lives.
Will ChatGPT make the InfoQ editor team obsolete? [26:39]
Daniel Bryant: Do you think us InfoQ editors are going to be out of our jobs, Srini, with ChatGPT?
Srini Penchikala: Yes, it looks like we won’t have jobs, because ChatGPT can write articles, maybe even host podcasts. The third I just want to mention briefly is the infrastructure. We talked about platforms. This is where the Kubernetes and the hybrid clouds and the cloud agnostic computing can really help in our machine learning. Also, we want to make these solutions as a service, so machine learning as a service so the developers. AI developers don’t have to remember what kind of image they need to manage or where to deploy, where to host, and how to scale up. The platform will take care of all those, right?
Thomas Betts: And, that’s where I would like to see in the next year or two coming, is how the barrier to entry for doing all these things with AI and ML has to go down. What is the platform that enables all my teams to start using it without having to become experts in it? Because, I feel like there’s just too much to learn for people to say, “I’m going to just use ML. It’s Monday. I’m going to use it on Tuesday.” I can spin up a new microservice in five seconds, but I can’t create a new machine learning model. I don’t know how to do that. If we can get a platform that can take care of that, going back to our earlier discussion of the importance of platform teams, then we can start doing 10X scale on what our internal AI models can do.
What about the legality, ethics, and sustainability challenges associated with the use of AI (and computing in general)? [27:48]
Daniel Bryant: Yes. Has anyone got any thoughts about the legality of some of these models? I’ve heard there’s some interesting challenges. I think there’s a court case now with Copilot. Just today, I saw, I think it’s a new subscription model coming out for Copilot where you can just train it on your data. To your point, Thomas, you have data sovereignty, and that doesn’t leak out anywhere. But, Yes, anyone got any opinions on the legality?
Shane Hastie: I want to tackle broader than legality, the whole ethical aspects of how do we make sure that the products we build are good? And, we’ll touch on green there, and we can touch on social good. Thomas, you’re in that social good space. How do we encourage others? Lots of questions, not a lot of answers.
Thomas Betts: We all know that there are inherent biases built into any of the models. The data that it’s sampled and read off of is what it has built in. You have to assume that. We’ve had plenty of discussions on the podcast and on InfoQ about understanding that. I think what we’re seeing, because ChatGPT took off, and mere mortals can now interact with an AI. That’s, I think, what we saw since it came out. People thought, oh, that’s those nerds over there in the corner. All of a sudden, everyone is trying it out. They’re like, “Oh, I just asked it to come up with a script for a Seinfeld episode, and it did.” Does that mean it was a good episode? No, but it did something.
People are talking about can it be used to have kids finish their homework? If it’s good enough to fool their professors, who’s going to know? There’s a lot of ethical questions, like you said, not a whole lot of answers. I think we’re just starting to see this becoming so mainstream that the accessibility of it is so easy. People are going to start using it, and like anything on the internet, it’s going to be used for good, and it’s going to be used for bad. I hope that we see more good cases come out of it, but I don’t think there’s an easy way to just ensure that it’s a good solution.
Wes Reisz: I think there’s another interesting point there that you bring up with green there, Shane. It’s a new -ility. We’re seeing green being discussed as a brand new -ility. What is the cost of running some of these models? What is the cost of running Kubernetes? I think sustainability and the green question, we’ve seen it both at QCon London. We’ve seen it at QCon San Francisco. We’ve seen people like Adrian Cockcroft who went into this space who helped define the term microservices, now taking his whole career towards sustainability. So, I think that’s a really interesting question. I’m curious what everybody else thinks about it. I know, Thomas, you did a podcast on it, didn’t you?
Thomas Betts: Yes, I talked to Marco Valtas who also spoke at QCon on the green software principles. Yes, there’s a definite link we can put in the show back to that podcast talking about the principles of green software development. I think if I can recall off the top of my head, it’s everybody’s problem, and everybody can contribute to it, and everybody can do something. You can’t just pass the buck and say, “That’s not my problem.” Everybody who writes software is involved in saying, “Can we make this a green solution?” The talks at QCon were very good, because they covered … Here’s what you need to do, and here’s what the big picture is, and set the whole context. Who was the keynote speaker?
Wes Reisz: Astrid Atkinson, and then the talk you were looking at was The Zen of Green Software with Lisa McNally and Marco Valtas.
Thomas Betts: Right, Astrid talked about how we have this issue with climate change, and we know the impact of carbon. You can either just go along and hope that somebody fixes it, or you can be part of the people who fix it. I think she’s shifted her entire career, founded a company focused on one aspect of the grid and how to make it … I’m going to take my knowledge of distributed computing and apply it to the distribution grid. She commented that distribution is an overloaded term in that context for sending electricity over wires, but managing it like a complex system. But, you don’t have to do the whole thing.
If I have to go into a field where that’s my focus, everyone can take their software and look at it and say, “Where am I running this? Is it using green energy, or is it running in one of the data centers that’s all coal-powered energy?” So, maybe we can move it somewhere. If it doesn’t have an impact on our system, or run my jobs off peak. I’ve got a smart meter in my house, so if I run my dishwasher during the day, it costs me more than if I run it overnight. Little decisions that I can make as a human, I can put that into my software, so my software says, “You know what? I’m going to run on a more green schedule, because no one notices the impact, and I still get the same results.”
Daniel Bryant: On that note, Thomas, then you mentioned Adrian Cockcroft already. He was a big advocate of us all as customers saying to our cloud providers, because many of us consume from the cloud, asking, exactly your point, Thomas, “What’s the impact in terms of CO2 of running this workload? What’s your green options? Can I pay a bit extra?” Adrian, I think, was on point and on the money in terms of we have to drive that change. We have to ask it as leaders. What is the impact? The cloud folks are very sensible. They’ll listen to the general trend, but they’ve got to get that feedback from customers.
ThoughtWorks Cloud Carbon Footprint [32:16]
Thomas Betts: And, Wes, I know Thoughtworks has done some sort of report that you can plug in some of your variables, and it’ll kind of estimate what’s your green impact.
Wes Reisz: Yes, Tom, there’s a tool that Thoughtworks puts out called … Again, I work for Thoughtworks, so not in this particular space. I’m a consultant working with organizational transformation, DevOps, Kubernetes, things like that. That’s what I do. However, Thoughtworks does have a tool called Cloud Carbon Footprint. What it does is it helps you estimate your carbon footprint with some of the cloud providers. You can go out to GitHub. It’s an open source tool. Download it. Actually use it to be able to see what that is. Again, that podcast that you did with Marco, I believe, dove into that a bit too, right?
Thomas Betts: Yes, and you may not get an exact answer, but every model is wrong. Some are useful. It’s helpful to start looking at it, and then you can figure out, do I need to refine this at all? Or, does it give us the general ballpark of how many tons of CO2 is our software creating? And, can we do something about that?
Wes Reisz: Adrian Cockcroft, he did a DevSusOps talk, and he talked about that the IT sector contributes 3% of global CO2 emissions, which is on par with the aviation industry. Think about the growth of the data centers just over the last three, four years. What will that look like in another four years, in another four years? So, we have to start talking about this. We have to start thinking about it, because our systems are hungry. They’re continuing to consume more and more power.
Thomas Betts: Adrian had a couple other good points. It’s always nice to have the other point of view and saying, “Yes, there are some big software concerns, and yes, there’s a lot that we can do.” Also, look at your organization. The software might not be the biggest contributing factor to your company’s CO2 footprint. I think he cited at Starbucks, their biggest one is dairy. If they got everybody to switch to non-dairy milk, that would reduce their carbon footprint more than shutting off all their servers. So, you can make a big impact, but is it the right impact? And, is it the right place for you to focus? So, that’s why those tools that can say, “Give me the estimate of how much we’re doing,” oh, well, we’re running all these servers, and no one is using them. We should shut them down. There are some easy wins, but the long-term operations is where you have to look and say, “Okay, what is this going to project, if we have super scale, and we have to have 100 times the load. Are we able to handle that in a green fashion?”
What will the future of work look like? Are we all heading back to the office? [34:24]
Daniel Bryant: Looking at the end to end there, Thomas, something I was just thinking about is a lot of us used to commute, right? Pre-pandemic, the commute itself, and then obviously, we all fly all over the world to conferences as well, so we’ve got to be careful what we say here. But, I’m kind of curious what everyone thinks in terms of the move to hybrid now. A lot more of us are working at home, cutting back emissions. So, there is that aspect of it, but it does bring in some other challenges in terms of we’re on Zoom a lot more, I’m guessing. I’d love to hear focused thoughts on what the future of hybrid will look like. Will we be flying around the world for conferences? Will we be traveling to the office every day? I’d love to get folks’ thoughts on that.
Shane Hastie: Hybrid hell is real. The stories and the reality of people commuting into the office, because there’s a mandate, they’ve now got to be there two days a week or three days a week, it’s not coordinated well. So, you come into the office, and then you spend seven hours on Zoom calls. We’ve got to start being deliberate about why we bring people together. There is huge value in coming together in person. We see this in the conferences. Being together at QCon SF was, for me, one of the highlights of this past year. It was just wonderful to be in the same place with the people sharing all of those ideas. So, there is a real value in that.
There’s also amazing value and great value in the QCon+ events and the ability to … Of course, this is what InfoQ does, is make those sessions available so we can watch them asynchronously, as well. But, then how do we help our teams, help the people in the organizations get the balance right? So, if you are going to bring people together, think about the cost for the organization, for the environment, for the people. When you’re doing so, make sure that the benefit outweighs that cost. So, if I’m coming in, and Yes, maybe it is one day a week, but it’s the same day for everybody on my team and all the stakeholders I need to work with, that’s incredibly powerful. Because, now we can have those collaborative conversations in person and leverage the humanity of it.
But, don’t bring us in to sit on Zoom calls. Have a deliberate reason. Also, let go of how we measure. It’s not about hours in front of a screen. It’s about outcomes. The other thing, if we think of the impact from a green and climate perspective, the organizations and in some places, countries, governments that are shifting to four day work weeks. What’s happening there? And, the studies are amazing. In every organization that has shifted to the four day work week, productivity has stayed the same or gone up. People are more focused, because now we’ve only got four days. We get the work done. That makes more leisure time, and we’re not chewing carbon in the leisure time. We’re actually taking the leisure.
Thomas Betts: Is the move to four day work week becoming more palatable because people don’t have a commute? If I don’t have to work, you’re basically taking 40 hours, making four 10 hour days. But, if I don’t have to drive an hour each way, my eight hour workday, I’ve got those two hours back. Now, am I willing to do those two hours if I’m at home? Is that why four day work weeks are becoming more common?
Shane Hastie: I don’t know, because the studies are largely saying, “No, this is a 32 hour, four days at eight hours, not four days at 10 hours.”
Thomas Betts: Oh, I like that even better.
Wes Reisz: I like what you said about outcome, Shane. That right there really resonated. I think as I was watching here with the screen, I know everybody else can’t see it, but everybody was nodding when you said that. It’s not about screen time. It’s about the outcome. It’s about what we’re trying to achieve, and I think when you’re focused on the outcome, sometimes you can get more done with less time.
Srini Penchikala: To add to Shane’s comments and your comments, Wes, I can agree with you guys, because the in-person mandate cannot be come to office Tuesday, Wednesday, and Thursday kind of thing. It has to be context based. It has to be product lifecycle based. If we are doing product in PI, for example sprint planning, sprint planning needs everybody preferably in person so that they can collaborate for one or two days. Then, if the development phase starts, when the development starts, not everybody has to be in the office. So, it has to be more context driven, rather than calendar driven, right?
Daniel Bryant: I think on that note, Srini, as well, I’ve seen a lot of companies and my team included getting together at least say quarterly in departments and once a year all together. Definitely, the retrospectives and the brainstorming, you cannot do, in my experience, at least my experience, you can’t do as effectively via Zoom. You get the stutters, the cutouts, or whatever. People just can’t read each other quite as well when you’re not in the same room. I think for us, the quarterly planning really works well. We rotate folks in to do cross-collaboration with the teams. But, then Yes, that yearly get everyone together, and it’s going to be tricky with the current macro economy, the climate of can you justify getting everyone in the same place?
To Shane’s point, is it good for the climate, flying everyone to one place? But, I don’t know, a lot of teams I’ve worked on, regardless of size, we’re probably talking up to 100. Once you get past 100, it’s not logistically possible. But, a lot of startups, I think the value of getting folks together once a year and just building those bonds, because with a startup, it does tend to be quite dynamic, right? It is really valuable, I think, to get folks into the same physical space. In addition to the commutes into the office, I think that the quarterly and yearly are really valuable.
Don’t underestimate the bonds that are built in-person [39:40]
Wes Reisz: I was going to echo the bond part there. It can’t be underestimated. We’re doing amazing things in this hybrid world, trying to build bonds online, but three dimensions is important. Getting it right, bringing people together, being able to connect as human beings really allows you to be more comfortable, more safe in your environment. It allows you to be present better. So, that bond, I think, is super important.
Thomas Betts: Another call back to the whole track of remote work was fantastic. It was probably the surprise. I didn’t plan to go to every session, and I kept sitting in that room. Just to call one out. Jesse McGinnis from Spotify talked about how you need to be intentional in whatever you do. Whether it’s remote first or hybrid or remote only or whatever your model is, embrace it, and say, “Here’s what we’re going to do,” and make those decisions. Say, “What’s the right venue?” I think, Srini, you said, “The context is really critical.” If you’re having your daily stand up, does it even need to be a Zoom call? Can we get that done on Slack? Or, is it four days, we can do it on Slack, and one day, we actually meet just to see each other.
If it’s just to communicate the status, that’s fine. Use the in-person for not just coming to the office to be around people, but to actually take advantage of the stuff you can only do when you have that three-dimensional connection. So, if it is quarterly, don’t spend it all in planning meetings. Do some actual team building events. Go out and volunteer together, something like that, so you actually meet the people as people, not just a three by five section of your Zoom screen.
Srini Penchikala: Just to add to that, Thomas, I really like your idea. You just mentioned the volunteering, right? Any of these community outreach efforts, if they happen in person, it’s going to help with the bonding, what Wes mentioned, the productivity in the future, the safety, and the gratification. So, it’s going to make it 10X valuable in the long term.
Daniel Bryant: One thing I would say, drawing some of these things together, is all the great talks I’ve heard you mention QCon SF and QCon+ have been covered on InfoQ. Shameless plug, Shane, I know you wrote a bunch of ones. Shout out to Steve Yan. Srini, I know you’ve done some as well with QCon things. Yes, there’s a bunch of great content, because that one you mentioned, Thomas, I read the notes on InfoQ. I had to immediately tweet it, because the information about being intentional was just amazing, right?
Thomas Betts: There was a fantastic panel discussion. In some ways, it’s the things that, oh, you hear them out loud, and they sound so obvious. Yet, companies don’t always do that. You’re like, “Well, why are they successful?” It’s not that hard. You just have to think about it a little bit. It’s not a drastic shift. When we said, “Everyone has to go home, because the pandemic,” all the companies who said, “Well, everyone has to be in the office, or they’re not going to be productive., they’re just going to slack off at home.” Then, going back to outcomes are what matter, oh, we still got our job done. Can we get our job done in four days instead of five? Focus on that, rather than I need you in the office so I can see you so that I know what you’re doing. I need you in the office with other people, because I want you to bond as a team. That’s what you’re looking for.
The role of virtual reality and (pair programming) chatbots [42:25]
Shane Hastie: I want to call out to one of my podcast episodes this year. I interviewed a software development team, and they were all using VR environments for their collaborative work. The podcast was actually released as a video as well, because we’ve got the 3D video image up there. I was not in the VR immersive space with them, but the six of them were around the table. They’re a the development team completely distributed, and they’re using Oculus. They mentioned the specific software for doing peer programming, for doing debugging sessions. It was really interesting to see, and it’s a sample size of one, but it was fascinating stuff to see. I wonder where that’s going to go in the future.
Thomas Betts: I’d like to segue from there into the … What if it’s not a virtual reality hologram, but it’s actually an AI that’s my pair programming. Because, that was how I tested out ChatGPT. I used it instead of my little rubber duck to say, “Hey, how do I solve this problem? Here’s my scenario.” And, I had to explain it to a point that it could give an answer, and then it gave me the code. I’m like, “Oh, well, that looks similar to my code. Oh.” And, you could ask it, “What did you do wrong? What would you improve? Or, here’s what it is.” Same thing that the pair programming in person was. Let’s talk over the code. Let’s not talk at each other. How do I have that relationship? Let me explain my problem to you, whether you’re a person or an AI, as long as I get the response back. I’m wondering if the VR, that’s just the seventh person in the room is the bot speaks out for you.
Hopes and wishes for 2023 [43:56]
Shane Hastie: I will do a little bit of a hint here. I’m track host for the what’s next for hybrid and remote in QCon London. One of the talks is going to be talking about mmersive VR. Okay, well, we’re coming to the end of our time together, so let’s talk about what we as a group want to see. What would our hopes and wishes be?
More deliberate culture design [44:16]
Shane Hastie: I’ll kick off with to see more of that deliberate culture design in organizations and to see some of the experiments, the four day work week, more and more organizations bringing that on, outcome focused, the humanistic workplaces. That’s what I hope to see in 2023. I also hope to see all of you physically in person. It’ll be great. Thomas, can I throw the ball to you? What do you want for next year?
The (AI-powered) architect’s assistant [45:53]
Thomas Betts: Sure. So, I went and reviewed last year when we did this podcast, and I said I was looking forward to in-person events, and that happened. So, I feel like one wish from last year got accomplished. So, now I’m going to go to what’s going to happen next year, and we’ll come back in 12 months. I think we’re right at the inflection point with artificial intelligence becoming so mainstream that I can think of … I have this role of an architect. Can I have an architect with an assistant that’s not just me, and I don’t have to reach out to a friend on Slack? I can just ask my chat bot for helpful information, and it can respond accordingly and help me think through and give me the ability to do my job 10X better than if I’m just sitting there, struggling trying to think of what’s the next line to write.
I don’t think it’s going to replace my InfoQ writing, but it can augment it. I think that same type of thing, it can help augment all of our roles, and how is that going to affect everybody, not just engineers and architects? Our UX designers are using it now to figure out how do we design new things. So, I think we’re going to see something come about in AI in the next 12 months that we didn’t expect to see, that it just becomes very mainstream, and we all start using it. since it’s sort of data, ML, AI, I’m going to hand it off to Srini next.
Harnessing AI/ML effectively and ethically for the individual, community, and nation [46:00]
Srini Penchikala: Yes, thanks, Thomas. I was going to say that. So, Yes, I agree with you. ChatGPT and whatever is the next AI solution, they can definitely do a better job at helping the consumers. But, I don’t think they will replace humans completely anytime soon. I was kind of joking that they would, but seriously, no, there’s always some gap between machines and programs and humans. To wrap this up, Yes, I’m kind of looking forward to how data and AI/ML technologies, how can they help with all aspects of our lives at the individual level, as well as community level, as well as national and government level? So, how can they help at all the different levels, not only in our offices, at our work, in our personal lives, and also in other areas like healthcare, the governance, and everything else?
At the same time, these solutions need to be ethical to individuals with our own bias, and they need to be ethical to communities. We’d have some communities being incorrectly suppressed, right? Also, they need to be ethical to the environment, which is where the green computing comes into the picture. I can see a lot of these, pretty much all the topics we talked about today, whether it’s on the human side or on the technology side. They’re all coming together to make our lives better overall.
Also, I want to mention a couple of things, Daniel, I guess shameless plugs, right? We have a e-magazine on data pipelines that we published recently. It’s a great resource, excellent articles on that, what’s happening in the data engineering side. Then, we also published back in August, the AIML trends report for 2022. It talks about the Transformers, ChatGPT, and other models. So, a lot of things that we didn’t have time in this podcast, I definitely encourage our listeners to check it out. Finally, as Shane mentioned, QCon London 2023 is going to have two tracks. One is on data engineering innovations, and the other one is on AIML trends, again excellent opportunity to learn what’s happening in these areas. Again, thank you all for the opportunity. So, it’s great to see you guys, and until next time. Wes is next. Go ahead, Wes.
Technical problems are often people problems; act accordingly [47:51]
Wes Reisz: Oh, Yes. That was a great summary, Srini. Thomas, you talked about last year, about returning to in-person event. I think I mention this every year, but Daniel, remember 2019, 2020 going, “You know what? I think we need to travel less next year. That’s going to be my goal.” That kind of rose up to bite us, but we’ve been pretty accurate for our year end, what’s coming up. I don’t know, for me, I guess it’s where I started. What I want to see is more a focus on reducing cognitive load. I really like where we’re evolving with platform conversations. We talked about it just a minute ago, but off of the podcast.
But, the more technical I get, the more deep I get into technical issues, the more I find out it’s about people, it’s about organizations, it’s about communication. The technical stuff kind of comes along, is not the hard part, I guess. So, for me, it’s just continuing the platform conversation next year, building stronger teams, and being able to do more with less, and reduce cognitive load so that people are able to develop software and be happy and healthy doing it. I’m going to turn it over to you, Daniel. I know you’ve got some things to jump into here.
Low code, no node, and AI augmentation; bringing it all together [48:55]
Daniel Bryant: Yes, Yes, just hearing everyone talk here. One thing I was going to lean into, something we didn’t talkabout today is the low-code, no-code trend that’s going on. I think actually AI is somewhat … and the ML has somewhat pushed that to the side. Because, I was doing a lot of work with Doug Hudgeon and some awesome folks in InfoQ where I was learning a bunch from him and the teams around how this is going to enable citizen developers. Many of us have had the dream for business process modeling and all that kind of stuff from when I started my career 20 years ago or so. But, I think the low-code, no-code stuff and with the rise of Zapier, and there’s many other platforms out there, is going to allow folks, not just technologists, but folks to assemble a lot more workflows and business logic.
Then, I think as we’re all concluding is that AI is probably going to augment that. I think technology is going to be more adoptable by more people. The interesting thing we’re all saying is if we are struggling with some of the ethics and the legality and a bunch of other things, imagine we push it onto folks that don’t have a CS background or haven’t been studying it quite so long. I think it opens the door to, I think as you hinted at Thomas, a lot of good stuff, but also maybe a lot of bad stuff as well. So, I’m thinking, as you alluded to, Thomas, I’m a big fan of Simon Wardley’s stuff. He talks about a punctuated equilibrium in terms of suddenly, you get this massive step change.
I think we’re kind of seeing that with ML. We’re kind of seeing that with low-code, no-code. We’re kind of seeing, as Shane has alluded to, the way the world is communicating and collaborating is changing too, the virtual and VR aspect to it as well. I just wonder if next year, we could be doing this podcast and going, “Wow, what a year?” Do you know what I mean? We’re wrapping all the things. We’ve been doing that now, but I think with the world opening back up, I wonder if next year could be a real game changer in that space.
Outro: Thanks to the listeners, and best wishes for the New Year [50:35]
I think that’s a perfect point to wrap up the podcast there. Thank you all so much for joining me. Thank you, Shane, Thomas, Srini, Wes. It’s always great to be in the same virtual room as you all. I always learn so much, and I enjoy hearing your insights and takes on all the different areas of interest that we have. I’ll also say a big thanks to you, the listener, and also readers, watchers at InfoQ and QCon. It’s always great to meet many of you at in-person conferences and also see your comments online at InfoQ.com, as well. Do be sure to pop on the website. Check out all the latest events we’ve got for you, InfoQ Live, QCon London soon, and other QCons. I’ll say, “Happy holidays,” and see you all in the new year.
References mentioned:
.
From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.
MMS • Deepak Vohra
Article originally posted on InfoQ. Visit InfoQ
Key Takeaways
- PHP 8.1 simplifies the syntax for creating a callable to
AnyCallableExpression(...)
. - PHP 8.0 introduces named function parameters in addition to positional arguments.
- PHP 8.1 introduces Fibers as interruptible functions to facilitate multi-tasking.
- PHP 8 adds new standard library functions, including
__toString()
, and sets new requirements on the use of magic methods. - Private methods may be inherited and reimplemented without any restrictions, except for private final constructors that must be kept as such.
This article is part of the article series “PHP 8.x”. You can subscribe to receive notifications about new articles in this series via RSS. PHP continues to be one of the most widely used scripting languages on the web with 77.3% of all the websites whose server-side programming language is known using it according to w3tech. PHP 8 brings many new features and other improvements, which we shall explore in this article series. |
PHP 8.0 adds support for several functions- and methods-related features, some of which are an improvement of existing features, while others are completely new features. The enhanced callable syntax in PHP 8.1 can be used to create anonymous functions from a callable. Named function arguments may be used along with positional arguments with the added benefit that named arguments are not ordered and can convey meaning by their name. Fibers are interruptible functions that add support for multitasking.
Inheritance on private methods is redefined
Object inheritance is a programming paradigm that is used by most object-oriented languages including PHP. It makes it possible to override public and protected methods, and class properties and constants defined in a class from any class that extends it. In PHP, public methods cannot be reimplemented with a more restrictive access such as by making a public
method private
. To demonstrate this, consider a class B that extends class A and reimplements a public method from A.
sortArray();
When run, the script generates an error message:
Fatal error: Access level to B::sortArray() must be public (as in class A)
public method cannot be reimplemented.
On the contrary, private methods defined in a class are not inherited and can be reimplemented in a class that extends it. As an example, class B extends class A in the following script and reimplements a private method from A.
<?php
class A{
private function sortArray():string{
return "Class A method";
}
}
class B extends A{
private function sortArray(int $a):string{
return "Class B method";
}
}
Prior to PHP 8.0, two restrictions applied to private method redeclaration in an extending class: the final
and static
modifiers were not allowed to be changed. If a private
method was declared final
, an extending class was not allowed to redeclare the method. If a private method was declared static, it was to be kept static in an extending class. And, if a private method did not have the static modifier, an extending class was not allowed to add a static modifier. Both restrictions have been lifted in PHP 8. The following script runs ok in PHP 8.
<?php
class A {
private final static function sortArray():string{
return "Class A method";
}
}
class B extends A {
private function sortArray(int $a):string{
return "Class B method";
}
}
The only private method restriction in PHP 8 is to enforce private final
constructors, which are sometimes used to disable the constructor when using static factory methods as a substitute.
<?php
class A {
private final function __construct(){
}
}
class B extends A {
private final function __construct(){
}
}
The script generates error message:
Fatal error: Cannot override final method A::__construct()
A variadic argument may replace any number of function arguments
In PHP 8, a single variadic argument may replace any number of function arguments. Consider the following script in which class B extends class A and replaces the three-argument function sortArray
with a single variadic argument.
$val) {
echo "$key = $val ";
}
} elseif ($sortType == "desc") {
rsort($arrayToSort);
foreach ($arrayToSort as $key => $val) {
echo "$key = $val ";
}
}
}
}
class B extends A {
public function sortArray(...$multiple) {
$arrayToSort= $multiple[0];
$sortType=$multiple[1];
if ($sortType == "asc") {
sort($arrayToSort);
foreach ($arrayToSort as $key => $val) {
echo "$key = $val ";
}
} elseif ($sortType == "desc") {
rsort($arrayToSort);
foreach ($arrayToSort as $key => $val) {
echo "$key = $val ";
}
}
}
}
The sortArray
function in class B may be called using multiple arguments.
$sortType="asc";
$arrayToSort=array("B", "A", "f", "C");
$arraySize=4;
$b=new B();
$b->sortArray($arrayToSort,$sortType,$arraySize);
The output is as follows:
0 = A 1 = B 2 = C 3 = f
Simplified Callable Syntax
A callable is a PHP expression that can be called, such as an instance method, a static method, or an invocable object. A callable can be used to create a short-form expression for a method call, for example. In PHP 8.1, there is new callable syntax available:
AVariableCallableExpression(…)
The AVariableCallableExpression represents a variable callable expression. The ellipses … is included in the syntax.
Why a new callable syntax? Let’s recall what the traditional callable syntax looks like with some examples:
$f1 = 'strlen'(...);
$f2 = [$someobj, 'somemethod'](...);
$f3 = [SomeClass::class, 'somestaticmethod'](...);
This has two issues:
- The syntax involves strings and arrays
- The scope is not maintained at the point at which the callable is created.
To demonstrate this, consider the following script for sorting an array in which the getSortArrayMethod()
method returns a callable for the sortArray()
method with return [$this, 'sortArray']
.
arrayToSort = $arrayToSort;
$this->sortType = $sortType;
}
public function getSortArrayMethod() {
return [$this, 'sortArray'];
}
private function sortArray() {
if ($this->sortType == "Asc") {
sort($this->arrayToSort);
foreach ($this->arrayToSort as $key => $val) {
echo "$key = $val ";
}
} elseif ($this->sortType == "Desc") {
rsort($this->arrayToSort);
foreach ($this->arrayToSort as $key => $val) {
echo "$key = $val ";
}
} else {
shuffle($this->arrayToSort);
foreach ($this->arrayToSort as $key => $val) {
echo "$key = $val ";
}
}
}
}
$sortType="Asc";
$arrayToSort=array("B", "A", "f", "C");
$sort = new Sort($arrayToSort,$sortType);
$c = $sort->getSortArrayMethod();
$c();
The script generates an error message:
Fatal error: Uncaught Error: Call to private method Sort::sortArray()
from global scope
Using Closure::fromCallable([$this, 'sortArray'])
instead of [$this, 'sortArray']
would fix the scope issue, but using the Closure::fromCallable
method makes the call verbose. The new callable syntax fixes both the scope and syntax verbosity issues. With the new callable syntax, the function becomes:
public function getSortArrayMethod() {
return $this->sortArray(...);
}
The array gets sorted with output:
0 = A 1 = B 2 = C 3 = f
The new syntax can be combined with the traditional syntax involving strings and arrays to fix the scope issue. The scope at which the callable is created is kept unchanged.
public function getSortArrayMethod() {
return [$this, 'sortArray'](...);
}
The new callable syntax may be used with static methods as well, as demonstrated by the following script that includes a static function.
arrayToSort = $arrayToSort;
$this->sortType = $sortType;
}
public function getStaticMethod() {
return Sort::aStaticFunction(...);
}
private static function aStaticFunction() {
}
}
$sortType="Asc";
$arrayToSort=array("B", "A", "f", "C");
$sort = new Sort($arrayToSort,$sortType);
$cStatic=$sort->getStaticMethod();
$cStatic();
The output is the same as before:
0 = A 1 = B 2 = C 3 = f
The following are equivalent ways for calling a method:
return $this->sortArray(...);
return Closure::fromCallable([$this, 'sortArray']);
return [$this, 'sortArray'](...);
The following are equivalent ways for calling a static method:
return Sort::aStaticFunction(...);
return [Sort::class, 'aStaticFunction'](...);
return Closure::fromCallable([Sort::class, 'aStaticFunction']);
The new callable syntax may be used even if a function declares parameters.
arrayToSort = $arrayToSort;
$this->sortType = $sortType;
}
public function getSortArrayMethod() {
return $this->sortArray(...);
}
private function sortArray(int $a,string $b) {
if ($this->sortType == "Asc") {
sort($this->arrayToSort);
foreach ($this->arrayToSort as $key => $val) {
echo "$key = $val ";
}
} elseif ($this->sortType == "Desc") {
rsort($this->arrayToSort);
foreach ($this->arrayToSort as $key => $val) {
echo "$key = $val ";
}
} else {
shuffle($this->arrayToSort);
foreach ($this->arrayToSort as $key => $val) {
echo "$key = $val ";
}
}
}
}
A callable must be called with its arguments if the method declares any.
$sortType="Asc";
$arrayToSort=array("B", "A", "f", "C");
$sort = new Sort($arrayToSort,$sortType);
$c = $sort->getSortArrayMethod();
$c(1,"A");
Simplified Syntax can be used with any PHP Callable expression
The simplified callable syntax may be used with any PHP callable expression. The callable syntax is not supported with the new operator for object creation because the callable syntax AVariableCallableExpression(…) does not have provision to specify constructor args, which could be required. The following is an example that is not supported:
$sort = new Sort(...);
An error message is generated:
Fatal error: Cannot create Closure for new expression
The following script demonstrates the full range of callable expressions that are supported.
arrayToSort = $arrayToSort;
$this->sortType = $sortType;
}
public function getSortArrayMethod() {
return $this->sortArray(...);
}
public function getStaticMethod() {
return Sort::aStaticFunction(...);
}
public static function aStaticFunction() {
}
public function sortArray(int $a,string $b) {
if ($this->sortType == "Asc") {
sort($this->arrayToSort);
foreach ($this->arrayToSort as $key => $val) {
echo "$key = $val ";
}
} elseif ($this->sortType == "Desc") {
rsort($this->arrayToSort);
foreach ($this->arrayToSort as $key => $val) {
echo "$key = $val ";
}
} else {
shuffle($this->arrayToSort);
foreach ($this->arrayToSort as $key => $val) {
echo "$key = $val ";
}
}
}
public function __invoke() {}
}
$sortType="Asc";
$arrayToSort=array("B", "A", "f", "C");
$classStr = 'Sort';
$staticmethodStr = 'aStaticFunction';
$c1 = $classStr::$staticmethodStr(...);
$methodStr = 'sortArray';
$sort = new Sort($arrayToSort,$sortType);
$c2 = strlen(...);
$c3 = $sort(...); // invokable object
$c4 = $sort->sortArray(...);
$c5 = $sort->$methodStr(...);
$c6 = Sort::aStaticFunction(...);
$c7 = $classStr::$staticmethodStr(...);
// traditional callable using string, array
$c8 = 'strlen'(...);
$c9 = [$sort, 'sortArray'](...);
$c10 = [Sort::class, 'aStaticFunction'](...);
$c11 = $sort->getSortArrayMethod();
$c11(1,"A");
$cStatic=$sort->getStaticMethod();
$cStatic();
Trailing comma and optional/required arguments order
Another new feature in PHP 8.0 is support for adding a trailing comma at the end of the list of parameters to a function to improve readability. Any trailing comma is ignored. A trailing comma may not always be useful, but could be useful if the parameters list is long or if parameter names are long, making it suitable to list them vertically. A trailing comma is also supported in closure use lists.
PHP 8.0 deprecates declaring optional arguments before required arguments. Optional arguments declared before required arguments are implicitly required.
The following script demonstrates the required parameters implicit order in addition to the use of trailing comma.
The output is as follows:
Deprecated: Optional parameter $the_third_arg_of_this_function declared before required parameter $the_last_arg_of_this_function is implicitly treated as a required parameter
Nullable parameters are not considered optional parameters and may be declared before required parameters using the $param = null
form or the explicit nullable type, as in the following script:
Named Function Parameters and Arguments
PHP 8.0 adds support for named function parameters and arguments besides already supported positional parameters and arguments. Named arguments are passed in a function call with the following syntax:
Argument_name:Argument_value
Some of the benefits of named arguments are the following:
- Function parameters may be given a meaningful name to make them self-documenting
- Arguments are order-independent when passed by name
- Default values may be skipped arbitrarily.
In the following script, the array_hashtable
function declares named parameters. The function may be passed argument values with or without argument names. When positional arguments are passed, the function parameters declaration order is used. When named arguments are passed, any arbitrary order may be used.
<?php
function array_hashtable($key1,$key2,$key3,$key4,$key5){
echo $key1.' '.$key2.' '.$key3.' '.$key4.' '.$key5;
echo "
";
}
// Using positional arguments:
array_hashtable(0, 10, 50, 20, 25);
// Using named arguments:
array_hashtable(key2: 0, key5: 25, key1: 10, key4: 50, key3: 20);
?>
The output is:
0 10 50 20 25
10 0 20 50 25
Named arguments and positional arguments may be used in the same function call. The mixed arguments call is used with the same example function array_hashtable
.
<?php
function array_hashtable($key1,$key2,$key3,$key4,$key5){
echo $key1.' '.$key2.' '.$key3.' '.$key4.' '.$key5;
echo "
";
}
// Using mixed arguments:
array_hashtable(0, 10, 50, key5: 25, key4: 20);
?>
Output is :
0 10 50 20 25
Notice that named arguments are used only after positional arguments. The following script reverses the order and uses positional arguments after named arguments:
<?php
function array_hashtable($key1,$key2,$key3,$key4,$key5){
echo $key1.' '.$key2.' '.$key3.' '.$key4.' '.$key5;
echo "
";
}
// Using mixed arguments:
array_hashtable(0, 10, key3: 25, 50, key5: 20);
?>
The script generates an error message:
Fatal error: Cannot use positional argument after named argument
Declaring optional arguments before required arguments is deprecated even with named arguments, as demonstrated by the following script:
<?php
function array_hashtable($key1=0,$key2=10,$key3=20,$key4,$key5){
echo $key1.' '.$key2.' '.$key3.' '.$key4.' '.$key5;
echo "
";
}
// Using mixed arguments:
array_hashtable(1,2,key3: 25, key4: 1,key5: 20);
?>
The output includes the deprecation messages:
Deprecated: Optional parameter $key1 declared before required parameter $key5 is implicitly treated as a required parameter
Deprecated: Optional parameter $key2 declared before required parameter $key5 is implicitly treated as a required parameter
Deprecated: Optional parameter $key3 declared before required parameter $key5 is implicitly treated as a required parameter
When using optional named parameters after required named parameters, named arguments may be used to skip over one or more optional parameters in a function call, as in the script:
<?php
function array_hashtable($key1,$key2,$key3=20,$key4=50,$key5=10){
echo $key1.' '.$key2.' '.$key3.' '.$key4.' '.$key5;
echo "
";
}
// Using mixed arguments:
array_hashtable(key1:1, key2:2,key4: 25);
?>
The output is:
1 2 20 25 10
You may call a function with only a subset of its optional arguments, regardless of their order.
<?php
function array_hashtable($key1=0,$key2=10,$key3=20,$key4=50,$key5=10){
echo $key1.' '.$key2.' '.$key3.' '.$key4.' '.$key5;
echo "
";
}
// Using mixed arguments:
array_hashtable(1,2,key4: 25);
?>
Output is as follows:
1 2 20 25 10
Even when calling a function with a subset of its optional arguments, positional arguments cannot be used after named arguments, as demonstrated by script:
<?php
function array_hashtable($key1=0,$key2=10,$key3=20,$key4=50,$key5=10){
echo $key1.' '.$key2.' '.$key3.' '.$key4.' '.$key5;
echo "
";
}
// Using mixed arguments:
array_hashtable(1,2,key4: 25,5);
?>
The following error message is produced:
Fatal error: Cannot use positional argument after named argument
PHP 8.1 improves on the named arguments feature by supporting named arguments after unpacking the arguments, as in the script:
<?php
function array_hashtable($key1,$key2,$key3=30,$key4=40,$key5=50){
echo $key1.' '.$key2.' '.$key3.' '.$key4.' '.$key5;
echo "
";
}
echo array_hashtable(...[10, 20], key5: 40);
echo array_hashtable(...['key2' => 2, 'key1' => 2], key4: 50);
?>
The output is as follows:
10 20 30 40 40
2 2 30 50 50
However, a named argument must not overwrite an earlier argument as demonstrated by script:
<?php
function array_hashtable($key1,$key2,$key3=30,$key4=40,$key5=50){
echo $key1.' '.$key2.' '.$key3.' '.$key4.' '.$key5;
echo "
";
}
echo array_hashtable(...[10, 20], key2: 40);
?>
Output is as follows:
Fatal error: Uncaught Error: Named parameter $key2 overwrites previous argument.
Non-static method cannot be called statically
Prior to PHP 8.0, if you called a non-static method in a static context, or statically, you only got a deprecation message. With 8.0 you now get an error message. Also, $this
is undefined in a static context. To demonstrate this, consider the following script in which a non-static method aNonStaticMethod()
is called with static syntax A::aNonStaticMethod()
.
<?php
class A
{
function aNonStaticMethod()
{
}
}
A::aNonStaticMethod();
If you run the script, you would get an error message:
Uncaught Error: Non-static method A::aNonStaticMethod() cannot be called statically
Fibers
PHP 8.1 adds support for multi-tasking with Fibers. A Fiber is an interruptible function with a stack of its own. A Fiber may be suspended from anywhere in the call stack, and resumed later. The new Fiber class is a final class that supports the following public methods:
Method |
Description |
|
Constructor to create a new Fiber instance. The parameter is the callable to invoke when starting the fiber. Arguments given to |
|
Starts the fiber. The method returns when the fiber suspends or terminates. A variadic list of arguments is provided to the callable used when constructing the fiber. A mixed value is returned from the first suspension point or |
resume(mixed $value = null): mixed |
Resumes the fiber, returning the given mixed value from Fiber::suspend(). Returns when the fiber suspends or terminates. The returned mixed value is actually returned from the next suspension point or NULL if the fiber returns. Throws a FiberError if the fiber has not started, is running, or has terminated. |
throw(Throwable $exception): mixed |
Throws the given exception into the fiber from |
getReturn(): mixed |
Gets the mixed return value of the fiber callback. |
isStarted(): bool |
Returns bool |
isSuspended(): bool |
Returns bool |
isRunning(): bool |
Returns bool |
isTerminated(): bool |
Returns bool |
static suspend(mixed $value = null): mixed |
Suspends the fiber. Returns execution to the call to Throws FiberError if not within a fiber (i.e., if called from the main thread). |
static getCurrent(): ?Fiber |
Returns the currently executing fiber instance or |
A Fiber may be started only once, but may be suspended and resumed multiple times. The following script demonstrates multitasking by using a Fiber to perform different types of sorts on an array. The Fiber is suspended after each sort, and resumed later to perform a different type of sort.
$val) {
echo "$key = $val ";
}
echo "
";
Fiber::suspend();
rsort($arr);
foreach ($arr as $key => $val) {
echo "$key = $val ";
}
echo "
";
Fiber::suspend();
shuffle($arr);
foreach ($arr as $key => $val) {
echo "$key = $val ";
}
});
$arrayToSort=array("B", "A", "f", "C");
$value = $fiber->start($arrayToSort);
$fiber->resume();
$fiber->resume();
?>
The output is as follows:
0 = A 1 = B 2 = C 3 = f
0 = f 1 = C 2 = B 3 = A
0 = C 1 = f 2 = A 3 = B
If the Fiber is not resumed after first suspension, only one type of sort is made, which could be implemented by commenting out the two calls to resume().
//$fiber->resume();
//$fiber->resume();
Output is the result from the first sort:
0 = A 1 = B 2 = C 3 = f
Stringable interface and __toString()
PHP 8.0 introduces a new interface called Stringabl
e that provides only one method __toString()
. The __toString()
method if provided in a class would implicitly implement the Stringable
interface. Consider the class A that provides a __toString()
method.
<?php
class A {
public function __toString(): string {
return " ";
}
}
echo (new A() instanceof Stringable);
The script returns 1 from the type check for Stringable
.
The reverse is however not true. If a class implements the Stringable interface, it must explicitly provide the __toString() method as the method is not added automatically, as in:
<?php
class A implements Stringable {
public function __toString(): string { }
}
New standard library functions
PHP 8 introduces a number of new functions belonging to its standard library.
The str_contains
function returns a bool
to indicate if the string given as the first argument contains the string given as the second argument. The following script returns false
:
<?php
if (str_contains('haystack', 'needle')) {
echo true;
} else {
echo false;
}
And the following script returns 1, or true:
<?php
if (str_contains('haystack', 'hay')) {
echo true;
}else {
echo "false";
}
The str_starts_with function returns a bool to indicate if the string given as the first argument starts with the string given as the second argument. The following script returns false.
<?php
if (str_contains('haystack', 'hay')) {
echo true;
}else {
echo "false";
}
And the following script returns 1, or true.
<?php
if (str_starts_with('haystack', 'needle')) {
echo true;
} else {
echo false;
}
The str_ends_with
function returns a bool
to indicate if the string given as the first argument ends with the string given as the second argument. The following script returns false
.
<?php
if (str_starts_with('haystack', 'needle')) {
echo true;
} else {
echo false;
}
And the following script returns 1
, or true
.
<?php
if (str_starts_with('haystack', 'hay')) {
echo true;
} else {
echo false;
}
The fdiv
function divides two numbers and returns a float
value, as demonstrated by the script:
<?php
var_dump(fdiv(1.5, 1.3));
var_dump(fdiv(10, 2));
var_dump(fdiv(5.0, 0.0));
var_dump(fdiv(-2.0, 0.0));
var_dump(fdiv(0.0, 0.0));
var_dump(fdiv(5.0, 1.0));
var_dump(fdiv(10.0, 2));
The output is:
float(1.1538461538461537)
float(5)
float(INF)
float(-INF)
float(NAN)
float(5)
float(5)
The fdatasync
function, aliased to fsync
on Windows, synchronizes data to a stream on a file. To demonstrate its use, create an empty file test.txt in the scripts
directory that contains PHP scripts to run. Run the script:
<?php
$file = 'test.txt';
$stream = fopen($file, 'w');
fwrite($stream, 'first line of data');
fwrite($stream, "rn");
fwrite($stream, 'second line of data');
fwrite($stream, 'third line of data');
fdatasync($stream);
fclose($stream);
Subsequently, open the test.txt file to find the text:
first line of data
second line of data
third line of data
The array_is_list
function returns a bool to indicate whether a given array is a list. The array must start at 0, the keys must be all consecutive integer keys and in the correct order. The following script demonstrates the array_is_list
function:
'a', 'b']);
echo array_is_list([1 => 'a', 'b']); // false
echo array_is_list([1 => 'a', 0 => 'b']); // false
echo array_is_list([0 => 'a', 'b' => 'b']); // false
echo array_is_list([0 => 'a', 2 => 'b']); // false
The output is:
1
1
1
Magic methods must have correct signatures
Magic methods are special methods in PHP to override default actions They include the following methods, of which the constructor method __construct()
may be the most familiar:
__construct(), __destruct(), __call(), __callStatic(), __get(), __set(), __isset(), __unset(), __sleep(), __wakeup(), __serialize(), __unserialize(), __toString(), __invoke(), __set_state(), __clone(), and __debugInfo().
As of PHP 8.0 the signature of the magic method definitions must be correct, which implies that if type declarations are used in method parameters or return type, they must be identical to that in the documentation. The new __toString()
method must declare string
as return type. To demonstrate declare return type as int
:
<?php
class A {
public function __toString(): int {
}
}
An error message is generated:
Fatal error: A::__toString(): Return type must be string when declared
However, functions that don’t declare a return type by definition such as the constructor function must not declare a return type, not even a void
return type. The following script is an example:
<?php
class A {
public function __construct():void
{
}
}
The script returns an error message:
Fatal error: Method A::__construct() cannot declare a return type
All magic methods with a few exceptions, e.g. __construct()
, must be declared with public visibility. To demonstrate this, declare __callStatic
with private
visibility.
<?php
class A {
private static function __callStatic(string $name, array $arguments) {}
}
A warning message is output:
Warning: The magic method A::__callStatic() must have public visibility
Even though it is ok to omit mixed return types, the method signature must be the same. For example, in the following script, class A
declares __callStatic
without specifying its return type, while the class B
defines its first parameter as an int:
<?php
class A {
public static function __callStatic(string $name, array $arguments) {}
class B {
public static function __callStatic(int $name, array $arguments) {}
}
An error message is output:
Fatal error: B::__callStatic(): Parameter #1 ($name) must be of type string when declared
Return Type Compatibility with Internal Classes
With PHP 8.1 most internal methods, which are the methods in internal classes, have “tentatively” started to declare a return type. Tentatively implies that while in 8.1 only a Deprecation notice is raised, in version 9.0 an error condition message shall be output. Thus, any extending class must declare a return type that is compatible with the internal class, or a deprecation notice is issued. To demonstrate this, extend the internal class Directory and redeclare the function read() without a return type:
<?php
class A extends Directory {
public function read() { }
}
The script generates a deprecation notice:
Deprecated: Return type of A::read() should either be compatible with Directory::read(): string|false, or the #[ReturnTypeWillChange] attribute should be used to temporarily suppress the notice
The following script, however is OK:
<?php
class A extends Directory {
public function read():string { return ""; }
}
Adding the #[ReturnTypeWillChange]
attribute attribute suppresses the deprecation notice:
<?php
class A extends Directory {
#[ReturnTypeWillChange]
public function read() { }
}
The SensitiveParameter attribute
While stack traces for exceptions that include detailed information about method parameters are quite useful for debugging, you may not want to output parameter values for some sensitive parameters such as those associated with passwords and credentials. PHP 8.2 introduces a new attribute called SensitiveParameter so that, if a method parameter is annotated with the SensitiveParameter
attribute, the parameter’s value is not output in an exception stack trace.
To demonstrate this, consider the following script in which the function f1
has the $password
parameter associated with the SensitiveParameter
attribute.
<?php
function f1(
$param1 = 1,
#[SensitiveParameter] $password = “s@5f_u7”,
$param3 = null
) {
throw new Exception('Error');
}
The function throws an Exception
just to demonstrate the SensitiveParameter
feature. Call the function:
f1(param3: 'a');
Notice that the exception stack trace does not include the value for the $password
parameter, and instead has Object(SensitiveParameterValue)
listed.
Stack trace: #0 : f1(1, Object(SensitiveParameterValue), 'a') #1 {main}
Built-in functions deprecation/enhancement
The built-in functions utf8_encode()
and utf8_decode()
have often been misunderstood because their names imply encoding/decoding just about any string. Actually the functions are for encoding/decoding only ISO8859-1, aka “Latin-1”, strings. Additionally, the error messages they generate are not descriptive enough for debugging. PHP 8.2 has deprecated these functions. The following script makes use of them:
<?php
$string_to_encode = "x7Ax6AxdB";
$utf8_string = utf8_encode($string_to_encode);
echo bin2hex($utf8_string), "n";
$utf8_string = "x6Ax6BxD3xCB";
$decoded_string = utf8_decode($utf8_string);
echo bin2hex($decoded_string), "n";
With PHP 8.2, a deprecation notice is output:
Deprecated: Function utf8_encode() is deprecated
Deprecated: Function utf8_decode() is deprecated
In PHP 8.2, the functions iterator_count
and iterator_to_array
accept all iterables. The iterator_to_array()
function copies the elements of an iterator into an array. The iterator_count()
function counts the elements of an array. These functions accept an $iterator
as the first argument. In PHP 8.2, the type of the $iterator
parameter has been widened from Traversable
to Traversable|array
so that any arbitrary iterable value is accepted.
The following script demonstrates their use with both arrays
and Traversables
.
'one', 'two', 'three', 'four');
$iterator = new ArrayIterator($a);
var_dump(iterator_to_array($iterator, true));
var_dump(iterator_to_array($a, true));
var_dump(iterator_count($iterator));
var_dump(iterator_count($a));
The output is as follows:
array(4) { [1]=> string(3) "one" [2]=> string(3) "two" [3]=> string(5) "three" [4]=> string(4) "four" }
array(4) { [1]=> string(3) "one" [2]=> string(3) "two" [3]=> string(5) "three" [4]=> string(4) "four" }
int(4)
int(4)
Summary
In this article in the PHP 8 series, we discussed the new features related to functions and methods, the most salient being named function parameters/arguments, a simplified callable syntax, and interruptible functions called Fibers.
In the next article in the series, we will cover new features for PHP’s type system.
This article is part of the article series “PHP 8.x”. You can subscribe to receive notifications about new articles in this series via RSS. PHP continues to be one of the most widely used scripting languages on the web with 77.3% of all the websites whose server-side programming language is known using it according to w3tech. PHP 8 brings many new features and other improvements, which we shall explore in this article series. |
MMS • Ben Linders
Article originally posted on InfoQ. Visit InfoQ
Applying ideas from psychological safety can enable people to speak up in teams about what they don&rsqou;t know, don&rsqou;t understand, or mistakes they have made. Trust and creating safe spaces are essential, but more is needed. People need to feel that they will not be punished or embarrassed if they take interpersonal risks.
Jitesh Gosai shared his experience with psychological safety at Lean Agile Scotland 2022.
Gosai mentioned that he started noticing people being reluctant to speak up in certain situations or take interpersonal risks:
When I delved deeper into issues, I found that people seemed reluctant to speak in certain conditions, particularly in group settings, where someone might have to admit that they don&rsqou;t understand or know something. People worried that revealing these things would either slow the team down or that teammates would judge them as incompetent, and therefore thought less of them. So they often opted to stay quiet or faked it till they made it, which indicates that the team was low in psychological safety.
People naturally look up in the hierarchy for what is and isn&rsqou;t acceptable, Gosai said. One of the best ways to get people to take interpersonal risks and increase levels of psychological safety is for leaders to show the way by adopting specific mindsets:
The three core mindsets are curiosity in that there is always more to learn: humility as we don&rsqou;t have all the answers and empathy because speaking up is hard and needs to be supported.
When leadership thinks through the lenses of these mindsets, it shapes their behaviour and interaction in ways that encourage and help others to take interpersonal risks, Gosai mentioned.
Many people believe that psychological safety is another word for trust and about creating safe spaces. While those aspects are essential, psychological safety is about helping people push themselves out of their comfort zones and take interpersonal risks with their work colleagues, as Gosai explained:
Psychological safety is more than trust and safe spaces; it’s a shared understanding that we don&rsqou;t have all the answers, and we will get things wrong. We need to be able to share what we know and what we don&rsqou;t, and people will not be punished or embarrassed if they take interpersonal risks.
InfoQ interviewed Jitesh Gosai about psychological safety.
InfoQ: What themes did you recognise talking to team members and those working closely with teams?
Jitesh Gosai: Over the last three years, I&rsqou;ve worked with quite a few teams with different technologies and sizes about how they work and the problems they face. And they would all describe scenarios where the team could have resolved a problem if they had spoken to each other. But instead, the issues were left unresolved until the problems became big enough that no one could ignore them, and a manager had to intervene. Or, what happened more often, was the team slowing down over time with no one entirely understanding why.
I&rsqou;d often hear team leads asking, “Why can&rsqou;t people just talk to each other?”
InfoQ: What led you to psychological safety?
Gosai: It was around the time I worked with the teams that I became aware of Amy Edmundson&rsqou;s work in her book The Fearless Organisation. In it, she describes psychological safety and how it affects groups, and I began to understand that this was what I was seeing. Amy Edmondson defines psychological safety as the belief that the work environment is safe for interpersonal risk-taking. Interpersonal risks are anything that could make you look or feel incompetent, ignorant, negative or disruptive. The work environment can be any group you work in, for example, your team.
When I would speak with team members about interpersonal risk-taking one-on-one, they would understand the benefits and recognise why it was essential for helping people speak up. But this one-on-one approach wouldn&rsqou;t work across a whole department, let alone an organisation. So we began looking into ways in which we could make it scale.
InfoQ: What have you learned?
Gosai: We will not convince people that their teams are safe for interpersonal risks through one-off gestures or training courses. It needs to be happening in all our team interactions, from group discussions to one-to-one meetings and everything in between. Psychological safety needs to become a part of your organisational culture; otherwise, it becomes one of those fads that teams try but don’t get any benefits from, so they go back to business as usual.
MMS • Filip Verloy Pedro Aravena Dave Malik Elizabeth Zagroba Rena
Article originally posted on InfoQ. Visit InfoQ
Transcript
Losio: I would like just to give a couple of words about today’s topic, what we mean by best practices for API quality and security. APIs are essential in software development. It’s nothing new. It’s been for many years now. They are essential to access and process cloud, third party services. In this roundtable, we are going to explain how to protect our API, how to discover, for example, API related security issues. We will discuss how, in general, we can as developers engineer to improve the quality and the security in our API design and management. As well, what tools we can use. For example, API gateways, open standard, can they help us? Let’s see how we can implement and manage API projects with a reasonable security strategy and development mindset.
Background, & Journey to API Security and Quality
My name is Renato Losio. I am the Principal Cloud Architect at Funambol. I’m an InfoQ Editor. I’m joined by four industry experts on API quality and security. I would like to give each one of them the opportunity to introduce themselves and give a short introduction on, basically, their journey to API security and quality.
Aravena: I’m Pedro. I’m a software engineer, and now developer advocate for a security company. We basically have two products right now. One of them is related to APIs. We’re working on a platform where DevSecOps teams can choose to build applications securely, and all about modern threats. That’s my API story.
Malik: I’m Dave Malik. I’m the CTO and Chief Architect at Cisco. I run the Americas CTO function. More importantly, we’re working with a lot of our clients who are building cloud native applications and driving large scale automation. APIs are the underpinnings of large scale automation and next-gen application development. We’re spending a lot of time with them, and really guiding and learning in terms of, how do we drive security, good API hygiene, and versioning? Then really taking the entire DevSecOps notion all the way to the left, where API security is embedded into the design process and the development process.
Zagroba: I’m Elizabeth. I am the quality lead at Mendix. I’m serving our data and landscape department. I support seven different teams as they build products for our low-code platform. The thing that we build allows our enterprise clients to build their own applications. My department is specifically looking at the problem of, how do you use and share APIs across a whole company? How do you catalog them? How do you drag them into our tool to be able to use them easily? What standards do we support in doing that?
Verloy: Filip Verloy. I’m the field CTO for the EMEA region at Noname Security, which is a startup that focuses purely on securing APIs. Before that I was at another startup for about six years building an API first platform. That’s my entry to the world of API. I’ve been using it as a practitioner. Then I figured out that we also need to secure them. That’s what I’m doing next. Today’s point we’re also looking at API security, both from a developer perspective, so shifting left, as he pointed out rightfully, all the way into production, and how both can influence each other into a better design going forward.
API Quality and Security, the Starting Point
Losio: What do you think is your first advice to someone like myself in the dev world that has worked maybe for many years, as a developer, as an engineer, but has not paid too much attention on API quality and security? He knows that it’s out there, but it’s not really his primary focus. Where should someone start?
Verloy: I think that’s essentially a good description of how most people look at APIs. Everybody knows APIs are out there, they have shifted in function quite a bit. If you go back a couple of decades, or maybe just a couple of years, not so long ago, APIs were mostly used as a configuration function. You could maybe use an API to change the settings of a network device or something like that. That’s the assumption that I feel a lot of people still have fun talking about API security in general. APIs are now everywhere, they’ve become very prevalent. They’re the pipelines into the data, as well these days. We need to spend a bit more time understanding really what APIs are, what the capabilities are, and especially from a security perspective, potentially what type of vulnerabilities they would enable. From a developer perspective, it’s a good first step to really try and understand the API landscape in general, like what type of APIs are there? Then, what type of coding best practices could you use from a developer perspective to build standardized and consistent APIs? I think consistency is definitely key when it comes to building secure APIs.
Losio: Pedro, I read one of your articles about API security recently. I wonder if you have anything to add and anything to share in terms of how to help a developer that is not really on the API yet, more so on the security aspect.
Aravena: Everything that Filip has said, 100%. As a developer advocate for an encryption as a service company, I would add something around encryption. I think it’s pretty important to understand the concept and how to do it. Not only when data is at rest, talking about APIs, but also in transit. I think in this encryption scenario, the API would use TLS. That is pretty important to understand, and authorization tokens to transmit data securely. The data that the API is accessing should also be encrypted. I would also add about encryption, all the data is key for this API development process.
Losio: It’s easy as a developer to think, I got the basics. I know that then it’s always pretty hard to work on it. More so, it’s always pretty hard to work on the quality as well of what you provide to end users. Elizabeth, with your experience of quality and testing on the API, if you have any specific recommendation in that field?
Zagroba: I think I agree with what Filip said about consistency. I would add discoverability to that. Just naming the things in a way where you can tell what is on a deeper endpoint, or having the collection call be the same path as an individual call for a REST API. That’s something I’d recommend.
Malik: I think the other panelists are spot on. If you look at, standards are important. In the open source community, you can argue if standards are followed or not followed. The OpenAPI specification is really important, where customers are aligned to in Cisco as well. If you look at some of the tools that are in the market today, such as API Insights, APIClarity, and so on, we will really want to make sure that discovery process is there as well, as was just mentioned. We want to understand what is out there, because most customers don’t even know what’s out there: most developers. Once we understand what is actually running in our infrastructure, then we can actually start scoring those APIs and seeing if they’re secure, or not. Then as new development paradigms and new software is getting pushed out, is it conforming to our standards? One is, you have to know what the standard is. If you don’t know what’s out there, go discover it, to the point that you made, and then secure it, leveraging mechanisms such as TLS. Then authentication mechanisms, policy and identity integration. Visibility is really important. Then start integrating into your DevOps pipelines as you’re driving through VS Code and other tools.
The Shift Left Mindset
Losio: I was wondering, as I say, it’s really not my main area of competence. I’ve been hearing the word shift left for the last few years. I even think at this point it’s overused, if I’m the one who uses it so often. I wonder what experts in the field think about the concept of a shift left mindset? How do you address that with your customer? How are you engaged when you’re creating and [inaudible 00:11:27] with that?
Verloy: If you think about shifting left from a developer perspective, so I think we are asking developers to do more. There’s also now this big push that I see in the industry around making sure that we don’t burn people out. The idea is, we want developers to not only think about security early on in the design process, we also want them to think about observability, and so on. We’re trying to push a lot of these things left, hence the term shift left, so earlier in the development lifecycle. I think, at least from my perspective, something that could be really helpful is to make sure if you introduce security earlier in the development lifecycle, you do it in such a way that it doesn’t really become an additional barrier to the developer.
Dave mentioned DevOps and things like CI/CD pipelines. I think it’s important to build tooling that neatly integrates into the existing workflow of a developer today, because we cannot really expect all developers to all of a sudden be security experts. What we can do is we can give them the tools and the standards so they can leverage those in the same workflows they are using today, which will hopefully lead to a better outcome. Giving them those tools and giving them those standards, those can be influenced, of course, by security professionals and security specialists, hopefully, then working together to something that makes sense from a developer perspective. In my perspective, nothing chases the developer away more quickly than saying, here’s another tool with another web interface that you need to log into, to develop your code. People won’t accept that. It needs to be a natural part of their workflow. I think that’s the only way we’ll get there, otherwise, shift left I think is dead on paper.
Security as a Consideration, from the Inception Phase
Losio: Shouldn’t API security be considered from the inception phase, basically, in terms of design?
Malik: One is, I think we would all agree that security is important. That goes without saying. I think where folks are lacking in the understanding some time, and at the community level, we’re all raising awareness, is agreeing on what the security standards are for that organization. Because someone in healthcare maybe have different security standards, or more tighter than someone in a non-regulated industry. Setting those standards at the firm level is super important, and aligning to industry specifications, so these are not custom. Then the developers can start embedding those pieces into their code, and those processes, which are really important.
The other piece is around automation. Security automation is still fairly early in the development cycle, because a lot of our developers are coding on multiple platforms, multiple languages, so on and so forth. How do you ensure we can do things at scale? That’s what developers also are really considering, any tool that we build or any process that we build, how can we do it at scale from an automation perspective? Then fix the bugs or fix the code or the vulnerabilities when we find them. Sometimes they’re in a backlog. Typically, backlogs, sometimes they take a while to groom. I think we all know that. Hitting left, and at the point of inception of finding a security vulnerability and automating the fixes at scale, I think that is a key area for shifting left moving forward.
Encryption at Rest and Encryption in Motion
Losio: I notice as well an interesting comment about encryption at rest and encryption in motion. There are two aspects: some people forget, and some people take them for granted and they don’t mention them. Absolutely, I think that everyone knows about the importance. Sometimes I have the feeling that cloud provider and service provider, maybe they make it harder to use them in the sense that it’s just maybe a checkbox, but it’s more expensive, or it’s not the default or whatever else. I don’t know if anyone has any experience with that, or any suggestion on how to overcome any security barrier in the sense of the encryption side, it’s often seen as a limitation.
Security Standards for APIs
Which security standards are available for APIs specifically?
Verloy: To Elizabeth’s point earlier, she mentioned the REST APIs, for example, so when it comes to APIs under just security standards, so many standards are available. In IT in general, it’s something we like to do. We can’t really agree on anything so we just create a new standard. From a security perspective, it’s hard. What I’m seeing, at least in my customer base is that the vast majority is using RESTful APIs at the moment, probably like 70%, maybe even 70%-plus. We’ve seen GraphQL. We’ve seen some gRPC. From a security perspective, if you look at the full picture, and this goes to Dave’s point as well, like if you don’t have the full oversight, the full inventory of everything that you’re using, if you can’t see it then you can’t secure it. Oftentimes, we see that customers are using SOAP XML. Here, they’re using GraphQL. Here, they’re using REST APIs. From a security standards perspective, it’s a tricky scenario, because, let’s compare a RESTful API versus a GraphQL API, for example. From a network security device perspective, REST API for each and every piece of functionality, you have a different endpoint, essentially. You could protect each endpoint individually. With a GraphQL API, you typically have one endpoint. Depending on the query, you get some piece of data back, but it presents itself with all of the functionality that it has from that API perspective. If you put a security device in between, you can’t really predict at all, what is going to happen, because almost every use case, or every user talking to that API endpoint is trying to generate something different in terms of output. Protecting APIs, in that sense, becomes really tricky.
Of course, from a security standard perspective, if we look early on into the development lifecycle, I think is where we’re converging on. There’s many things you can do, for example, linting, making sure that you have consistency in the type of API specification that you can deliver. That’s something you can do on the basis of external validation. That validation is something that you as an organization can create yourself. You can have an external rule base that says, if I’m going to design an API, I’m going to compare it with this rule base, so I’m going to do my linting operation. If it doesn’t fall into that design aspect that I agreed upon before, then I’m not going to pass it down the line as a developer. There’s many ways I think in which you can look at security standards from an API.
Differences between Internal and External APIs
Losio: Elizabeth, from your perspective, do you see any difference between internal and external APIs, and how to approach them? Do you have any suggestion in that sense?
Zagroba: The thing that I see with our clients is just making that first call and making sure that they have the authentication details correct. That they know where to set that up. That that is clear in the specification, and that we have examples for them of how to call the API. Just really having a clear spec or a walkthrough, something a little bit more in-depth even, is really what our customers need to be able to work with our APIs successfully. When you have an internal API, a lack of spec or a lack of a clear spec means you just end up going and talking to the other team. You send them a Slack message, or you send them an email, there are other ways around it. Having a public API, either that feedback loop is a lot slower, or that’s not the best way to get in contact with the people who own it. Yes, the documentation is where I would start in having a good quality thing externally.
Losio: Dave, do you have anything to add, recommend in that sense? Do you see from your perspective, any major difference between internal, external, or from authentication?
Malik: Ideally, there should be not much difference. Internal APIs shouldn’t be more secure than external APIs. It’s the same standards. If we follow the same standard, a lot of our products, if not all of them, we use the same APIs that our customers use internally, when we make calls into our own subsystems, microservices. We treat our development environment just like we would give those APIs to external customers, and we build on top of them. We don’t really differentiate. Again, the API standards are really important, authentication, authorization. A lot of the issues that we clearly see sometimes, whether it’s internal or external, you’ll have these zombie APIs or deprecated APIs that folks built at some time. Externally, maybe they may be less relevant in keeping up to date versus internal ones. The versioning becomes very important, that the version that you’re doing for internal versus external should be almost identical, if not maybe one release off. You maybe want to test internally before you publish those APIs externally.
Yes, versioning and deprecated APIs that are just sitting out there for a long time, open source APIs, we tend to see a lot of the folks just being a little more relaxed. That’s where vulnerabilities creep in. We have tools from Cisco there we’re obviously contributing to in the open source community, that really look at a lot of these problems and making sure they have versioning, do we have the right change log? Can we keep track of everything that was changed in the API? What other enhancements are coming in from a security perspective? Just keeping a closer eye on it. I think that there should be, hopefully no difference in internal and external. We want our customers to use the same APIs that we do internally.
Encryption, and API Security Best Practices
Losio: Pedro, as Dave just mentioned as well about API attacks and security and breaches. We all know, we all read that attacks are increasing rapidly and security concerns are always there. What are some proven techniques? As a developer, where should I start to consider as best practices to improve the security of my APIs?
Aravena: I believe that all network traffic should be encrypted, particularly API requests and responses. As it likely contains sensitive credentials and data, all APIs should use and require HTTPS. Talking about REST APIs, they use HTTP, so encryption can be achieved by using the TLS that we talked before, like the TLS protocol, and its previous iteration, the SSL, the Secure Sockets Layer protocol. These protocols supply the S in HTTPS. Basically, what I would say for if you’re not using a platform that will provide free SSL or somehow, your sensitive data should also be encrypted in the database layer as well. I would say encrypting data, TLS, SSL, and require HTTPS when it’s possible.
When It’s Ok to Have Something Less Secure
Zagroba: Is there a time when it’s ok to have something that’s less secure? If it’s just an internal API or if it’s just for a demo? When would you let some of the stuff go?
Malik: I think we should not let anything go. Everything should be secure from the beginning. It should be just built, ingrained into your DNA. I think, if we give too many choices out there, we’re humans, we may make the wrong choice at the wrong time. I think security is paramount, whether it’s internal or external, because a lot of the attacks obviously, as we know, sometimes come internally within the organization, as well, whether they’re accidental or intentional. At Cisco, we follow a very simple philosophy, trust nothing but verify everything. I think that just getting more embedded into our customers as well as we move more towards a zero trust environment, more even in the development side than the networking side.
Why Security Practices are Sometimes Not Enforced
Losio: I was wondering as well, reading that comment about design-first strategy, basically, build the contract, mock the contract, engage with the customer early and get to the consent. Once the consumer agrees with the contract, the actual development with all the security best practices should be enforced. I love it. The problem is, I always wonder, I see many projects where that doesn’t happen. Do you think it’s because developers are fundamentally lazy, or because it’s a matter of time, resources, or not fully understand that maybe security is not as complex as we think it is?
Zagroba: If you’re working in an agile environment, or in particular, a lean environment, you’re trying to get the smallest piece of software out quickly. You’re not trying to get the whole product out all at once, necessarily. It might be that speed and time to market is a higher priority commercially than having everything secure right away, particularly if you have just a small set of beta customers. I find myself more often in a position of finding, what is the right balance? What have we promised people? How quickly can we get enough out to them, and then add some of those things on later, whether it’s refactoring or security stuff.
Losio: You’re basically saying that often it’s a balancing act that we need to get something out there to test it or to prove that there is a market or any way because we need something live. Sometimes security is not a problem but is seen as an extra effort at the beginning, that maybe I’m not saying people are compromising, but maybe they’re sometimes taking some shortcuts maybe.
Zagroba: Or as Filip said at the beginning, not everyone is a security expert as well. Particularly if you are working remotely, and you’re working independently for a lot of that time, then building and working on something that you’re not the expert in, it may be that it’s not the highest quality. My recommendation for that is more pairing and more ensembling, just working together with people at the same time rather than doing asynchronous code review at different times, just so that you can shift some of that thought process left, as we were saying earlier.
Quality That Is Not Security Related
Losio: We’re really focused a lot on security. What do we mean usually by quality that is not security related? What do you mean by API quality or more so the third phase of a project, a contract, or any other advice?
Zagroba: It depends on the person. Quality is value to some person who matters, said Jerry Weinberg. You have to talk to the people and decide whether the thing that is most important to them is the speed or the accuracy or the security of it. It will depend on your context.
API Gateways vs. Service Mesh
Losio: I see an interesting question about API gateways versus service mesh. North-South, API gateways, East-West, service mesh. API traffic have different semantics in security, are there any set of best practices in terms of security applicable to each arena?
Verloy: I think we do have to differentiate a little bit between infrastructure related APIs and maybe application related API. If we talk about the service mesh, we’re typically talking about maybe a Kubernetes environment or something like that. To Pedro’s point, it’s going to be using mTLS to encrypt everything between all of the nodes, which is a good best practice. From a, how can we validate API transactions and API communications, it’s making our job more difficult. We do have to figure out a way to do that. eBPF could be a good way to help us out there. If you talk about a service mesh versus maybe a gateway or a device like that, you mentioned East-West versus North-South, it does play a role there, to go back to Dave’s earlier point as well. If we focus on a gateway, we’re going to see only the APIs that are routed through that gateway. We’re going to miss probably the majority of the actual APIs that a customer has in use. From a security perspective, that’s not going to be good enough, we want to see everything.
The idea really is we have to understand all of the pieces, and this is network, all of the pieces of the network where API communication is happening. That’s the only way to truly capture East-West, North-South API traffic in an environment. Service mesh can certainly help for some of that. When we’re talking application related APIs, we’re typically more into the gateway piece of the house. Again, a gateway is very good, but a very focused security control that doesn’t really protect you against the most common API type of attacks. If you look at the OWASP API Top 10, number one on the list is going to be broken object level authorization. You’re going to be hard pushed to find a gateway that can protect you against those types of attacks. Because those are typically related to some issue in the business logic inside of the API or the application itself. There are definitely tools that can help us. We must take care not to be too myopic, and really understand API security from a much broader perspective. Step one is get a full inventory of all of the APIs that you have, and then we can talk about securing those. A gateway is definitely a good point, but you need to look at the broader picture.
Risk Scoring
Malik: I think it was around risk scoring, things of that nature. That’s very important. If you look at the idea of, how do we score our vulnerabilities, how severe they are, and creating an index for our critical APIs? I think alignment is really important with the developer team and the security team. Sometimes they could be at odds, if you have a risk and compliance and security cyber team in the organization, and you have a pure development team. If we can align on how we’re managing and scoring the vulnerabilities on API, so everybody’s on the same page, that it’s very important that we know this vulnerability is a severity 1, severity 2, severity 3. Severity 1 being obviously business impacting, and these are mission critical, because they are external vendor APIs or partner APIs where you’re doing commerce, and they could have immediate business impact. Denial of Service APIs are very important as well to keep track of as well.
Authorization Mechanisms in OpenAPI
Losio: I have an interesting question about authorization mechanism in OpenAPI. Is there any benefit in terms of security in using Open Authentication 2 over HTTP, or JWT, or whatever, or both offer the same?
Verloy: This goes to some extent back to, should there be a difference between internal APIs and external APIs? I do agree in theory that there shouldn’t be. What we’re seeing in practice, though, is that typically, on the authentication and authorization piece, that’s when we start to see the difference, like people are doing like basic authentication, getting a JSON Web Token back for an internal API and happy days. Then external APIs are using things like OAuth 2, and so on, of OpenID Connect. There is a difference in the level of trust that you can assume between those different types of authentication authorization mechanisms. The problem is always going to be that the internal API with good intentions is going to escape and become an external API at some point, and then, now you have an external API which uses basic authentication, which might not be the best idea. I am definitely in the camp of shooting for the highest level of authentication and authorization from a security perspective. Things like OAuth 2 and OpenID Connect, I think, definitely have a role to play.
OpenAPI Specs
Zagroba: The best specs are ones that are auto-generated and that people will read. Whether that’s OpenAPI, or some other spec, I think matters less than that they’re up to date. I like OpenAPI, but it’s a personal taste.
Malik: I would agree. Cisco is fully on the OpenAPI bandwagon, and we conform to like actually contributing to it. We like standards. We like contributing to standards. Anything on standards is super important. We embed that into our solutions and our discussions, and obviously, in the developer mindset.
Verloy: I think additionally, like if you look at OpenAPI, because we’re talking about security, one of the major advantages there is I think, it allows you to use a security scheme as part of the OpenAPI spec. You can define authentication and authorization rules, as part of the scheme. Just because it’s there, it even helps developers think about it. That’s already like a really good step.
How to Drive Open Standards in the Software Development Lifecycle
Losio: I always wonder, as Dave mentioned open standards and their importance. I think we all agree that open standards are important and essential more so to drive innovation. How important do you think it is to contribute to the standard? How does it work in the sense, it’s just a big operation, it’s just hoping for the best. What could we do? Because now, we already mentioned a few different standards. We agree that it depends on taste, it depends on different things, but how to drive those standards, how to embrace them in the software development lifecycle, as well.
Malik: One is actively contributing, contributing code, contributing to specifications and aligning to specific workstreams that we can go after. An API spec or standard is very broad. Whether it’s you going after security, whether you go after authorization, whether you’re going after encryption, there’s a bunch of areas where folks can contribute to. I think more as you get part of the community, and you’re interacting with the community in the industry, you gain a lot of knowledge, but you’re also contributing. I think it makes everybody a better person at the end of the day, and then just getting awareness. I think education is something that we can all drive further in the industry, and some of these standards organizations where folks are coming together drives awareness and education, hence in better practices. I learn a lot every time I go in and start reading and looking at all the code that’s being checked in, and published, and what folks are downloading, what’s active, what’s not, and how they’re using it in terms of use cases. Which is very important because folks are using code in different ways in their software development lifecycle. Just understanding how people are using certain APIs, is very interesting and you learn a lot from a best practices perspective as well, through being part of a lot of these standards bodies and forums.
Getting Started on Learning the API Standards
Losio: Essentially, the things that you mentioned, a bit of that education as well, and learning is an interesting topic for people, for developers like myself that know what is out there but they don’t really have a direct experience with learning the standards, playing with the standard. Do you have any recommendation how to start to work in that sense? Where should I start? Should I start on the problem? Can I learn something new in that space? Whatever it is, OpenAPI that you prefer, or anything else?
Zagroba: What I would want to say is, read the whole standard. I don’t think that’s something that someone just starting out is really going to be able to do.
Losio: Video, courses?
Malik: If we look at what Cisco is doing, whoever holds the organization from a development relations perspective, focuses strictly on the developer. If you go to developer.cisco.com, just to give you an idea, those are infrastructure based developers and pure application cloud native developers as well. A lot of the education doesn’t have to be, to your point, reading a spec or going in to a Git repo and downloading code, and understanding. It might be a little bit daunting for folks that are starting out. Basic learning and getting a common understanding and getting access to some of the free tools. There are a lot of free tools out there, take advantage of it. If you go to developer.cisco.com, you could be a novice and start slowly, get the basics and fundamentals, which are really important. Then as we build on those fundamentals, we can start building applications on how to interact with infrastructure, whether it is on-prem or off-prem. Free resources, which are super important, I think to really get access to them is critical, in addition to reading specification of the course.
Verloy: I definitely agree with all that’s been said. I think from a developer perspective, I always find that it really helps if you have a specific goal in mind, like learning something to learn something. I can’t do it myself. If there’s something that I can build, and I’m going to fail 100 times, and then the 101st time, I’m going to succeed to build a small little thing that I’m trying to create, that’s going to teach me so much more than watching 100 hours of YouTube video or reading all of this documentation. I definitely agree, as I always tell my teenage son, I’m like extremely jealous. If you go to YouTube today, you can find anything on anything. It’s a free library to learn any technical topic almost. We do make it really easy.
I do want to give one shout-out maybe to the OWASP organization when we’re talking about API security. They do publish a bunch of open source tools as well. One tool that we regularly use is called the crAPI API or the Completely Ridiculous API. It’s essentially a broken web application that you can spin up on a Docker container or something like that. Then what you can see is you have an API that’s completely vulnerable. You can go in and then you can see how you can break it. You can start to understand how attackers are thinking when they see your API appear in public, what type of techniques they will attempt to use to gain access. Things like that really gets you thinking differently from a developer perspective, to also see like, it’s not about me sending all objects to the frontend and having frontend deal with my stuff. It’s actually, I should really set scoping in my standard and so on. I think that’s super helpful as well.
Losio: I really like the idea of a tool of something to break or something that might be ridiculous as a definition, or as simple as a project maybe without a real use case, but that really helps to play with it.
API Testing, Beyond Rule Checking
Talking actually about OWASP rules. Would you recommend actually about testing, about any tool for automatic testing the security of an API that goes beyond checking the rules? If anyone has any advice in that sense, like providing malformed input or invalid, missing authorization for firewall setup. I see that there’s been a bit of discussion there with feedbacks that are like there is no single tool. I wonder if anyone has advice in terms of testing an API.
Zagroba: You could automate all of those things. Why? I think a lot of that stuff you would catch once and then you’d fix it. You wouldn’t need it to be part of your regression suite and things that you’re running over and over again. I can recommend the test heuristics cheat sheet as far as just coming up with a list of all the different things you might want to try. It’s a really good place to start as far as brainstorming some of those things. Yes, not everything that you test needs to be also automated.
Losio: If you’re a member, get access to the SecureFlag, and you get a chance to play with broken web apps. Again, go there and test things, play with things and see how things happen, not just watch the video that might be exciting, but might not help you.
Verloy: From a tooling perspective, I agree that it’s tricky. There’s probably not a single tool that captures it all. The reason I say that is your attack surface manifests itself in different ways when you think about an API. If you’re looking from the outside in, there’s probably things that you’re doing that are making your APIs less secure that have nothing to do with your API design. For example, you’re leaking API keys in your GitHub repository, as a silly example. That could make you really unsecure, but that has nothing to do with how good or bad the developer was in developing the API, or how good or bad the security people were in putting in gateways and firewalls and all of that other stuff.
There is a lot of API validation and security testing tools that rely on things like fuzzing, like throwing a random code at an API endpoint and seeing what happens. It’s a valid approach in a lot of cases. The problem with that is that the majority of real sophisticated API breaches are business logic issues, like people misusing the business logic of an API. Coinbase, in February of this year, had a good example of that, where people could trade different cryptocurrencies and then they would hold in their own wallet. Testing business logic, you can’t really do with fuzzing, you really have to understand the objects that the API is working with. There, you have to look at really specialized tools, or maybe even pen testers. People who do this manually and know the design of the API and try to figure out how to break in.
Losio: It’s not just about automated tests.
Being Proactive On Threat Detection and Securing External APIs
Often, we have seen third party customers having attacks that’s bleeding to their use of our API spam, distributed denial of service, can you speak to being more proactive on threat detection, security on external APIs.
Malik: Live monitoring is obviously important to understand if there’s a denial of service attack going on. The other area is when we interface with partners, we want to make sure their APIs are secure. One thing is you protect yourself, but you also make sure you protect yourself from others. Then if you look at tools, just to hit the security topic before, we have tools such as APIClarity, as an example, with various different modules. We’re looking at tracing. We’re looking at how authentication calls are happening, how authorization calls are happening. Are there too many coming at the same time from a flow perspective? Coupled with some of these tools that we have from Cisco, and then also industry shrink wrap tools, we can really look at these denial of service attacks more closely, and then be able to put the right production mechanism in place. We have to be able to trace the problem. We have to be able to look in the header, making sure the header is right. We can’t see inside the payload, of course, can we see what the body looks like? Are there versioning pieces? Is somebody exploiting a denial of service attack, because the API’s version 3 release is backwards, as an example, and we won’t even allow them to communicate with us? Typically, what customers are starting to do, with developers saying, this is the minimum requirement, beyond this minimum requirement we won’t even let you communicate with us. It’s a tough call, but sometimes you have to protect yourself. I think some of these tools that we have in the market allows us to make sure some of these exposures such as denial of service or misuse of credentials or encryption, we cut the communication off pretty quickly, or start absorbing them in a honeypot and see what that attacker or that threat actor is trying to do.
Preferred API Security Approaches
Losio: For me, I’m using Web Application Firewall, API gateways, and more traditional approaches and whatever against malicious activity concerning APIs. Do you think these approaches are effective? Instead, what else should a developer consider? Is that enough or am I just hiding my head under the sand and hoping for the best?
Malik: WAF and API gateways, they’re important into application design. You have the cloud providers that offer API gateway as a service, and obviously Cisco has WAF products as well. In general, they’re not good enough completely to shift left. We need deeper level of security into the code at compile time, design time, and of course runtime as well, in addition to API gateways. API gateway, just putting a border at the end and making sure everybody authenticates in and out. It’s not going to give us that level of detailed security at the code level that developers are asking for, but API gateways are important. It’s complementary, but I wouldn’t say one will supersede another one. I think the API gateways do a very poor job of inventorying an infrastructure, inventory tracking, API tracking, things of that nature. If you want to do risk assessment, very difficult for API gateways to do full risk assessment, which security specific tools or APIs typically do today. Some other security angles are missing.
Verloy: I absolutely agree. I think back to the earlier point, the API gateway is definitely a good tool to some extent, but it can only see the traffic that passes through it, and to Dave’s point, like most attacks happen after successful authentication and authorization. From the gateway’s perspective, the attacker is behaving like a normal user, and then trying to misuse the business logic. Ultimately, these are SIEM user based solutions, they can block known attacks. If you look at an API, and especially in-house, custom developed APIs, these are all different, and they all have their own unique logic. A gateway is really meant to block a transaction. If you look at an API and the amount of communication that’s happening across an API, it’s really different than a standard web application. The amount of calls that you make across API endpoints from a single user perspective, that’s essentially what you need to understand and need to track if you really want to figure out if this particular user can be linked to malicious activity down the line. Linking that particular user then goes back to, how have we authenticated this user? Are we using JSON Web Tokens? Are we using something else? How can we link the malicious attacker potentially to all of the activity that he or she is generating? It’s more about context, to Dave’s point. You need a lot more context when it comes to APIs than what these traditional tools can provide.
The Adoption Curve of API Quality and Security
Losio: At InfoQ we talk a lot about adoption curve. We mentioned before that ideally, we love that everyone think about API security and API quality. I was wondering, Elizabeth, what’s your feeling? Do you feel at the moment it’s something like just a few early adopters, or early majority are doing, or it’s something that everyone is doing and we shouldn’t even talk about?
Zagroba: Not everyone is doing it, I can tell you that. As far as what the whole market looks like and what every developer is up to, I’m not quite sure. Just within my organization, I see a huge variance in where people are in their careers and what is the depth of knowledge they have, and how safe do they feel on their team or in their department to be able to ask for help and reach out to an architect or to a security expert when there’s something that they’re not quite sure how to build, or if it’s in line with what our company standards are. I would imagine that there are probably some of the same issues in other places.
Short-term, Practical Action Items on API Best Practices
Losio: As I said before, I’m a developer and a cloud architect. Think about the typical audience who joined this roundtable, you convinced me about the importance of security and quality in API development. If you have to suggest one action item, something I can do, I think some of them have already come up before, but one action item, something I can do tomorrow morning, in a few hours, something not long term, not redeploy my entire application, rewrite from scratch, because it’s not going to happen tomorrow. Just something we can do in the short term, first advice to basically an action item on what we discussed today.
Verloy: Maybe I can give a shout-out to a nice free resource by a gentleman of the name of Corey Ball, who wrote a very interesting book recently called, Hacking APIs. He has this online resource, I think it’s called, apisecurity.com or apisecurity.org, where he actually shows you how to break APIs. I think once you understand what the bad guys are trying to do, then from a developer perspective, you really get why you should pay more attention to some of these things. That goes back to my earlier point, if you don’t know exactly why you’re working towards something, you lose interest. If you see it firsthand, I think it really drives home the message of this is extremely important. We all know APIs are extremely important, and broadly so. I would definitely recommend that.
Malik: The key thing is just to get educated. I think if you go to developer.cisco.com, there’s plenty of resources out there, and just trying the free tools. I like tinkering. I’m sure most developers do as well. There’s a lot of resources out there just to get started. Really in an environment, you’d have to be super educated in terms of the API security space. To get started, just really learning what you have and running tools against your environment, and you’ll discover things that we probably haven’t discovered. It’s a good way to just start to learn and then seeing what’s out there. Then that will be the next step to move forward.
Zagroba: If you’re someone who’s already working on or with APIs, I think a small thing you could do tomorrow is delete something. Find something old in the README or in the code that’s not being maintained, just make the attack surface smaller for where stuff can go wrong and stuff that needs to be updated. It’ll be easier to build stuff in the future.
See more presentations with transcripts
MMS • Steef-Jan Wiggers
Article originally posted on InfoQ. Visit InfoQ
Microsoft recently announced the preview of Azure Storage Mover, a fully managed, hybrid migration service.
Azure Storage Mover is a tool that helps users migrate data from on-premises storage to Azure Storage. This tool is particularly useful for users with large amounts of data that they want to move to the cloud efficiently and cost-effectively. With the preview release, the company specifically targets the migration of an on-premises network file system (NFS) share to an Azure blob container.
Azure Storage Mover is a hybrid service with migration agents that users can deploy close to their source storage. All agents can be managed from the same place in Azure, even if they are deployed across the globe. Furthermore, with the service, users can express their migration plan in Azure, and when they are ready, conveniently start and track migrations right from the Azure portal, PowerShell, or CLI.
Under the hood, Azure Storage Mover is backed by a replication engine and is best suited for fast and accurate migration scenarios. Users could choose it for short-term or occasional uses where users do not need to keep two locations in sync at any given time.
A spokesperson from Microsoft explained to InfoQ:
For example, the migration of an application workload and the associated data into the cloud can keep the source (on-premises) in sync with changes in the destination (in the cloud). It evaluates the source and target when a migration job starts, and only the differences are copied according to migration settings. The ability to accumulate data in the cloud towards one target with data ingested from different sources enables Azure Storage Mover users to restructure their cloud solution storage designs. Lastly, Azure Storage Mover can flexibly change settings according to migration needs.
In addition, the spokesperson provided some general scenarios for Azure Storage Mover, such as:
- Migrating content from a source to a target storage location in Azure. These migrations require careful planning. Users can express their migration plans, share project structures that may vary per workload, review their migration plans in Azure Storage Mover, and track migration jobs.
- In an offline bulk data ingestion, Azure Storage Mover can be used to catch up with changes that have happened since initiating the copying process to Data Box.
- Shipments of data. Azure Storage Mover is well suited for scenarios where customers generate or receive large amounts of data on-premises and need to add this new data to an existing repository in Azure. A customer can use an existing source-target plan and integrate triggers and timers to start a new migration job.
The migration service differs from other products in Microsoft’s portfolio. For instance, a solution such as Azure File sync is backed by a synchronization engine and is better suited for long-term uses where users might need to keep two locations always in sync. For example, a hybrid file server scenario might require both the copy in the cloud and the copy on the server in an office location to be kept in sync. If a file or folder changes in one location, those changes are tracked, and only the change is transported to the other location.
And finally, the Microsoft spokesperson stated:
Microsoft meets customers wherever they are in their cloud journey. Across the planning and migration phases, Azure Storage Mover provides customers with capabilities enabling seamless and confident migrations of their most complex data estates to Azure.
More details on the service are available on the documentation landing page.
Article: GraalVM Java Compilers Join OpenJDK in 2023, Align with OpenJDK Releases and Processes
MMS • Karsten Silz
Article originally posted on InfoQ. Visit InfoQ
Key Takeaways
- The Community Editions of the GraalVM JIT and Ahead-of-Time (AOT) compilers will move to OpenJDK in 2023.
- They will align with Java releases and use the OpenJDK Community processes.
- Existing releases, GraalVM Enterprise Edition features, and other GraalVM projects will remain at GraalVM.
- GraalVM 22.3 provides experimental support for JDK 19 and improves observability.
- Project Leyden will standardize AOT compilation in Java and define far-reaching optimizations for Java applications running in a JRE with a JIT compiler.
As part of the GraalVM release 22.3, Oracle detailed the planned move of two GraalVM projects to OpenJDK. Sometime in 2023, the code for the Community Editions of the Just-in-time (JIT) compiler (“Graal Compiler”) and the Ahead-of-Time (AOT) compiler for native OS executables (“Native Image”) will move to at least one new OpenJDK project. Existing releases, GraalVM Enterprise Edition features, and other GraalVM projects will remain at GraalVM.
Oracle originally announced this move during JavaOne in October 2022 without providing specifics. In mid-December 2022, OpenJDK then proposed Project Galahad for the move.
The GraalVM project is part of Oracle Labs and, thereby, not under OpenJDK governance. GraalVM currently has four feature releases per year and follows a different development process than OpenJDK.
At OpenJDK, the GraalVM Java compilers will align with the Java release cadence of two feature updates per year, four annual patch updates, and one Long-Term Support (LTS) release every two years. The GraalVM OpenJDK project will use the OpenJDK Community processes and submit JDK Enhancement Proposals (JEP) for inclusion in OpenJDK Java releases.
The Graal Compiler is written in Java and uses the Hotspot VM in a Java Runtime Environment (JRE). It replaces the C2 JIT compiler, written in C++, which ships in most Java distributions.
The GraalVM Native Image AOT compiler produces native executables that typically start much faster, use less CPU and memory, and have a smaller disk size than Java applications running in a JRE with a JIT compiler. That makes Java more competitive in the cloud. GraalVM Native Image achieves these optimizations by removing unused code and pre-calculating the application heap snapshot, using the Graal Compiler under the hood. But that also excludes some Java applications from using GraalVM Native Image. InfoQ recently published an article series on this topic.
The Enterprise Edition provides improved Native Image performance, such as the Profile-Guided Optimizations for runtime profiling, but it will not move to OpenJDK. Neither will the other GraalVM projects, such as support for other languages like JavaScript or Python, or Java on Truffle, a Java replacement for the entire Hotspot VM.
The GraalVM Community Edition ships under the GNU General Public License, version 2, with the Classpath Exception. Many OpenJDK distributions, including Oracle’s OpenJDK builds, use that same license. Oracle’s Java distribution, on the other hand, uses the “Oracle No-Fee Terms and Conditions” license. Oracle announced the alignment of “all the GraalVM technologies with Java […] from a licensing perspective” and promised “additional details […] in the coming months.”
GraalVM Release 22.3 Supports JDK 19, Improves Observability
GraalVM 22.3 was the last feature release for 2022. It has experimental support for JDK 19, including virtual threads and structured concurrency from Project Loom. Full support for JDK 19 will come in GraalVM 23.0 at the end of January 2023.
The release contains a lot of improvements for monitoring native executables, an area that lags behind Java programs running in a JRE. The JDK tool, jvmstat
, can now monitor the performance and resource usage of native executables and collect heap dumps for inspection with VisualVM. Native executables can also record the JavaMonitorEnter
, JavaMonitorWait
, and ThreadSleep
events for the free Java Flight Recorder (JFR) tool.
The GraalVM Native Image compiler needs so-called hints about the usage of reflection in Java code. Working with the Spring and Micronaut frameworks, GraalVM launched a public repository of such hints for Java libraries in July 2022. That repository, called “GraalVM Reachability Metadata,” now has entries for Hibernate, Jetty, JAXB, and Thymeleaf.
The GraalVM Native Image AOT compiler performed 2-3 times faster in select benchmarks. Native executables use less memory at runtime and run integer min/max operations much more quickly. Two optimizations, StripMineCountedLoops
and EarlyGVN
, added as experimental in 22.2, are now stable and enabled by default.
Native executables can now contain a Software Bill of Materials (SBOM), and the debugging experience has improved by better identifying memory usage and memory leaks.
In GraalVM ecosystem news, the current IntelliJ version 2022.3 has experimental support for debugging native executables, and JUnit 5.9.1 has annotations for including or excluding them.
The Python implementation in GraalVM changed its name to GraalPy. It saw many compatibility and performance improvements, as did the Ruby implementation TruffleRuby. Python and other languages in GraalVM benefit from the experimental availability of the LLVM Runtime on Windows.
This GraalVM release provides a new download option through a one-line shell script for macOS and Windows:
bash <(curl -sL https://get.graalvm.org/jdk)
Project Leyden Optimizes Java With Condensers
Project Leyden is an OpenJDK initiative with a mission “to improve the startup time, time to peak performance, and footprint of Java programs.” Oracle clarified GraalVM’s relationship to that project: it “plans to evolve the Native Image technology in the OpenJDK Community to track the Project Leyden specification.” The original goal of Project Leyden was to add native executables, like the ones produced by the GraalVM Native Image AOT compiler, to the Java Language Specification. After its formal creation in June 2020, the project showed no public activity for two years.
In May 2022, Project Leyden emerged again with a second goal: identifying and implementing far-reaching optimizations for Java applications running in a JRE with a JIT compiler. It did this because the GraalVM AOT compiler enforces a closed-world assumption and must have all application information, such as its classes and resources, at build time. Some Java applications and libraries use dynamic features in Java that don’t work within this constraint. At least some of the optimizations from this second goal will also require changes to the Java Language Specification. Its implementation will rely on existing tools, such as the Hotspot VM and the jlink tool.
In October 2022, Project Leyden detailed how to achieve both goals. It introduced the concept of a condenser that runs between compile time and run time. It “transforms a program into a new, faster, and potentially smaller program while preserving the meaning” of the original program. The Java Language Specification will evolve to contain condensers.
Condensers will define how AOT compilation and native executables fit into Java, fulfilling the original goal of Project Leyen. But condensers will also improve Java applications running in JRE with a JIT compiler, serving the second goal. The GraalVM Java compilers, the JRE, and the HotSpot JIT compiler continue to receive features and updates independent of Project Leyden. So condensers provide an additional layer of optimization.
As of mid-December 2022, the development of condensers has yet to start. That makes it unlikely that Project Leyden will ship condensers in the next Java LTS release, Java 21, in September 2023. So the earliest possible Java LTS release with results of Project Leyden may be Java 25 in September 2025.
The Java Community Reacts to Oracle’s Move
Spring Boot, with the release of version 3.0 in November 2022, joins the group of Java frameworks that have already supported AOT compilation with GraalVM Native Image in production: Quarkus, Micronaut, and Helidon. InfoQ spoke to representatives from all four frameworks about Oracle’s announcement.
The first responses come from Andrew Dinn, Distinguished Engineer in Red Hat’s Java Team, and Dan Heidinga, Principal Software Engineer at Red Hat. They published an InfoQ article in May 2022, arguing for a closer alignment between OpenJDK and GraalVM.
InfoQ: Your InfoQ article from May 2022 said, “Native Java needs to be brought into OpenJDK to enable co-evolution with other ongoing enhancements.” From that perspective, how do you view Oracle’s announcement?
Andrew Dinn & Dan Heidinga: We’re really positive on Oracle’s announcement and on the recent progress on Project Leyden. Bringing the key parts of GraalVM — the JIT compiler and the Native Image code — under the OpenJDK project helps bring both development and user communities together.
Working together under the OpenJDK project using a common development model and the existing OpenJDK governance enables better communication and collaboration between the GraalVM developers and the broader OpenJDK community. And it makes it easier for these components to influence the design of other Java projects like Valhalla or Amber in ways that make them more amenable to AOT compilation.
Finally, it brings GraalVM directly under the remit of the Java specification process and ensures that Java and GraalVM evolve together in a consistent direction.
InfoQ: Oracle also announced that the GraalVM Native Image AOT compiler will implement the specifications of the OpenJDK project Leyden. Leyden recently delayed the standardization of native Java in favor of optimizing JIT compilation. How do you view that decision now, given this new development?
Dinn & Heidinga: Leyden recently laid out Mark Reinhold’s vision for how to start to address the problem space. He suggests using “Condensers” to help shift computation from one application phase to another. And he mentions both specification changes to support this approach (a necessary task to provide a stable foundation to build on!) and Java language changes, such as the lazy statics JEP draft.
None of this says “delay AOT” or “optimize JIT.” This approach enables both AOT and JIT — as well as general execution — to be optimized. His example is using an XML library to read a configuration file before run time. This approach to shifting computation helps both AOT and JIT compilation. As we’ve seen with frameworks like Quarkus, being able to initialize state at build time has been critical for faster startup. Leyden is now laying the groundwork to do that pre-initialization in a way that preserves the meaning of the program — one of the key requirements Andrew and I had called out in our article.
This is not simply a matter of ensuring that GraalVM is tightly and reliably bound to the Java specification. Mark’s proposal clarifies that Leyden will, if required, carefully and coherently update the specification to permit behavioral variations appropriate to specific Condensation steps. For example, it even mentions the need to clarify the semantics of class loader behavior when targeting a fully closed-world AOT program.
Bringing GraalVM into OpenJDK also makes mixing and matching the code from both projects easier. Maybe GraalVM’s Native Image AOT compiler becomes the final condenser in a pipeline of program transformations?
InfoQ: Your article described some Java features that don’t work in native Java. Java Agent support, which is vital for observability, is another missing feature. What should GraalVM do about these limitations — if anything?
Dinn & Heidinga: The GraalVM team will likely be fairly busy migrating their code into OpenJDK over the next year or so. Migration projects tend to take longer than expected, even when expecting them to do so.
We see the GraalVM community working to increase the tooling available for Native Image, including efforts by Red Hat to support JFR and JMX, and work on debugging support, including efforts by both Red Hat and IntelliJ, all in conjunction with the Oracle developers.
GraalVM is in a great place to continue evolving and collaborating with existing agent providers to find the right instrumentation API for Native Image. Anything learned in this process will be directly applicable to Leyden and speed up the delivery of Leyden.
That’s the end goal now: getting all the experience and learning — and hopefully some code — fed into the Leyden development process to help that project deliver the required specification updates, language changes, and condenser tools to drive broad adoption across the frameworks, libraries, and applications.
Sébastien Deleuze, Spring Framework Committer at VMware, shared the views of the Spring ecosystem.
InfoQ: How do you view the move of the GraalVM JIT and AOT compilers for Java to OpenJDK?
Sébastien Deleuze: Since our work compiling Spring applications to native executables began, we have collaborated very closely with the GraalVM team. Our goal is to limit the differences between “native Java” and “Java on the JVM” while keeping the efficiency benefits of the AOT approach. So from our point of view, the move of the GraalVM JIT and AOT compilers to OpenJDK is good news because it is a key step towards a more unified Java under the OpenJDK banner, even if some differences will remain between native and the JVM.
We are also reaching a point where more and more issues or optimizations require some changes on the OpenJDK codebase. So hopefully, having GraalVM on the OpenJDK side will help, including via closer collaboration with Project Leyden.
GraalVM has been an umbrella project with many distinct sub-projects, like polyglot technologies, with different levels of maturity and different use cases. The clear split between what moves to OpenJDK and the rest will likely help to focus more on the GraalVM Native Image support and clarify what GraalVM means for end users.
InfoQ: The GraalVM Java compilers now have four feature releases per year and will have two in the future. How does that affect Spring?
Deleuze: It is true that the four feature releases have been pretty useful for moving forward fast while we were experimenting with the Spring Native project. But Spring Boot 3, which was released in late November, started our official and production-grade support for GraalVM native. So, from that perspective, the switch to a slower pace in terms of features, synchronized with OpenJDK releases, will help handle those upgrades consistently, with less frequent risks of breaking changes and more time to work on ambitious new features. Let’s not forget there will also be four predictable quarterly Critical Patch Updates annually to fix glitches on native image support.
InfoQ: The development of the GraalVM Java compilers will be different, with at least one OpenJDK project, committers, reviewers, and JEPs. What’s the impact on Spring?
Deleuze: While the processes will change, we expect to continue the collaboration between Spring and GraalVM teams on the OpenJDK side. Also, as announced last year, we work closely with BellSoft, one of the leading OpenJDK contributors, on both JDK and native support. We will share more details on the impact in the upcoming months.
Jason Greene, Distinguished Engineer & Manager at Red Hat, responded on behalf of the Quarkus framework.
InfoQ: How do you view the move of the GraalVM JIT and AOT compilers for Java to OpenJDK?
Jason Greene: We see it as a positive change for both the GraalVM and OpenJDK communities. Bringing the two projects closer together will increase collaboration and code-sharing between the efforts, including work that advances Project Leyden. We have a positive relationship with both teams and look forward to that continuing in the new structure.
InfoQ: The GraalVM Java compilers now have four feature releases per year and will have two in the future. How does that affect Quarkus?
Greene: The change may mean waiting a little longer for a new Native Image feature to appear in a formal GraalVM release. However, GraalVM also has an extensibility SPI that we currently utilize in Quarkus. This, in combination with related improvements to Quarkus itself, allows for improvements to the Quarkus GraalVM Native Image experience within the frequent Quarkus release schedule.
InfoQ: The GraalVM Java compilers will have at least one OpenJDK project with committers, reviewers, and JEPs. What’s the impact on Quarkus?
Greene: We expect minimal impact on the Quarkus community from these changes. Even though the OpenJDK processes and tools have some differences, they share similar goals with the current model. While GraalVM did not use JEPs, they did have design discussions on issues and a PR process involving regular code reviews.
Graeme Rocher, Architect at Oracle, provided a view from the Micronaut framework.
InfoQ: How do you view the move of the GraalVM JIT and AOT compilers for Java to OpenJDK?
Graeme Rocher: This is an excellent step for the community and the broader adoption of GraalVM.
InfoQ: The GraalVM Java compilers now have four feature releases per year and will have two in the future. How does that affect Micronaut?
Rocher: The quarterly releases were helpful during the early adoption phase of GraalVM. GraalVM is mature and stable now, so moving to two releases a year is less of a problem and more of a benefit at this stage. The Micronaut team and the GraalVM team both work at Oracle Labs and will continue to collaborate and ensure developer builds and snapshots are well-tested for each release.
InfoQ: The GraalVM Java compilers will have at least one OpenJDK project with committers, reviewers, and JEPs. What’s the impact on Micronaut?
Rocher: There will undoubtedly be efforts to standardize many of the APIs and annotations for AOT, which we will gradually move to as these new APIs emerge. However, this is not a new challenge for Micronaut as it has evolved in parallel with GraalVM and adapted to improvements as they have emerged.
Tomas Langer, architect at Oracle, responded for the Helidon framework.
InfoQ: How do you view the move of the GraalVM JIT and AOT compilers for Java to OpenJDK?
Tomas Langer: The JIT compiler helps the runtime and has no impact on the sources and development process of Helidon. Any performance improvement to OpenJDK is excellent!
AOT compilation impacts the way we design (and test) our software. If Native Image becomes part of OpenJDK, complexity will decrease for any work related to AOT — we would only need a single installation of a JDK. The same would be true for our customers.
InfoQ: The GraalVM Java compilers now have four feature releases per year and will have two in the future. How does that affect Helidon?
Langer: As GraalVM Native Image becomes more mature, we should not see such significant changes as we have seen in the past. Having fewer releases will actually make our life easier, as we should be able to support the latest release easier than in the past: We are skipping GraalVM release support, as the amount of testing and associated work makes it harder to stay on the latest release.
InfoQ: The GraalVM Java compilers will have at least one OpenJDK project with committers, reviewers, and JEPs. What’s the impact on Helidon?
Langer: I think the answer is very similar to the previous one — the more mature GraalVM Native Image is, the easier it should be for us to consume it.
GraalVM currently plans four feature releases for 2023. At least the first release will still contain the GraalVM Java compilers, as GraalVM Native Image will get full Java 19 support in version 23.0 on January 24, 2023. It’s unclear when the GraalVM Java compilers will have their last release with GraalVM and when they’ll have their first release with OpenJDK.
MMS • Sergio De Simone
Article originally posted on InfoQ. Visit InfoQ
After over two years in development, support for using Rust for kernel development has entered a stable Linux release, Linux 6.1, which became available a couple of weeks ago.
Previous to its official release, Rust support has been available in linux-next, the git tree resulting from merging all of the developers and maintainers trees, for over a year. With the stable release, Rust has become the second language officially accepted for Linux kernel development, along with C.
Initial Rust support is just the absolute minimum to get Rust code building in the kernel, say Rust for Linux maintainers. This possibly means that Rust support is not ready yet for prime-time development and that a number of changes at the infrastructure level are to be expected in coming releases. Still, there has been quite some work work going on on a few actual drivers that should become available in the next future. These include a Rust nvme driver, a 9p server, and Apple Silicon GPU drivers.
Rust for Linux is only [available on the architectures supported by LLVM/Clang](https://github.com/Rust for Linux/linux/blob/rust/Documentation/rust/arch-support.rst), which is required to compile Rust, Thus, LLVM/Clang must be used to build the whole Linux kernel instead of the more traditional GNU toolchain. This limits supported architecture to a handful, including arm, arm64, x86, powerps, mips, and others. For detailed instructions about building Linux with the appropriate flag for each supported platform, check the [official documentation](https://github.com/Rust for Linux/linux/blob/rust/Documentation/kbuild/llvm.rst).
One of the key parts of Rust for Linux is bridging the Rust world, where the compiler can provide memory safety guarantees, and the C world, where no such guarantees exist. To enable the use of functions and types available in C, Rust for Linux creates bindings, which are a set of Rust declarations translating their C-layer counterparts into Rust.
In addition to bindings, Rust for Linux also uses abstractions, which are Rust wrappers built around C code available in the kernel. Abstractions are meant to allow developers to write Rust drivers without directly accessing the C bindings, but they are only available for a limited number of kernel APIs at the moment. However, their number will grow as Rust for Linux will be further developed, say the maintainers.
MMS • Anthony Alford
Article originally posted on InfoQ. Visit InfoQ
Meta AI Research recently open-sourced CICERO, an AI that can beat most humans at the strategy game Diplomacy, a game that requires coordinating plans with other players. CICERO combines chatbot-like dialogue capabilities with a strategic reasoning, and recently placed first in an online Diplomacy tournament against human players.
CICERO was described in a paper published in the journal Science. CICERO uses a 2.7B parameter language model to handle dialogue between itself and other players. To determine its moves, CICERO’s planning algorithm uses the dialogue to help predict what other players are likely to do, as well as what other players think CICERO will do. In turn, the output of the planner provides intents for the dialogue model. To evaluate CICERO, the team entered it anonymously in 40 online Diplomacy games; the AI achieved a score more than double that of the human average. According to the Meta team,
While we’ve made significant headway in this work, both the ability to robustly align language models with specific intentions and the technical (and normative) challenge of deciding on those intentions remain open and important problems. By open sourcing the CICERO code, we hope that AI researchers can continue to build off our work in a responsible manner. We have made early steps towards detecting and removing toxic messages in this new domain by using our dialogue model for zero-shot classification. We hope Diplomacy can serve as a safe sandbox to advance research in human-AI interaction.
Diplomacy is a strategy board game where players must capture a majority of territories called supply centers to win. There is no random component in the game; instead, battles are determined by numerical superiority. This often requires players to cooperate, so the bulk of game play consists of players sending messages to each other to coordinate their actions. Occasionally players will engage in deceit; for example, promising to help another player, while actually planning to attack that player.
To be successful, therefore, an AI must not only generate messages of human-level quality; the messages must make sense given the state of the game board, and the messages must cause other players to trust the AI. To generate the dialogue, Meta used a pre-trained R2C2 language model that was fine-tuned on a dataset of almost 13M messages from online Diplomacy games. The generated dialogue is conditioned on the intents generated by a planning module; the intents are the most likely actions that message sender and receiver will take after reading that message.
CICERO’s planning module generates intents by predicting other players’ likely actions, given the state of the board and messages from those players, then choosing an optimal action for itself. To model the likely actions of the other players, CICERO uses an iterative planning algorithm called piKL which incorporates information from the dialogues with other players. To train the planning module, the Meta researchers used a self-play algorithm similar to that used by AlphaZero.
The Meta team entered CICERO into anonymous league play for online Diplomacy games. The AI played 40 games, including an 8-game tournament with 21 players; CICERO placed first in the tournament. For its entire 40 games, CICERO was ranked in the top 10 percent of players with an average score was 25.8%, while the average score of its 82 human opponents was 12.4%.
In a Twitter thread about the work, CICERO co-author Mike Lewis replied to a question about whether CICERO would “backstab” (that is, lie to) other players:
It’s designed to never intentionally backstab – all its messages correspond to actions it currently plans to take. However, sometimes it changes its mind…
The CICERO source code is available on GitHub.