Immersive Stream for XR: Extended Reality Experiences from Google Cloud

MMS Founder
MMS Renato Losio

Article originally posted on InfoQ. Visit InfoQ

Google Cloud recently announced the general availability of Immersive Stream for XR, a managed service to host, render, and stream 3D and extended reality (XR) experiences. The new service makes the rendering of 3D and augmented reality no longer dependent on the hardware of smartphones.

Using Immersive Stream for XR, the virtual experience is rendered on Google cloud-based GPUs and then streamed to a variety of devices where users can interact using touch gestures and device movement. Sachin Gupta, vice president of infrastructure at Google Cloud, writes:

With Immersive Stream for XR, users don’t need powerful hardware or a special application to be immersed in a 3D or AR world; instead, they can click a link or scan a QR code and immediately be transported to extended reality.

The general availability of the service adds new features, including support of content developed in Unreal Engine 5.0 and of content in landscape mode for tablet and desktop devices.

Immersive Stream for XR can be used for rendering photorealistic 3D digital objects and spaces. Gupta describes use cases where users can move around the virtual space and interact with objects:

Home improvement retailers can let their shoppers place appliances options or furniture in renderings of their actual living spaces; travel and hospitality companies can provide virtual tours of a hotel room or event space; and museums can offer virtual experiences where users can walk around and interact with virtual exhibits.

Google released a template to start development and an immersive stream example developed with the car manufacturer BMW. Fabian Quosdorf, managing director at mixed.world, comments:

This cloud service enables frictionless access to high-quality content to millions of mobile devices! It could be a huge competitor to Azure Remote Rendering service. With current layoffs in this sector at Microsoft, news like this gives developers like me hope that Mixed Reality still has a chance of surviving.

Paul McLeod, principal at Decision Operations, wonders if the new service might end up like Stadia, the cloud gaming service that Google shut down:

Interesting but does not seem compelling without HMD support. Meanwhile, Microsoft HL2 supports Remote Rendering and it’s at the level engineering firms need. Seems like they’re laying the ground for something. May work out terrific, but could be another Stadia.

Similarly, Amazon Sumerian, a managed service to run AR and VR applications was recently discontinued by AWS. A common question in Reddit threads is how cloud rendering can work as latency and latency jitter are critically important for interactive experiences. User Hopper199 explains:

Latency for XR over the cloud is lower than it is for 2D games, which typically run at 60 Hz instead of 90 or 120 Hz for XR. But the main reason why Cloud XR works well, or even at all, is because of the last-second reprojection of the video stream on the headset. If your latency is too high, you can’t even fix that, but in practice with 50ms or less, it’s fine (…) The trick here is that the viewpoint is virtually lag-free, due to reprojection.

The pricing of Immersive Stream for XR depends on the configured streaming capacity, defined as the maximum number of concurrent users that the experience can support. Currently available in a subset of Google Cloud regions, the service hourly charges 2.50 USD per unit in the cheapest regions.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Presentation: Chaos Engineering Observability with Visual Metaphors

MMS Founder
MMS Yury Nino Roa

Article originally posted on InfoQ. Visit InfoQ

Transcript

Roa: I am Yury Niño. I’m from Columbia. I am here to speak about observability, chaos engineering, and visual metaphors. I work as a cloud infrastructure engineer at Google. Also, I am a chaos engineer advocate in my country. I am going to provide definitions for three concepts, observability, visualization, and chaos of course. In the second part, I am going to explain the classical charts and dashboards that we are using to monitor our systems, and I am going to show you several weaknesses of these charts. With this context, I am going to explore another point of view I am talking about, so-called visualization with metaphors. Finally, I am presenting the results of a survey that I apply among some colleagues. With this survey, I try to see the effectiveness of the classical charts and dashboards. I try to identify if the visual metaphors could be useful for improving the observability of our software systems.

The Royal Botanical Expedition to New Granada

I am Columbian. People say many things about my country, such as we grow delicious coffee, or that we have beautiful landscapes. There is another awesome thing about my country, the Royal Botanical Expedition to New Granada. It took place between 1783 and 1816 in Colombia, Ecuador, Panama, Venezuela, Peru, and the north of Brazil. The expedition was held by Jose Celestino Mutis, a botanist, mathematician, and illustrator. Jose Celestino Mutis is recognized because during 25 years, he documented the flora and fauna to the New Granada using more than 20,000 drawings. Here’s Jose Celestino Mutis. His illustrations are visual treasures of the flora and fauna of our country, and the best visualization of the Royal Botanical Expedition.

Probably many of you are asking why I am speaking about the illustration of plants and insects in a software conference. The answer is because humans are highly visual creatures. Probably it was one of the reasons for Jose Celestino Mutis to draw more than 5000 flowers and insects. According to our research, half the human brain is directly or indirectly devoted for processing visual information. In the brain, for example, neurons devoted to visual processing take up about 30% as compared with 8% for touching, and just 3% for hearing. In this investigation, scientists have seen that at least 65% of people are visual learners. The results show also that presentations using visual aids were found to be 43% more persuasive than unaided presentations.

Terminology

In our context, visualizations, charts, and graphics are super important. Here, you’re seeing the timeline of an incident during and now touching to a software release. It was taken from the book “Incident Management Operations,” a really good book. The first instructions from the command manager was to check the analytics dashboard, but the access to the dashboard was not working yet. Let me now review some definitions that we should have clear before trying to learn about observability and chaos engineering. Observability is being able to fully understand our systems. In control theory, for example, observability is defined as a measure of how well internal states of a system can be inferred from knowledge of its external outputs. For me, observability is about asking questions, providing answers, and building knowledge about our systems. Here, another important definition, for modern software systems, observability is not about mathematical equations, it is about how people interact with and try to understand the complex systems.

Observability is different from monitoring and it is super important to understand why. According to the Google SRE book, monitoring is about collecting, processing, aggregating, and displaying real-time quantitative data about our system. There are many reasons to monitor a system, including analyzing long trends. For example, monitor how big is my database and how fast it is growing. Alerting that is very common if something is broken. For example, somebody should be notified to fix it. Building dashboards, I’ll do so in the incident timeline. Dashboards answer basic questions about our services, and they are our first tool to try to understand what is happening. We monitor our system through the signals that they are sending. These signals are called metrics. A metric is a single number with tags optionally appended for grouping and searching, such as query counts, error counts, processing times, and server lifetimes. According to Jason English, data visualization is a more general concept, because it involves designing and engineering a human-computer interface to allow a better human cognition and analyzing of metrics like data streams and archived data. Finally, a dashboard is an application, usually web based, that provides a summary view of a service’s core metrics. A dashboard may have filters and selectors, with the objective to expose the metrics most important to the users.

Since this talk is about graphics, dashboard, visualizations, and observability, I put those definitions in this sketch. Observability is being able to fully understand a system health monitoring and analyzing metrics. Monitoring is about collecting, processing, aggregating, and displaying real-time metrics of a system. Metric, an important term here, is a single number with tags optionally appended such as query counts, processing times, and server lifetimes. Visualization involves designing and engineering a human-computer interface or metric dashboard to allow human cognition, for example. Dashboard is an application that provides a summary view of a set of metrics about a system. Finally, I would like to introduce a new concept here, chaos. This talk is about observability, but it is about chaos engineering also. Chaos is a state of turbulence in a system whose consequences are unpredictable and random.

Relation Between Observability and Chaos Engineering

What is the relation between observability and chaos engineering? According to the website, Principle of Chaos, that contains some manifesto for chaos engineers. Chaos engineering is the discipline of experimenting on a system in order to be confident in the system’s capability to face turbulent conditions in production. Chaos engineering and observability are closely connected, according to me, both concepts can be related using this expression. Chaos engineering is the sum of chaos, observability, and resilience. Because to confidently execute that chaos experiment, observability must detect when the system is normal and how it deviates from that steady state as the experiment is executed. In this expression, there is an important concept, I am talking about knowledge. Specifically, chaos plus observability gives us the parts for defining knowledge in this context. If we identify that something is not normal with our system, and we are able to determine how our system will respond to a chaotic situation, we could say that we know the system. Precisely, knowledge is the concept that connects these two concepts: chaos engineering and observability. Take a look at this definition for observability. Observability can be defined as the sum of metrics plus questions plus answers. Observability is about having tools for making the proper questions and providing the correct answers. In this definition the concept of knowledge is present again, considering that if you know the answers for these questions, you know the system. Here is a summary of what I was trying to explain. Both concepts are complementary and they are bridged by an important concept, knowledge. In this sense, chaos engineering is leveraged by observability, since it allows to detect a deviation from the steady-state of a system. Observability is leveraged by chaos engineering since it helps to discover and overcome the weaknesses of the system.

Signals

Let me focus on observability again. I would like to share this, observability feeds on the signals that a system emits that provides the raw data about the system’s behavior. Observability is limited by the signals and the quality of the signals that a system puts out. I am talking about the four golden signals, latency, saturation, traffic, and errors. Let me remember a short definition for everyone using these beautiful sketch notes from Denise Yu. Latency is defined as the time that it takes to service a request. It is a symptom of degraded performance in a system in an incident, for example. Traffic is a measure of how much demand is being placed on the system. Some examples include the number of HTTP requests, sessions, and errors. Errors are the rate of requests that failed, for example, HTTP 500 errors. Finally, saturation is about the utilization of the resource, for example, the utilization of the CPU or the memory.

How are we seeing those signals? Here, I have a visualization of a set of dashboards in Google Cloud Platform that are showing the behavior of a system. Here I have a question for you. How many of you see chaos here? Chaos is the deviation of the normal state of a system. In my case, I see a problem in the dashboard QPS per region. Although the chart line is the most common visualization for these types of incidents, it is confusing because according to the title, we are seeing a counter of waves, but the y-axis maps time in seconds. It is important to mention that if we don’t use the proper colors, layers, and variables in the axes, one of the most simple could be transformed in one of the most confusing chart. Another common chart is the bar chart. A bar chart is a graph that represents categorical data with rectangular bands, with heights and length proportional to the values that they represent. The challenge is the same. If we don’t use the proper categories, the chart could be confusing. Considering those limitations, what about visualization? They are the proper charts to visualize the chaos. Do you remember that this talk is about chaos engineering also?

Visual Metaphors

I am going to introduce a new definition here, visual metaphor. Visual metaphors are mappings from concepts and objects of the simulated application domain to a system of similarities and analogies. A computer metaphor is considered the basic idea for simulation between interactive visual objects and model objects of the application domain. Some examples include maps, cities, geometric coefficient. See this illustrator here with a beautiful map. The city metaphor is a popular method of visualizing properties of a program code.

Many projects have employed this metaphor to visualize properties of software repositories, for example. Existing research has used cities to visualize packages, classes, and size of tools, cyclomatic complexities. I am going to show you more details in the next slide. Here, for example, we have a city metaphor for showing the properties of our software systems. In this case, a city metaphor represents Java packages as neighborhoods, Java classes as buildings, and dimensions as classes properties, or cyclomatic complexities.

Survey

With the intention to identify the perception of engineering teams involved in software operation activities, I applied a study consisting of specific questions about an incident, in which two visualizations would provide. One with a traditional chart, for example, line charts or bar plots, and another view using visual metaphors. For each situation, the value of each type of visualization was analyzed. Twenty-eight of them were surveyed regarding traditional dashboards and visual metaphors. Specifically, they were asked about an incident in four categories or metrics: errors, latency, traffic, and saturation, and were visualized using classical dashboards versus visual metaphors. The backgrounds of the participants were distributed among backend, frontend, and full stack engineers, software architects, data engineers, and site reliability engineers. The most participations come from backend development engineers as it is illustrated here.

The first question was about the saturation signal. Basically, two dashboards were used here, a line chart and a city metaphor, for asking about the state of five microservices. Microservice authentication, microservice patients, microservice payments, microservice medications, and microservice appointments. These microservices were part of a fictional healthcare system. Specifically, the question was, using this traditional line chart, which microservice was impacted? Here, city represents the utilization of CPU by each microservice. For example, the orange line represents the utilization of the payments microservice. The correct answer for this question was microservice authentication. The answer is confusing since it is not clear which line and colors represent each microservice. Probably, this line chart was confusing for our participants, since the answer was distributed among several options, just the 55% selected the correct answer. Remember, the correct answer is microservice authentication. See this line, orange in the pie. On the other side, it is curious, like 11.1% of participants choose payments, a service that effectively had a high consume, but in the previous day is not [inaudible 00:17:49].

I asked the same question but now using a visual city metaphor. I used a building to represent each microservice, for example, a pharmacy represents the meditations microservice. I used silhouettes of people to map the level of saturation. The number of people is proportional with the utilization of the CPU. Finally, I used the red color to represent in another world, if the saturation is higher than a value, the building is painted in orange. As you see, the visual metaphor was more useful than the traditional dashboard. All participants agreed that the microservice impacted by a high utilization of CPU was authentication. Although it did not manage to prove my hypothesis, there is a fact that colors, shapes, and size change the perception of the participants. The open answer of some participants currently that I am seeing, for example, the first one says that the city metaphor was very useful to see the current state of the CPU. Although they claim that the city metaphor didn’t show the behavior through the time. About the other signals, the second golden signal, a classical bar chart and a treemap were used to ask the participants to calculate the average of errors for each microservice as it is illustrated here. If you calculate the average, you can see that the correct answer was microservice appointments. Although participants didn’t choose it, many changed their answer when they used the visual metaphor. This figure illustrates that I am stuck. Which was selected just by 38% of the participants. It is very curious that 88% of the participants think that the correct answer is authentication, just for having more nodes, but not necessarily have more errors.

With a treemap, the distribution percentage changed, but the majority continue thinking that the correct answer is authentication. Here a summary that I am talking with visualization for the distribution of answers for traditional dashboard, for visual metaphor, and the correct answer. It is interesting because it allows to conclude that visual metaphors, another guarantee that we are interpreting those data in a proper way. If you see, in the second case, using a visual metaphor, just the 32.1% choose the correct answer. Regarding traffic signals, a classical bar chart and geometric metaphor were used for asking the participants to which third-party service there is more traffic. In this case, the interaction between the original, the microservices and the new third-party services, service LDAP, service government, service assurance, and service authentication was analyzed. They are external or third-party services that interact with our microservices. This figure shows this integration using a bar plot and geometric metaphor. In the metaphor, the circles represent the services and microservices, and the lines that are connecting the relation among them. In spite of having lines and size for representing the connection and the traffic load among this microservice and third-party services, the metaphor was confusing for the participants that you see here. It is possible that the size of the circle could be associated with at least percentage of service LDAP. That is the correct answer. In which it is represented by the green portion in the pie. Finally, the most people answered that the metaphors were more useful that is illustrated here. As you see, the majority choose visual metaphors in order to get better results.

Key Takeaway Points

For modern software systems, observability is not about mathematical equations. It is about how people interact with and try to understand the complex systems. A second important point here, is considering that chaos engineering and observability involves humans and their individual interpretations, designers of dashboards can bias those interpretations. In this sense, visual metaphors are not a guarantee that we are interpreting this data in a proper way. Finally, it is important to keep in our minds, observability feeds on the signals that a system emits, and that provides the raw data about the behavior of the system.

Questions and Answers

Bangser: It’d be really interesting to hear from you about why you decided to explore the strategy around different visual metaphors when visualizing incidents. How’d you get started with this?

Roa: Regarding your question, why I decided to explore this strategy, because in my experience, chaos engineering, observability, and visualizations in performance, as I have mentioned in my presentation. The individual in presentations, it is a fact that the designers of dashboards can bias those interpretations. That is my main motivation for this study. The bias is the main topic here. Since classical dashboards can lead to bias, I was wondering that if we have an alternative option to explore our dashboards, it could be highly valuable for our operator systems, for engineers, cloud engineers. Out of that, I thought that dashboards based on visual metaphors can provide more useful data than classical visualizations. However, after the study that I shared, I discovered that both strategies have the same worries, because for example, when I was showing the third study related to the geometric metaphor, the participants were confused with the metaphor. For me, the main motivation is related to bias. With this study, while I was preparing this presentation, I discovered that both the strategies and any strategies could be biased, because we are interacting with humans. That is really challenging for dashboard designers.

Bangser: I’ve heard a saying before that there’s lying and then there’s statistics. The joke behind that saying is that, depending on what frame and what lens you put on statistics, it can really show the bias of what you want people to see. What are the things you’re showing, and so exploring, what has been traditional about our visualizations, and what that turns into bias is important, because we may be not as aware of the biases we’re building in, because this is just how it’s always been done. This is really important.

Do you think that the perception of understanding could be evaluated to improve the visual metaphor dashboards?

Roa: Yes. I agree, because that is an input for us. That is an input for us as designers, the perception of the human. Probably we can provide more strategies and more metaphors that cover more perceptions. That is a fact, we have a limitation here that is the interpretations, experience, and backgrounds of the readers of dashboards, but effectively, I think the perception, understanding. We have some frameworks in the literature, in order to analyze this perception in order to get the best input for designing more strategies. Because at this moment, we are limited to line charts and bar charts, that is the charts available in the cloud providers, for example. Although some tools specialize in observability and monitoring, have more strategies to monitor our systems. That is a fact. We have a lot of possibilities. At this moment, we have few strategies for monitoring, but we have a universe of metaphors. Although some of them can be related to the business as my study related to healthcare. I used buildings related to healthcare, hospital, pharmacies, medication buildings. That is a great opportunity to create many tools and share our thoughts about this topic.

Bangser: So often, it’s that cat and mouse game of the tools exist, so people start using them more, and then the tools are encouraged to become broader and effect more things. It’s hard to get started without those tools. Do you have any suggestions of tools to create the visualizations that you did? Specifically, that city visualization, how would you suggest other people get started with that?

Roa: For my study, I designed the metaphors for this case. I used common tools to design and to provide the treemaps. There are some tools in the market, but for example the city, there is a tool that visualization is 3D. I am going to share the link in Slack, because as a result of a paper that I published in the past, I created a tool that provides some visualizations. These visualizations are focused in visualizing software, in visualizing the characteristics of the software. Specifically, for monitoring, I don’t have tools in the market. I am going to share with you some tools that could be extended or used here. I have to recognize that for this study, I draw the circles and lines in order to prove my perception about this topic.

Bangser: How important is color when you’re looking at these visual metaphors. I would maybe extend this to ask as well about how you deal with accessibility when color is a big part of what you were trying to show at times with red versus green, and things like that?

Roa: The color is really important. The color is very important here, because, for example, in my metaphor, with the hospital, I use the red color. When you see a red part or that red section in your dashboard, that takes your attention immediately, because we are familiar with these colors. Red represents fire, represents an alert. Green represents that that is what. It is really important to use the proper colors. For example, in the third metaphor when I was using the geocentric metaphor, because that is really curious for me, because I was expecting that these metaphors could be more valuable for our participants. This was confusing, because I used the same colors, I used blue and gray. I don’t use, for example, the red or green colors. I tried to use the size and shapes in this case, and it was confusing for our participants. I think the color is really important, and it is really important to use in a familiar way for humans, because in our understanding, in our experience, red color, for example, represents alerts. I should take advantage of that.

Bangser: It makes sense that for a large part of the population that is the first thing we look at. I am privileged in that I do not have any colorblindness. I look at red. I grew up in a culture where red means stop, or bad, or error, and so that sits well for me. How do you think the industry can take on board making that geometric shape and size that you tried to use in the last example, more common for people so that it therefore makes it more accessible and less dependent on color, which is something which may or may not work for everybody due to colorblindness and other aspects.

Roa: That is a fact also considering the accessibility headings. I am thinking at this moment, that is a great opportunity to run another experiment, because probably we are ignoring there are some other persons with this challenge to access our tools. We need to consider the standards for accessibility for them. Precisely, I was reading a study published by InfoQ related to this topic, with 10 guidelines to build applications that are accessible for our users. I think we can design an experiment for this, but I didn’t consider this topic in the study. That is a fact. I think that it’s really interesting to explore these considerations also.

Bangser: Yes, there’s just so many angles. You have to try and tackle a lot of them. One side we definitely want to include is accessibility, and there’s so many others, though, that you were able to get insight into during the study, which was really interesting.

I realized this was a great question that came in around the animation, because everything that you showed us was static, even the arrows that had motion in them in the sense that they were pointing in a direction, they were just stationary arrows. Have you thought about adding animation or movement to your visualizations?

Roa: Yes, it could be great to have this opportunity, because we could have more variables for showing the situation or the state of a system. In this case, it is important to consider that if we have a lot of variables and a lot of things in the same dashboard, it could be confusing also. Because for example, when you have movement, you can distract with these movements in the dashboard. It is a really good idea for a dashboard, but we need to consider there is a risk considering that, for example, if you have a dashboard with fast movement in the dashboard, and you have the proper dashboard in this section of your page with the static situation, probably you can distract with the other dashboard. It is important to consider that. Yes, so the lack of movement. That is an interesting discussion for this topic, because the movement can provide a lot more information for us, and it could be highly valuable for our readers. In the same case, I need to run experiments, and I need to go to the users in order to understand them. I think probably a user experience expert could be valuable or could be very useful for us here. I think we need to explore all options, in order to provide the proper visualization for our users. That is a great opportunity for industry and academia. That is important for academia considering that we have, for example, people studying these topics related to visualization, related to human factors, related to accessibility, it is a great topic for a PhD thesis, for example. There is a great opportunity to explore these topics in academia also.

Bangser: You mentioned there a really interesting lever to pull on, which is the number of criteria that you can use when you add color or shape or motion. These are all ways in which you can describe different attributes. As you add more attributes, it can get more confusing.

I was just curious, you seemed almost a bit surprised by some of the study results, the things that were confusing for people, and you didn’t get the results you were expecting. What do you think maybe caused some of that surprise, or those unexpected results?

Roa: Probably, in the third study related to traffic signals, a classical bar chart and a geometric, because that was really surprising for me, considering that I was expecting that the circles and line could provide more information for the users. The reality was another reality. Regarding the traffic signals, the traffic signals that were surprising for me. In the metaphor, the circles represent the services and the microservices represents the relation between the third-party services, in spite of having lines and size. I think my problem with this metaphor was related to the color because I didn’t use the proper colors here. In the other case, for this incident, this chaos, probably the lines and charts and the simplicity, it could be more useful for the attendance. In conclusion, I think the main cause is related to humans, is related to our perceptions of the systems, because each of us is a unique universe with different experience, with different backgrounds. The main root cause for this confusion was related to the background, for example. Because when I explore with details, the answers, I found that the backend software engineers, same similar perceptions, and the frontend engineers, same similar perception that are different, for example, from the cloud infrastructure engineers or persons who work in the operations topics in an engineering team. I think the background experience no less is the main cause. That is the challenge for designers of these types of dashboards.

Bangser: That is a challenge. It makes sense, though, that what you’re used to seeing every day you make assumptions, or you start to read in. I remember when, for example, the three bars meaning like open up a sidebar in an app, became new, but now it’s become something that people are aware of, and that can start to build a repertoire. When you’re dealing with such a broad base, backend, frontend, operations, all of that, that can be really hard.

Do you think that these new visual metaphors are something that can be brought into the industry in the future, despite all these challenges around different backgrounds, and all those kinds of things?

Roa: Yes, but I hope that it will be useful for them. I hope that, in the future, we have the possibility to interact with our cloud, using metaphors. I think that it could be great for us. It is valuable considering the open answers for our participants. I think the visualization of chaos, and specifically of the incidents on production, represents several challenges for industry and academia. I would like to open this gate and this universe of the metaphors for our industry providers. For example, some cloud providers, some of them work, for example, in treemaps, and heat maps, in order to provide more strategies for our operator systems. They are working on that. At this moment, we don’t have the possibility, for example, to use a city metaphor or geocentric metaphor, because considering that, for example, those metaphors are related to the business, are related to the organization, and related to the proper and specific business topic. I think that, for example, we could provide tools for building these metaphors, for providing these metaphors, providing tools that allow us to draw, or to provide, or to design our dashboards in order to connect our business, our preoccupations, our priorities with our dashboards. If we have the possibility to design the dashboards in our cloud provider, it could be great. It could generate value for our operator’s system. I think that that is challenging, but there is an open gate for creating things as our imagination allows.

Bangser: As you say, we have to bring industry and academia together to solve these problems. What’s really exciting is if the cloud providers do start working in this space, they operate at such a scale that we can start to really get feedback into academia and start to actually run studies at scale and get feedback on that. That’d be a very exciting opportunity for the industry.

See more presentations with transcripts

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Presentation: Modern API Development and Deployment, From API Gateways to Sidecars

MMS Founder
MMS Matt Turner

Article originally posted on InfoQ. Visit InfoQ

Transcript

Turner: Do you find you have trouble with the APIs in your organization? Do services struggle to interact with each other because their APIs are misunderstood, or they change in breaking ways? Could you point to the definition of an API for a given service, let alone to a catalog of all of them available in your estate? Have you maybe tried to use an API gateway to bring all of this under control, but do you find it doesn’t quite feel like the right tool for this job? I’m Matt Turner. In this talk, I’m going to talk about some more modern techniques and tools for managing your APIs. We’re going to see what API gateways really do, what they’re useful for, and where they can be replaced by some more modern tooling.

Outline

In the first half of this talk, I’m going to talk about what an API is just very briefly, get us back on the same page. Talk about what an API gateway is. Where they came from. Why I’m talking about them. Then we’re going to look at sidecars, which I think are a more modern distributed microservice style alternative. Then get a look at some more tooling around the lifecycle of an API. Some of these stages you might be familiar with in the design and development of an API. I think API gateways have been pressed into helping us with a lot of these things as well. Actually, rather than using an API gateway or a sidecar, rather than using a component of the network, like an online active component at runtime, there’s a bunch of more modern tooling now that enables us to do this earlier in the dev cycle to shift these concerns left. I’m going to be covering that as well.

API Gateways and Sidecars

API gateways and sidecars. Let’s very briefly go over what I mean by API, at least for the purposes of this talk. Let’s start off easy, what isn’t an API? The definition I’ve come up with is that an API is the definition of an external interface of some service. Service is open to interpretation, your workload and point where you want to call it. There’s a way of interacting with it and talking to it from the outside. That’s what an API is. Wikipedia says that an API is a document or a standard that describes how to build or use a software interface. That document is called an API specification. It says that the term API may refer to either the specification or the implementation. What I really want us to agree on, for the purposes of this talk, an API is not a piece of code or a running process. People, I’ve heard them talk about deploying an API, and what they mean is deploying a pod to Kubernetes, deploying some workload to a compute environment. That’s either the fact that that service is not a batch processing service but it’s some daemon that has an API. I think we get confused if we start using the word API for that. An API by my definition, how do we define it? An API is defined by an IDL, an Interface Definition Language. You may have come across OpenAPI Specs, formerly called Swagger, or protobuf files, which include the gRPC extensions. There’s Avro, or Thrift, and a lot of those other protocols have their own IDLs as well. This is like a C++ header file, the Java interfaces, C# interface.

Very briefly, a workload, a running process, a pod can have an API that we can talk to. If you got more than one of those, you put something in front, that it spreads the work between APIs. This is your classic load balancer. You can have more than one different API on different workloads, and then your front-facing thing, whatever it is, can expose both of them. You can have a slightly fatter, more intelligent front-facing thing that can combine those backend APIs in some way. Think of this as a GraphQL resolver, or even a load balancer, or an API gateway doing path routing. It’s worth saying, a pod, a service, a workload can actually implement two different APIs, like a class having two interfaces, or maybe two versions of the same API, which is something that’s going to be quite relevant for us.

What Is an API Gateway?

What is an API gateway? They have a set of common features. They are network gateways. The clue is in the name with gateway. They do things like terminate TLS. They load balance between services. They route between services. They do service discovery first to find out where those services are. They can do authentication with cloud certificates, JWTs, and whatever. They can do authorization. That’s allow listing, block listing of certain hosts, certain paths, certain header compressions, whatever, because it’s at layer 7. They do security related stuff. An obvious example would be injecting calls headers. They can rate limit and apply quotas. They can cache. They can provide a lot of observability, because we send all traffic through them, so we can get a lot of observability for free. Then, some of the common features that get more interesting or more relevant to this talk that I’ve pulled out of a lot of those different product specifications is things like the ability to upload one of those IDL files, like an OpenAPI Spec, and have the API gateway build an internal object model of your API, hosts, paths, methods, and even down to a body schema, so enforcing the schema of bodies and request bodies coming in and going out. A lot of them also have some support for versioning APIs, different stages of deployment, so maybe a test, a staging, and a prod version of an API. A lot of them will also do format translations. That’s maybe gRPC to SOAP or gRPC to “REST,” JSON over HTTP. They can do schema transforms. They might take a structured body like a JSON document in one format, and rearrange some of the fields, rename some of the fields. They can manipulate metadata as well, so insert, modify, delete headers.

Why Am I Talking About This?

Why am I talking about these things? Back in the day, we had a network and we would probably need some load balancing. Since we got beyond a toy example, we’d have certain services. In this case, I’m showing a database where we need more than one copy running for redundancy, for load carrying ability. We’d have multiple clients as well that needed to talk to those things. We’d stick a load balancer in the network, logically in the middle. These were originally hardware pizza boxes, thinking F5, or a Juniper, or a NetScaler. Then there were some famous software implementations, something like HAProxy, NGINX is quite good at doing this. They balance load service to service, but you also needed one of at the edge so that external clients could reach your services at all, across that network boundary, discover your services. Then they would offer those features like load balancing between different copies of the service. Here, I’ve just shown that middle proxy copied, we’ve got another instance of it, because it’s doing some load balancing over. This is probably the ’90s or the early 2000s. This is your web tier, and then they got a load balancer in front of your database tier. Because this thing is a network gateway, because it’s exposed out on the internet, it’s facing untrusted clients on an untrusted network, it needs to get more features. Things like TLS termination, just offering a serving cert so you can do HTTPS rate limiting, authentication, authorization, and then more advanced stuff like bot blocking, and your web application firewall features like detecting injection payloads, and that kind of stuff. Although it’s a very blurry definition, I think all of these features made a load balancer into what we can nebulously call an API gateway.

Now that we’ve moved to microservices, our monoliths are becoming distributed systems, we need a bunch of those features inside the network as well, so observability doesn’t come from attaching a debugger to the monolith, it comes from looking at the requests that are happening on the wire. Things that were a function call from one class into another that were effectively infinite and infallible, are now network transactions that can fail, that can time out, that can be slow, that can need retrying. All of this routing, and, yes, we used to have dependency injection frameworks. You wanted to do your reads in different config or access a different database in test, you would just use dependency injection to do a different build profile. We can’t do any of that stuff anymore in the processes themselves, because they’re too small and too simple. A lot of that stuff is now being done on the network. That fairly dumb load balancer in the middle has gained a lot of features and it’s starting to look a lot like an API gateway. In fact, a lot of API gateway software is being pressed in to perform this function.

I think the issue with this is that it can do a lot. The issue is that it can maybe do too much, and that it’s probably extensible as well. You just write a bit of Lua or something to plug in. It’s very easy for this to become an enterprise service-plus thing from 2005. It’s this all-knowing traffic director that does all the service discovery, holds all the credentials for things to talk to each other. It’s the only policy enforcement point. All of those extensions, all of those things that we can plug in, even the built-in features like the ability to just add a header here or manipulate a body there, just rename a field, just so that a v1 request looks like a v2 request. All of that stuff strikes terror into me of ESPs, and makes me think of these systems that accreted so much duct tape that nobody really understood. Of course, that duct tape isn’t really duct tape, this is part of a running production system, so you’ve got to consider it production code.

It’s also not giving us the security we think it is, in this day and age. The edge proxy works. The edge proxy secures us from things on the internet, because it’s the only way in and out of the network. That’s fine as long as all of our threats are outside. Then that’s the only way in because the subnet is literally unroutable. Bitter experience shows us that not all threats are on the outside, compromised services attempting lateral movement, probably because they’ve had a supply chain attack or a disgruntled employee has put some bad code in, or just somebody on the internal network because Clee has plugged into an Ethernet port or somebody’s cracked the WiFi or something. There’s more of these devices on our networks now with Bring Your Own Device, and with cloud computing, and with one monolith becoming 3000 microservices, there’s just more units of compute, more workloads that we have to worry about that could be a potential threat. While an API gateway being used as a middle proxy might do authentication, it’s kind of opt-in, you’ve got to send your traffic through it with the right headers, if you don’t have the right headers, what’s to stop you going sideways like this?

Sidecars

Onto sidecars. The basic sidecar idea is the idea of taking this middle proxy that’s got a bunch of logic, and logic that I agree that we need, I just don’t think it should be here. What a sidecar does is it moves it to each of the services. Some of them are maybe a little bit obvious, like rate limiting. Each one of those sidecars can rate limit on behalf of the service it’s running on. Load balancing is one people often get a bit confused about. You don’t need a centralized load balancer. Each service, in this case, each sidecar can know about all of the other potential services and it can choose which one to talk to. They can either coordinate through something called a lookaside load balancer to all make sure that their random number generators aren’t going to make them all hit the same one basically. They can look at the lookaside load balancer to find out which of the potential backends has the currently least connections from all clients. Client-side load balancing is actually perfectly valid and viable for those backend services, for internal trusted services where we have control over the code.

I’ve said sidecar a few times, I should actually have talked more generally about that logic. The first thing we did with that logic, that stuff that the API gateway was doing for us on the network, retries, timeouts, caching, authentication, rate limiting, all that stuff, the first thing we did was we factored it out into a library. Because like that Kong slide showed earlier, we don’t want every microservice to have to reimplement that code. We don’t want the developer of each service to have to reinvent that wheel and type the code out, or copy and paste it in. We factor it out. There are a couple of early examples of this. We’ve got Hystrix and Finagle, which are both very full featured libraries that did this stuff. The problem with those libraries is that they’re still part of the same process. They’re still part of the same build, so deployments and upgrades are coupled. They’re reams of code. They’re probably orders of magnitude more code than your business logic, so they actually have bug fix updates a lot more often. Every time they do, you have to roll your service. You also, practically speaking, need an implementation in your language. Hystrix and Finagle were both JVM based languages. If you want to do Rust or something, then you’re out of luck unless there’s a decent implementation that comes along. We factored it out even further, basically, to an external process, a separate daemon that could do that stuff, and another process that can therefore have its configuration loaded, be restarted, be upgraded independently.

What software can we find that will do that for us? It turns out, they’ve already existed. This basically is an HTTP proxy, running as a reverse proxy. Actually, I could be running as a forward proxy on the client side and then a reverse proxy on the server side on the Kotlin side. Even Apache can do this, if we press it, but NGINX, HAProxy are better at it. The new goal is a cloud native HTTP proxy called Envoy, which has a few advantages, like being able to be configured over its API in an online fashion. If you want to change a setting on NGINX, you have to render out a config file, put it on disk, hit NGINX with SIGHUP, I think. I don’t think you actually have to quit it. You do have to hit it with a signal and it will potentially drop connections while it’s reconfiguring itself. Envoy applies all of that stuff live. This is the logo for Envoy. Envoy is a good implementation of that, and we can now use nice cool modern programming languages, any language we want, rather than being stuck in the JVM.

This is great for security, too. I talked before about how that API gateway middle proxy was opt-in, and it could be bypassed fairly easily. If you’re running Kubernetes and each of these black boxes is a pod, then your sidecar is a sidecar container, so a separate container. It’s in the same network namespace. The actual business logic, the application process is unroutable from the outside. The only way traffic can reach it is through the pod’s single IP, and therefore through the sidecar because that’s where all traffic is routed. These sidecars are also going to be present in all traffic flows in and out of the pod. Whatever tries to call a particular service, no matter where on the network it is, or how compromised it is, it’s going to have to go through that sidecar, they can’t opt out. We therefore apply authentication, authorization, rate limit to everything, even stuff that’s internal in our network that it would have just been trusted, because it’s on the same network, in the same cluster, it came from a 192.168 address. We don’t trust any of that anymore, because it’s so easy for this stuff to be compromised these days. This is an idea called zero trust, because you trust nothing, basically. I’ve put a few links up for people who want to read more on that. They can also cover the egress case. You’re trying to reach out to the cloud, all traffic goes through the cloud. You’re trying to reach out to things on the internet, again, all traffic out of any business logic has to go through the sidecar.

These sidecars, they can do great things, but they can be quite tricky to configure. NGINX config files can get quite long, if you wanted to do a lot. Envoy’s config is very long and fiddly. It’s not really designed to be written by a human. Each of these sidecars is going to need a different config depending on the service that it’s serving. Each one is going to have different connections that it allows and different rate limits that it applies on whatever. Managing that configuration by hand is a nightmare. We very quickly came up with this idea of a control plane. A daemon that provides a high-level simple config API, and we can give it high level notions like service A can talk to service B at 5 RPS. It will then go and render the long-winded, individual configs needed for the sidecars, and go and configure them. This control plane in addition to the sidecar proxies gives us what’s called a service mesh. Istio is probably the most famous example. There’s others out there like Linkerd, or Cilium, or AWS’s native App Mesh. If you’re running your workloads in Kubernetes, then you could get this service mesh solution quite easily. Using various Kubernetes features, you can just make container images that contain just your business logic. You can write deployments that deploy just your business logic, just one container in a pod with your container image in. Then using various Kubernetes features, you can have those sidecars automatically injected. The service mesh gets its configuration API and storage hosted for free in the Kubernetes control plane. It can be very simple to get started with these things if you’re in a friendly, high-level compute environment like Kubernetes.

Just to recap, I think we’ve seen what an API gateway is as a piece of network equipment. What features it has. Why they used to be necessary. They are still necessary, but why in a microservice world having them all centralized in one place is maybe not the best thing. How we can move a lot of those features out to individual services through this sidecar pattern.

API Lifecycle

I want to talk about some of the stuff we haven’t touched on. Some of those API gateway features like enforcing request and response bodies, like doing body transformation, and all that stuff. Because, as I was saying, API gateways, some of them do offer these features, but I don’t think it’s the right place for it. I don’t think it should be an infrastructure networking. I don’t think it should be moved to a sidecar. I think it should be dealt with completely differently. I’m going to go through various stages of an API’s lifecycle and look at different tooling that we can use to help us out in all of those stages.

We want to come along and design an API. The first thing is you should design your API upfront. This idea of schema-driven development, of sitting down and writing what the API is going to be, because that’s the services contract, is really powerful. I found it very useful personally. It can also be great for more gated development processes. If you need to go to a technical design review meeting to get approval to spend six weeks writing your service, or you need to go to a budget holder’s review meeting to get the investment, then going with maybe a sketch of the architecture and the contract, the schema, I’ve found to be a really powerful thing. Schema-driven development I think is really useful. It really clarifies what this service is for and what it does, and what services it’s going to offer to whom. If you’re going to be writing the definitions of REST interfaces, then you’re almost certainly writing OpenAPI files. That’s the standard. You can do that with Vim, or with your IDE with various plugins. There’s also software packages out there that support this, first-class. Things like Stoplight, and Postman, and Kong’s Insomnia. If you’re writing protobuf files describing gRPC APIs, which I would encourage you to do. I think gRPC is great. It’s got a bunch of advantages, not just around the API design, but just around actual runtime network usage stuff. Then you’re going to be writing proto files. Again, you can use Vim, you can use your IDE and some plugins. I don’t personally know of any first-class tools that support this at the moment.

Implementation

What happens when I want to come to implement the service behind this API? The first thing to do I think, is to generate all of the things: generate stubs, generate basically an SDK for the client side, and generate stub hooks on the server side. All of that boilerplate code for hosting a REST API where you open a socket and you attach a HTTP mux router thing, and you register logging middleware, and all that stuff. That’s boilerplate code. You can copy and paste it. You can factor it out into a microservices framework, but you can also generate it from these API definition files. Then they will leave you with code where you just have to write the handlers for the business logic. Same on the client side, you can generate your client libraries, SDKs, whatever you want to call them, a library code where you write the business logic. Then you make one function call to hit an API endpoint on a remote service. You can just assume that it works because all of the finding the service, serializing the request body, sending it, retrying it, timing it out, all of that kind of stuff is taken care of. Again, you can just focus on writing business logic.

One of the main reasons I think for doing this is, I often see API gateways used for enforcing request schemas on the wire. Perhaps service A will send the request to service B. The API gateway will be configured to check that the JSON document that it’s sending has the right schema. This just becomes unnecessary if all you’re ever doing is calling autogenerated client stubs to send requests and hooking into autogenerated service stubs to send responses. It’s not possible to send the wrong body schema, because you’re not generating it and serializing it. Typically, in an instance of a class, you’ll fill in the fields, so you’ll have to fill in all of the fields and you can’t fill any extra fields. Then the stubs will take it from there, and they’ll serialize it, and they’ll do any field validation or non-integer size, or string length, or whatever. By using these client stubs, just a whole class of errors just goes away and it gets caught a lot earlier.

Generating stubs from OpenAPI documents for REST, there’s a few tools out there. There’s Azure AutoRest, which gets a fair amount of love, but only supports a few languages. There’s this project called OpenAPI Generator. Its main advantage is that it’s got templates for like a zillion languages. In fact, for Python, I think it’s got four separate templates, so you choose your favorite. I do have to say from a lot of bits of practical experience that most of those templates aren’t very good. The code they emit is very elaborate, very complicated, very slow, and just not yet idiomatic at all. Your mileage may vary, and you can write your own templates, although that’s not easy. It’s a nice idea. I’ve not had a great amount of success with that tool. Even the AWS API Gateway can do this. It’s not a great dev experience, but if you take an OpenAPI file, and you upload it into AWS API Gateway, which is the same thing as clicking through the UI and making paths and methods and all that stuff, there’s an AWS CLI command that’ll get you a stub. It only works for two languages, and they seem pretty basic, doing the same for gRPC. Then there’s the original upstream Google protoc compiler, and it has a plugin mechanism. There needs to be a plugin for your language. There’s plugins for most of the major languages. It’s fine, but there’s this new tool called Buf, which I think is a lot better.

When we’ve done that, we really hopefully get to this point of just add business logic. We can see this service on the bottom left is a client that’s calling the service on the top right, which is a server. That distinction becomes irrelevant in microservices often, but in this case, we’ve got one thing pointing another to make it simple. That server side has business logic, and that really can just be business logic, because network concerns like rate limiting and authentication or whatever are taken care of by the sidecar. Then things like body deserialization, body schema validation, all of that stuff are taken care of by just the boilerplate, like open this socket and set the buffer a bit bigger, so we can go faster. All of that stuff’s taken care of in the generated service stub code. Likewise, on the client, the sidecar is doing retries and caching for us. Then the business logic here can call on three separate services, and it has a generated client stub for each one.

Deployment

When we want to deploy these services, schema validation that we use to configure on an API gateway. I’m going to say, don’t, because I’ve talked about how we can shift that left, and how we don’t make those mistakes if we use generated code. I’ve actually already covered it, but the IDLs tend to only be expressive to the granularity of the field email is a string. There are enhancements and plugins, I think for OpenAPI, certainly for proto where you can give more validation. You can say that the field email is a string. It’s a minimum of six characters. It’s got to have an ampersand in the middle. It’s got to match a certain regular expression. That fine-grained stuff can all be done declaratively in your IDL, and therefore generated into automatic code at build time.

Publication

What happens when we want to publish these things? Buf is just one example, Stoplight and Postman all offer this as well, I think, but Buf has a schema registry. I can take my protobuf file on the left and I can upload it into the Buf schema registry. There’s a hosted version or you can run your own. You can see that it’s done, sort of Rust docs, or Go docs, it’s rendered this nicely in terms of documentation with hyperlinks. Now I’ve gotten nice documentation for what this API is, what services it offers, how I should call them, and I’ve got a catalog by looking at the whole schema registry. I’ve got a catalog for all the APIs available in my organization, all the services I can get from all the running microservices. This is really useful for discovering that stuff. The amount of time in previous jobs I’ve had people say, “I’d love to write that code but this piece of information isn’t available,” or, “I’m going to spend a week writing the code to extract some data from the database and transform it,” when a service to do that already exists. We can find them a lot more easily now.

There’s this idea of ambient APIs. You just publish all your schema to the schema registry, and then others can search them, others can find them. You could take that stock generation and you can put it in your CI system. When you’re automatically building those stubs, you automatically build those stubs in CI every time the IDL proto definition file changes. Those built stubs are pushed to pip, or npm, or your internal Artifactory, or whatever, so that if I’m writing a new service that wants to call a service called foo, if I’m writing in Python, I just pip install foo-service-client. I don’t have to go and grab the IDL and run the tooling on it myself and do the generation. I don’t have to copy the code into my code base. I can depend on it as a package. Then I can use something like Dependabot to automatically upgrade those stubs. If a new version of the API is published, then a new version of the client library that can call all of the new methods will be generated. Dependabot can come along and suggest or even do the upgrade for me.

Modification

I want to modify an API. It’s going through its lifecycle. We should version them from day one. We should use semver to do that. Probably sucking eggs, but it’s worth saying. When I want to go from 1.0 to 1.1, this is a non-breaking change. We’re just adding a method. Like I’ve already said, CI/CD will support the new IDL file, generate and publish new clients. Dependabot can come along and it should be safe to automatically upgrade the services that use them. Then, next time you’re hacking on your business logic, you can just call a new method. You press clientlibrary. in your IDE, and autocomplete will tell you the latest set of methods that are available because there are local function calls on that SDK. When I want to go to v2, so this is a breaking change, say I’ve removed a method, renamed a field. Again, CI/CD can spot that new IDL file, generate a new client with a v2 version on the package now, and publish that. People are going to have to manually do this dependency upgrade because the API changes on the wire, then the API for the SDK is also going to change in a breaking way. You might be calling the method on the SDK that called the endpoint on the API that’s been removed. This is a potentially breaking change. People have to do the upgrade manually and deal with any follow in their code. The best thing to do is to not make breaking changes, is to just go to 1.2 or 1.3, and never actually have to declare v2. We can do this with breaking change detection, so the Buf tool. This is one of the reasons I like it so much, you can do this for protobuf files. Given an old and a new protobuf file coming through your CI system, Buf can tell you whether the difference between them is a breaking change or an additive one. That’s really nice for stopping people, making them think, I didn’t mean for that to be a breaking change. Or, yes, that is annoying, let me think if I can do this in a way that isn’t breaking on the wire.

Deprecation

How do we deprecate them? If I had to make a v2.0, I don’t really want v1.0 to be hanging around for a long time, because realistically I’m going to have to offer both. It’s a breaking change. Maybe all the clients aren’t up to date yet. I want to get them up to date. I want to get them all calling v2, but until they are, I’ve got to keep serving v1, because a v1 only client is not compatible with my new v2. As I say, keep offering v1. A way to often do this is you take your refactored, improved code that natively has a v2 API, and you can write an adapter layer that will keep serving v1. If the code has to be so different, then you can have two different code bases and two different pods running. Then the advantage of the approach I’ve been talking about is you can go and proactively deprecate these older clients. The way you need to do that is you need to make sure that no one’s still using it. We’ve got v2 now where we want everybody to be using v2, we want to turn off v1, so we want to delete that code in the pod that’s offering v1. I’ll turn off the old viewer pods, or whatever it is. We can’t do that if people are still using it, obviously, or potentially still using it. The amount of people I’ve seen try to work out if v1 is still being used by just looking at logs or sniffing network traffic. That’s only data from the last five minutes or seven days, or something. I used to work in a financial institution, it doesn’t tell you whether you turn it off now, in 11 months, when it comes around to year-end, some batch process or some subroutine is going to run that’s going to expect to be able to call v1, and it’s going to blow up and you’re going to have a big problem.

If we build those clients stubs into packages, and we push them to something like a pip registry, then we can use dependency scanners, because we can see which repos in our GitHub are importing foo-service-client version 1.x. If we insist that people use client stubs to call everything, and we insist that they get those client stubs from the published packages, then the only way anybody can possibly ever call v1, even if they’re not doing it now is if their code imports foo autogenerated client library v1. We can use a security dependency scanner to go and find that. Then we can go and talk to them. Or at least we’ve got a visibility at least even if they can’t or won’t change, we know it’s not safe to turn off v1.

Feature Mapping

This slide basically says for all of the features that you’re probably getting for an API gateway, where should it go? There are actually a couple of cases where you do need to keep an API gateway, those kinds of features. Things like advanced web application firewall stuff, advanced AI based bot blocking, or that stuff. I haven’t seen any sidecars that do that yet. That product marketplace is just less mature. It’s full of open source software at the moment. These are big, heavy R&D value adds. You might want to keep a network gateway. For incidental stuff, this is talking about whether you want to move that code actually into the service itself, into the business logic, whether you want to use a service mesh sidecar or whether you want to shift it left.

Recap

I think API gateways is a nebulous term for a bunch of features that have been piled into what used to be ingress proxies. These features are useful. API gateways are being used to provide them but they’re now being used in places they’re not really suited like the middle of microservices. Service meshes, and then this shift left API management tooling can take on most of what an API gateway does. Like I said, API gateways still have a place, especially for internet-facing ingress. You actually probably need something like a CDN and regional caching even further left than your API gateway anyway. In this day and age, you probably shouldn’t actually have an API gateway exposed to the raw internet. These patterns like CDNs, and edge compute, and service meshes are all standard now. I wouldn’t be afraid of adopting them. I think this is a reasonably well trodden path.

Practical Takeaways

You can incrementally adopt sidecars. The service meshes support incremental rollouts to your workloads one by one, so I wouldn’t be too worried about that. I think sidecars will get more of these API gateway features like the advanced graph stuff over time. I don’t think you’re painting yourself into a black hole. You’re not giving yourself a much bigger operational overhead forever. Check out what your CDN can do when you’ve got those few features left in the API gateway. CDNs can be really sophisticated these days. You might find that they can do everything that’s left, and you really can get rid of the API gateway. That shift left management tooling can also be incrementally adopted. Even if you’re not ready to adopt any of this stuff, if I’ve convinced you this is a good way of doing things and you think it’s a good North Star, then you can certainly design with this stuff in mind.

Questions and Answers

Reisz: You’ve mentioned, for example, what problem are you trying to solve. When we’re talking about moving from an API gateway to a service mesh, if you’re in more of something where the network isn’t as predominant. You’ve got more of like a modular monolith, when do you start really thinking service mesh is a good solution to start solving some of your problems? At what point like in a modular monolith, is it a good idea to begin implementing a service mesh?

Turner: I do like a service mesh. There’s no reason not to do that from the start. Even if you have, in the worst case, where you have just the one monolith, it still needs that ingress piece, that in and out of the network to the internet, which is probably what a more traditional API gateway or load balancer is doing. A service mesh will bring an ingress layer of its own that can do a lot of those features. Maybe not everything, if you’ve subscribed to an expensive API gateway that’s going to do like AI based bot detection and stuff. If it satisfies your needs, then it can do a bunch of stuff. Then, as soon as you do start to split that monolith up, you don’t ever have to be in a position of writing any of that resiliency code or whatever, or suffering outages because of networking problems. You get it there, proxying all of the traffic in and out. You can get a baseline. You can see that it works. It doesn’t affect anything. Then, as soon as you make that first split, split one little satellite off, implement one new separate service, then you’re already used to running this thing and operating it, and you get the advantages straight away.

Reisz: One of the other common things we hear when you first started talking about service mesh, particularly in that journey from modular monoliths into microservices, is the overhead cost. Like we’re taking a bunch of those cross-cutting features, like retries, and circuit breakers out of libraries and putting them into reverse proxies that has that overhead cost. How do you answer people, when they say, I don’t want to pay the overhead of having a reverse proxy at the ingress to each one of my services?

Turner: It might not be for you if you are high frequency trading or doing something else. I think it’s going to depend on your requirements, and knowing them, maybe if you haven’t done it yet, going through that exercise of agreeing and writing down your SLAs and your SLOs to know, because this might be an implicit thing, and people are just a bit scared. Write it down. Can you cope with a 500-millisecond response time? Do you need 50? Where are you at now? How much budget is there left? That code is either happening anyway in a library, in which case, there are going to be the cycles used within your process, and you’re just moving them out. Or maybe it’s not happening at all, so things look fast, but are you prepared to swap a few dollars of cloud compute cost for a better working product? Yes, having that as a separate process, there’s going to be a few more cycles used because it’s got to do a bit of its own gatekeeping. They do use a fair amount of RAM typically. That’s maybe a cost benefit thing. They do have a theoretical throughput limit. What I’d say is that’s probably higher, these proxies do one job and they do it well. Envoy is written in C++. The Linkerd one is written in Rust. These are pretty high-performance systems languages. The chances of their cap on throughput being lower than your application, this may be written in Java, or Python, or Node, is actually fairly low. Again, depends on your environment. Measure and test.

Reisz: It’s a tradeoff. You’re focusing on the business logic and trading off a little bit of performance. It’s all about tradeoffs. You may be trading off that performance that you don’t have to worry about dealing with the circuit breakers, the retries. You can push that into a different tier. At least, those are some of the things that I’ve heard in that space.

Turner: Yes, absolutely. I think you’re trading a few dollars for the convenience and the features. You’re trading some milliseconds of latency for the same thing. If it’s slowing you down a lot, that’s probably because you’re introducing this stuff for the first time, and it’s probably well worth it. The only time where it’s not a tradeoff, or it’s a straight-up hindrance is probably that cap on queries per second throughput. Unless you’ve got some well optimized C++, you’re writing like a trading system or something, then the chances are, it’s not your bottleneck and it won’t be, so there really isn’t a tradeoff there.

Reisz: If there is a sidecar or not sidecar, like we’ve been talking in Istio, kind of Envoy sidecar model. There are interesting things happening without sidecars, and service meshes. Any thoughts or comments there?

Turner: It’s a good point. This is the worst case. The model we have now gets things working. The service mesh folks have been using Envoy. It is very good at what it does, and it already exists. It’s a separate Unix process. Yes, I think things aren’t bad at the moment. With BPF moving things into the kernel, there’s this thing called the Istio CNI. If you’re really deep into your Kubernetes, then the Istio folks have written a CNI plugin which actually provides the interface into your pod’s network namespace so that you don’t have to use iptables to forcefully intercept the traffic, which means you save a hop into kernel space and back. Yes, basically technological advancement is happening in this space. It’s only getting better. You’re probably ok with the tradeoffs now. If you’re not, watch this space. Go look at some of the more advanced technologies coming out of the FD.io VPP folks, or Cilium, or that stuff.

Reisz: Any thoughts on serverless? Are these types of things all provided by the provider? There isn’t really like a service mesh that you can implement, is it just at the provider? What are your thoughts if some people are in the serverless world?

Turner: If I think of a serverless product, like a Knative serve or an OpenFaaS, something that runs as a workload in Kubernetes, then as far as Kubernetes is concerned, that’s opaque and it may well be hosting a lot of different functions. If you deploy your service mesh in Kubernetes, then you’re going to get one sidecar that sits alongside that whole blob, so it will do something, but it’s almost like an ingress component into that separate little world that may or may not be what you want. You may be able to get some value out of it. You’ll get the observability piece, at least. I don’t personally know of any service meshes that can extend into serverless. I don’t know enough about serverless to know what these individual platforms like a Lambda or an OpenFaaS offers natively.

Reisz: Outside of Knative, for example, they can run on a cluster.

Turner: Yes, but Knative is maybe its own little world, so Kubernetes will see one pod, but that will actually run lots of different functions. A service mesh is going to apply to that whole pod. It’s going to apply equally to all of those functions. It doesn’t have too much visibility inside.

See more presentations with transcripts

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


CloudNativeSecurityCon 2023: SBOMs, VEX, and Kubernetes

MMS Founder
MMS Mostafa Radwan

Article originally posted on InfoQ. Visit InfoQ

At CloudNativeSecrityCon 2023 in Seattle, WA, Kiran Kamity, founder and CEO of Deepfactor, led a panel discussion on software supply chain security, the practical side of SBOMs, and VEX.

Kamity started the talk by underscoring how some organizations are rushing to create SBOMs because they have until the beginning of June of this year to comply with the US executive order on cybersecurity.

He mentioned that the goal of this panel discussion is to bring together experts on the operational aspects of cybersecurity to answer questions such as how to create SBOMs, how to store them, and what to do with them.

After introducing the speakers, Kamity started the discussion by asking the panel what is an SBOM and why people should care about it.

Allan Friedman, Senior Advisor at CISA, referred to SBOMs as dependency treatment to help us achieve more transparency. He pointed out:

Transparency in the software supply chain is going to be a key part of how we make progress in thinking about software.

He mentioned that the log4j vulnerabilities crisis of December 2021 won’t be a one-off event and SBOMs can help developers build secure software, IT leaders select software, and end-users react faster.

Furthermore, he indicated that there are plenty of tools today, both proprietary and open source, to generate SBOMs including the two formats SPDX, and CycloneDX.

Friedman ended by referring to the Vulnerability Exploitability eXchange(VEX) which will allow organizations to assess the exploitability level of vulnerabilities to prioritize and focus on those that matter to them.

InfoQ sat with Chris Aniszczyk, CTO of CNCF, at CloudNativeSecurityCon 2023 and talked about the event and the relevance of SBOMs.

 I love SBOMs. It is funny that we have been excited about SBOMs for a decade at the Linux foundation and now they’re everywhere. We’ve been prototyping with SBOMs for some projects in the CNCF and based on the tool used, it generates a different type of SBOM. Eventually, the tools will converge and generate a similar thing but we are not there yet.

Next, Rose Jude, Senior Open Source Engineer at VMWare and maintainer of project Tern, an open source software inspection tool for containers, discussed the storage and distribution of SBOMs.

She mentioned that the focus in the community lately has been more on generating SBOMs and less on storing them.

Also, she pointed out that the considerations for SBOMs storage are no different than other types of cloud native artifacts including lifecycle management, caching, versioning, and access control. However, SBOMs’ association with artifacts is a unique thing.

She ended by underscoring that if you’re a software vendor, sharing your software SBOMs with your customers is a way to establish trust and help them understand their exposure and risk.

Kamity wrapped up the session by asking Andrew Martin, CEO of ControlPlane, how can teams start using SBOMs and what’s the payoff.

Martin pointed out that because there’s no standard way to distribute SBOMs today, end-users should ask for or pull SBOMs from their software vendors and scan package manifests using a container vulnerability tool to assess CVEs. He pointed out that It’s a complex problem and further automation is needed.

Kamity recommended the Graph for Understanding Artifact Composition (GUAC) developed by Google as a guide to better understand how to consume, use, and make sense of SBOMs since it covers the proactive, preventative, and reactive aspects.

A Software Bills of Materials(SBOM) is a list of the components that make up an application including dependencies and relationships during its development and delivery.

The breakout session recording is available on the CNCF Youtube channel.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


How to enable MongoDB for remote access – TechRepublic

MMS Founder
MMS RSS

Posted on nosqlgooglealerts. Visit nosqlgooglealerts

Looking to use your MongoDB server from another machine? If so, you must configure it for remote access.

Code on a monitor.
Image: Maximusdn/Adobe Stock

MongoDB is a powerful and flexible NoSQL server that can be used for many types of modern apps and services. MongoDB is also scalable and can handle massive troves of unstructured data.

SEE: Hiring Kit: Database engineer (TechRepublic Premium)

I’ve outlined how to install MongoDB on both Ubuntu and RHEL-based Linux distributions, but one thing that was left out was how to configure it for remote access.

Note that the installation for RHE-based distributions has changed to accommodate the latest version of MongoDB. The new installation requires a different repository and installation command. The repository is created with the command:

sudo nano /etc/yum.repos.d/mongodb-org-6.0.repo

The content for that repository is:

[mongodb-org-6.0]
name=MongoDB Repository
baseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/6.0/x86_64/
gpgcheck=1enabled=1
gpgkey=https://www.mongodb.org/static/pgp/server-6.0.asc

Finally, the installation command is:

sudo dnf install mongodb-org mongodb-mongosh -y

Now that you have MongoDB installed and running, you need to configure it for remote access. Why? Because you might want to use the MongoDB server as a centralized location to serve data to other remote machines.

What you’ll need to enable remote access in MongoDB

To enable MongoDB for remote access, you’ll need a running instance of MongoDB and a user with sudo privileges.

How to enable remote access for MongoDB

The first thing we must do is enable authentication. To do that, access the MongoDB console with the command:

mongosh

Change to the built-in MongoDB admin with:

use admin

Create a new admin user with the following:

db.createUser(
  {
    user: "madmin",
    pwd: passwordPrompt(), // or cleartext password
    roles: [
      { role: "userAdminAnyDatabase", db: "admin" },
      { role: "readWriteAnyDatabase", db: "admin" }
    ]
  }
)

You can change madmin to any username you like. You’ll be prompted to create a new password for the user. A word of warning: You only get one chance to type that password, so type it carefully.

Next, open the MongoDB configuration file with:

sudo nano /etc/mongod.conf

Locate the line:

#security:

Change that line to:

security:
    authorization: enabled

Save and close the file.

Restart MongoDB with:

sudo systemctl restart mongod

Now, we can enable remote access. Once again, open the MongoDB configuration file with:

sudo nano /etc/mongod.conf

In that file, locate the following section:

net:
  port: 27017  bindIp: 127.0.0.1  

Change that section to:

net:  port: 27017
  bindIp: 0.0.0.0

Save and close the file. Restart MongoDB with:

sudo systemctl restart mongod

If you’re using the firewall on your server, you’ll need to open it for port 27017. For example, on Ubuntu-based distributions, that would be:

sudo ufw allow from remote_machine_ip to any port 27017

Reload the firewall with:

sudo ufw reload

Remote access granted

At this point, you should be able to connect to your MongoDB on port 27017 using the new admin user and password you created above. That is all there is to enable MongoDB for remote access. When you want to use that server as a centralized DB platform, this can make that possible.

Subscribe to TechRepublic’s How To Make Tech Work on YouTube for all the latest tech advice for business pros from Jack Wallen.

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


The Future of Databases Is Now – Datanami

MMS Founder
MMS RSS

Posted on nosqlgooglealerts. Visit nosqlgooglealerts

(Immersion Imagery/Shutterstock)

Databases–those workhorses of data management–have come a long way over the past 10 years. Users no longer must accept the tradeoffs that were commonplace in 2013, and the array of features and capabilities in relational and NoSQL databases is growing every month. In some ways, we’ve already arrived at the glorious future data architects envisioned for us way back then. So what’s holding us back?

The biggest change in databases is the cloud. While you could get a DynamoDB instance from AWS as far back as 2012, the cloud was pretty much an afterthought for databases. But thanks to continued investments by database vendors, cloud database services greatly improved, and by 2018, Gartner estimated that managed cloud database services accounted for $10.4 billion of the $46.1 billion DBMS market, or about a 23% share.

One year later, Gartner went out on a limb when it declared that cloud had become the default deployment method for databases. “On-premises is the new legacy,” the analysts declared. By 2020, thanks to COVID-19 pandemic, cloud migrations had kicked into high gear, and managed cloud deployments accounted for $39.2 billion in revenue by 2022, a whopping 49% of the total $80 billion market, Gartner found.

Today, the cloud is the default mechanism for database deployments. Database vendors work hard to eliminate as much as the complexity as possible from deployments, utilizing containerization technology to create serverless database instances that scale up and down on demand. While data modeling continues to occupy customers’ time, operating and managing the database has practically been eliminated.

A Modern NoSQL Database

Ravi Mayuram, the senior vice president of products and engineering at NoSQL database vendor Couchbase, remembers the bad old days when database administrators (DBAs) would dictate what could and couldn’t be done with the database.

“We need to go to a place where the front- and back-end friction should go away, where more of the operational tasks of the databases are hidden, automated, and made autonomous,” Mayuram says. “All that stuff should basically go away when you’re getting to a point where the database is on tap, so to say. You just have a URL end point, you start writing to it, and it takes care of the rest of the stuff.”

NoSQL databases like Couchbase, Cassandra, and MongoDB emerged in response to limitations in relational databases, specifically the schema rigidity and lack of scale-out capabilities. Developers love the schema flexibility of document databases like Couchbase and MongoDB, which store data in JSON-like formats, while their distributed architectures allow them to scale out to meet growing data needs.

Since their introduction, many NoSQL vendors have also added multi-modal capabilities, which allows the database to shapeshift and serve different use cases from the same data store, such as search, analytics, time-series, and graph. And most NoSQL vendors have even embraced good old SQL, despite having their own query language optimized for their particular data store.

That flexibility appeals to customers and prospects alike, Mayuram says. “In Couchbase, you can write the data once and I can do a key-value lookup, I can do a relational query on it, I can do a full ACID transaction on it, I can search tokens. I can do analytics,” he says. “It’s more like a smartphone. It’s about five different data services in one place.”

Like Teslas, the differences with modern databases are under the hood  (Alexander Kondratenko/Shutterstock)

While Couchbase delivers some of the same functionality as a relational database, it goes about it in a completely different manner. Newer databases, such as Couchbase’s, are completely different animals than the relational databases that have roamed the land for the past 40 years. Today’s modern databases are more complex in some ways than the old guard, and it will take some time for enterprise to adjust to the new paradigm, Mayuram says.

“Sometimes you have to go slow to go fast,” he says. “There is going to be an amount of time in which we have to carry sort of both sides, if you will, until we can sort of cut over. That is not an easy task. It’s a generational shift. It’s going to take a little bit of time before your investment that you made in the past has to be transformed to the investment that we make for the future. There is a learning curve as well as an experience curve that you will go through.”

Familiarity will be critical to giving customers a sense of comfort as they slowly swap out the old databases for the newer generation of more-capable databases, Mayuram says.

“You can say there is no difference between Tesla and a regular car because it’s got the same steering wheel, the same tires, the same gas pedal, so what’s the difference?” he says. “What we are losing is our comfort. We just need to go to the next level to tackle the problem. Tthat doesn’t mean you break away completely. You have to have the same SQL available to you. It’s the same steering wheel. Don’t take away the steering wheel. That’s where the comfort lives. Change the gas engine, which is saving all the pollution and, you know, dependency on oil and all that stuff. Change that.”

New Relational DBs

A similar but slightly different journey has taken place in the world of relational databases, which has seen its share of new entries. Vendors like Cockroach Labs, Fauna, and Yugabyte have sought to remake the RDBMS into a scale-out data store that can provide ACID guarantees for a globally distributed cluster. And like their NoSQL brethren, the new generation of relational databases can run in the cloud in a serverless manner.

Yugabyte, for example, has found success by fitting the open source Postgres database into the new distributed and cloud-first world. “Our unique advantage is we don’t enable one feature at a time,” says Karthik Ranganathan, Yugabyte’s founder and CTO. “We enable a class of features at a time.”

By starting with Postgres, YugabyteDB ensures compatibility not only with applications that have already been built for Postgres, but also ensures that the database works with the large ecosystem of Postgres tools, Ranganathan says.

However, unlike plain vanilla Postgres, YugabyteDB is a full-fledged distributed database, providing ACID guarantees for transactions in globally distributed clusters. Not every organization needs that level of capability, but the world’s biggest enterprises certainly do.

Yugabyte wraps that Postgres compatibility and distributed capability in a cloud-native delivery vehicle, dubbed YugabyteDB Managed, enabling users to scale their database clusters up and down as needed. In addition to scaling out by adding more nodes on the fly, YugabyteDB can also scale vertically.

Yugabyte has brought together all of these features into a single package, and it’s resonating in the market, Ranganathan says.

The cloud is now driving the bulk of database revenue, per Gartner

“You need the availability, resilience, and scale in order to be cloud-native because cloud is, after all, commodity hardware and it’s prone to failures and it’s a bursty environment,” he says. “And all of the features are there and the architectural way of thinking how to build an application [is there]…because they have the ecosystem, the tooling and the feature set. So that marriage has been amazing, and we’re getting incredible pull from companies.”

Many enterprises that would have traditionally looked to the trusted relational database vendors–the Oracles, IBMs, and Microsofts of the world–are looking to open source Postgres to save money. They’re short-listing the Postgres offerings from cloud vendors, such as Amazon Aurora and Amazon RDS, and giving YugabyteDB a try in the process.

YugabyteDB is winning its share of business. Kroger, for example, relies on the database to power its ecommerce shopping cart. Another customer is General Motors, which uses YugabyteDB to manage data collected from 20 million smart vehicles. And Temenos, which is one of the world’s largest banking solutions providers, is also running core transaction processing on YugabyteDB.

Ranganathan admits that some of this success is luck. He certinaly couldn’t have foreseen that Postgres would become the world’s most popular database when he and his colleagues started work on the stealth project 10 years ago. But Ranganathan and his colleagues also deserve credit for doing the hard work to create a database that contains the other features enterprises want, which is the resiliency and scale of distributed processing and the ease of use that comes with cloud.

“Sometimes we get pulled into the conversation by the research the customers do and they tell us ‘Can you help us with this?’ So we’re kind of getting the ask handed to us,” Ranganathan says. “It’s still difficult. Don’t get me wrong…But we just really love the place we’re in and where the market is pulling us.”

The times are changing when it comes to databases. Today’s cloud-native databases provide better scalability, more flexibility, and are easier to use than the relational databases of old. For customers looking to modernize their data workhorses and reap the data-driven benefits that come with it, the future has never looked brighter than it does right now.

Related Items:

Cloud Databases Are Maturing Rapidly, Gartner Says

Cloud Now Default Platform for Databases, Gartner Says

Who’s Winning the Cloud Database War

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


AI-Based Code-Completion Tool Tabnine Now Offers Automatic Unit Test Generation

MMS Founder
MMS Sergio De Simone

Article originally posted on InfoQ. Visit InfoQ

One of the pioneer in the field, Tabnine is a code completion assistant that uses generative AI to predict and suggest the next lines of code based on its surrounding context. Tabnine is now opening beta access to new capabilities aimed to generate unit tests.

Unit testing was proclaimed dead by Rails developer David Heinemeier Hansson in 2014. It was possibly at that moment that the developer community started to split into two camps: those who defended unit testing and those who insisted on integration testing as a better way to ensure the proper behaviour of a software system.

In fact, Unit testing is often considered a tedious and time-consuming task, which is thus neglected by many developers, says Tabnine. This is why they extended their vision AI-based software development life-cycle by adding unit testing generation to their assistant:

Our new unit test generation capability uses cutting-edge AI technology to generate unit tests for your code automatically, helping ensure that your code is rigorously tested, resulting in fewer bugs and better code stability – especially important for larger projects or projects with tight deadlines.

The tool supports several languages, including Python, Java, and JavaScript, and is integrated with Visual Studio Code and JetBrains IDEs. According to Tabnine, the tool is able not only to generate unit tests but also to learn how to match them to the developer coding style and patterns.

Tabnine is able to provide code completions on three different levels by completing a line, completing a whole function, or converting natural language comments into code. It can run either in the Cloud or on premises to match distinct privacy and compliance requirements. It must be noted that Tabnine code completion supports a larger set of languages than unit testing generation, additionally including Rust, Go, and Bash.

In the last year several services have been launched to generate code suggestions, including GitHub Copilot, AWS CodeWhisperer, OpenAI Codex and others. Tabnine is the first to also provide unit testing generation.

On a related note, Tabnine has also announced it has reached 1M+ monthly users.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


How to Choose the Right Database in 2023 – The New Stack

MMS Founder
MMS RSS

Posted on nosqlgooglealerts. Visit nosqlgooglealerts

<meta name="x-tns-categories" content="Data / Software Development / Storage“><meta name="x-tns-authors" content="“>

How to Choose the Right Database in 2023 – The New Stack

Modal Title

2023-02-22 04:33:10

How to Choose the Right Database in 2023

sponsor-influxdata,sponsored-post-contributed,

While going with a database you know is always a good option, it makes sense for developers to keep tabs on some of the new technologies.


Feb 22nd, 2023 4:33am by


Featued image for: How to Choose the Right Database in 2023

Databases are often the biggest performance bottleneck in an application. They are also hard to migrate from once being used in production, so making the right choice for your application’s database is crucial.

A big part of making the right decision is knowing what your options are. The database landscape has been changing rapidly in the past few years, so this article will try to simplify things for you by going over the following topics:

  • An overview of the database ecosystem in 2023
  • What actually makes different types of databases perform differently from a technical perspective
  • When to use a specialized database vs. a general-purpose database

The Database Landscape in 2023

Before diving into things, let’s look at a snapshot of the current database ecosystem and the market share of various types of databases:

As you can see, relational databases are still the most used type of database despite all the hype around NoSQL databases. However, if we look at recent trends, the ranking tells a slightly different story.

This chart shows that over the past two years relational databases have been losing a bit of ground to several different types of database models. The following are some of the main database models that are gaining adoption with developers:

What Makes Databases Perform Differently?

When it comes down to database performance, there’s nothing magical that makes one perform better than another. Like all things computer science, it comes down to trade-offs that allow performance to be optimized for specific use cases. For databases specifically, CAP theorem is a good introduction to some of the possible trade-offs made to tune performance.

For example, in the early days of NoSQL databases, there was a lot of hype around their scalability, but the trade-off generally involved sacrificing data consistency guarantees provided by standard relational databases.

Some other design factors that will affect how a database performs:

  • On-disk storage format — How a database actually stores and organizes data on hard drives has a major impact on performance. As more companies begin storing huge amounts of data intended for analytics workloads, storing data on disks in a column-based format like Parquet is gaining popularity.
  • Primary index data structure — How a database indexes data will also have a major impact on performance. Databases generally have a primary index used by their storage engine and then allow users to define secondary indexes. The simplest way to think about indexing is that they will help improve read performance but add overhead to writing new data points.
  • Data compression — How data is compressed will factor into how much it costs to store your data and the query performance of the database. Some compression algorithms are designed to reduce the size of your data as much as possible. Others might have a lower compression ratio but are faster when it comes to decompressing the data, which means that you get better query performance of your data.
  • Hot and cold storage — Many database systems now allow for data to be moved between faster and more expensive “hot” storage, and cheaper but slower “cold” storage. In theory this allows for better performance for frequently queried data and for saving money on storage while still allowing the data in cold storage to be accessed rather than outright deleted.
  • Durability/disaster recovery — How a database handles disaster recovery plays a role in performance as well. Designing a database to mitigate various failures will generally decrease performance, so for some use cases where data isn’t mission critical and occasionally losing data points is fine, databases can remove some safety guarantees to squeeze out better performance.

All of these factors, as well as many others that weren’t covered, play into the performance of a database. By twisting these levers, a database can be optimized for very specific performance characteristics, and sacrificing certain things won’t actually be a problem because they aren’t needed for a certain situation.

When to Use a Specialized Database for Your Application

There are a number of factors that go into deciding which database to use for your app. Let’s take a look at some of the major things you need to consider when choosing the database for your application.

Data Access Patterns

The primary factor in choosing a database is how the data in your application will be created and used. The broadest way to start with is probably to determine whether your workload will be online analytical processing (OLAP) or online transaction processing (OLTP). OLAP workloads are analytics-focused and have different access patterns compared to the more standard OLTP workloads that relational databases were designed to handle. OLAP queries generally only hit a few columns to perform calculations and can be optimized by using a columnar database designed for this. As an example, most data warehouses are built on top of column-oriented databases due to the performance benefits.

Once you’ve broadly determined the type of workload, you now need to consider things like the latency requirements for queries and how frequently data will be written. If your use case needs near-real-time queries with low latency for tasks like monitoring, you might consider a time-series database that is designed to handle high write throughput while also allowing data to be queried soon after ingest.

For OLTP-style workloads, you’ll typically be deciding between a relational database or a document database. The key factor here will be looking at your data model and determining whether you want the schema flexibility provided by NoSQL document databases or if you would prefer the consistency guarantees provided by relational databases.

One final thing you could consider is whether you expect your workload to be fairly consistent throughout the day or if it will be “bursty” and require your database to occasionally handle far larger volumes of reads and writes. In this case, it would make sense to use a database that makes it easy to scale your hardware up and down so you aren’t facing downtime or high costs for hardware that isn’t required most of the time.

In-House Knowledge

Your team’s existing skill set should be taken into consideration when deciding on what to use for your database. You need to determine whether the potential gains of using a specialized database are worth the investment in training your team to learn how to use it and the lost productivity while learning a new technology.

If you know the service you are building won’t need to be fully optimized for performance, it would be fine to use whatever database your team is most familiar with to get the job done. On the other hand, if you know performance is critical, it may be worth the growing pains of adopting a new database.

Architectural Complexity

Keeping your software’s architecture as simple as possible is ideal, so adding another component to a system like a new database should be weighed against the additional complexity that managing the database will add to the system.

This isn’t as big of an issue if your application is such an ideal fit for a specialized database that it can act as the primary database for the application’s data. On the other hand, if you will be using a more general-purpose database as the primary storage for the app, bringing on an additional database for a subset of the data might not be worth it unless you are facing serious performance problems.

Conclusion

The database ecosystem is evolving rapidly. While going with a database you know is always a good option, it makes sense for developers to keep tabs on some of the new technologies that are being released and see if they are a good option for what you are building. Building on a specialized database can help your application succeed in a number of ways by saving you money on costs, improving performance for users, making it easier to scale and improving developer productivity.

Group
Created with Sketch.

TNS owner Insight Partners is an investor in: Pragma.

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Database Manager Job Description – Solutions Review

MMS Founder
MMS RSS

Posted on nosqlgooglealerts. Visit nosqlgooglealerts

Database Manager Job Description

Database Manager Job Description

Solutions Review editors assembled this resource to provide you with a comprehensive database manager job description.

A database manager is a professional responsible for designing, implementing, and maintaining databases. They are proficient in various database technologies and tools, including SQL and NoSQL databases, to help organizations manage, process, and analyze large data sets.

Database Manager Job Description


Key Responsibilities

  1. Design and Develop Databases: Database managers are responsible for designing and developing databases that can store and organize large volumes of data. They use their expertise in database technologies to ensure that the databases are efficient, scalable, and secure.
  2. Manage and Maintain Databases: Database managers are responsible for ensuring that the databases they develop are running smoothly. They monitor system performance, diagnose and troubleshoot issues, and make necessary changes to optimize system performance.
  3. Data Integration and Processing: Database managers are responsible for processing, cleaning, and integrating large data sets from various sources to ensure that the data is accurate, complete, and consistent.
  4. Data Modeling: Database managers are responsible for designing and implementing data modeling solutions to ensure that the organization’s data is properly structured and organized for analysis.
  5. Collaboration: Database managers work closely with cross-functional teams, including data scientists, analysts, and business stakeholders. They collaborate with these teams to ensure that the databases they develop meet the organization’s requirements and can support its goals.

Qualifications

  1. Education: A Bachelor’s or Master’s degree in computer science, data science, or a related field is required.
  2. Technical Skills: Database managers must be proficient in various database technologies, including SQL and NoSQL databases. They must also have a strong background in programming languages such as SQL, Python, or Java.
  3. Analytical Skills: Database managers must have strong analytical skills to identify patterns and insights from large and complex data sets.
  4. Collaboration Skills: Database managers must have excellent communication and collaboration skills to work effectively with cross-functional teams.

Benefits

  1. Competitive Salary and Benefits: Database managers are in high demand, and as a result, they are often compensated well.
  2. Opportunities for Career Growth and Professional Development: Database managers have ample opportunities to advance their careers and gain new skills through training and professional development programs.
  3. Dynamic and Collaborative Work Environment: Database management often involves working in cross-functional teams, collaborating with other professionals in areas such as engineering, product management, and marketing.
  4. Access to Cutting-Edge Technology and Tools: Database management requires the use of sophisticated tools and technologies, and database managers are often provided with the latest software and hardware to do their work.

In summary, database management is a rapidly growing field with an increasing demand for skilled professionals who can design, build, and maintain databases. Database managers must have a strong technical background, analytical skills, and excellent collaboration skills. They must be able to design and develop efficient, scalable, and secure databases, manage and maintain these databases, and work closely with cross-functional teams to meet the organization’s requirements and goals. If you are passionate about working with data and have the qualifications and skills required for the role, a career in database management may be an excellent fit for you.

This article on database manager job description was AI-generated by ChatGPT and edited by Solutions Review editors.
Follow Tim
Tim is Solutions Review’s Editorial Director and leads coverage on big data, business intelligence, and data analytics. A 2017 and 2018 Most Influential Business Journalist and 2021 “Who’s Who” in data management and data integration, Tim is a recognized influencer and thought leader in enterprise business software. Reach him via tking at solutionsreview dot com.
Timothy King
Follow Tim
Latest posts by Timothy King (see all)

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


How to Lead and Manage in This Brave New Remote and Hybrid World

MMS Founder
MMS Ben Linders

Article originally posted on InfoQ. Visit InfoQ

Hybrid working is a mindset of trusting people and providing opportunities to get the best from everyone regardless of place and time. Managers have the opportunity to make people feel empowered, motivated, and productive. Alternatively, they can squash creativity, fun and psychological safety. Erica Farmer will speak about working techniques and leadership practices for hybrid and remote working at QCon London 2023. This conference is held March 27-29.

According to Farmer, people describe hybrid models of working in different ways:

The most purest and technical answer would be something along the lines of “so many days in the office and so many days in another location, such as home”, however, I don’t buy this.

Farmer considers hybrid to be a mindset which is all about personalisation. It’s about providing opportunities to get the best from your people no matter where or when they are working. It’s all about what’s best for the company AND the individual – so it’s a balance. Something which might work for one person, might not work for another – so we need to offer personalised and supported working structures which are motivational and mutually beneficial, Farmer argues.

According to Farmer, managers and leaders can support hybrid teams by providing coaching and support which starts with trust. Trust that your people are doing the right things, trust in their competence, and trust that they will come to you when you need help.

It’s all about the environment you create, Farmer argues; not the physical environment, but a psychologically safe environment. This means creating a culture where team members can raise their head above the parapet and suggest something new without fear of reprimand or embarrassment.

InfoQ interviewed Erica Farmer about leading in hybrid and remote environments.

InfoQ: What are the challenges of hybrid and remote working?

Erica Farmer: It’s a mindset thing. Some managers and leaders will automatically answer this question from a trust and behavioural perspective (which actually says more about their management style than anything else!). A great use of collaborative technology, dialling up communication, support and autonomy, can provide a fantastic working environment for team members who feel valued and treated like individuals.

We all have our own preferences based on how we like to work and our work/life balance, and to get the best from our people we need to put this at the forefront of our practice.

Don’t get me wrong; sometimes there are times when we just need to be in the office. But that’s our own decision to make and output to deliver, which is how we should be managing our people – output.

InfoQ: How can managers deal with these challenges and support hybrid teams?

Farmer: We have found that some managers have taken to hybrid and remote management like a duck to water, and some have really struggled. No longer can we as managers and leaders peer over someone’s shoulder to quickly judge the quality of their work in the moment. And arguably I would question, was this ever the best way to manage people?

The best manager I ever had was a manager who provided the platform for me to spread my wings. Just enough guidance, and plenty of room to make my own decisions (and mistakes). She always listened to my thinking about a project or output, and held me accountable for this in a fair, kind and direct way. This was through clear feedback, praise, approachability and a good listening ear. Oh, and by the way, we never really saw each other as we lived in different parts of the country!

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.