Category: Uncategorized

MMS • RSS

MongoDB had its Relative Strength (RS) Rating upgraded from 68 to 71 Tuesday — a welcome improvement, but still shy of the 80 or better score you prefer to see.
When To Sell Stocks To Lock In Profits And Minimize Losses
IBD’s unique RS Rating tracks technical performance by showing how a stock’s price action over the last 52 weeks measures up against that of the other stocks in our database.
Decades of market research shows that the stocks that go on to make the biggest gains tend to have an RS Rating north of 80 in the early stages of their moves. See if MongoDB can continue to show renewed price strength and clear that threshold.
While MongoDB is not near a proper buying range right now, see if it manages to form and break out from a proper base.
MongoDB saw both earnings and sales growth rise last quarter. Earnings-per-share increased from 49% to 96%. Revenue rose from 20% to 22%. Keep an eye out for the company’s next round of numbers on or around Aug. 28.
The company earns the No. 2 rank among its peers in the Computer Software-Database industry group. Oracle is the No. 1-ranked stock within the group.
This article was created automatically with Stats Perform’s Wordsmith software using data and article templates supplied by Investor’s Business Daily. An IBD journalist may have edited the article.
RELATED:
Which Stocks Are Showing Rising Relative Strength?
Why Should You Use IBD’s Relative Strength Rating?
How Relative Strength Line Can Help You Judge A Stock
Ready To Grow Your Investing Skills? Join An IBD Meetup Group!

MMS • Anthony Alford

Apple open sourced DiffuCoder, a diffusion large language model (dLLM) fine-tuned for coding tasks. DiffuCoder is based on Qwen-2.5-Coder and outperforms other code-specific LLMs on several coding benchmarks.
Unlike typical LLMs, which generate text auto-regressively and “left-to-right,” dLLMs generate text by de-noising an entire sequence in parallel, which can mean faster generation. Apple’s researchers developed DiffuCoder so they could investigate the best strategies for dLLM fine-tuning and inference. In their research, they developed a variation of the Group Relative Policy Optimization (GRPO) fine-tuning technique that they call coupled-GRPO which improves the model’s performance. On the MBPP coding benchmark, DiffuCoder outperformed Gemini Diffusion and was “competitive” with GPT-4o. According to Apple,
By using a novel coupled-sampling strategy, our method provides a more accurate likelihood estimation. Coupled-GRPO significantly boosts DiffuCoder’s performance, demonstrating the effectiveness of RL methods aligned with diffusion principles. Our work provides the community with a deeper understanding of dLLMs and lays a strong foundation for future explorations of dLLMs in complex reasoning and generation tasks.
Most LLMs, like OpenAI’s GPT models, generate text auto-regressively by predicting a single next token to append to a sequence, then feeding the new sequence back as input. dLLMs take an approach similar to image-generation models like DALL-E: they start with a noisy sequence and iteratively de-noise it. This allows dLLMs to generate output much faster than autoregressive LLMs: up to five times faster in the case of Gemini Diffusion. Furthermore, they are not constrained to produce text left-to-right. Instead, they can perform a “global planning of content,” which can be an advantage in coding tasks.
One key outcome of Apple’s research was creating autoregressive-ness (AR-ness) metrics, which measures how closely a model follows the left-to-right pattern of LLMs. They found that dLLMs often do exhibit a high degree of AR-ness, likely due to the inherent nature of text generation. However, when generating code, this metric drops.
They also found that increasing the sampling temperature affected the model’s AR-ness, by making the model more flexible in both its choice of tokens and token order. This improved its “pass@k” score on coding benchmarks. The researchers point to past work that shows that a RL fine-tuned model’s reasoning ability is “bounded by the base model’s pass@k sampling capabilities,” which suggested that DiffuCoder had “substantial” potential for improvement. This led to their development of coupled-GRPO RL training, which did improve DiffuCoder’s benchmark results, by over six percentage points in some cases.
In a discussion on Hacker News, one user wrote:
A diffusion model comes with a lot of benefits in terms of parallelization and therefore speed; to my mind the architecture is a better fit for coding than strict left to right generation…Overall, interesting. At some point these local models will get good enough for “real work” and they will be slotted in at API providers rapidly. Apple’s game is on-device; I think we’ll see descendants of these start shipping with Xcode in the next year as just part of the coding experience.
The DiffuCoder code is available on GitHub. The model files can be downloaded from Huggingface.

MMS • Jiangjie Qin

Transcript
Qin: I’m going to talk a little bit more about stream and batch processing convergence in Apache Flink. My name is Becket. I’m currently a software engineer at LinkedIn. I used to work on mainframe at IBM, and then worked on Kafka at LinkedIn, and then moved to Flink. I’m mostly focusing on the data infra development. Currently, I’m a PMC member of Apache Kafka and Apache Flink, and also mentor of Apache Paimon and Apache Celeborn.
Outline
First of all, I’m going to introduce a little bit about what is stream and batch unification? What is the motivation and the use cases of it? Then, we’re going to look at the stream and batch unification in Flink, so how exactly Flink does that. We’re going to dive deep into the computing model and the execution model of stream and batch unification. Finally, we’re going to take a look at some of the future work that we’re trying to accomplish in Apache Flink.
Motivation and Use Cases of Stream and Batch Unification
Let’s take a look at the motivation of stream and batch unification first. To explain that, imagine that you have a bunch of data in your organization or company, and, apparently, you want to build some data application around the data. You want to get some insights out of it. Your application team will probably ask for a few things, like data freshness, throughput, cost, and also, they want to be able to express their business logic, of course. In addition to that, they probably will ask for scalability, stability, operability, and so on, as well. Now the question is, how are you going to tackle all those requirements? The answer is data infrastructure. That’s the one in the middle. Data infrastructure actually consists of multiple things. Apparently, you need a computing engine.
At this point, we have many computing engines out there. You have MapReduce, which is old school, and you have Hive, Samza, Flink, Spark, Pinot, Trino, all those different engines to tackle different problems. Also, you have storage. Your data infra must have storage. Then, we’re talking about all the message queues, KV stores, file systems. To name some projects, we have Ambry, HBase, we have HDFS, Kafka, Venice, Redis, all those projects out there. Then, you also need some control plane to do orchestration and resource management and so on. We have YARN. We have Kubernetes. We have Airflow. We have Azkaban. There are some new projects like Flight going on in this domain. Also, you need data modeling.
In order to model your data, you need some format for it. What we see is that we see like Avro, we see Parquet, we see ORC, we see all those metadata management out there for you. Eventually, you also need some supporting tooling. For example, you will need metrics, you will need logging, you will need alerting, testing, releasing, all those support tooling to help you to just build your data infrastructure projects and application. Everything in the middle consists of this whole data infra domain.
Now let’s take a look at an example. If I’m provided with all those things that I just mentioned, as an application developer, how would I develop my application? One typical example use case we see is here, we have an online application and it’s generating some user activity events, and I’m pumping those events into Kafka. Then, those events are processed using my streaming feature generation. I’m trying to generate some feature for my machine learning model. What I can do is that I can have a Flink job reading from Kafka, the messages are in Avro format, and I’m using Kubernetes to just orchestrate this job. The result features are pumped into Feature Store. Let’s assume the Feature Store is implemented on top of Venice, which is a key-value store, open-source project. This is just one, like the streaming feature generation pipeline.
Meanwhile, you probably will ETL all the user activity to your HDFS as well. It will have your batch feature generation jobs running here, which is also pumping features into the Feature Store here. We’re talking about the batch pipeline here. The projects involved probably include like HDFS, Spark, Parquet, which is columnar format. You also have Airflow here, and YARN, if it’s old school. As you can see, the interesting thing is that the streaming pipeline and the batch pipeline actually has quite different tech stack. We’re basically talking about two completely different ecosystems when it comes to the same thing, which is feature generation. I think part of the reason is because, in the past, the batch ecosystem was built first and streaming was introduced much later. That’s why you see this divergence between the ecosystem of batch processing and stream processing.
What we want to achieve ideally is the following, we don’t want to distinguish between stream and batch because for the same exact logic, in that case, I need to develop twice for streaming and for batch. What I want to have is that I just want to have one unified storage that can provide me with the key-value access pattern, the queue access pattern, which is like the messaging system, and also the range scan pattern.
In terms of the processing part, I don’t want to have two separate pipelines. I just want to have one tech stack that can handle both streaming and batch feature generation. Here we can imagine that I can use Flink as the convergence engine, and I’m just going to use Kubernetes across the board, use Airflow for workflow orchestration, and hopefully I can just unify all the formats with just one format. I don’t need to do all those type system conversions along the way. This would be the ideal world. We will have unified storage. We will have unified computing engine, unified orchestration control plane, unified data model, and unified tooling as well.
What’s the benefit of having that? The benefit of having that is that we’re going to dramatically lower our overall cost for data infrastructure. When we talk about cost of data infrastructure, usually we’re talking about these parts. If I’m adopting a new technology, there will be a migration cost. Whatever technology that I adopt, there will probably be a learning cost, meaning that if I have a new hire, they probably need to learn what’s the tech stack there, so that’s a learning cost. When I have my stack running there, I will probably need to maintain it just to make sure it’s running fine all the time. That’s maintenance cost. Once everybody learns the stack and my infrastructure runs fine, there will be a long-running cost, which is that I’m using this stack to develop my application. Every time I develop something, there will be a cost.
If the tech stack is super-efficient and has very good engineering productivity, in that case, my development cost will be low. If the stack is bad and really takes a lot of time for me to develop some new application, then the development cost will be high. Finally, it’s execution cost, meaning that if I run this stack, I would need to put some hardware there. That’s where usually we put the benchmark. The performance benchmark usually only benchmarks for the execution cost. Those are the five costs associated with this data infra domain. With the stream and batch unification, the target here is to lower the development cost, maintenance cost, and learning cost so that we do not need to have two ecosystems to maintain, and our end users do not need to learn different projects. They just need to learn one tech stack, and they can do both stream and batch processing.
With that, we can conclude what does stream and batch unification mean. It basically means a new design pattern that people do not need to distinguish between streaming and batch anymore. Today, we’re going to focus on just one part of it. The goal is that we want to have just one single project in all those five categories, and then, we do not need to distinguish between stream and batch. Today, we’re going to focus on just the computing engine. We’re going to talk about Flink. We’re going to dive deep into Flink as a computing engine itself, and explain how exactly Flink achieved the goal of stream and batch unification.
Goals of Computing Convergence
Before we dive deep, we need to first understand, from computing engine perspective, what exactly does computing convergence or stream, batch unification entail? There are actually three requirements. The first thing is same code. If you write the same exact code, it should run as both stream and batch job. The second thing is same result. If you write the same code and feed it with the same data, then both streaming and batch job should return you the exact same result. Finally, it’s the same experience, meaning that whether you’re running a streaming job or batch job, your experience should be pretty much the same in terms of logging, metrics, configuration, tuning, debugging, all the tooling, and so on. You want to have all those three categories to be there to declare successful computing convergence.
Stream and Batch Processing Unification in Flink
With that defined, let’s take a look at how exactly Flink achieved this goal. We have been hearing this statement quite a bit. People always say batch is a special case of streaming. Conceptually, that is true. However, if you look deep into it, that’s not completely the case. When we’re talking about streaming, we’re actually talking about dealing with infinite, out-of-order, dynamic data, in a sense that we actually need to solve the problem of stream processing, and stream processing is not solving a third problem. It’s just two problems it’s trying to solve. One is infinite data, and the second thing is out-of-orderness. In order to address these challenges, we invented a bunch of mechanisms, including watermark, event time, retraction, checkpoints, state, all those things are just invented to tackle those two challenges, infinite data and out-of-orderness.
On the batch world, the characteristic of the input dataset is completely different. We have finite, well-organized, and static data, and the goal is actually to proactively plan for execution time and resources. On the batch side, the focus would be, how can we divide all those computation into execution stages? How can we do speculative execution, adaptive execution, data skipping, all those technologies? If we want to achieve both worlds with just one engine, we’re talking about we need to take care of query optimization, scheduling, shuffle, state backend, connectors, all those components, and make sure they can actually work well in both streaming and batch. While conceptually speaking, you can say batch is a special case of streaming, when you look into the actual requirement of each component, they’re actually different. The focus is different.
In order to understand what are the differences, and what are exactly the problems or challenges that we need to tackle, we can think of a computing engine as two parts, at a high level. The first part is called computing model. The computing model is essentially the logical computing semantics to perform computation. It’s like the primitives this computing engine provides to its end user to describe, how do you want to do the computation? Usually, the computing model have an impact on the computing result. There are some examples, for example, event time, the timer, retraction, watermark, all those semantics or primitives provided in stream processing will actually have an impact on the end result you’re receiving. There is a second part, which is execution model. The execution model is usually considered as a physical implementation of the engine, just to fulfill the computing model or computing semantics.
Typically, the computing results are not impacted by the execution model. Examples of the part of the execution model are, for example, shuffle, scheduling, resource management. No matter how you do shuffle, no matter how you do resource management or scheduling, your computing model, your end result is unlikely to be impacted. That’s the physical part. If we look at the computing model and execution model separately, then we can understand why streaming and batch unification make sense. The Flink approach is following. For the computing model, we’re basically adopting the streaming computing model for both streaming and batch.
That basically means, under the hood, Flink is using exact same streaming operator to perform both streaming and batch computation. Because of its unified model, there’s actually little overhead on the batch side, even if we’re using streaming operator to do the batch job. The challenge here is, how are we going to handle streaming semantics in batch processing elegantly? We’re going to dive deep into all of those with examples. Usually, we can consider a stateless processing to be identical between streaming and batch. For example, you’re just doing a stateless transformation.
In that case, on streaming side, on the batch side, there’s no difference. The stateful processing is the tricky part. We do recognize for batch processing and stream processing, the execution model has to be different. It’s not because that streaming cannot produce the same result as batch. It’s just, you need a separate execution model for batch so it can run more efficiently. It’s not about correctness, it’s more about efficiency. In Flink, we do have a separate execution model for stream and batch.
Computing Model
Let’s take a look at computing model. We have this table here and we listed the difference between streaming processing and batch processing in terms of the computing model. In stream processing, there’s concepts called event time and watermark. There are concepts called timer, and retraction, and state backend. All those kinds of things don’t exist on the batch processing side. Imagine you’re writing a program and you’re leveraging all those primitives when you’re writing the program. Now you want to run the program in batch mode. How would the batch processing elegantly handle all those semantics? That’s going to be one of the challenges.
Another thing is that, on the stream processing side, we cannot support global sort because it’s too expensive, as you can imagine, because you have infinite data. If you want to say, I want to do global sort, that’s going to be extremely expensive, and you will never be able to produce correct result per se, because the data never ends. If you want to do global sort, then there will be never a correct result. On the batch processing, however, this is like a very normal operation. There’s also a gap. This is about computing model difference.
In order to tackle this problem, we’ll talk about a few things, event time, watermark, timer, retraction, and state backend. Let me explain how Flink actually handles those semantics in batch. First, let’s take a look at how are those semantics handled in streaming, just to explain what do those concepts mean. Let’s assume that we’re doing a count aggregation on a tumbling window of size 4. The tumbling window starting time is of timestamp 0. I think Adi actually provided a good example of the tumbling window in her talk. It’s basically a non-overlapping window, just of size 4. The output of the windows are triggered by timers. We talk about timer. The way it works is that every time when we see a window actually closes, there will be a timer that is triggered to emit the result for that closed window. That’s how it works, so timers. Watermark actually is there to delay the result emission to handle late arrivals. This watermark is invented to handle out-of-orderness.
If you remember, I mentioned that for stream processing, one key problem we’re trying to solve is out-of-orderness. The idea is very simple actually. Because you say, the event can actually arrive out of the order, so in order for me to emit the right result for a particular window, one simple strategy is to just wait a bit more. Let’s say if I want to emit result at the window and at time 12 p.m. today.
By the wall clock time of 12 p.m., because of out-of-orderness, there might be some events whose timestamp is before 12 p.m., that haven’t arrived yet. In order for me to emit the right result for the window ending at 12 p.m., I need to wait for a little bit longer so that the late arrival events can come into the window before I emit the result. That’s exactly what watermark means. It basically tries to delay the emission of the result a little bit. Watermark equals to X basically means all the events whose timestamp is before X has arrived. Assuming that we have Watermark equals to MAX_TIMESTAMP_SEEN minus 3, that basically means that if I see across all the events, the maximum timestamp, let’s say is X, then I would consider the watermark to be X minus 3.
With those mentioned, let’s take a look at this example. Let’s say I have a message coming in whose timestamp is 4. Apparently, this will fall into the tumbling window starting from 4 and ending at 8. Because I’m doing a count aggregation, I’ll put the key to be the window and I put the value to be the count. Here I will say, there’s one event in the window starting at 4 and ending at 8. In this case, my watermark is equal to 1. The reason being is that I’m saying that the watermark is actually the max timestamp that I’ve seen minus 3, because currently the max timestamp I see is like 4. If it’s minus 3, then it’s going to be 1.
Now the second event arrives, it’s 5. Again, it’s falling to the window of 4 to 8, and so we’re going to increase the count by 1. Now in window 4 to 8, we’re going to have the value of 2. The watermark is also increased to 2. The third event comes, and in this case, the out-of-orderness happened. We see a timestamp of 2, which is actually out-of-order. In this case, inside the stream processing state, we will put the window 0 to 4 count equals to 1, because 2 actually falls into the window 0 to 4. We’re not going to bump up the watermark, because, so far, the maximum timestamp seen hasn’t been increased yet, so the watermark stays at 2.
Moving on, we see the fourth event, which is also of timestamp 3. Similar to the last event processing, we will have the value for window 2 to 4 bumped to 2, and their watermark still stays the same, which is 2. Now the fifth event comes, and the timestamp is 7. In this case, what happens is that, first of all, the watermark will bump to 4. Because we say the watermark is max timestamp minus 3, and currently the max timestamp is 7, so watermark becomes 4. Because watermark becomes 4, that basically means the assumption is that all the events before timestamp 4 has arrived, meaning that I can close the window starting from 0 and ending at 4. In this case, the output will be emitted for window 0 to 4. The value would be 2 to 4. In this case, I can clean the state. I don’t need to keep the state of window 0 to 4 anymore because the result has been emitted. The state of window 4 to 8 is still there because the window is still open. It hasn’t been closed yet. Now the next event comes, which is timestamp 9.
In this case, again, watermark will be bumped to 6. Because the watermark is bumped to 6, and we take a look at the window, we didn’t find any window that can be closed. The state stays the same. We have the state 4 to 8, count equals to 3. The reason 4 to 8 count equals to 3 is because we saw 7 earlier. When timestamp 9 comes, we will have a new window because 9 falls into the window from 8 to 12. The window from 8 to 12, the value will become 1. This is the state after we’re processing event with timestamp 9. Assume that’s the last event we process. Assuming that this is like the finite streaming mode, eventually we will just emit the Long.MAX_VALUE at the end of the stream. In this case, eventually we’re going to emit Long.MAX_VALUE. The watermark in that case will also be Long.MAX_VALUE, so it will basically close all the windows that is in the state. Then, the output would become 4, and window 4 to 8 has a value of 3. Window 2 to 12 has a value of 1. Eventually, we will have the output of all the three windows.
This is how stream processing works. We explained event time, which is the timestamp here, can be out of order. We explained watermark, which is the max timestamp minus some value. We also explained the state, so it’s keeping the state. We explained how the states are built and how the states are clean in stream processing. Now imagine I have the exact same program, which has timer, has event time, has watermark, has all those concepts, but now instead of running in streaming, I want to run it in batch. How can I do it? It’s exactly the same program and exactly the same semantic. We want to get the exact same result in batch. What happens is the following.
We will have 4 coming first, and we will have the state, which is window 4 to 8, count 1. 5 comes, window 4 to 8, count 2. 2 comes, and we will say window 0 to 4, count 1, 4 to 8 is still count 2. 3 comes, and we’re keeping building this state. As you can see, we are not emitting any watermark. The reason being that we’re processing finite dataset. We don’t need to emit any intermediate result. We only need to emit the final result when all the data has been processed. The way it works is that we will emit one single Long.MAX_VALUE at the very end. Then, we will have the output, which is the state by the time when the Long.MAX_VALUE watermark comes.
However, this looks like correct. At least logically speaking, it’s going to emit the exact same result as stream processing. The problem is that you still have RocksDB here. In batch processing, you probably don’t want RocksDB because RocksDB is expensive. You’re building your state pretty big. It’s not going to be very memory or disk I/O friendly when you have such a big state there. How can we improve it? Can we replace it with a HashMap. We can just get rid of RocksDB and we’d replace it just with a HashMap. The natural question would be, what if the state is too large to fit into memory if you use a HashMap? You will definitely have OOM. How can we do it? The way we do it is basically leveraging the fact that your data is finite. What we can do is that we can sort the input before we feed it into the operator.
If we sort the input, let’s take a look at what happens. We will see 2 comes into the picture at first. Remember that in the past, we have 2 coming in the middle because it’s out of order. Now, because we’re sorting the input by the event time, we’re actually having this 2 come into the picture first. Then, we can say, now the state becomes 0 to 4 has a count of 1. 3 comes into the picture, we say, 0 to 4 with a count of 2. Then, timestamp 4 comes. Timestamp 4 actually falls into a different window. In this case, because we know all the data are sorted, we know that once the key changes, or the window changes, that basically means all the data in window 0 to 4 actually have arrived. What we can do is that we can just clean up the state of window 0 to 4, and we emit it as output.
Then, in that case, we just need to keep the state of window 4 to 8. In order to emit 0 to 4 with the count 2, when event 4 comes, we actually need to emit the watermark to be Long.MAX_VALUE. Remember, we say the watermark is actually the thing there that triggers the emission in stream processing. If you want to apply the same model in batch processing, you should use the same way to trigger the data emission. That’s how we do it. In Flink, once we see there is a key switch in batch processing, we will immediately emit a Long.MAX_VALUE watermark. That watermark will trigger the emission of the current key, the result of the current key to be outputted. If we move forward, then we will see, 5 comes, and it also goes into the window of 4 to 8. Then 7 comes, it also goes to the window of 4 to 8. Then 9 comes, 9 is going to be in a different window.
The window key, again, changes. In this case, what happens is that we will see the 4 to 8 value going to be emitted, and the 8 to 12 equals to 1 is going to be keeping the state. We were going to use the Long.MAX_VALUE watermark to emit the 4 to 8 window result. Eventually, when everything is processed, we’re going to, again, emit Long.MAX_VALUE, just to trigger the emission of the last key.
As you can see, the way it works is, basically, we only need to keep the current key state in memory. We don’t need to keep everything in memory in that case. We’re using the exact same way as we’re triggering the output on the streaming side to trigger the emission on the batch side. This is exactly how most of the batch engine works. It just does the sort-merge agg, or sort-merge join. That’s how it works. That’s about the computing model. We explained the difference of the computing model between stream and batch processing because there are different primitives and semantics, and how are we going to handle those primitives and semantics in streaming elegantly in the batch jobs.
Execution Model
The second part we look at is the execution model comparison. The execution model comparison actually contains the scheduling, shuffle, checkpointing, failure recovery, and sort-merge join and agg. For stream and batch processing, both of them will have their own scheduling and shuffle and checkpointing. The difference is that on the streaming side, all the tasks need to be up and running at the same time, while on the batch processing side, the tasks are actually executed one by one even if you have only one task manager, one CPU, it’s going to be time-shared execution. For the shuffle, on the streaming side, it’s peer-to-peer pipeline shuffle.
On the batch processing side, it’s blocking shuffle, in a sense that you will put your shuffle result into disk first, and then the next task will read from the disk. For checkpointing, stream processing has interval-based checkpointing. On the batch processing side, we’re not having checkpointing at all, we’re disabling it. For failure recovery, streaming processing side, we’re just recovering from the checkpoint. On the batch processing side, we’re recovering to the shuffle boundary. For stream processing, at this point, we cannot support sort-merge join or agg, as you can imagine, because global sort is not supported. There’s no Adaptive Query Execution, AQE, in the picture at this point. On the batch processing, we have both. Those are the execution model comparison. Again, so those are for efficiency, those are not for correctness.
Just to explain this a little bit more, so for scheduling and shuffle, on the streaming side, all the tasks must be up and running at the same time, and shuffle data does not go to disk, it just goes to network and send over to the downstream task managers. On the batch side, if you have just one task manager, it’s going to store its intermediate result in the disk, and next one comes, and just read from disk, it’s going to be like that. In Flink, it’s blocking shuffle. For checkpointing and failure recovery, so let’s assume that I have a job that is doing checkpointing here, basically store all the state into remote checkpoint store, and one of the task manager fails. What happens is that everybody will just rewind to the last successful checkpoint and continue to process. This is a global failover, sort of. All the connected tasks will be restored from the remote storage.
On the batch side, what happens is that if I have one of the tasks that has failed while the task manager is still there, I can just restart this task, because all its input data is actually stored in the disk. If I have the whole task manager lost, what happens is that I actually need to rerun every task that has been run in this task manager, because all the intermediate state, all the intermediate shuffle result will be lost when this task manager is gone. In order to rebuild all the shuffle result, I need to run all the tasks that used to be running on this task manager. This is assuming that we’re not using remote shuffle, but if you’re using remote shuffle, then the story will be a little bit different. There are more differences between streaming and batch execution model.
For example, sort-merge join and basically AQE, is not supported on the streaming side, because it needs global sort. AQE is actually based on the stats for shuffle or finite dataset, which is not available on the streaming side either. For example, in batch, what it can do is that you look at the intermediate result size, then you automatically adjust the downstream parallelism so that it can run with proper resource, but this is not available on the streaming side. Also, there’s speculative execution that is also for batch only, and there’s different optimization rules for streaming and batch, for example, the join reorder thing.
The Future of Stream and Batch Unification
The last part is about the future of stream and batch unification. There are a few things that we want to do. One thing we definitely want to do is to be able to run the streaming and batch stage in the exact same job. At this point, you can only submit a job either as a streaming job or as a batch job, but you cannot have both streaming and batch mode running in the same job, but it’s going to be very useful if you think about it. If I’m processing the whole dataset, I can process the historical data using batch mode very efficiently, and once I switch over to the streaming data, I can switch to the streaming mode, and then it can produce the compute in the streaming mode for streaming data very efficiently as well. That’s going to combine the best of both worlds. We will also want to have better user experience. Currently, there are still some semantic differences between streaming SQL and batch SQL, for example, in Flink.
One thing we see a lot is that on the streaming side, your data may be stored in some KV store or online service, so you can have an RPC call to look it up, but on the batch side, such online service doesn’t exist. Your data might be storing in some batch dataset, so this causes the semantic or the query you’re going to write to be a little bit different. The last one we want to achieve is that we want to maybe also include ad-hoc query into the picture.
So far, we are talking about stream and batch computing unification, but if you think about analytics world, there’s actually a third pattern, which is ad-hoc query. The ad-hoc query actually has the exact same computing model as batch processing, but the execution model is different, because for ad-hoc query, the execution model is trying to optimize for the end-to-end response time rather than throughput. This is causing the difference. We can apply the same rule, basically, is like you compare the computing model and you compare the execution model, and you can actually unify them.
Let’s take a look at what’s the current status of this stream and batch unification in the industry. At this point, I think the Flink community has spent many years to build this feature, and if you go there, you will see actually the batch performance is really good. It’s almost on par with the open-source Spark. It’s at the point of where we’re seeing early majority for computing engine. For the rest of the domain that we mentioned in the data infrastructure world, storage, control plane, and so on, I think it’s still in the early adopter phase.
See more presentations with transcripts

MMS • RSS
BMO Capital Markets initiated coverage on shares of MongoDB (NASDAQ:MDB – Free Report) in a research note released on Monday, Marketbeat Ratings reports. The brokerage issued an outperform rating and a $280.00 price target on the stock.
Other equities analysts also recently issued research reports about the company. Royal Bank Of Canada reiterated an “outperform” rating and issued a $320.00 price objective on shares of MongoDB in a research note on Thursday, June 5th. Stephens started coverage on shares of MongoDB in a report on Friday, July 18th. They set an “equal weight” rating and a $247.00 target price on the stock. DA Davidson reissued a “buy” rating and set a $275.00 target price on shares of MongoDB in a report on Thursday, June 5th. Rosenblatt Securities lowered their target price on shares of MongoDB from $305.00 to $290.00 and set a “buy” rating on the stock in a report on Thursday, June 5th. Finally, Needham & Company LLC reaffirmed a “buy” rating and issued a $270.00 price target on shares of MongoDB in a report on Thursday, June 5th. Nine analysts have rated the stock with a hold rating, twenty-seven have given a buy rating and one has assigned a strong buy rating to the company’s stock. According to data from MarketBeat, the stock currently has an average rating of “Moderate Buy” and an average price target of $281.31.
View Our Latest Stock Analysis on MongoDB
MongoDB Price Performance
Shares of MDB stock traded down $3.53 during trading hours on Monday, hitting $240.88. The stock had a trading volume of 1,793,244 shares, compared to its average volume of 2,009,958. The stock’s 50 day moving average is $208.62 and its 200 day moving average is $212.38. MongoDB has a twelve month low of $140.78 and a twelve month high of $370.00. The firm has a market capitalization of $19.68 billion, a PE ratio of -211.30 and a beta of 1.41.
MongoDB (NASDAQ:MDB – Get Free Report) last posted its quarterly earnings results on Wednesday, June 4th. The company reported $1.00 earnings per share (EPS) for the quarter, beating the consensus estimate of $0.65 by $0.35. The business had revenue of $549.01 million for the quarter, compared to analyst estimates of $527.49 million. MongoDB had a negative return on equity of 3.16% and a negative net margin of 4.09%. The business’s revenue was up 21.8% compared to the same quarter last year. During the same period in the prior year, the company posted $0.51 EPS. As a group, equities analysts predict that MongoDB will post -1.78 earnings per share for the current year.
Insider Activity
In other MongoDB news, CEO Dev Ittycheria sold 8,335 shares of the company’s stock in a transaction that occurred on Monday, July 28th. The stock was sold at an average price of $243.89, for a total transaction of $2,032,823.15. Following the completion of the transaction, the chief executive officer owned 236,557 shares of the company’s stock, valued at $57,693,886.73. This represents a 3.40% decrease in their position. The transaction was disclosed in a document filed with the SEC, which is available at the SEC website. Also, Director Dwight A. Merriman sold 1,000 shares of the stock in a transaction that occurred on Friday, July 25th. The stock was sold at an average price of $245.00, for a total value of $245,000.00. Following the completion of the transaction, the director directly owned 1,104,316 shares of the company’s stock, valued at $270,557,420. This represents a 0.09% decrease in their position. The disclosure for this sale can be found here. In the last quarter, insiders sold 51,416 shares of company stock valued at $11,936,656. Insiders own 3.10% of the company’s stock.
Institutional Trading of MongoDB
Several hedge funds have recently bought and sold shares of the stock. Cloud Capital Management LLC bought a new position in MongoDB in the first quarter worth approximately $25,000. Hollencrest Capital Management bought a new position in MongoDB in the first quarter worth approximately $26,000. Cullen Frost Bankers Inc. grew its stake in MongoDB by 315.8% in the first quarter. Cullen Frost Bankers Inc. now owns 158 shares of the company’s stock worth $28,000 after purchasing an additional 120 shares in the last quarter. Strategic Investment Solutions Inc. IL bought a new position in MongoDB in the fourth quarter worth approximately $29,000. Finally, Coppell Advisory Solutions LLC lifted its holdings in MongoDB by 364.0% during the fourth quarter. Coppell Advisory Solutions LLC now owns 232 shares of the company’s stock worth $54,000 after buying an additional 182 shares during the period. Institutional investors own 89.29% of the company’s stock.
About MongoDB
MongoDB, Inc, together with its subsidiaries, provides general purpose database platform worldwide. The company provides MongoDB Atlas, a hosted multi-cloud database-as-a-service solution; MongoDB Enterprise Advanced, a commercial database server for enterprise customers to run in the cloud, on-premises, or in a hybrid environment; and Community Server, a free-to-download version of its database, which includes the functionality that developers need to get started with MongoDB.
Further Reading
Before you consider MongoDB, you’ll want to hear this.
MarketBeat keeps track of Wall Street’s top-rated and best performing research analysts and the stocks they recommend to their clients on a daily basis. MarketBeat has identified the five stocks that top analysts are quietly whispering to their clients to buy now before the broader market catches on… and MongoDB wasn’t on the list.
While MongoDB currently has a Moderate Buy rating among analysts, top-rated analysts believe these five stocks are better buys.

Discover the next wave of investment opportunities with our report, 7 Stocks That Will Be Magnificent in 2025. Explore companies poised to replicate the growth, innovation, and value creation of the tech giants dominating today’s markets.

MMS • Daniel Dominguez

Anthropic has proposed a new transparency framework designed to address the growing need for accountability in the development of frontier AI models. This proposal focuses on the largest AI companies that are developing powerful AI models, distinguished by factors such as computing power, cost, evaluation performance, and annual R&D expenditures. The goal is to establish a set of standards that ensures safety, mitigates risks, and increases public visibility into the development and deployment of these advanced AI systems.
A central aspect of the framework is the implementation of Secure Development Frameworks (SDFs), which would require large AI companies to assess and mitigate potential catastrophic risks associated with their models. These risks include chemical, biological, and radiological hazards, as well as harms caused by misaligned model autonomy. The proposal outlines that these frameworks should not only address risk mitigation but also ensure the responsible handling of AI development processes.
One of the key requirements of the framework is public disclosure. Under the proposed regulations, AI companies would be mandated to make their SDFs publicly available through a registered website, offering transparency into their safety practices. This would allow researchers, governments, and the public to access important information about the models being deployed, ensuring that safety standards are being met and that any risks are properly managed. Additionally, companies would be required to publish system cards that provide a summary of the model’s testing procedures, evaluation results, and the mitigations implemented. This documentation would need to be updated whenever a model is revised or a new capability is added.
The framework also proposes that smaller developers and startups be exempt from these requirements. Instead, the regulations would apply to large-scale AI companies whose models have the potential to cause significant harm, such as those with substantial computing power or financial resources. The exemption is designed to avoid placing an undue burden on smaller companies while still focusing regulatory efforts on the largest players in the field.
Furthermore, the proposal includes specific provisions for enforcing compliance. It would be a legal violation for AI companies to provide false or misleading statements about their adherence to the framework, ensuring that whistleblower protections can be applied if necessary. The enforcement mechanism would allow the attorney general to pursue civil penalties for violations, helping to maintain the integrity of the system.
Community reactions reflect a mix of optimism, skepticism, and practical concerns with recent global discussions on AI regulation.
AI Expert Himanshu Kumar commented on X:
Isn’t fostering open-source AI development also crucial for safe innovation?
Meanwhile user Skeptical Observer commented:
Enforcement by whom? This feels very U.S.-centric. What about Chinese labs or others outside this scope? The whistleblower protections sound nice, but without global reach, it’s just a Band-Aid. Hope they clarify this at the AI Safety Summit!
Ultimately, the proposed transparency framework aims to strike a balance between ensuring AI safety and enabling continued innovation. While the framework sets minimum standards for transparency, it intentionally avoids being overly prescriptive, allowing the AI industry to adapt as the technology continues to evolve. By promoting transparency, the framework seeks to establish clear accountability for AI developers, helping policymakers and the public differentiate between responsible and irresponsible practices in the field. This could serve as a foundation for further regulation if needed, providing the evidence and insights necessary to determine if additional oversight is warranted as AI models advance.

MMS • Michael Redlich

This week’s Java roundup for July 21st, 2025, features news highlighting: a new tool to use Quarkus MCP Client to access a secure Quarkus MCP Server from the command line; the second beta release of Groovy 5.0.0; and point releases for GraalVM Native Build Tools and JHipster Lite.
JDK 25
Build 33 of the JDK 25 early-access builds was made available this past week featuring updates from Build 32 that include fixes for various issues. Further details on this release may be found in the release notes.
JDK 26
Build 8 of the JDK 26 early-access builds was also made available this past week featuring updates from Build 7 that include fixes for various issues. More details on this release may be found in the release notes.
Jakarta EE
In his weekly Hashtag Jakarta EE blog, Ivar Grimstad, Jakarta EE Developer Advocate at the Eclipse Foundation, provided an update on Jakarta EE 12, writing:
Jakarta EE 12 chugs along according to plan during the summer months. The first milestone will be in September. This milestone will mostly contain administrative tasks that are expected by the component specifications. Such as setting up the build environments, preparing the repositories with branches and such.
The anticipated GA release of Jakarta EE 12 is July 2026.
GraalVM
Oracle Labs has released version 0.11.0 of Native Build Tools, a GraalVM project consisting of plugins for interoperability with GraalVM Native Image. This latest release provides notable changes such as: a new Gradle DSL to experiment with layering native images; and a re-enabling of SBOM integration testing and improvements that include using regular expressions. Further details on this release may be found in the changelog.
Spring Framework
It was a busy week over at Spring as the various teams have delivered milestone releases of Spring Boot, Spring Security, Spring Authorization Server, Spring for GraphQL, Spring Session, Spring Integration, Spring REST Docs, Spring Batch, Spring AMQP, Spring for Apache Kafka, Spring for Apache Pulsar and Spring Web Services. More details may be found in this InfoQ news story.
The Spring Data team has announced that version 4.0.0-M4 of the Spring Data JDBC and Spring Data R2DBC sub-projects now support composite IDs (or composite keys) for improved mapping of an entity with an attribute for each column in the composite ID.
Quarkus
The Quarkus team has introduced a way for developers to use Quarkus MCP Client to access a secure Quarkus MCP Server from the command line. Users will be able to login to a Quarkus LangChain4j AI server application with GitHub OAuth2 access token. Developers can learn more by following this example demo on GitHub.
Groovy
The second beta release of Groovy 5.0.0 ships with bug fixes, dependency upgrades and new features: a new subList()
method, added to the DefaultGroovyMethods
class, that accepts an integer range for processing; and improvements to the named-capturing groups for regular expressions which may still be accessed by index. Further details on this release may be found in the release notes.
JHipster
The release of JHipster Lite 1.34.0 provides bug fixes, dependency upgrades and new features: a distribution of OpenRewrite upgrade recipes for instances of custom-jhlite
and generated projects when slugs are renamed; and a new SonarQube module for TypeScript that will ensure code quality.
A breaking change in this release, related to the SonarQube module, is the rename of slugs and features. More details on this release may be found in the release notes.

MMS • Sergio De Simone

In a recent tech report, Apple has provided more details on the performance and characteristics of the new Apple Intelligence Foundation Models that will be part of iOS 26, as announced at the latest WWDC 2025.
Apple foundation models include a 3B-parameter version optimized to run on Apple Silicon-powered devices, as well as a larger model designed to run on Apple’s Private Cloud Compute platform. Apple emphasizes that both models were trained using responsible web crawling, licensed corpora, and synthetic data. A further training stage included supervised fine-tuning and reinforcement learning.
According to Apple, the 3B parameter model is designed for efficiency, low-latency, and minimal resource usage. The larger model, by contrast, aims to deliver high accuracy and scalability. Apple notes that, given its reduced size, the on-device model isn’t intended to implement a world-knowledge chat, but can support advanced capabilities such as text extraction, summarization, image understanding, and reasoning with just a few lines of code.
On the architecture side, the 3B-parameter model uses KV-cache sharing, a technique used to reduce the time-to-first-token, and is compressed using 2-bit quantization-aware training. Sharing the key-value caches between the two blocks the model is divided into enables a reduction of memory usage by 37.5%, says Apple. Quantization-aware training is a technique that allows to recover quality by simulating the effect of 2-bit quantization at training-time:
Unlike the conventional quantization scheme which derives the scale from weights W, we introduce a learnable scaling factor f that adaptively fine-tunes the quantization range for each weight tensor.
For the server-side model, Apple used a novel Parallel-Track Mixture-of-Experts (PT-MoE) transformer that combines track parallelism, sparse computation, and interleaved global–local attention. comprises multiple transformers that process tokens independently, each with its own set of MoE layers. Apple says that the combination of parallel token processing with the MoE approach delivers reduced synchronization overhead and allows the model to scale more efficiently.
To evaluate its foundation models, Apple researchers relied on human graders to assess each model’s ability to produce a native-sounding response. The results show that the on-device model performs well against Qwen-2.5-3B across all supported languages, and remains competitive with larger models like Qwen-3-4B and Gemma-3-4B in English. The larger server-side model performs favorably against Llama-4-Scout, but falls short compared to much larger models such as Qwen-3-235B and GPT-4o.
For image understanding, Apple followed the same approach by asking humans to evaluate image-question pairs, including text-rich images like infographics:
We found that Apple’s on-device model performs favorably against the larger InternVL and Qwen and competitively against Gemma, and our server model outperforms Qwen-2.5-VL, at less than half the inference FLOPS, but is behind Llama-4-Scout and GPT–4o.
As a final note, Apple researchers emphasizes their approach to Responsible AI, which includes enforcing a baseline of safety and guardrails to mitigate harmful model input and output. These safeguards were also evaluated through a combination of human assessment and auto-grading. Apple has also published educational resources for developers to apply Responsible AI principles.
As mentioned, Apple’s AI foundation models require XCode 26 and iOS 26 and are currently available as beta software.
The White House Releases National AI Strategy Focused on Innovation, Infrastructure, and Global Lead

MMS • Robert Krzaczynski

The White House has published America’s AI Action Plan, outlining a national strategy to enhance U.S. leadership in artificial intelligence. The plan follows President Trump’s January Executive Order 14179, which directed federal agencies to accelerate AI development and remove regulatory barriers to innovation.
The strategy identifies more than 90 federal actions to be taken in the coming months. These are organized across three primary pillars: Accelerating Innovation, Building American AI Infrastructure, and Leading in International Diplomacy and Security.
The plan outlines a range of federal initiatives aimed at strengthening the U.S. AI ecosystem and positioning the country as a global leader in the field:
- AI Exports: Support for allied nations through full-stack AI packages, including hardware and software.
- Infrastructure: Faster permitting for data centers and fabs, plus workforce training in key technical trades.
- Deregulation: Review and removal of federal rules that may hinder AI development, with industry input.
- Procurement: New guidelines favor “ideologically neutral” frontier models in federal contracts.
According to Michael Kratsios, Director of the White House Office of Science and Technology Policy, the initiative aims to align government efforts in building a stronger national AI ecosystem. Administration officials have described the effort as a response to global competition in AI development and deployment.
The plan’s focus on deregulation and rapid buildout has prompted a range of responses. Some have raised concerns about the risks of easing oversight in high-impact sectors. One Reddit user compared the approach to hypothetical deregulation in the aviation industry:
Deregulation means average citizens have to suffer in one way or another. Imagine deregulation in aviation just to compete — fewer safety measures, and who cares if more planes crash? This doesn’t happen because of strict regulation.
Security was another area of concern. While the plan references the “secure by design” principle, some observers argue that more detail is needed regarding the protection of the underlying systems running AI applications:
While one of the pillars prioritizes the ‘secure by design’ concept, a lot of the controls focus on LLM input/output validation. These applications also need hardened runtimes, like any other production software.
The strategy also emphasizes avoiding what officials describe as “Orwellian” applications of AI and ensuring that American workers are central to the AI-driven economy. Officials say the buildout will create skilled jobs and technological opportunities across sectors, from medicine to manufacturing.
The White House says implementation of the plan will begin immediately, with coordination across federal agencies. More details on timelines, specific agency responsibilities, and international partnerships are expected in the coming weeks.
Spring News Roundup: Milestone Releases of Boot, Security, Auth Server, GraphQL, Kafka, Pulsar

MMS • Michael Redlich

There was a flurry of activity in the Spring ecosystem during the week of July 21st, 2025, highlighting milestone releases of Spring Boot, Spring Security, Spring Authorization Server, Spring for GraphQL, Spring Session, Spring Integration, Spring REST Docs, Spring Batch, Spring AMQP, Spring for Apache Kafka, Spring for Apache Pulsar and Spring Web Services.
Spring Boot
The first milestone release of Spring Boot 4.0.0 delivers new features such as: the ability for types annotated with @ConfigurationProperties
to refer to types that are located in a different module; and support for the new Spring Framework JmsClient
interface to complement existing support for the JmsTemplate
and JmsMessagingTemplate
classes.
A deprecation in this release is related to the constructor parameters of the OperationMethod
class. Developers are encouraged to use OperationMethod(Method, OperationType, Predicate)
instead of the original OperationMethod(Method, OperationType)
.
More details on this release may be found in the release notes.
Spring Security
The first milestone release of Spring Security 7.0.0 ships with bug fixes, dependency upgrades and new features such as: a new BearerTokenAuthenticationConverter
class, an implementation of the AuthenticationConverter
interface, that converts a request to an instance of a BearerTokenAuthenticationToken
class; and improvements to the UsernameNotFoundException
class that adds a username
property and its use in the BadCredentialsException
class.
Breaking changes in this release include the removal of deprecated elements and methods in various classes and interfaces.
Further details on this release may be found in the release notes.
Spring Authorization Server
The first milestone release of Spring Authorization Server 2.0.0 provides dependency upgrades and new features: the addition of a Gradle testRuntimeOnly
dependency for the JUnit Platform Launcher; and removal of the specific Map
from the Jackson TypeReference
class in various classes. More details on this release may be found in the release notes.
Spring for GraphQL
The first milestone release of Spring for GraphQL 2.0.0 ships with dependency upgrades and new features such as: a migration of their nullability annotations to JSpecify; and a return of Jackson serializers that allow for fields in the ArgumentValue
class to send variables in GraphQL requests. Further details on this release may be found in the release notes.
Spring Session
The first milestone release of Spring Session 4.0.0 features notable dependency upgrades to Spring Framework 7.0.0-M7, Spring Data 2025.1.0-M4 and Spring Security 7.0.0-M1. More details on this release may be found in the release notes.
Spring Integration
The first milestone release of Spring Integration 7.0.0 ships with bug fixes, improvements in documentation, dependency upgrades and new features such as: an initial migration of their nullability annotations to JSpecify; and a removal of Joda-Time support from the Jackson ObjectMapper
class. Further details on this release may be found in the release notes.
Spring Modulith
The first milestone release of Spring Modulith 2.0.0 provides bug fixes, dependency upgrades and new features/improvements such as: a restructuring of the event publication registry lifecycle related to the implementation of JDBC; and a reduction of visibility of internally used methods defined in the ApplicationModuleSource
class. More details on this release may be found in the release notes.
Spring REST Docs
The first milestone release of Spring REST Docs 4.0.0 ships with improvements in documentation, dependency upgrades and new features such as: support for nullability with JSpecify; and support for official HAL and HAL-FORMS media type in link extraction. Further details on this release may be found in the release notes.
Spring Batch
The first milestone release of Spring Batch 6.0.0 delivers bug fixes, improvements in documentation, dependency upgrades and new features/enhancements such as: a new CommandLineJobOperator
class, a more modern replacement for the original, and now deprecated, CommandLineJobRunner
class; and an update to the MapJobRegistry
class that may now auto register jobs defined in the application context. More details on this release may be found in the release notes.
Spring AMQP
The third milestone release of Spring AMQP 4.0.0 delivers bug fixes, improvements in documentation, dependency upgrades and new features such as: an improved shutdown phase in the BlockingQueueConsumer
class; and a new getStreamName()
method added to the RabbitStreamTemplate
class that returns the value of the streamName
variable passed into the constructor. Further details on this release may be found in the release notes.
Spring for Apache Kafka
The third milestone release of Spring for Apache Kafka 4.0.0 provides bug fixes, improvements in documentation, dependency upgrades and new features such as: the addition of JSpecify annotations in the batch messaging classes; and a refactored StringOrBytesSerializer
class that includes the use of pattern matching to reduce the number of conditionals. This release has been integrated into Spring Boot 4.0.0-M1. More details on this release may be found in the release notes.
Spring for Apache Pulsar
The first milestone release of Spring for Apache Pulsar 2.0.0 features bug fixes, dependency upgrades and notable changes such as: upgrades to Spring Java Format 0.0.47 and the Checkstyle 10.25.0 in preparation for JSpecify; and the removal of the listenerScope
field override in the AbstractPulsarAnnotationsBeanPostProcessor
base class and corresponding derived classes. Further details on this release may be found in the release notes.
Spring Web Services
The first milestone release of Spring Web Services 5.0.0 delivers bug fixes, improvements in documentation, dependency upgrades and new features such as: a migration of their nullability annotations to JSpecify; and an alignment with Spring Framework 7.0.0-M7 and Spring Security 7.0.0-M1. More details on this release may be found in the release notes.

MMS • Craig Risi

Vercel has enhanced its observability platform by integrating external API caching insights, enabling developers to track how many requests to third-party APIs are served from the Vercel Data Cache versus being routed to the origin server. As of May 22, 2025, users can view caching behavior at the hostname level, with those on Observability Plus gaining more granular path-level metrics in the dashboard.
Vercel’s Data Cache is a specialized layer designed to work with frameworks like Next.js, storing fetch responses per region with support for time-based and tag-based invalidation. By exposing cache hit data, Vercel helps teams pinpoint areas where they can further leverage caching to reduce latency, decrease origin requests, and enhance application performance.
This feature builds on the broader rollout of “Observability“, initially launched in beta in October 2024 and generally available since December, which offers comprehensive insights into function invocations, edge requests, build diagnostics, and external API calls. The addition aligns with Vercel’s focus on visibility and performance for frontend and serverless workloads.
On Vercel’s official changelog account, they highlighted that their CDN now “caches proxied responses using the CDN-Cache-Control and Vercel-CDN-Cache-Control headers”, an essential precursor to deeper caching analytics in Observability.
Developers interested in using this feature can visit the External APIs tab in the Observability dashboard to view cache hit metrics for deployed projects and evaluate opportunities for efficiency gains in API usage.
While Vercel’s native integration of external API caching metrics into its observability suite is a notable step forward, other companies also offer solutions, albeit often with more manual setup or third-party integrations. Netlify, for instance, supports caching strategies at the CDN level, but support for observability into external API calls often requires pairing with tools like New Relic, Datadog, or Grafana for custom dashboards and telemetry pipelines. These platforms can ingest logs and metrics from API calls; however, developers must manually configure them or do so via SDKs.
Cloudflare, meanwhile, offers advanced caching rules and Cache Analytics as part of its Enterprise plans. While it can show cache hit ratios and performance metrics, detailed visibility into external API usage typically requires correlating data from other sources, such as log push services or API gateways like Kong or Apigee, which can add operational overhead. Cloudflare Workers users can log external fetches, but the depth of insights depends on how comprehensively telemetry is captured.
AWS and Google Cloud offer more granular API Gateway and CDN logging, with services like Amazon CloudWatch, X-Ray, and Cloud Monitoring providing observability. Still, surfacing high-level caching insights across external APIs often requires stitching together telemetry from multiple services and instruments, unlike the streamlined integration Vercel now offers out of the box.
The key difference lies in developer experience: whereas traditional approaches require integration effort and expertise in observability tooling, Vercel bundles relevant metrics into the same surface where developers already monitor functions and deployments. This simplification could serve as a model for other developer-first platforms looking to unify caching, observability, and performance optimization into a cohesive experience.