MMS • Matt Campbell
Article originally posted on InfoQ. Visit InfoQ
Grafana has released Grafana Tempo 2.0 which introduces the new TraceQL query language and support for the Apache Parquet format. Grafana Tempo is an open-source tracing backend that works with object storage. The new TraceQL query language works with the Apache Parquet format to provide improved search times and queries aligned to traces.
The TraceQL query language is based on existing languages like PromQL and LogQL. It allows for selecting traces based on span and resource attributes, timing, and duration. A TraceQL query is an expression that is evaluated one trace at a time. Queries are comprised of a set of expressions chained together into a pipeline. Each expression in the pipeline will select or discard spansets from being included in the results. For example, the following query will select traces where the spans have an http.status_code
between 200
and 299
and the number of matching spans within the trace is greater than two:
{ span.http.status_code >= 200 && span.http.status_code 2
A trace represents the journey of a request through the system under observation. They are composed of one or more spans that represent a unit of work within a trace. Spans have a start time relative to the start of the trace, a duration, and an operation name.
TraceQL differentiates between two types of span data: intrinsics and attributes. Intrinsic fields are fundamental to spans and include things like the status, duration, and name of the span. Attribute fields are derived from the span and can be customized. For example, the query { span.http.method = "GET" }
uses attribute fields on the span to return traces with the HTTP GET
method.
Comparison operators are supported and can be used to combine spansets. To find traces with spans that traverse two regions the following query could be used:
{ resource.region = "eu-west-0" } && { resource.region = "eu-west-1" }
This release also includes the aggregators count
and sum
. count
can be used to return a count of spans within a spanset whereas average
is used to provide an average of a given attribute or intrinsic for a spanset. The following query could be used to find traces that have more than three spans with an HTTP status of 200:
{ span.http.status = 200 } | count() > 3
This release also introduces Apache Parquet as the default backend storage format. Apache Parquet is an open-source, column-oriented data file format. Joe Elliott, Principal Engineer at Grafana, does note that this change may have performance implications:
Previous iterations of Tempo used a format we call v2 that was incredibly efficient at storing and retrieving traces by ID. Parquet is more costly due to the extra work of building the columnar blocks, and operators should expect at least 1.5x increase in required resources to run a Tempo 2.0 cluster.
However, the Tempo release notes indicated the Tempo team saw a substantial search speed increase with the new Parquet format:
With our previous block format, we’d seen our ability to search trace data cap out at a rate of ~40-50 GB per second. Since switching to Parquet, we’re now hitting search speeds of 300 GB/s on common queries and doing so with less compute.
This new Parquet block is enabled by default in Tempo 2.0 and is required to make use of the new TempoQL query language. Once enabled, Tempo will begin writing data in the Parquet format but will leave existing data as-is. The Parquet format can be disabled to instead use the original v2 block format.
Tempo 2.0 is open-source and available under the AGPL-3.0 license. Elliot notes that the best TraceQL experience is found with Grafana 9.4.