InfluxDB: Flux - Analyze Query (Profiler)
The Flux Profiler package provides performance profiling tools for Flux queries and operations.
Each enabled profiler appends a table to the output stream of tables containing data returned by the profiler.
import "profiler" option profiler.enabledProfilers = ["query", "operator"] ...<your query>...
query
The query profiler provides statistics about the execution of an entire Flux script. When enabled, results returned by yield() include a table with the following columns:
- TotalDuration: total query duration in nanoseconds.
- CompileDuration: number of nanoseconds spent compiling the query.
- QueueDuration: number of nanoseconds spent queueing.
- RequeueDuration: number fo nanoseconds spent requeueing.
- PlanDuration: number of nanoseconds spent planning the query.
- ExecuteDuration: number of nanoseconds spent executing the query.
- Concurrency: number of goroutines allocated to process the query.
- MaxAllocated: maximum number of bytes the query allocated.
- TotalAllocated: total number of bytes the query allocated (includes memory that was freed and then used again).
- RuntimeErrors: error messages returned during query execution.
- flux/query-plan: Flux query plan.
- influxdb/scanned-values: value scanned by InfluxDB.
- influxdb/scanned-bytes: number of bytes scanned by InfluxDB.
operator
The operator profiler output statistics about each operation in a query. Operations executed in the storage tier return as a single operation. When the operator profile is enabled, results returned by yield() include a table with a row for each operation and the following columns:
- Type: operation type
- Label: operation name
- Count: total number of times the operation executed
- MinDuration: minimum duration of the operation in nanoseconds
- MaxDuration: maximum duration of the operation in nanoseconds
- DurationSum: total duration of all operation executions in nanoseconds
- MeanDuration: average duration of all operation executions in nanoseconds
More information's: https://docs.influxdata.com/influxdb/cloud/reference/flux/stdlib/profiler/
Performance Tips
- Can you apply any groups that will reduce the number of rows in your table(s) before applying a map() function? -> This helped me a few times!
- Can you use the experimental.join() function instead of join() function?
- Can you tune any regexes to be as specific as possible?
- Does
|> sort(columns: ["_time"], desc: false) |> limit(n:1)
perform better than|> last()
?