InfluxDB: Flux - Aggregate the Sum over Time

Aus Wiki-WebPerfect
Version vom 11. Februar 2022, 12:53 Uhr von Admin (Diskussion | Beiträge)

(Unterschied) ← Nächstältere Version | Aktuelle Version (Unterschied) | Nächstjüngere Version → (Unterschied)
Wechseln zu: Navigation, Suche

AggregateWindow() with sum() = unexpected behaviour

The Flux function aggregateWindow() is often used for timeseries data for example in a graph visualization.
Mostly you use the aggregation function mean() inside aggregateWindow() and then everything is as you expect, but if you use the aggregation function sum() inside the aggregateWindow() the behaviour is a little bit strange (at least for me).
The "problem" repectively the unexpected behaviour occurs if you have irregular data points.

"Problem" explanation / Difference between mean() and sum()

For example you want the sum of all VMs memory over time. -> This is example is simplified with only one timeserie.
For example you have following timeseries data (memory in GB):
_time, vm, _value

2021-08-13 08:00:00 GMT+2, vm01, 4GB
2021-08-13 08:01:00 GMT+2, vm01, 4GB
2021-08-13 08:02:00 GMT+2, vm01, 4GB
2021-08-13 08:02:30 GMT+2, vm01, 4GB
2021-08-13 08:03:00 GMT+2, vm01, 4GB

Result of AggregateWindow() with mean() "|> aggregateWindow(every: 1m, fn: mean)"

2021-08-13 08:00:00 GMT+2, vm01, 4GB
2021-08-13 08:01:00 GMT+2, vm01, 4GB
2021-08-13 08:02:00 GMT+2, vm01, 4GB
2021-08-13 08:03:00 GMT+2, vm01, 4GB

Result of AggregateWindow() with sum() "|> aggregateWindow(every: 1m, fn: sum)"

2021-08-13 08:00:00 GMT+2, vm01, 4GB
2021-08-13 08:01:00 GMT+2, vm01, 4GB
2021-08-13 08:02:00 GMT+2, vm01, 8GB
2021-08-13 08:03:00 GMT+2, vm01, 4GB

In the example above everything looks logical, but if you have multiple timeseries (for example multiple VMs) and want the sum of all VMs memory over time, then the behaviour is really strange.


Workaround to prevent the unexpected behaviour

In the following workaround we use the same example; you want the sum of all VMs memory over time.

//define variables
bucket = "<YOUR_BUCKET>"
field = "<YOUR_FIELD>"
window = "<YOUR_WINDOW>"

//create a custom function for your usecase
customsum = (tables=<-, column="_value") => 
  tables
    |> mean() //calculates the mean to prevent unexpected behaviours with irregular timeseries
    |> drop(columns: ["host"]) //drops the "host" because we want the sum overall and a VM can switch the host, we use drop() because we can't use group() it removes some timestamps
    |> unique(column: "VMName") //removes duplicated timeseries (for example the VM is on multiple hosts in the timewindow)
    |> drop(columns: ["VMName"]) //drops the "VMName" because we want the sum overall
    |> sum(column) //calculates the sum

from(bucket: bucket)
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => 
    r._measurement == "hyperv_health" and
    r._field == field
  )
  |> aggregateWindow(every: window, fn: customsum, createEmpty: false)
  |> fill(column: "_value", usePrevious: true)
  |> toInt() //round the result


Important Notes (with aggregateWindow() and sum() inside)

If the gathering of the data took some time (a few minutes or more) then maybe your graph looks weird because your windowing time is shorter then this gathering time.
This means in some windows you have more data then in other windows.

Solution

Increase your windowing time (have to be bigger than the gather time).

Example time to gather took in average 15 minutes:

... 
|> aggregateWindow(every: duration(v: int(v: v.windowPeriod) + int(v: 15m)), fn: customsum, createEmpty: false)