Accessing your time series data is the first thing you have to do, that is why Warp 10 offers so many options for retrieving data. Discover them in this post!
Your journey with time series data has to start somewhere, when using Warp 10, your starting point is usually a call to FETCH
which will retrieve data from the underlying Time Series DataBase (TSDB), namely the Warp 10 Storage Engine.
The FETCH
function has a plethora of options, this post will cover some very useful combinations, so you can rapidly access your data in ways you did not know were possible.
General syntax
The FETCH
function has a simple calling signature using a list of parameters as input and an advanced one relying on a parameter map. This post will focus on this latter function, as it is the one allowing the most flexibility.
Selecting Geo Time Series
In order to fetch data, you will need credentials in the form of a Read Token. This token should always be present in the parameter map under the key token
.
The next thing you will need to select which series to access is a set of parameters defining them. Those parameters will match the class and labels or attributes of the series. They can be specified in one of both ways.
How to easily manipulate raw binary payloads in WarpScript |
Using class
and labels
The class
key of the parameter map can be associated with a value representing either an exact match on the Geo Time Series class or a regular expression classes should match. An exact match is simply a STRING
which should be prefixed with =
if the exact match starts with either a =
or a ~
. A regular expression is a STRING
prefixed with ~
and followed by the regular expression to use for matching.
The labels
key of the parameter map should be associated with a map whose keys are label or attribute names and values are either exact matches (prefixed with =
) or regular expressions (prefixed with ~
).
The following example will select all series from class foo
with a label bar
matching the regular expression .*matchme.*
.
Using selectors
A Geo Time Series (GTS) selector is a STRING
of the form CLASS_SEL{LABELS_SEL}
where CLASS_SEL
conforms to the syntax of the class
parameter above and LABELS_SEL
is a comma separated list of label or attribute name immediately followed by an exact match (prefixed with =
) or regular expression (prefixed with ~
).
The parameter map can contain a single selector under the key selector
or a list of selectors under selectors
. This allows you to retrieve series for which a single regular expression would prove cumbersome to craft.
The example below fetches two sets of series, based on two selectors.
Note that for both approaches, STRING
values should be URL encoded (using %hh
) if they contain characters such as ,
, {
, =
, ~
or }
.
Retrieving data within a time range
The most common pattern for accessing data is to retrieve the data within a given time range. In WarpScript this is easily done by specifying either an end timestamp and a duration (timespan), or start and end timestamps.
When retrieving data with an end timestamp and a duration, the combination of those elements will be converted to a start timestamp (start = end - timespan + 1
).
Here are two examples:
Accessing the most recent values
Another popular pattern is to retrieve N
values before or at a given instant. This is easily done with FETCH
by specifying an end timestamp and a number of datapoints to retrieve.
Selecting active or quiet Geo Time Series
Warp 10 can track what it calls activity of Geo Time Series. When this feature is enabled (via ingress.activity.window
), updates and metadata changes performed on a GTS will modify its last activity timestamp.
It is then possible for a FETCH
or FIND
call to only select series which were active or quiet after a given timestamp.
This is simply done by using the active.after
or quiet.after
keys in the parameter map.
Fetching boundary values
Apart from selecting active or quiet series, everything we have described so far can more or less be found in every time series database.
Boundaries are a totally different story, and to the best of our knowledge, Warp 10 is the only time series database to support them with some industrial data historians.
Boundaries are data points which are either before (pre boundary) or after (post boundary) a specific time range. When working with IoT data, boundaries are very useful, because they allow you to greatly limit the amount of data you need to store. How is that?
Well imagine you have sensors whose values hardly ever change, this is rather common when a sensor reports the position of a valve for example. You could store the value of the sensor at periodic intervals, consuming possibly costly bandwidth, or you could only store the value when it changes. The latter case is the ideal one, but without boundary support in your TSDB, you will have a hard time fetching and analyzing those values.
Indeed, imagine your valve only changes state every week, so you record one value per week for the associated sensor. How do you determine the state of the valve at any given time? Without boundaries, you have to basically guess, fetching data at random after the moment you are interested in, hoping to find the first value after that instant. The value before the moment you are interested in is usually easier to fetch, luckily.
With boundaries, the problem is easily solved, simply specify the time range you are interested in and ask FETCH
to retrieve the first value just before the time range and the first value right after it. If the valve did not change state within the specified time range it is no big deal, you will still end up with the value before and after it and you can proceed with your analysis.
The boundaries are specified in the following manner:
If you have followed carefully the explanation, you have correctly concluded that by using boundary.post
you can retrieve the first data points after a given timestamp, something very few time series databases can do!
Read more about saving and processing sensor data with Node-RED and WarpScript |
Sampling values
Simply specify the ratio of data points which should be returned, the FETCH
call will sample your data, selecting only this amount of values.
Sometimes you are not interested in all the values within a time range, this is why FETCH
supports sampling of your data.
Note that sampling is not applied to pre- and post-boundaries.
The syntax for sampling is:
The sampling mechanism still has to seek to each value, so even though only a portion of the encountered data points is returned, no significant performance improvement will be noticed.
Skipping values
If you should find it useful to skip values, know this is something FETCH
can do. Simply use the following syntax:
As for sampling, skipping still needs to seek in the data files, so performance should only be slightly better than when fetching the skipped values.
Combining it all
Yes, all the options we have described can be combined, giving you the most flexible fetching capability of all time series databases.
Tell us what you achieved using those fancy fetching patterns!
In case you were using a version of Warp 10 older than 2.3.0
, some options described above may not be available. Please upgrade to test drive everything.
Read more
Industrie du futur : les données sur le chemin critique - Partie 1
2021 recap: new functions and tools in the Warp 10 Platform
Etch-a-Time Series: a Raspberry Pi, a laser and Warp 10
Co-Founder & Chief Technology Officer