Your journey with time series data has to start somewhere, when using Warp 10, your starting point is usually a call to
FETCH which will retrieve data from the underlying Time Series DataBase (TSDB), namely the Warp 10 Storage Engine.
FETCH function has a plethora of options, this post will cover some very useful combinations so you can rapidly access your data in ways you did not know were possible.
FETCH function has a simple calling signature using a list of parameters as input and an advanced one relying on a parameter map. This post will focus on this latter function as it is the one allowing the most flexibility.
Selecting Geo Time Series
In order to fetch data, you will need credentials in the form of a Read Token. This token should always be present in the parameter map under the key
The next thing you will need to select which series to access is a set of parameters defining them. Those parameters will match the class and labels or attributes of the series. They can be specified in one of both ways.
class key of the parameter map can be associated with a value representing either an exact match on the Geo Time Series class or a regular expression classes should match. An exact match is simply a
STRING which should be prefixed with
= if the exact match starts with either a
= or a
~. A regular expression is a
STRING prefixed with
~ and followed by the regular expression to use for matching.
labels key of the parameter map should be associated with a map whose keys are label or attribute names and values are either exact matches (prefixed with
=) or regular expressions (prefixed with
The following example will select all series from class
foo with a label
bar matching the regular expression
A Geo Time Series (GTS) selector is a
STRING of the form
CLASS_SEL conforms to the syntax of the
class parameter above and
LABELS_SEL is a comma separated list of label or attribute name immediately followed by an exact match (prefixed with
=) or regular expression (prefixed with
The parameter map can contain a single selector under the key
selector or a list of selectors under
selectors. This allows you to retrieve series for which a single regular expression would prove cumbersome to craft.
The example below fetches two sets of series, based on two selectors.
Note that for both approaches,
STRING values should be URL encoded (using
%hh) if they contain characters such as
Retrieving data within a time range
The most common pattern for accessing data is to retrieve the data within a given time range. In WarpScript this is easily done by specifying either an end timestamp and a duration (timespan), or start and end timestamps.
When retrieving data with an end timestamp and a duration, the combination of those elements will be converted to a start timestamp (
start = end - timespan + 1).
Here are two examples:
Accessing the most recent values
Another popular pattern is to retrieve
N values before or at a given instant. This is easily done with
FETCH by specifying an end timestamp and a number of datapoints to retrieve.
Selecting active or quiet Geo Time Series
Warp 10 can track what it calls activity of Geo Time Series. When this feature is enabled (via
ingress.activity.window), updates and metadata changes performed on a GTS will modify its last activity timestamp.
It is then possible for a
FIND call to only select series which were active or quiet after a given timestamp.
This is simply done by using the
quiet.after keys in the parameter map.
Fetching boundary values
Apart from selecting active or quiet series, everything we have described so far can more or less be found in every time series database.
Boundaries are a totally different story, and to the best of our knowledge, Warp 10 is the only time series database to support them with some industrial data historians.
Boundaries are data points which are either before (pre boundary) or after (post boundary) a specific time range. When working with IoT data, boundaries are very useful, because they allow you to greatly limit the amount of data you need to store. How is that?
Well imagine you have sensors whose values hardly ever change, this is rather common when a sensor reports the position of a valve for example. You could store the value of the sensor at periodic intervals, consuming possibly costly bandwidth, or you could only store the value when it changes. The latter case is the ideal one, but without boundary support in your TSDB, you will have a hard time fetching and analyzing those values.
Indeed, imagine your valve only changes state every week, so you record one value per week for the associated sensor. How do you determine the state of the valve at any given time? Without boundaries you have to basically guess, fetching data at random after the moment you are interested in, hoping to find the first value after that instant. The value before the moment you are interested in is usually easier to fetch, luckily.
With boundaries, the problem is easily solved, simply specify the time range you are interested in and ask
FETCH to retrieve the first value just before the time range and the first value right after it. If the valve did not change state within the specified time range it is no big deal, you will still end up with the value before and after it and you can proceed with your analysis.
The boundaries are specified in the following manner:
If you have followed carefully the explanation, you have correctly concluded that by using
boundary.post you can retrieve the first datapoints after a given timestamp, something very few time series databases can do!
Sometimes you are not interested in all the values within a time range, this is why
FETCH supports sampling of your data.
Simply specify the ratio of data points which should be returned, the
FETCH call will sample your data, selecting only this amount of values.
Note that sampling is not applied to pre and post boundaries.
The syntax for sampling is:
The sampling mechanism still has to seek to each value, so even though only a portion of the encountered data points is returned, no significant performance improvement will be noticed.
If you should find it useful to skip values, know this is something
FETCH can do. Simply use the following syntax:
As for sampling, skipping still needs to seek in the data files, so performance should only be slightly better than when fetching the skipped values.
Combining it all
Yes, all the options we have described can be combined, giving you the most flexible fetching capability of all time series databases.
Tell us what you achieved using those fancy fetching patterns!
In case you were using a version of Warp 10 older than
2.3.0, some of the options described above may not be available. Please upgrade to test drive everything.
Co-Founder & Chief Technology Officer