Fetch the first datapoint

Fetch the first datapoint

When you fetch data, Warp 10 FETCH always start from the end and walk back in time. Fetching the last datapoint is easy. Fetching the first datapoint is less obvious. As you asked us for a solution, here is a way to do it.

FETCH basics

[ $ReadToken 'classname' {} NOW -1 ] FETCH will fetch the last point of a stored GTS. To do this, Warp 10 needs to open the right file and decompress the data. Fetching one point or one thousand points does not make a big difference.

The idea to find the first point is to fetch 1000 points one month ago, if there are 1000 points, fetch 1000 points two months ago, or before the first tick of the points, you found... until you do not find 1000 points. Then make a dichotomy until you find strictly less than 1000 points, and return the first one.

Implementation

The macro takes 4 parameters:

  • The read token
  • A GTS that will be your selector (possibly empty)
  • An offset to define the time step you will jump (it depends on your data)
  • A batch size (1000 is a good one)

If you know your data is around 1 million per month since a few years, a one month time step and a 10000 datapoint batch size is reasonable. It really depends on your data.

@training/dataset1 //load token to access fuel prices [ $TOKEN '~.*' {} ] FIND 1 10 SUBLIST 'gtsList' STORE //find the first 10 gts $gtsList //iterate on each gts to find its first point. <% 'gts' STORE $TOKEN $gts 30 d 1000 @senx/gts/first FIRSTTICK ISO8601 //let the result on the stack %> FOREACH

This macro is useful... So we made it available on WarpFleet resolver. You can use it with @senx/gts/first. You can review the code here.

Going further with attributes

If you never write in the past and you often need the first datapoint for your analysis, you can store the first timestamp in a GTS attribute. It will be ready for the next time you need the first point of your GTS!

[ $READTOKEN '~.*speed' { 'id' '~.*paris.*' } ] FIND 0 GET 'myGts' STORE //take the first one //check for attributes <% $myGts ATTRIBUTES 'first_tick' CONTAINSKEY %> <% 'first_tick' GET TOLONG //let the attribute on the stack %> <% DROP //nothing in attributes $READTOKEN $myGts 30 d 1000 @senx/gts/first FIRSTTICK 'result' STORE //store it as an attribute $myGts { 'first_tick' $result TOSTRING } SETATTRIBUTES $WRITETOKEN META $result %> IFTE

On a one million point series, this method fetches 4006 datapoints on the first iteration and the second iteration is immediate.

All these operations could be done on a list of GTS, with LMAP or a FOREACH.

Alternatively, you can set an attribute after you push the first point of your GTS (for example with the meta endpoint). But beware that if you delete the first point or write in the past, the value of this attribute might not reflect what you think it should (unless you update it).

Conclusion

If you know your series are small enough, FETCH all datapoints and call FIRSTTICK, then store the first tick in the series attributes. If you don't know exactly, use @senx/gts/first macro.

Share