Comparing every sunday data in a given timezone is not trivial. With WarpScript language, this is a pretty straightforward timeseries analytics.
Level 1: compare data day to day
Customer: I want a mean of hourly power consumption of last 20 days, hour by hour. I mean, do a mean of data at h, h-24, h-48, h-72, and so on. There might be missing datapoints. Can I do this in WarpScript?
Me: Sure, it could be done in a few lines, give me your data!
The first thing your need to do is to make sure to align your data. 1 point per hour, the mean of data between h and h+59minutes. Then just use TIMEMODULO, to create splits of one day
[ $OneMeter20dGTS bucketizer.mean 0 1 h 0 ] BUCKETIZE 0 GET // align timestamps, take first gts in the list 1 d 'split' TIMEMODULO //create a list of gts with a split attribute, one per day
|Analyze your electrical consumption from your Linky and Warp 10|
[ SWAP  reducer.mean.exclude-nulls ] REDUCE // compute the mean of every points for each tick
The WarpScript and the wrapped test data is playable on WarpStudio, just click this link.
Level 2: Compare data Sunday to Sunday
Customer: I also have a Sunday effect I want to isolate. Power Consumption is higher on Sunday for some customers. Is it possible to have series for one week day only, in Paris timezone?
Me: It is slightly more complex to do because you need to play with ->TSELEMENTS, but as it is a very useful function, we put it on WarpFleet. If you're running Warp 10 2.0, it is really easy. Please show me your data!
Customer: we haven't such data, please add 2.0 to every values on sunday, Paris Timezone.
As an experimented WarpScript programmer, you guess that it could be done in a single mapper. For each value in a single value window, read the day from the timestamp, add 2.0 if day is sunday:
[ $OneMeter50dGTS <% 'i' STORE <% $i 0 GET 'Europe/Paris' ->TSELEMENTS 8 GET 7 == %> //if day == 7 <% $i [ 3 7 ] SUBLIST FLATTEN //original data $i 7 GET 0 GET 2.0 + //get the value, add 2.0 on value 4 SET //change the value %> <% $i [ 3 7 ] SUBLIST FLATTEN //push original data %> IFTE %> MACROMAPPER 0 0 0 ] MAP
Again, a fully playable example is available on this WarpStudio snapshot. Don't forget to select the right timezone in the DataViz!
One line split by day
Now you have some fake data, time to cut them by day. On Warp 10 2.0, WarpFleet repository is activated by default. It means that Warp 10 will look online for macros not found locally. Here we will use
$OneMeter50dGTS 'Europe/Paris' @senx/cal/bydayofweek
Ok, we can timesplit each series with a one day silent period. The output will be a list of list of GTS, one per day:
1 d 1 'split' TIMESPLIT
Timeshift with timezone
Now, we must shift all GTS to the same day, in order to REDUCE them in a mean. TIMEMODULO could do the job... But it won't take into account the timezone. For each series, we must look at the day of the year in the timezone, and shift it in the past accordingly. Applying a function on a list to produce a list could be done with LMAP:
<% DROP 'gts' STORE $gts $gts FIRSTTICK 'Europe/Paris' ->TSELEMENTS 7 GET -1 * d TIMESHIFT %> LMAP
Result is really close to my first TIMEMODULO example, but:
- It takes timezones into account
- Look at the GTS labels: they are labelled with
.dayofweek allow to use REDUCE on a class of equivalence. The final result just need one more line:
[ SWAP [ '.dayofweek' ] reducer.mean.exclude-nulls ] REDUCE 'mean by day' RENAME
The playable example is available here.
Comparing data day by day, year by year, is a very common questions from our customers to extract trends in their data. We know the thinking path is not trivial. I hope this article will help you to think in WarpScript !
|Explore the other posts about Thinking in WarpScript.|
WarpScript, the data programming language of the Warp 10 platform, offers built-in functions to help you detecting anomalies. We review them in this post.
Discover everything about the new Multi Value syntax introduced by release 2.1 of Warp 10.