WarpScript equivalent to SELECT WHERE requests is a basic GTS sum. You just need to build a new GTS to select interesting data from the original one!
Lagging effect on derivatives
Non-real-time systems may lag. This has no effect if your system just stores data: you will end up with missing data points which could be easily handled later.
Now, imagine that your system also computes the longitudinal acceleration of a vehicle. Lower layer software sends data every 100ms to the software. The software computes (wheelspeedN-1 – wheelspeedN)/100ms, applies a low-pass filter, then stores the result. But if there is a lag in the meantime, N-1 will represent data 1s before, instead of 100ms before, and the result will be 10 times too high.
|Discover how to detect a sequence of values in WarpScript|
That is what happened with car data I am currently working with. Sometimes, in very rare circumstances, lags happen, and the longitudinal acceleration turns wrong:
Advice: turn on the "dots" when you work with unknown data. In VSCode WarpScript extension, there is an "always show dots" option.
|Learn more about using VSCode with Warp 10.|
Around 17:16:12, the 1.5 lag introduces an incorrect value. Moreover, since the software that produced the data applied a low pass filter, neighboring data points are also affected.
Ideally, a few data points must be discarded before the lag is detected, and around 30 points after the lag (it takes about 20 samples to converge again). How could you do this in WarpScript?
First, we will detect the lags by computing delta between ticks, then we will discard bad data points.
Compute delta between ticks
[ $gLongGTS mapper.tick 0 0 0 ] MAP //put tick in value [ SWAP mapper.delta 2 2 STRICTMAPPER 1 0 0 ] MAP //delta between ticks
First line: put the tick in the value. mapper.tick output is a GTS with value=tick.
Second line : mapper.delta computes the delta between the last and first values (delta = last – first) of the sliding window. For the very first tick, the sliding window will contain one point, and the result will be zero. The best way to avoid this start condition is to transform the mapper with a STRICTMAPPER that can take a minimum of 2 data points and a maximum of 2 data points.
Build a validity indicator
Using the previous GTS, we want to obtain a GTS that only keeps the ticks that are valid. Thus, we will be able to use this GTS as a validity indicator.
If a tick has a delta above a certain threshold, we consider it to be invalid along with 10 ticks before, and 30 ticks after (since they have been impacted by the low-pass filter). So, in the following WarpScript, we associate the bad delta value to those ticks too, using mapper.max.
[ SWAP mapper.max 30 10 0 ] MAP //maximum delta in a moving window
Reverse your brain, be the mapper: if you want to find a bad delta 10 points before, it means your moving window should look 10 points after the current tick. That’s why the "post" parameter of MAP is 10. Same logic with the "pre" parameter.
Now, I just want to keep points less than a threshold of 400ms:
[ SWAP 400 ms mapper.lt 0 0 0 ] MAP //keep delta under 400ms
Keep valid points
Since Warp 10 2.0, you can do arithmetic on GTS directly:
$gtsA $gtsB +
For each tick, if both gtsA and gtsB have a value for this tick, it returns a value (the sum of gtsA and gtsB values).
Read the previous sentence again… again… and try to formulate the contraposition: to remove values from a GTS, you just need to add a GTS with value 0 on each tick you want to keep. Back to the complete example:
[ $gLongGTS 1439572577413977 12061287 TIMECLIP mapper.tick 0 0 0 ] MAP [ SWAP mapper.delta 2 2 STRICTMAPPER 1 0 0 ] MAP //delta between ticks [ SWAP mapper.max 30 10 0 ] MAP //maximum delta in a moving window [ SWAP 400 ms mapper.lt 0 0 0 ] MAP //keep delta under 400ms 0 GET 0 * //take first gts in the list output, multiply all values by 0 'noLagIndicator' RENAME 'noLagGTS' STORE
The red curve are the ticks I want to keep, with value 0. I will add it to my original data:
$gLongGTS $noLagGTS +
Here is the full sequence with lagging effect ignored:
SELECT ... WHERE ... efficient equivalent in WarpScript is just a GTS sum.
SELECT ... WHERE a AND b AND c efficient equivalent in WarpScript is just several GTS sums:
[ $gearGTS $gear mapper.eq 0 0 0 ] MAP 0 GET //keep points with this gear ratio 0 * 'gearSelectorGTS' STORE //multiply value per zero [ $throttleGTS 90 mapper.gt 0 0 0 ] MAP 0 GET //keep points where throttle pedal is pressed 0 * 'throttleSelectorGTS' STORE //multiply value per zero //keep datapoints for current gear, for throttle over 90%, and no lag : $rpmGTS $gearSelectorGTS + $throttleSelectorGTS + $noLagGTS + 'rpm' STORE
This method is the most efficient way I found in WarpScript to remove invalid data. Applied to 2500 cars, the acceleration diagram (max acceleration vs rpm for a given gear, full throttle) is now clean:
The fix took… Well… 7 lines of WarpScript!
What in the world is a Time Series Database?
Working with Time Series in Spark
Data replication with Warp 10
Electronics engineer, fond of computer science, embedded solution developer.