It is time go back to the basis. Extract data within a geographical area.
We assume that we have geolocalized data-points. We want to fetch all data-points within an area.
First, define an area, either in WKT or in GeoJSON.
You can use Wicket for WKT and GeoJSON.io for GeoJSON.
In this post, we will use a simple GeoJSON:
{
"type": "Polygon",
"coordinates": [
[
[
130.94261169433594,
-12.162184641327928
],
[
131.1822509765625,
-12.162184641327928
],
[
131.1822509765625,
-11.982905071887549
],
[
130.94261169433594,
-11.982905071887549
],
[
130.94261169433594,
-12.162184641327928
]
]
]
}
Filter with MAP
We will use the MAP framework and its mapper.geo.within function:
<'
{
'type': 'Polygon',
'coordinates': [
[
[ 130.94261169433594, -12.162184641327928 ],
[ 131.1822509765625, -12.162184641327928 ],
[ 131.1822509765625, -11.982905071887549 ],
[ 130.94261169433594, -11.982905071887549 ],
[ 130.94261169433594, -12.162184641327928 ]
]
]
}
'>
'geoJson' STORE
$geoJson 0.01 false GEO.JSON 'geoshape' STORE
[ 'READ TOKEN' 'my.dataset' {} NOW 1 h ] FETCH 'gts' STORE
[
$gts // GTS list
$geoshape // from GEO.JSON
mapper.geo.within // keep inner points
0 // pre
0 // post
0 // occurrence
]
MAP
NONEMPTY // exclude empty GTS
Going further
Discover the power of FETCH here and the MAP framework here.
GEO.JSON parameters define the precision you want.
Discover the easiest way to detect motion and to split a Geo Time Series accordingly. |
Going further
In the previous chapter, we fetch all data-points, and then we keep only those which are in our area. But we can improve this.
If our data-points are not moving, we can use another technique. A better alternative is to fetch only data points in the geographical area. To do so, we first need to compute an HHCODE from the location of the last known data-point and add it to an attribute. You have to execute this script once:
'READ TOKEN' 'read' STORE
'WRITE TOKEN' 'write' STORE
$read AUTHENTICATE
10000000 MAXOPS // yes it could be huge
[ $read 'my.dataset' { } NOW -1 ] FETCH // fetch the last known data point
<%
// Remove list index
DROP
'gts' STORE
[
$gts
<%
// Extract lat + lon
[ 4 5 ] SUBLIST FLATTEN LIST-> DROP
// Convert to HHCode
->HHCODE 'hhcode' STORE
// Set HHCode as attribute 'loc'
$gts { 'loc' $hhcode } SETATTRIBUTES
// Return NO VALUE
0 NaN NaN NaN NULL // fake data for the macro mapper
%> MACROMAPPER
0 0 0
]
MAP
// Discard mapped GTS as we do not need it
DROP
%> LMAP
$write META
And then, you can fetch using this HHCODE
to only retrieve series which have data-points located in this area.
[...]
$geoJson 0.01 false GEO.JSON 'geoshape' STORE
'~(' $geoshape GEO.REGEXP + ')' + 'regexp' STORE
[ $token 'my.dataset' { 'loc' $regexp } NOW 1 h ] FETCH 'gts' STORE
[...]
Going further
We used GEO.REGEXP, META, MAXOPS, SETATTRIBUTES and MACROMAPPER
But what if my data-points are moving? You can optimize your computation by using the spacio-temporal indexing technique.
Read more
Random Number Generation in WarpScript
AIS data made easy
Conversions to Apache Arrow format
Senior Software Engineer