Thinking in WarpScript™ – Fetch data within a geographical area

Fetch data within a geographical area

It is time to go back to the basis. Extract data within a geographical area.

We assume that we have geolocalized data-points. We want to fetch all data-points within an area.

First, define an area, either in WKT or in GeoJSON.

You can use Wicket for WKT and GeoJSON.io for GeoJson.

In this post, we will use a simple GeoJson:

{
  "type": "Polygon",
  "coordinates": [
    [
      [ 130.94261169433594, -12.162184641327928 ],
      [ 131.1822509765625, -12.162184641327928 ],
      [ 131.1822509765625, -11.982905071887549 ],
      [ 130.94261169433594, -11.982905071887549 ],
      [ 130.94261169433594, -12.162184641327928 ]
    ]
  ]
}

Filter with MAP

We will use the MAP framework and its mapper.geo.within function:

<'
{
  "type": "Polygon",
  "coordinates": [
    [
      [ 130.94261169433594, -12.162184641327928 ],
      [ 131.1822509765625, -12.162184641327928 ],
      [ 131.1822509765625, -11.982905071887549 ],
      [ 130.94261169433594, -11.982905071887549 ],
      [ 130.94261169433594, -12.162184641327928 ]
    ]
  ]
}
'> 
'geoJson' STORE
$geoJson 0.01 false GEO.JSON 'geoshape' STORE
[ 'READ TOKEN' 'my.dataset' {} NOW 1 h ] FETCH 'gts' STORE
[
  $gts               // GTS list
  $geoshape           // from GEO.JSON
  mapper.geo.within  // keep inner points 
  0                  // pre
  0                  // post
  0                  // occurrence
] MAP NONEMPTY // exclude empty GTS

Going further

Discover the power of FETCH here and the MAP framework here.
GEO.JSON parameters define the precision you want.

Going further

In the previous chapter, we fetch all data-points and then we keep only those which are in our area. But we can improve this.

If our data-points are not moving, we can use another technique. A better alternative is to fetch only data points in the area. To do so, we first need to compute an HHCODE from the location of the last known data-point and add it to an attribute. You have to execute this script once:

'READ TOKEN' 'read' STORE
'WRITE TOKEN' 'write' STORE
$read AUTHENTICATE 10000000 MAXOPS // yes it could be huge
[ $read 'my.dataset' { } NOW -1 ] FETCH // fetch the last known data point
<%  
  // Remove list index  
  DROP  
  'gts' STORE
  [ $gts <% 
    // Extract lat + lon 
    [ 4 5 ] SUBLIST FLATTEN LIST-> DROP      
    // Convert to HHCode      
    ->HHCODE 'hhcode' STORE // Set HHCode as attribute 'loc'
    $gts { 'loc' $hhcode } SETATTRIBUTES // Return NO VALUE
    0 NaN NaN NaN NULL // fake data for the macro mapper
  %> MACROMAPPER 0 0 0 ] MAP 
  // Discard mapped GTS as we do not need it    
  DROP
%> LMAP  $write  META

And then, you can fetch using this HHCODE to only retrieve series which have data-points located in this area.

[...]
$geoJson 0.01 false GEO.JSON 'geoshape' STORE
'~(' $geoshape GEO.REGEXP + ')' + 'regexp' STORE
[ $token 'my.dataset' { 'loc' $regexp } NOW  1 h ] FETCH 'gts' STORE
[...]

Going further

We used GEO.REGEXP, META, MAXOPS, SETATTRIBUTES and MACROMAPPER

But what if my data-points are moving? You can optimize your computation by using the spacio-temporal indexing technique.

Share