Discover why, when, and how to use WarpScript in Python, to glean the benefits of using the analytics engine of the most advanced time series platform.
Pythonists can benefit from using WarpScript in Python. In this post, we explain why, when, and how to do that.
Some contents in this article are taken from the talk I gave at a PyData meetup recently. The slides are available here.
If it's the first time that you hear about WarpScript, I suggest you read this post first.
Why and when do you need WarpScript?
Python users already have the Pandas library to work with time-series, so when would they need WarpScript?
- WarpScript supports pickle (using
->PICKLE
andPICKLE->
functions). This means that data can flow efficiently between Python and WarpScript. - WarpScript has built-in functions to manipulate data and meta-data coming from time-series databases. For example, multi-way grouping and computing the mean series in each group can be done with one line of code:
[ $gts [ 'key1' 'key2' ] reducer.mean ] REDUCE ]
. - WarpScript library is specialized in time-series (and geo time series) and contains more than 1000 functions which were written to answer common practical use cases, from time and geo manipulation to graphical content generation and more. Not reinventing the wheel will gain you time!
- Some functions overlap between WarpScript and pandas. For example,
BUCKETIZE
with.resample()
andMAP
with.rolling()
, but they differ enough to justify using WarpScript version of the function in practical cases (for example when there are missing data). - The same WarpScript can be executed either on a single server, or can be distributed with PySpark. See the doc here.
- WarpScript doesn't need a Warp 10 platform. You can use it for its library, or process any input source on-the-fly. For example, it can transform any Hadoop input format at loading time.
- You can include WarpScript macros from a trusted remote repository easily. Just use the syntax
@repo/my/macro
in your WarpScript to use a remote macro.
How to use WarpScript in Python
Using WarpScript in Python can be done in just a few steps.
Method 1: From a Jupyter notebook
Just pip install the extension and load it in your notebook.
%bash pip install warp10-jupyter
%load_ext warpscript
Now you are good to use the %%warpscript
cell magic. The --local/l
flag is used to tell that you are using the WarpScript library locally. If you want it to be connected to a Warp 10 platform, you can specify the --address
and --port
on which its Py4J gateway runs (see this post for more information).
%%warpscript --local --stack stack "Hello world of WarpScript!"
top: "Hello world of WarpScript!"
The WarpScript execution environment is stored under the variable stack
. It will be reused in subsequent %%warpscript
cells, or you can also use it to directly execute WarpScript code stack.exec("some-warpscript-code")
.
Method 2: Not from a Jupyter notebook
Note that the same package also provides functions to execute WarpScript code outside a notebook.
import warpscript
#pip install warp10-jupyter
stack = warpscript.newLocalStack()
# or
newStack(adress, port, auth_token)
stack.exec("Hello world of WarpScript!")
...
Note that .exec()
executes one-line statements and .execMulti()
executes multi-line strings.
Method 3: With the Py4J library
If you want more control on your interaction with the stack and the JVM (for example for using a specific Warp 10 version, for using WarpScript extensions or simply other libraries from the Java world), you can do what precedes using the Py4J library directly.
With this method, you need a Warp 10 jar first. You can download one from GitHub, then untar it:
wget https://github.com/senx/warp10-platform/releases/download/X.Y.Z/warp10-X.Y.Z.tar.gz
tar xvzf warp10-X.Y.Z.tar.gz
Now, launch a Py4J gateway. This gateway is responsible for creating a stack (the environment which executes WarpScript code). If you want to be connected to a Warp 10 platform, connect to its gateway rather than launching one.
from py4j import launch_gateway, JavaGateway, GatewayParameters
import warpscript # optional import (in warp10-jupyter package), this overrides methods for printing stack and GTS objectsport,
token = launch_gateway(enable_auth=True, die_on_exit=True,
classpath='warp10-2.1.0/bin/warp10-2.1.0.jar')
gateway = JavaGateway(gateway_parameters=GatewayParameters(port=port,
auto_convert=True, auth_token=token))
Specify a WarpScript configuration and create the stack:
default_conf = {}
default_conf["warp.timeunits"] = "us"
default_conf["py4j.stack.nolimits"] = "true"
entry_point = gateway.jvm.io.warp10.Py4JEntryPoint(default_conf)
stack = entry_point.newStack()
You can now play with it!
stack.exec("Hello world of WarpScript!")...
Conversions
The gateway already automatically converts usual objects: numbers, lists, dicts, strings, bytes …
For larger objects, the principled way to transfer them between Python and WarpScript is to use the pickle representation.
For example, in what follows, we transfer some data to WarpScript:
import pickle
stack.push(pickle.dumps(ticks))
stack.push(pickle.dumps(values))
Now we use a WarpScript function (here it is TIMESPLIT):
%%warpscript --local --stack stack --not-verbose[ "ticks" "values" ] STORE$ticks PICKLE-> [] [] [] $values PICKLE-> MAKEGTS1 d 2 "piece" TIMESPLITVALUES ->PICKLE
… and we retrieve data back in Python:
result = pickle.loads(stack.pop())
Additional tips
- In a Jupyter notebook, you can use
NOOP
(no operation function) to initiate a stack:
%%warpscript --local --stack stack NOOP
Local gateway launched on port 40641Creating a new WarpScript stack accessible under variable "stack".
- You can load macros from any trusted remote repository.
For example, we can set the list of trusted repository:
trusted_repos = stack.getAttribute("warpfleet.repos")
if trusted_repos is None:
trusted_repos = []
trusted_repos.append("https://raw.githubusercontent.com/randomboolean/shareable/master/")
stack.setAttribute("warpfleet.repos", trusted_repos)
And then call a macro from this repo:
%%warpscript --local --stack stack2 @mc2/RANDNORMAL
top: -0.95144750488072722: -0.6635116324344921
- You can use the
--not-verbose/-v
flag if your stack contains big pickle objects to avoid the notebook representing them below the executed cell - Stack objects have the same public methods as defined in WarpScript source code. Of notable use are
.peek()
,.pop()
,.get(int)
,.push()
,.getAttribute(key)
,setAttribute(key, value)
,.depth()
… you can list them all withdir(stack)
. - The
--stack/s
flag usesstack
as default, so--stack stack
is in fact not needed. - The
--local/l
flag is only needed when the stack is initiated. After that, the stack variable given as argument (or default one) is reused.
Conclusion
In this post, we reviewed why and when to use WarpScript in Python and how to do it. Happy WarpScripting in Python!
You may also be interested to read the previous post on the Py4J plugin for Warp 10, on the post that presented the notebook extension, and on the related page from the official documentation.
Read more
n8n & Warp 10 - Automate your time series manipulations
September 2020: Warp 10 release 2.7.0, ready for FLoWS
I want this in Excel!
Machine Learning Engineer