Ingesting 300k datapoints per second in the Warp 10 time series database on a Raspberry Pi is possible with Warp 10 2.1 new ingestion format. We also benched the PINE64 board!

Time for IoT again! Embedding Warp 10 on a Raspberry Pi or another embedded target has some advantages. You can use WarpScript everywhere to process data locally, use REXEC and other server to server functions for data synchronization. But what about performances?
If you look at the Warp 10 documentation, you can read that edge deployments can handle around 10k data points/s. You can also see that the standalone version on a single computer can handle around 100k data points/s.
Time to check those figures!
Explicit Warning: Warp 10 2.1 spoilers inside.
Bench material
ARM32: Rpi3 B
Not the most recent, but so common for makers, it is a "must bench". Note there is now some industrial-grade Raspberry.
- Quad Core 1.2GHz Broadcom BCM2837 64bit SOC (Cortex A-53)
- 1 GB RAM
- 32 bits (armv7l) Raspbian OS.
Yes, Raspbian is a 32 bits OS on a 64bits ARM. This has an impact on the LevelDB implementation.
ARM64: PINE64 LTS
Not very common, but a very good performance/price ratio.
- Quad Core 1.15 GHz Allwinner A64 SOC (Cortex A-53)
- LPDDR3 RAM (up to 2 GB)
- Armbian 64 bits (aarch64) with a pretty old patched 3.10 kernel (before sunxi starts merging with mainline).
ARM32: Rpi4 B
The most recent release of the famous Raspberry
- 1.5-GHz, Quad-Core Broadcom BCM2711B0 (Cortex A-72)
- 2 GB RAM
- Twice the RAM bandwidth as Rpi3.
- 32 bits (armv7l) Raspbian OS.
Again, Raspbian is a 32 bits OS on a 64bits ARM.
A fan
Yes. I want to bench Warp 10, not the weird throttling strategies implemented in both SOC.
SD Cards
OK, these are not the fastest on the market, but I don't care about sequential read or write. I used them for a customer because they have a SMART-like protocol which clearly tells you how many write cycles you did.
Meanwhile, during the bench, I was so impressed by the Rpi4 that I also bought an A2 class SD card. This kind of SD card sustains 2000 IOPS random write. For a database, this is more important than the sequential write performance. When you buy an SD card, forget about the "speeds up to 60 MB/s". The card I bought is A2 class (random write 10 MB/s, 2000 IOPS), and V30 (30 MB/s minimum sequential write).
Anyway, I will also do tests with a RAM drive. Again, I want to bench Warp 10, not SD cards.
Discover the Raspberry Beer'o'meter |
Software setup
Java: OpenJDK 1.8
Warp 10: latest 2.0.3 master… And a pretty cool new branch we will merge soon. I'm going to explain that later on.
Warp 10 configuration: the out-of-the-box default configuration. There may be room for improvement.
Datasets
Dataset has a huge impact on performance. Ingesting booleans is quicker than doubles (parsing time), and there are a few tweaks you must know about the Warp 10 input format. Warp 10 2.1 also allows something really new.
For the test, I imagine I am logging 22 channels on the CAN bus of a car.
The naive dataset format
Each line is a different classname, label is repeated everywhere.
The ordered dataset format
I grouped every channels, but the classname and label are still repeated.
The optimized dataset format
I used the continuation syntax, with =
sign to tell Warp 10: "go on, it is the same GTS as on the previous line".
The brand new Warp 10 2.1 multi value format
Up to Warp 10 2.0, you can store booleans, longs, doubles, or string values. Warp 10 2.1 introduces a new format: the binary format. It means you can push whatever you want in the value field, starting with b64:
(followed by Base64 URL content) or hex:
(followed by hex encoded content), or multiple values enclosed in [ ]
.
In this example, I have a fixed list of 22 channels. I will use a list of values, one value per channel. The timestamp and the position are repeated once, and my GTS is named allchannels
.
The list is encoded by the Warp 10 ingestion process into a wrapped ENCODER. When you will FETCH the data, you will need to decode it to recover values. This operation is really fast.
If you want an even faster ingestion rate, you can remove the compression of this ENCODER: just add !
right after the [
:
The number of lines is 57841, instead of 1272524 (22 channels, 22 times fewer lines).
Ingest 300k data points per second in the Warp 10 time series database on a Raspberry Pi Share on XRaw Bench Results
Here are the results in data points per second for each dataset, remember this is on rather low performance hardware:
naïve | ordered | optimized | Warp 10™ 2.1 | |
Raspberry Pi 3B, SD card | 7 200 | 13 000 | 19 000 | 77 000 (**) |
Raspberry Pi 3B, ram drive | 8 900 | 17 000 | 23 000 | 100 000 |
PINE64, SD card | 10 600 | 26 000 | 31 000 | 74 000 (**) |
PINE64, ram drive | 15 000 | 33 000 | 45 000 | 130 000 |
Raspberry Pi 4B, SD card | 11 000 | 16 500 | 32 000 | 140 000 212 000 w/o compression |
Raspberry Pi 4B, A2 SD card | 14 000 | 18 000 | 48 000 | 180 000 300 000 w/o compression |
Raspberry Pi 4B, ram drive | 17000 | 23 000 | 55 000 | 195 000 330 000 w/o compression |
i7 ssd laptop | 113 000 | 340 000 | 430 000 | (*) |
(*) Ingestion time is between 1.2s and 1.7s (~1 000 000 data points/s). This 1.2M data points dataset is too small for a reliable figure.
(**) hardly repeatable results. Somewhere between 70 000 and 80 000. The Cortex A-53 SD card/CPU interface is the limit here.
Analysis
- We had not benched the Raspberry Pi in two years. On the Warp 10 website, we advertise the ingestion rate to be 10k data points/s for edge applications, which was a mix of some hardware. We see a Raspberry Pi 3B can now reach 20k data points/s.
- A Pine64 LTS outperforms the Raspberry Pi 3 everywhere the SD card is not limiting performance. Both are Cortex A-53 with nearly the same clock. Is it the 64 bits effect?
- I used the same SD card everywhere… The Cortex A-72 of the Raspberry 4 obviously removed the CPU/SD card bottleneck!
- With the Warp 10 2.1 multi-value ingestion format, A2 class SD card performance is really close to a RAM drive…
- The SD card random access is still a bottleneck for every hardware: with the Warp 10 2.1 multi-value ingestion format, both the Raspberry Pi and PINE64 are limited to ~70k data points/s by the SD card.
Conclusion
The Warp 10 2.1 multi-value ingestion format is perfect for aligned data, typically in industrial applications. A CAN or a Modbus network can be stored "as is". Each request or frame is stored in a row. You need to keep the mapping/database/schema somewhere else for easy decoding. This mapping could be stored as JSON in an attribute of the GTS. If the mapping is subject to a major breaking change, just add the mapping major revision number in a label to create a new GTS.
Video: Etch-a-Time Series: a RaspberryPi, a laser, and Warp 10… |
As nobody among serious Warp 10 users does use the naive GTS input format, we can reasonably say that 20k data points/s is now the average performance achieved for Warp 10 2.x edge applications.
By the way… This is a "one thread" benchmark. If several sources push data, multithreading will speed up ingestion!
How does your time series database performance compare to that of Warp 10? Let us know!
Read more
FETCHez la data !
Edge computing: Build your own IoT Platform
Data replication with Warp 10

Electronics engineer, fond of computer science, embedded solution developer.