AIS data are being integrated in many applications. Managing those data at scale can prove cumbersome. This post introduces TrueSea by SenX, a turnkey solution for AIS data built on the acclaimed Warp 10 Geo Time Series platform.
Data from the Automatic Identification System (AIS for short) allows the tracking of vessels around the world. They are being integrated into a growing number of applications from insurance companies, defense agencies, or financial services.
These data are emitted periodically by ships at sea or at moorings and can represent millions of daily messages, bringing with them challenges of their own.
This blog post will present some of those challenges and how SenX can help you solve them, creating more value from all those AIS messages.
What is AIS?
The Automatic Identification System was created to increase the safety at sea by enabling ships to identify nearby vessels to avoid collisions and by facilitating Search and Rescue (SAR) operations. It was made mandatory by the 2002 IMO SOLAS agreement for all vessels above 300 GT.
Technically AIS is a system that transmits over VHF short messages with information about the vessels.
There are three main types of information being transmitted.
- Static information about the ship, such as its name and IMO number, its size or type, and the location of the antenna acquiring the position, which is important on a 300 m tanker or cargo ship for example. This information is transmitted every 6 minutes.
- Voyage information, also transmitted every 6 minutes. These include the type of cargo, the current destination of the ship and its ETA, its planned route, and some safety-related information.
- Lastly, dynamic information which is transmitted more often when the ship is moving and depending on its speed. This information includes the vessel's speed and course on ground, its heading, its rate of turn, navigation status, and other indicators.
Where can AIS data be acquired?
AIS messages are transmitted via VHF radio waves in an unencrypted format. They can be picked up by any station with a line of sight view of the emitter's antenna. This has led to the proliferation of hobbyists on coastal areas deploying cheap SDR dongles with Open Source software able to decode the AIS chatter.
This craze soon led to the creation of web services that offered to concentrate the data coming from individual stations and to redistribute the consolidated stream to all participants.
The two more popular such services are MarineTraffic and AISHub. The latter is purely a community oriented service while the former has turned into a commercial provider of the data while retaining its redistribution to participating stations.
More recently, companies such as ORBCOMM, Spire, or SpaceQuest have launched constellations of satellites carrying a payload capable of collecting AIS data. They are thus extending the available data to those coming from ships in the open sea.
AIS data can be purchased from those companies, either as a real time stream or as historical datasets. Their offerings mainly differ by the coverage they can provide and particularities of their own APIs. But the format of the data is common since it is dictated by the AIS standard.
What usage can be made of AIS data?
Over the years, AIS data have been used for a growing number of types of applications. The first uses were those for which AIS was invented, the safety of ships at sea. But soon after the data started being collected and consolidated by community websites, uses not at sea started booming.
There are far too many uses of AIS data today to endeavor into naming them all. But we can list a few such as e-ODyn computing oceanic currents, Quiet Oceans forecasting ocean noise, or Kpler providing energy market intelligence.
What is a typical pipeline around AIS data?
The provider or pool or providers you have selected will make their AIS data available to you via a real-time feed and/or sets of downloadable files. Your first task will be to listen to those feeds and retrieve those files. Then immediately you will need to start storing those data to start building your AIS trove.
This storage layer will need to be constructed with usability in mind. It means that the data must be accessible by the applications you intend to build.
Your real-time feed may need to be connected to a stream processing engine for on-the-fly analysis and decision.
And lastly, your historical datasets will need to be available for large scale batch processing, whether for detecting specific situations in the past, creating custom visualizations, or determining which ships went where in order to gather intelligence from past voyages.
What are the challenges of working with AIS data?
Right after connecting to your first AIS feed and getting the first few files, you may realize that these data may quickly prove more difficult to work with than what you may have expected.
Your choice of database fills rapidly. The developers are demanding more performance. This leads to the creating of more and more indices which in turn increases the rate at which your disk fills.
|Read more about how AIM45 uses Warp 10 to analyze ocean race data.|
The way your applications need to access data usually also evolve rapidly. You built your system with requirements for fetching data from a bounding box, but you now get requests for identifying specific types of ships which went through important trade routes such as the Panama and Suez canals over the course of the COVID-19 lockdown, or find how ships called in various world ports over an extended period of time.
New uses of the data you had not planned when you first designed your system and which now occupy your teams more than you had expected and put your cloud bills through the roof as you need more and more power to mitigate bad design choices.
Your team will quickly need to learn more than they want about spatio-temporal indexing, space-filling curves, efficient data representations. All of this so that you can use data that you pay every month.
This type of situation adds a lot of stress to many teams:
- Product teams which cannot deliver the features they promised to their customers,
- Data teams which struggle to scale their databases,
- And finance teams which see costs increasing all over the place with expensive data not being put to work to produce value as it was originally planned.
How can SenX help?
You can decide that your team has the time and ability to become top notch spatio-temporal gurus. But if that is not the case or if that objective is not aligned with your business goals we think that SenX can help.
As it happens, we distribute Warp 10 as open-source, so you can grab it, and thank us for all the fish, putting it in the hands of your already stressed-out data teams who will surely appreciate they get an extra tool to master.
Or you can decide to contact us to help you solve your AIS issues.
Over the years we have built quite an expertise in handling spatio-temporal data at scale, whether for aircraft manufacturers, offshore racing teams, or defense agencies. AIS is such a good fit for our technologies that we have created an offering dedicated to the management of those data, TrueSea.
TrueSea is a turnkey solution for AIS data pipelines, handling the ingestion of AIS feeds and files. It makes that data available through powerful APIs and to batch processing tools such as Spark to enable you to perform large scale analytics to power your applications.
TrueSea is compatible with major AIS data providers and can be easily adapted to emerging ones. The solution can also mix AIS data with weather, aeronautical, or road traffic data for a global approach.
TrueSea also comes with visualization options such as our Marauder's Map like view which can display millions of positions from tens of thousands of ships.AIS data can be difficult to handle efficiently. TrueSea by SenX is a turnkey solution meant to ease managing and analyzing those data. Click To Tweet
If you are interested in having a conversation about AIS data and how we can help you get the most out of your investment, send us an email, we will reach back to you rapidly to schedule a video call.
Download the documentation to learn more about TrueSea.
In the world of data, the Parquet format plays an important role and it might be tempting to use it for storing time series. This post explores the in...
WarpScript syntax can look alien at first. We discuss the thoughtful reasons for its design and see how easy it is to get used to it.
Learn how to generate random numbers in WarpScript. Very useful to generate random Time Series for synthetic test datasets.
Co-Founder & Chief Technology Officer