UFO datasets, the truth is out there...

Regularly, people talks about UFOs, some of us believe having seen one of them. This phenomena last since the WW2. UFO sightings across the world is a nice datasets to discover and experiment data manipulation and correlation with Warp 10.

UFO datasets, the truth is out there...

Regularly, people talk about UFOs, some of us believe having seen one of them. This phenomenon lasts since the WW2.

There is a UFO dataset which represents 80 thousands of UFO sightings over the world, with a date, a location, and some extra info.

Upload UFO datasets

First import it into Warp 10 (thanks to the sandbox, you can do it easily):

  • Go to https://sandbox.senx.io/ and create a Sandbox. Copy-paste somewhere the 3 generated tokens (especially, the read and the write one)
  • Download this UFO datasets: ufo.zip
  • Unzip it and push data into the Sandbox:
curl \
  -H "X-Warp10-Token: <YOUR WRITE TOKEN>" \
  -H "Transfer-Encoding: chunked" \
  -T ufo.gts \
  'https://sandbox.senx.io/api/v0/update'

Now, you have data on our SandBox.

Here is the data model:

  • class name: sighting.ufo
  • labels:
    • state (mostly for the US, but I do not want to spoil some analysis results)
    • country
    • shape (the shape of what they see, like a triangle, a circle, a pink duck, and so on)
  • datapoint: date of observation / latitude : longitude / and the duration of the observation as a value
Read more about what in the world is a Time Series Database.

First analysis

First of all, I would like to know the country where there were the most sightings. You can use either https://studio.senx.io/ or our VSCode plugin.

With WarpScript, it takes 4 steps:

  • Fetch data
  • Bucketize it in a single big bucket and compute the datapoints count
  • Reduce by country and compute the sum of datapoints counts
  • Prettify the result
[{"us":64194,"ca":2974,"":9618,"au":536,"de":105,"gb":1894}]

Ok, the winner is surprisingly the USA.

This could mean one of two things:

  1. A large amount of Americans are reptilian alien
  2. The US government is in cahoots with aliens and is using their technology to maintain their status as a global superpower

Second analysis

How sightings evolve during time, with a monthly aggregation? 3 steps:

  • Fetch data
  • Bucketize it by month and compute the datapoints count (ie: month count aggregation)
  • Reduce by country and compute the sum of datapoints counts
Sightings by month and by country
Sightings by month and by country

Ok, it happens mostly at the end of century and it seems that there is some kind of seasonality. Now, focus on the USA. In which US state do UFO sightings occur the most?

{
    "ca": 8787,
    "wa": 3882,
    "fl": 3787,
    "tx": 3410,
    "ny": 2936,
    "il": 2395
}

Cross data

What are the top states for UFO sightings relative to state population?

{
    "wa": 0.052348325379498385,
    "mt": 0.044920686023398645,
    "ak": 0.04181862778606555,
    "or": 0.04153957335086212,
    "me": 0.04131968009118826,
    "vt": 0.041084898090194194
}

Seasonality

Is there a particular period for UFO flights? Here is the process:

  • Fetch data
  • Bucketize and fill gaps
  • Reduce
  • Split by year
  • Shift the time to 01/01/1970
  • Bucketize to align timestamps and reduce
Sightings count by month
Sightings count by month

Well, they like Christmas and the summer.

Correlation

Is there a correlation with alien movies?

So, here is TheMovieDB datasets with the "UFO" search term. movies.zip. Download it and upload your data.

Now I would like to compare evolution of each curve:

UFO sightings:

UFO sightings
UFO sightings

UFO movies releases:

UFO movies releases
UFO movies releases

It seems there is something. Now, I put both series on the same chart:

UFO movies and UFO sightings
UFO movies and UFO sightings

Oh, they are not on the same scale. Obviously, there's less movie releases per year than UFO sightings. Here is a way to compare them:

UFO movies and UFO sightings normalized
UFO movies and UFO sightings normalized

We have to look a bit further by zooming on two eras:

The last century:

UFO movies and UFO sightings during the 20th century
UFO movies and UFO sightings during the 20th century

It seems that sightings are a consequence to Hollywood movie production. In fact, this is a common social theory about the UFO observation phenomena. But for this century:

UFO movies and UFO sightings during the 21th century
UFO movies and UFO sightings during the 21th century

It seems that Hollywood reacts to people by producing movies about a spread interest.

Final thought about this UFO datasets

I am not a sociologist but, with the power and the simplicity of WarpScript, we have done some data manipulations.

These UFO datasets were just a pretext to introduce common timeseries concepts like:

In a future post, I will correlate sightings location with military bases position.

Live long and prosper.

With the help of https://www.kaggle.com/hakeemtfrank/ufo-sightings-data-exploration

Part 2 of UFO sightings datasets is right here.