Geo Visualisation with ELK Stack
Would you like to know how to visualice your data on a map? You need to have geo referenced data on your dataset, of course, but, if you already have that kind of data, how could you show it on a geographic map? In this post, we are going to see how we can have GEO Visualisation for our data set with the ELK Stack. Don’t spent more time and Let’s start with it!
Preconditions
In this lab:
- We are using the free cloud instance from https://cloud.elastic.co/
- We are using a public dataset available on my personal GitHub
Step 01. Creating the Index and Index Map
In this lab, we are going to take our dataset available in GitHub and publish it in elastisearch with logstash. But, before that, we need to make some preparation steps. The first one is define is “create an index patterns” to group all records that the we are going to publish, and the second one is create a “mapping rule” to identify the geo referenced data in the dataset, in other words, “the geo point”
So, let’s start creating the index with the mapping in Elasticsearch using the DevToos console in kibana. With “DevTools” we can write a POST request and impact on the elastic search api, for example, to create an index:
We can see how “uela-dataset-01-geo-all” index is created:
At this moment the index is empty, we didn’t publish any record on it yet. But also, to be ready to relate all the data we are goingo to publish on that index we need to define an “ index pattern”, also, from Kibana, how we can see in the following screen:
Step 02. Preparing Logstash ~ The config file
Now, we are going to prepare Logstash to run. Let’s observe the config file that we are using to the dataset
In this config file, the idea is to ingest four different dataset, each one from a different geographic place. So, this section of the scrip show how is added to every record two new field with geo location info from associated to each one of the four datasets:
In other words, in the filter section, I add the mutate, in this simplified example to only on dataset about “La Boca” neighboard, we add the geolocation of the “La boca” (-34.5834529467442, -58.4053598727298) to every record of the dataset “DATASET_01_2009_2017_BO.csv”
input { file {
type => "dataset_01_2009_2017_BO"
path => "${ELK_STACK_UELA_DATASET}/DATASET_01_2009_2017_BO.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}}
filter { mutate {
add_field => { "[location][lat]" => "-34.5834529467442" }
add_field => { "[location][lon]" => "-58.4053598727298" }
} mutate {
convert => {"[location][lat]" => "float"}
convert => {"[location][lon]" => "float"}
}
}output { elasticsearch {
index => "uela-dataset-01-geo-all"
hosts => "${ELK_STACK_CLOUD_HOST}"
user => "${ELK_STACK_CLOUD_USER}"
password => "${ELK_STACK_CLOUD_PASS}"
} stdout {}
}
It’s important to note the index in the output: “uela-dataset-01-geo-all” is the same defined in the Kibana POST at step one:
Step 03. Run Logstash to ingest data
And finally, last but not least, we will run logstash to ingest CSV file data withe the confing and the index that we have just prepared:
After the successful run, we have the index alive
And, not only that, but also, in the fields we have Location with the type “geo_point” … That’s is great! and thanks to our initial map config on step 1
With a field of type “geo_point” now we are ready to make a map visualisation! So, let’s see how we could do that:
Step 04. Query the data in the geo map
With the data indexed, and thank to the “geo location” maps that we have config in the first step, we are now able to get the data in a geo map dashboard:
On the right panel, we can se how to configure a new map visualisation, with the aggregated metrics of the data, and the most importan for this post, with the geohash location information.
We simply select a “Geohash” aggregation on this kind of visualisation and the field “location” that we already have mapped to a geo point type.
Let’s finish the last step from the post showing some queries that we could make with the geo referenced data:
Step 05. Sample Queries ~ How much geo points?
a) How much documents are under each geo point?
All four points have the same size: 37445 and 37445 x4 = 149780, so, we can realice that all docs are reflected on the geo graph
Let’s add some custom visualisation with only the columns what I am interested on:
Final words
Ok!, We will stop the post at this point …
We have gotten an highlight about how to ingest a data in CSV file in Elasticsearch, an how to format it into GeoPoints in Kibana to visualice this one a map. We see that we need first define a map to indicate the geo location field and we need to define the index to group the records.
After that, we show how to configure logstash to add geo location information to every record while they are ingested in elasticserch, and, at last, we make a geographic visualisation of the data some queries to geo indexed dataset.
I hope you enjoyed reading and learning something new …
See you on a next one!
Regards,
Pablo
Resources
Public available
- A nice post about Geolocalization types on Elasticsearch
- Nice post about the basics about Elasticsearch Mappings
- Our dataset available on my personal GitHub
- Elastic cloud with free 14 days of ELK Stack https://cloud.elastic.co/
From the Official Doc
- Official Doc about Geo-point field type
- Official Doc about Mapping Definitions on Elasticsearch
- Official Doc about Grok to transform the data