IoT Topic 1 : the №1 Time Series Database — InfluxDB

7 min readJul 2, 2021

Why is this topic?

I have been designing and building IoT solutions for a while, and it is the time to record some of the understanding before I forget everything.

Meanwhile, I have been asked what is Time Series Database, why it is important, what is InfluxDB and how to use it, so here comes this topic.

Who will be the audience?

The targeted audience are those who are interested in understanding what is InfluxDB and hoping to see its usage or even want to have a practice with minimum effort.

What is Time Series Database ( TSDB )?

A time series database (TSDB) is a database optimised for time-stamped or time series data. Time series data are simply measurements or events that are tracked, monitored, down-sampled, and aggregated over time. This could be server metrics, application performance monitoring, network data, sensor data, events, clicks, trades in a market, and many other types of analytics data.

A time series database is built specifically for handling metrics and events or measurements that are time-stamped. A TSDB is optimised for measuring change over time. Properties that make time series data very different than other data workloads are data lifecycle management, summarisation, and large range scans of many records.

Can we use SQL or NoSQL to replace TSDB?

Technically, it is practical to use traditional SQL databases or NoSQL databases to deal with Time Series data. However, when the size of the time-stamped data gets big enough, the performance issue will come up to both traditional SQL and NoSQL, at both reading and writing sides.

As a TSDB is supposed to be built and optimised for all those ‘data-growing-rapidly’ scenarios, it is believed to have better performance though sometimes challenged by some traditional databases.

Anyway, it is highly recommended to use a TSDB for many scenarios, like IoT, web clicks, events, market trades and many others.

What is InfluxDB?

InfluxDB was built from the ground up to be a purpose-built Time Series Database; i.e., it was not repurposed to be time series. Time was built-in from the beginning.

InfluxDB is well supported by open source community and has been recognised as the most popular TSDB in the industry. Thanks to InfluxDB’s great popularity and long history, there are plenty of materials and libraries available to support most of the programming languages and most of the cloud platforms.

Ranking based on social media search popularity, by DB-Engine

How to Install InfluxDB?

InfluxDB could be installed on Linux, Windows and MacOS, for free. Meanwhile it could be deployed in Docker or Kubernetes, which is usually the way how it is deployed in AWS, Azure and GCP. You can find the more information from its online document regarding installation.

How to try InfluxDB, quickly?

Besides providing InfluxDB as a single binary, InfluxData Inc. , the owner of InfluxDB, also provides two online services: InfluxDB Cloud and InfluxDB Enterprise.

The InfluxDB Cloud is a fast, elastic, server-less time series platform as a service — easy to use with usage-based pricing.
The InfluxDB Enterprise subscription turns any InfluxData instance into a production-ready cluster that can run anywhere.

Those guys who want to quickly try InfluxDB are recommended to try ‘Free Plan’ of InfluxDB Cloud. The ‘Free Plan’ is usually enough for any experiments and even the initial development.

How to Register in InfluxDB Cloud

You can navigate to InfluxDB Cloud 2 for signing up, and the following registration does not require credit card. You are free to sign up either using Google/Microsoft account or an email address.

You are required to choose which cloud platform for data store. I chose AWS however you can choose whichever cloud provider you want and the deployment process is 100% transparent. And of course the best part is purely free. Thumb up to InfluxData Inc.

Choose the cloud platform for the data store

After one minute long wait, you will be navigated to a deployed InfluxDB Cloud instance hosted in AWS with a link https://xxx.aws.cloud2.influxdata.com/orgs/xxx. This link is only available to your account.

In order to quick start, we can click ‘Explore with demo data first’ to populate a demo bucket.

Welcome Page once navigating to the instance for the first time

Generate Demo data in InfluxDB Cloud Instance

You can click ‘Try it out’ confidently in Demo Data Generation wizard dialog. It will create a bucket with data related to website activities.

After a few seconds wait, a new bucket called ‘Website Monitoring Bucket’ has been created.

Data Explorer in InfluxDB Cloud

You can use ‘Data Explorer’ to explore how this demo data looks like and how to query against the demo data.

An example has been displayed in the snapshot below, which is to filter :

the selected bucket is ‘Website monitoring bucket’
the column ‘_measurement’ must be “http_response”
the column ‘_field’ must be “response_time”
the column ‘server’ can be any of those three servers

Also remember to select the toggle button ‘View Raw Data’, and click ‘Submit’. You will see the raw data in this demo bucket.

Basically, this raw dataset is simple. It gives a time series of the response time from three servers, at the frequency of one data point per 10 seconds.

You can click ‘Customise’ button to check out the X and Y.

Toggle off ‘View Raw Data’, then you will see the chart of the response time from three servers in the past 1 hour.

You can click ‘Query Builder’ to see the query on top of the InfluxDB.

This query uses a language called ‘Flux’ which is an alternative to InfluxQL (InfluxQL, a widely used query language by InfluxDB) and other SQL-like query languages for querying and analyzing data. See the online documents for the details about Flux.

from(bucket: "Website Monitoring Bucket")|> range(start: v.timeRangeStart, stop: v.timeRangeStop)|> filter(fn: (r) => r["_measurement"] == "http_response")|> filter(fn: (r) => r["_field"] == "response_time")|> filter(fn: (r) => r["server"] == "https://docs.influxdata.com" or r["server"] == "https://influxdata.com" or r["server"] == "https://influxdays.com")|> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)|> yield(name: "mean")

Also, It is believed that flux has overcome many InfluxQL limitations by using functional language patterns. Check out the comparison of those two languages from this link.

Dashboard in InfluxDB Cloud

You can save your query by clicking the ‘save as’ button at the top right, and you will be asked to save as a dashboard cell as well.

Once you save a dashboard, you will get a default dashboard page which directly consumes the query you make in ‘Data Explorer’.

In dashboard, you can move the cells around, add new cells, delete cells, even configure cells.

Each cell has been configured to execute one specific query. You can change the query to change the underlying data of the cell.

After all those steps, you now have a dashboard which shows a complex dashboard which allows you to monitor the response time of 3 servers in real time.

Last words about Deployment

It is still widely seen that InfluxDB has been deployed in a docker container in the cloud, for example AWS ECS in AWS EC2. It leaves a big burden to the cloud admin to look after the server, the docker, the storage and many others.

Like usual Server-as-A-Service, InfluxDB Cloud and InfluxDB Enterprise provide easier and reliable options for InfluxDB service.

Meanwhile, those cloud providers are providing in-house Time Series Database, which should attract your attention as well. AWS provides Amazon TimeStream, while Azure promotes Time Series Insights. It will be safe to choose those in-house options.

Good luck with the journey of Time Series :)