We build DalmatinerDB for one purpose, be able to ingest and query more metrics then any other metric store that exists. I am rather confident that we succeeded with that goal. Part of why this has been possible is that it is build for simplicity, we use the same tree structure for metrics that Graphite uses, we use flat files instead of a elaborate database to store metrics, we leverage existing technology like ZFS and Riak Core instead of trying to roll our own clustering, compression, file integrity etc tools, all that removes overhead.
Last week the team of dalaloop.io came over and we set together to discuss and work on tags, or labels, or dimensions, however you want to call them. They are very nice and helpful, however they are not necessarily simple. They conflict with the file layout and data-structures in the DalmatinerDB Backend wich already is not fast for looking up metrics on wildcards. The good news is the modular design of DalmatinerDB allows to not bother the backend with this kind of problems.
So in the spirit of using proven technology tags are implemented on top of Postgres. Relational queries like this are after all what Postgres is build for and it does a damn fine job at them. Even complex relation of tags can be expressed as a SQL query and between sprinkling some indices over the data and the optimizer doing it’s job it’s incredibly fast.
So how do tags look like, they are a a relation between what we call a collection and a metric in Postgres to a bucket and key in the DalmatinerDB backend. Each metric can have one more more tags. Tags consist out of three elements, a namespace, a tag name and a tag value, all of those being strings, where the namespace can be empty.
We choose to make namespaces first class elements of the tag system as it allows for more powerful features down the road, such as metric 2.0 metadata or privately prefixed tags for improving the speed glob lookups. Even with over 100.000 tags stored in the database queries remain within the milliseconds.
Tags follow a slightly different syntax and can be used in any place where normal lookups or globs were used before. As an example
SELECT sum(action.count IN fifo WHERE service=sniffle) LAST 1m
will create a graph that counts all the actions performed by sniffle within the last minute with a 1 second granularity, before the same query was written as:
SELECT sum(*.*.sniffle.*.*.count BUCKET fifo) LAST 1m
As tags requires a Postgres backend they remain entirely optional to use, and Dalmatiner will function as before without it. Documentation on how to set up Postgres and add tags as well as a proxy to read metrics 2.0 data will be coming in the near future.