Project-FiFo Blog

Articles and Blog Posts related to Project FiFo

  • HOME
  • BLOG INDEX
  • CONTACT

  • HOME
  • BLOG INDEX
  • CONTACT

(r)vmadm – managing FreeBSD jails

August 21, 2017 By Heinz N. Gies

We are releasing the first version (0.1.0) of our clone of vmadm for FreeBSD jails today. It is not done or feature complete, but it does provides basic functionality.  At this point, we think it would be helpful to get it out there and get some feedback. As of today, it allows basic management of datasets, as well as creating, starting, stopping, and destroying jails.

Why another tool to manage jails

However, before we go into details let’s talk why we build yet another jail manager? It is not the frequent NIH syndrome, actually quite the opposite. In FiFo 0.9.2 we experimented with iocage as a way to control jails. While iocage is a useful tool when used as a CLI utility it has some issues when used programmatically.

When managing jails automatically and not via a CLI tool things like performance, or a machine parsable interface matter. While on a CLI it is acceptable if a call takes a second or two, for automatically consuming a tool this delay is problematic.

Another reason for the decision was that vmadm is an excellent tool. It is very well designed. SmartOs uses vmadm for years now. Given all that, we opted for adopting a proven interface rather than trying to create a new one. Since we already interface with it on SmartOS, we can reuse a majority of our management code between SmartOS and FreeBSDBSD.

What can we do

 

Today we can manage datasets, which are jail templates in the form of ZFS volumes. We can list and serve them from a dataset-server, and fetch those we like want. At this point, we provide datasets for FreeBSD 10.0 to 11.1, but it is very likely that the list will grow. As an idea here is a community-driven list of datasets that exist for SmartOS today. Moreover, while those datasets will not work, we hope to see the same for BSD jails.

After fetching the dataset, we can define jails by using a JSON file. This file is compatible with the zone description used on SmartOS. It does not provide all the same features but a subset. Resources such as CPU and memory can be defined, networking configured, a dataset selected and necessary settings like hostname set.

With the jail created, vmadm allows managing its lifetime, starting, stopping it, accessing the console and finally destroying it. Updates to jails are supported to however as of today they are only taken into account after restarting the jail. However, this is in large parts not a technical impossibility but rather wasn’t high up on the TODO list.

It is worth mentioning that vmadm will not pick up jails created in other tools or manually. Only using vmadm created jails was a conscious decision to prevent it interfering with existing setups or other utilities. While conventional tools can manage jails set up with vmadm just fine we use some special tricks like nested jails to allow for restrictions required for multi-tenancy that are hard or impossible to achieve otherwise.

Whats next

First and foremost we hope to get some feedback and perhaps community engagement. In the meantime, as announced earlier this year, we are hard at work integrating FreeBSD hypervisors in FiFo, and as of writing this, the core actions work quite well.

Right now only the barebone functions are supported, some of the output is not as clear as we would like. We hope to eventually add support for behyve to vmadm the same way that it supports KVM on SmartOS. Moreover, the groundwork for this already exists in the nested jail techniques we are using.

Other than that we are exploring ways to allow for PCI pass through in jails, something not possible in SmartOS zones right now that would be beneficial for some users.

In general, we want to improve compatibility with SmartOS as much as possible and features that we add over time should make the specifications invalid for SmartOS.

You can get the tool from gitlab.

Filed Under: Project-FiFo Tagged With: freebsd, jails, project-fifo

Testing C code with Erlang QuickCheck

June 15, 2017 By Heinz N. Gies

In Erlang, it’s the usual accepted approach to implement as much as possible on the BEAM. This gives every bit of code the wonderful characteristics of decent fault handling we love so much.

Preamble

However, there are some cases when we can’t do that, be it because we rely on directly interfacing with some library or low level code or because we’re totally bonkers. Still the first rule of NIF-Club is: Do not write NIFs. So, if you ever wonder “should I write this as a NIF” then please keep reading and answer your question with “Hell no!”.

Sadly, I’m totally bonkers and rarely heed my own advice, so for the 0.3.3 version of DalmatinerDB I implemented a library in C to handle the caching of metric writes before they get serialized to disk.

Before you get the rotten tomatoes and raw eggs out give me a chance to defend my honor. The task of write caching is incredibly dependent on memory management and mutating data on a very high rate. Neither of those things Erlang gives you the best tools for. I write it in Erlang. Then wrote it in Erlang with some parts in a C. And finally decided to go all the way for C and after having a working version the results are stunningly good.

Well so let’s get the actual topic, the NIF. Writing something in C is easy! Writing something in C that compiles is still doable. Doing so in a way that does what you want it to do is a lot harder. And finally writing something in C that does what you think it does and not randomly segfaults or overwrites memory is close to impossible – at least so I blatantly claim without proof or citation other than an empirical study with a sample size of: 1. Then again this is my article so I’m allowed to do that especially if it serves as a story telling device.

Testing with EQC

When testing my code, I heavily rely on QuickCheck. It saves me from coming up with test cases myself and rather lets me describe what I want to happen in a greater scope rather then. I will not go into the detail about the what or how as there are other better sources for this, but I’ll sum up my approach quickly.

I implemented the logic I want in Erlang in the most straight forward (and perhaps inefficient) manner I could think of. Now I let Erlang QuickCheck (EQC) generate a random sequence of operations on the cache, with random input parameters, and in the end, see if the naïve implementation and the real cache produce the same outcome.

For optimizing a simple concept, I really like this approach as often the simple implementation is rather easy to reason about, and even if it’s wrong chances are that the way the simple and the optimized implementation are wrong are different.

Now, with that implementation EQC sets off to do random crazy things to the code and see if something breaks. However, that is only half the story! Once it finds something that breaks it will then try to simply the events that lead to the disaster and present me with a (hopefully) minimal test case that can trigger the problem.

And this works perfectly for Erlang code! Or even C code, it can even find some memory corruption issues that way. However, – yes, I know that is what everyone was waiting for, sorry it took me so long – the concept turns completely useless when the C code segfaults and brutally murders the BEAM.

Resolving the segfault problem

During the EUC last week I talked to Thomas Arts, one of the brilliant people behind EQC, about the problem. He suggested something that is totally obvious once you’re told about but I’d never had thought about it on my own. He said, in his wonderful accent, “Oh, that is not a problem, just execute the tests on a different node.”. The simplicity of that blew me away, of cause, it’s Erlang, just run it on another node and don’t be bothered weather it explodes. It’s brilliant!

Now there are a few hurdles in the way however, EQC, to my knowledge, has no build in abstraction for remote execution. That said it’s easy enough to build it with Erlang.

I use the rebar_eqc plugin to run my tests, rebar3 has a bit of an issue when it comes to hostnames. So, before you do anything else you need to be sure epmd is running on the machine you want to test on. The simplest way is just to start a erl shell in another window. Once that is done you can start rebar with rebar3 as eqc eqc –sname eqc.

Erlang, or rather it’s common test framework comes with a nice helper for starting another node for tests. So that is easy we can use ct_slave:start(eqc_client) that will give us a new host to test on.

Next up the host is started without any paths so we’ll need to make sure it knows where to find the code to test, the simplest way I found is just to feed it the same path that the main node has.

Then, since EQC does not know about the second node, we extract the body of the test into its own function. Then pass it the generated values and just returns the needed info to decide if it’s a failure or success. The rpc module will automatically escalate the remote node crashing to a test failure. And this is the big part, a segfault not goes from destroying our test system, to just a type of failure we can encounter in our testing process.

maybe_client() ->
    case ct_slave:start(eqc_client) of
        {ok, Client} ->
            rpc:call(Client, code, set_path, [code:get_path()]),
            {ok, Client};
        {error, already_started, Client} ->
            {ok, Client};
        E ->
            E
    end.

remote_eval(Fn, Args) ->
    {ok, Client} = maybe_client(),
    rpc:call(Client, ?MODULE, Fn, Args).

map_comp_body(Cache, MaxGap) ->
    {H, T, Ds} = eval(Cache),
    TreeKs = all_keys_t(T, MaxGap),
    CacheKs = all_keys_c(H, []),
    Ds1 = check_elements(Ds),
    {CacheKs, TreeKs, T, Ds1}.

prop_map_comp() ->
    ?SETUP(
       fun setup/0,
       ?FORALL(
          {MaxSize, MaxGap, Opts}, {c_size(), nat(), opts()},
          ?FORALL(
             Cache, cache(MaxSize, MaxGap,  Opts),
             begin
                 {CacheKs, TreeKs, T, Ds1} =
                     remote_eval(map_comp_body, [Cache, MaxGap]),
                 ?WHENFAIL(io:format(user, "Cache: ~p~nTree:~p / ~p~nDs: ~p~n",
                                     [CacheKs, TreeKs, T, Ds1]),
                           CacheKs == TreeKs andalso
                           Ds1 == [])
             end))).

And really, that’s it! There a few gotchas like that you can’t pass NIF references over RPC, or that sometimes when canceling the test, you need to manually shut down the client node. Still all in all this worked incredibly well.

The code and tests of the library can be found here: https://github.com/dalmatinerdb/mcache

Filed Under: Project-FiFo

FiFo in 2017

January 12, 2017 By Heinz N. Gies

It’s now 2017! and to celebrate the new year we wanted to give you a little glimpse of whats in store for this coming year.

FiFo 0.9.0

We aim to release the 0.9.0 version of FiFo towards the second half of January. With 0.9.0 we add a number of small changes and improvements. Most notably a considerable rework of the Multi Datacenter and a lot of small improvement to make use of the newer features in DalmatinerDB. We also have a few nice cleanups and improvements for the Cerberus UI.

DalmatinerDB 0.3.0

Along with the 0.9.0 release of FiFo we aim to publish DalmatinerDB 0.3.0. this includes better indexing and a vast number of new functions. Along with it we also add experimental support for events. Furthermore we added more protocols to the Dalmatiner Proxy now including InfluxDB, Prometheus (including the Prometheus persistent data storage protocol), OpenTSDB, Grafite and Metrics 2.0.

The road to the 1.0.0 release

This is the big news! A 1.0.0 release is probably the most important version of any software. If released too early you end up breaking backward compatibility quickly. You risk a version explosion and loose credibility. If released too late you’re stuck in pre-1.0.0 land forever which does not inspire much confidence either.

Project-FiFo has been in pre-1.0.0 for almost three years. That sounds like a long long time, but we are confident that it was the right decision. We are extremely careful and object to introducing breaking changes. While we tried to minimise breaking changes, we unfortunately did had a few of them. After the 1.0.0 release we are committed not to introduce any breaking changes in the foreseeable future. Last but not least we’ve learned a lot, as building a large system as big as FiFo is hard to get right.

As time progressed, our experience building, running, supporting FiFo in large and small installations has now reached the point of confidence that its an appropriate time for a big 1.0.0 release. The design is stable, the concept sound. We will therefore be working towards a 1.0.0 release around the third quarter of 2017. That also means that in the interim, we’ll focus on stability and code cleanup on the way to the 1.0.0 release in order to make your experience as nice as possible.

We hope you are as excited about 1.0.0 as we are!

ZTC & 1.0.0

Of course ZTC will also get some additional love and ironing out. A more streamlined UI is in the works, and while we don’t want to spoil everything just yet we’re really happy to introduce a proper audit log. This means for a managed FiFo ZTC instance, all actions are audited and will allow us to pinpoint who did what & when. Just like the rest of FiFo, this new audit log will be properly distributed.

The future!

The improvements don’t stop after the 1.0.0 release. However, we’ll do our best to keep anything after that compatible with the 1.0.0 API. With a stable starting point we’ll explore more features, and plan expand the hypervisors. Most notably we’ll take a look at expanding support for OmniOS and adding FreeBSD as potential targets for hypervisors.

Filed Under: DalmatinerDB, Project-FiFo

The lies we tell

September 7, 2016 By Heinz N. Gies

Over the last week or so I’ve been helping Steve from dataloop.io with some ranking and benchmarking of different time series databases. Of course I’m biased on that topic given I’ve written DalmatinerDB and believe it’s awesome, but I’ve tried my best to be objective and scientific in the approach.

One of the more interesting parts is the benchmarks. As part of filling the ranking sheet I’ve read and analyzed about a dozen different benchmarks. They ranged from good ones that I would call scientific to bollocks marketing claims.

The truth and nothing but the truth

The truth is, all benchmarks lie and that includes the ones I’ve made for DalmatinerDB. Not necessarily intentionally, don’t get me wrong (even so some are pure marketing). It’s as simple as the work load differing so drastically in real production systems that it’s close to impossible to design a [Read more…]

Filed Under: Project-FiFo

Building a Cloud with Erlang and SmartOS

August 22, 2016 By Mark Slatem

Heinz Gies: Erlang User Conf 2014

Building a Cloud with Erlang and SmartOS

How Hard Could it Possibly Be?

Project FiFo has been around for two years now and grown from a for fun project to run my hobby server to powering private systems installation in multiple universities and last but not least multiple datacenters for the Lucera public cloud.

Between the start and what we have today, FiFo has become a rather complex distributed system and it was not always a smooth ride to get here. This talk will go over the biggest issues and failures and what I have learned from them.

Talk objectives:

Sharing some of the issues throughout FiFo’s history, giving an idea how they were solved and what I learned from it. Also if possible giving everyone a good laugh and my stupidity.

Target audience:

People interested in distributed systems, failure stories, how ‘clouds’ are built and/or SmartOS.

Filed Under: Project-FiFo Tagged With: conference, erlang, video

Querying a metric with confidence!

April 29, 2016 By Heinz N. Gies

DalmatinerDB stores metrics in a periodic fashion, so for every second that passes (assuming your buckets are set up to have a 1s resolution) either a “received” metric’s value or a “non-received” metric is written to the database.

When you miss writing a metric, DalmatinerDB does not write a ‘0’ but it does write an explicit ‘non-value’. This has multiple advantages, not only is it very nice for data compression (ZFS handles zero blocks very well), but it also allows us to differentiate between the two for a specific period.

Version 0.2.0 of DalmatinerDB (the current development branch) will include a confidence function which helps to examine how confident DalmatinerDB can be with a given value or aggregate, let’s look at it using an [Read more…]

Filed Under: DalmatinerDB Tagged With: dalmatinerdb, database, metrics

DalmatinerDB metrics get tags

April 25, 2016 By Heinz N. Gies

We build DalmatinerDB for one purpose, be able to ingest and query more metrics then any other metric store that exists. I am rather confident that we succeeded with that goal. Part of why this has been possible is that it is build for simplicity, we use the same tree structure for metrics that Graphite uses, we use flat files instead of a elaborate database to store metrics, we leverage existing technology like ZFS and Riak Core instead of trying to roll our own clustering, compression, file integrity etc tools, all that removes overhead.

Last week the team of dalaloop.io came over and we set together to discuss and work on tags, or labels, or dimensions, however you want to call them. They are very nice and helpful, however they are not necessarily simple. They conflict with the file layout and data-structures in the DalmatinerDB Backend wich already is not fast for looking up metrics on wildcards. The good news is the modular design of DalmatinerDB allows to not bother the backend with this kind of problems. [Read more…]

Filed Under: Project-FiFo Tagged With: dalmatinerdb, erlang, project-fifo

DNS support in FiFo 0.8.0

March 3, 2016 By Heinz N. Gies

With the 0.8.0 release FiFo’s ecosystem will grow with the addition of a dynamic DNS service. The functionality is rather simple, and makes it possible to assign each interface a hostname. When queried this hostname will return the IP of the interface. It is also  possible to give multiple interfaces (on different machines) the same hostname, effectively creating a DNS round robin where all available DNS records get returned.

A simple example where this functionality can be used is a multi-node FiFo installation. Each zone running Howl (the API endpoint) gets the hostname ‘fifo’. From that point forward, fifo.<org uuid>.vms.cloud.fifo will resolve to all of those IP’s, which provides a simple load balancing, and to a certain degree fault tolerance.
[Read more…]

Filed Under: Project-FiFo Tagged With: fifo

Docker comes to FiFo – 0.7.1 released

December 16, 2015 By Heinz N. Gies

Why Docker?

These days, to say that Docker adoption has proceeded at a remarkable rate, would be an understatement. There is a lot of Buzz about Docker and it has gained a huge following and a lot of momentum. This is especially true for folks who have come from the Linux world, who have not previously been exposed to the “Magic and Wonder” of native OS containers us Solaris and BSD users have been enjoying for so many years.

When any new technology garners mass appeal, it is imperative from a business perspective to give the people what they want. Joyent has rightly recognized this fact and has integrated Docker support in the most sensible way [Read more…]

Filed Under: Project-FiFo Tagged With: docker, fifo

The art of writing a ticket

October 20, 2015 By Heinz N. Gies

There are a multitude of ways contributing to open source. Slinging code, writing documentation, improving on features, research, web design, helping newcomers, keeping the community engaged. However there is one part that is often overlooked, and it is the most fundamental thing that every user can do: report problems and logging a ticket.

A problem not reported is a problem not fixed, and that hurts everyone, the user – they have a problem, the project – they have a problem they don’t know about, the community – someone might run into the problem too.

That said writing tickets isn’t trivial, there is a big difference between ‘a ticket’ and ‘a good ticket’. Writing a good ticket requires skill and consideration but it is worth [Read more…]

Filed Under: Project-FiFo

  • « Previous Page
  • 1
  • 2
  • 3
  • Next Page »
  • GITHUB
  • DOCS
  • WEBSITE
  • CONTACT
  • TICKETS
  • DISCLAIMER
Copyright © 2019 Project-FiFo