The tree class is the in-memory representation of a configuration file, and is the data structure passed around methods and algorithms to tune their behavior. It replaces the previous config static structure.
The following highlights describe the tree class:
- Keys are (and thus the tree layout is) pre-registered. One side-effect of moving away from a static C++ structure as the representation of the configuration to a dynamic structure such as a tree is that the compiler cannot longer validate the name of the configuration settings when they are queried. In the past, doing something like config.architecture would only compile if architecture was a valid structure defined... but now, code like config["architecture"] cannot be validated during the build.
In order to overcome this limitation, trees must have their keys pre-defined. Pre-defining the keys declares their type within the tree. Accesses to unknown keys results in an error right away, and accesses to pre-defined keys must always happen with their pre-recorded types.
Note that pre-defined nodes can, or cannot, hold a value. The concept of "being set" is different than "being defined". - Some nodes can be dynamic. Sometimes we do not know what particular keys are valid within a context. For example, the test_suites subtree of the configuration can contain arbitrary test suite names and properties within it, and there is no way for Kyua (at the moment) to know what keys are valid or not.
As a result, the tree class allows defining a particular node as "dynamic", at which point accesses to any undefined keys below that node result in the creation of the node. - Type safety. Every node has a type attached to it. The base configuration library provides common types such as bool_node, int_node and string_node, but the consumer can define its own node types to hold any other kind of data type. (It'd be possible, for example, to define a map_node to hold a full map as a tree leaf.)
The "tricky" (and cool) part of type safety in this context is to avoid exposing type casts to the caller: the caller always knows what type corresponds to every key (because, remember, the caller had to predefine them!), so it knows what type to expect from every node. The tree class achieves this by using template methods, which just query the generic internal nodes and cast them out (after validation) to the requested type. - Plain string representations. The end user has to be able to provide overrides to configuration properties through the command line... and the command line is untyped: everything is a string. The tree library, therefore, needs a mechanism to internalize strings (after validation) and convert them to the particular node types. Similarly, it is interesting to have a way to export the contents of a tree to strings so that they can be shown to the user.
config::tree tree;
// Predefine the valid keys.
tree.define< config::string_node >("kyua.architecture");
tree.define< config::int_node >("kyua.timeout");
// Populate the tree with some sample values.
tree.set< config::string_node >("kyua.architecture", "powerpc");
tree.set< config::int_node >("kyua.timeout", 300);
// Query the sample values.
const std::string architecture =
tree.lookup< config::string_node >("kyua.architecture");
const int timeout =
tree.lookup< config::int_node >("kyua.timeout");
Yep, that's it. Note how the code just knows about keys and their types, but does not have to mess around with type casts nor tree nodes. And, if there is any typo in the property names or if there is a type mismatch between the property and its requested node type, the code will fail early. This, coupled with extensive unit tests, ensures that configuration keys are always queried consistently.
Note that we'd also have set the keys above as follows:
tree.set_string("kyua.architecture", "powerpc");
tree.set_string("kyua.timeout", "300");
... which would result in the validation of "300" as a proper integer, conversion of it to a native integer, and storing the resulting number as the integer node it corresponds to. This is useful, again, when reading configuration overrides from the command line as types are not known in that context yet we want to store their values in the same data structure as the values read from the configuration file.
Let's now see another very simple example showcasing dynamic nodes (which is a real-life example from the current Kyua configuration file):
config::tree tree;
// Predefine a subtree as dynamic.
tree.define_dynamic("test_suites");
// Populate the subtree with fictitious values.
tree.set< config::string_node >("test_suites.NetBSD.ffs", "ext2fs");
tree.set< config::int_node >("test_suites.NetBSD.iterations", 5);
// And the querying would happen exactly as above with lookup().
Indeed, it'd be very cool if this tree type followed more standard STL conventions (iterators, for example). But I didn't really think about this when I started writing this class and, to be honest, I don't need this functionality.
Now, if you paid close attention to the above, you can start smelling the relation of this structure to the syntax of configuration files. I'll tell you how this ties together with Lua in a later post. (Which may also explain why I chose this particular representation.)