Writing migration scripts (and manipulating VyOS config files outside VyOS) just got easier
Long story short
VyOS 1.2.0-rolling (starting with the next nightly build) includes a library for parsing and manipulating config files without loading them into the system config. It can be used for automatically converting configs from old versions in case an incompatible change was made, and for standalone utilities. Motivation and history are discussed below.
Here is an example of interacting with the new library:
>>> from vyos import configtree>>> c = configtree.ConfigTree("system { host-name vyos \n } interfaces { dummy dum0 { address 192.0.2.1/24 \n address 192.0.2.20/24 \n disable \n } } /* version: 1.2.0 */")
>>> print(c.to_string())
system {
host-name vyos
}
interfaces {
dummy dum0 {
address 192.0.2.1/24
address 192.0.2.20/24
disable { }
}
}/* version: 1.2.0 */
>>> c.set(['interfaces', 'dummy', 'dum0', 'address'], value='293.0.113.3/32', replace=False)
>>> c.delete_value(['interfaces', 'dummy', 'dum0', 'address'], '192.0.2.1/24')
>>> c.delete(['interfaces', 'dummy', 'dum0', 'disable'])
>>> c.is_tag(['interfaces', 'dummy'])
True>>> c.exists(['interfaces', 'dummy', 'dum0', 'disable'])
False>>> c.list_nodes(['interfaces', 'dummy'])
['dum0']>>> print(c.to_string())
system {
host-name vyos
}
interfaces {
dummy dum0 {
address 192.0.2.20/24
address 293.0.113.3/32
}
}/* version: 1.2.0 */
As you can see, it largely mimics the API you get for the running
config. The only notable differences are that the "set" method requires
that you specify the path and the value separately, and to have nodes
formatted as tag nodes (i.e. "ethernet eth0 { ..." as opposed to
"ethernet { eth0 { ..." you need to mark them as such with "set_tag",
unless they were originally formatted that way in the config you parsed.
Incompatible syntax changes and migration scripts
Have you ever noticed the mysterious line at the end of saved VyOS configs?
/* Warning: Do not remove the following line. */
/* === vyatta-config-version: "cluster@1:config-management@1:conntrack-sync@1:conntrack@1:cron@1:dhcp-relay@1:dhcp-server@4:firewall@5:ipsec@4:nat@4:qos@1:quagga@2:system@6:vrrp@1:wanloadbalance@3:webgui@1:webproxy@1:zone-policy@1" === */
/* Release version: VyOS 1.1.8 */
What is it for? Think what happens if the old CLI syntax design is proven suboptimal, and the only way to seriously improve some feature requires an incompatible syntax change. One such change you may be aware of is the new vs. old NAT syntax, even if just because EdgeOS decided to keep it. The old syntax with a single "service nat" subtree where source NAT rules had to be over 5000, and yet you also had to specify rule type in addition to it, was unwieldy, and pretty much everyone agrees that the new one is better.
Despite the incompatible change, old configs from Vyatta Core 6.3 and older still load in modern VyOS versions just fine. There is a migration mechanism that looks at that version string at the end of the config, and runs appropriate scripts. For example, if you copy a config from some incredibly old version with "nat@1" in the version string, and do "load /config/config.boot.ancient", the system will look up scripts in /opt/vyatta/etc/config-migrate/migrate/nat/ and run scripts 1-to-2, 2-to-3, and 3-to-4, until the current version (though it's better to call it compatibility level) is reached.
So far so good. Why so much legacy syntax remains in VyOS then? Take "system gateway-address" for example, the source of confusion as to where that default route comes from and the leading cause of unintended equal cost multipath setups. Why that stuff is still there?
The mechanism for running the migration scripts is simple, but actually manipulating configs is not. While some scripts are nothing more than a single sed line (for example, the script that changes "smp_affinity" to "smp-affinity"), in most cases you need to go inside nodes. Basically, you need a set/delete/return_values/etc. API that you get for the running config, but without any validation and safeguards, since configs that need migration wouldn't validate by definition, otherwise no migration would be needed.
The parser used by the load command, which is also run non-interactively on boot, is tightly coupled with validation and is hard to decouple from it. It is likely going to stay this way until the entire config backend is replaced. To get around it, people back at Vyatta wrote a library for sort of parsing config files into sort of in-memory datastructures. The only problem is that it's badly broken, fragile, and very annoying to use. This is the reason people have avoided writing migration scripts at all costs, including keeping the worst kinds of legacy syntax until it's absolutely unbearable.
The old parser
The library in question is called XorpConfigParser. XORP was a routing protocol stack used by early versions of Vyatta, and they also used its shell and config format for their own features unrelated to routing protocols. That made adding new features hard and the routing protocol stack itself suffered from serious performance problems, so it was replaced by Quagga and the new CLI was made based on bash, but the name stuck. And not only the name. The library still makes assumptions about the old format, where names and values were separated by colons. When the config format was changed along with the CLI, it was not capable of manipulating values of nodes anymore and returns say "address 192.0.2.1/24" as single string, so the NAT script, for example, makes extensive use of regex match and replace to rename the options. Without normal set and delete operations, manipulating the config becomes an exercise in juggling eggs in variable gravity.
It has always been clear that it needs a replacement, but the problem of correctly parsing and manipulating configs is not a simple one. In addition to inherent complexity of a multi-way tree, the current VyOS syntax itself adds a whole bunch of problems. For example, leaf nodes (e.g. "address 192.0.2.1/24", or "reboot-on-panic true", or "disable") are terminated by newlines, while newlines are not significant anywhere else. Using the same /* */ syntax for node comments that are supposed to stay in the config, simply commented out nodes, and version metadata makes it even worse — the grammar is left-recursive and highly ambiguous. If you see a comment, you are not sure what follows — a node, and which kind of node, another comment, or end of file, and all cases need to be handled differently.
Putting Vyconf to use
A new config backend for VyOS, to be used in the future VyOS 2.0, has been in development for a while, but remained separate from any of the current VyOS code.
While replacing the config backend, and the old config format along with it, still needs a lot of work and cannot be completed until all config and op mode scripts are rewritten in the new style, there is a lot of work already done in the new backend, Vyconf. It was designed from the start to be layered, and rely on in memory datastructures, so the code that is aware of user sessions and commits is based upon code that is only aware of the datastructures and node name and value validation, which in turn builds upon code that is only aware of datastructures. The problem of set/delete interface to multiway trees is already solved there, so why not reuse it?
It needed support for the old config format input and output of course, and to make it accessible from Python scripts, the functionality needed to be made available through a shared library, but I managed to make a working version, which will make it to the next nightly build.
Technical details
The library is somewhat hacky now. The config formatter, for example, doesn't attempt to sort nodes, and likely cannot handle all possible cases of node nesting. The Python bindings are based on ctypes and dlopen(). The shared library it links with is unnecessarily large due to linking with all libraries that Vyconf needs. Building the library assumes per-user OPAM setup and has implicit dependency on Vyconf (and its build dependencies). To avoid the issue with ambiguity introduced by trailing comments, I opted for separating them with a simple finite state machine matcher before passing the actual config parts to the real parser.
You can find the source code of the shared library in libvyosconfig, and the Python bindings in vyos-1x.
Conclusion
Just like any new feature, it needs testing. I'll be testing for missing cases and preparing and API reference, as small as it is, but everyone is invited to play with it in the next 1.2.0 nightly build and send their feedback.
Comments