Archive for the 'cloud-init' Category

Installing/Starting Systemd Services Using Cloud-Init


Using cloudiinit to bootstrap cloud instances and install custom sofware/services is common practice today. One thing you often want to do is install the software, enable it to start on boot, and then start it so that you don't have to reboot in order to go ahead and start using it.

The Problem

Actually starting a service can be tricky though because when executing cloud-init configuration/scripts you are essentially already within a systemd unit while you try to start another systemd unit.

To illustrate this I decided to start a Fedora 22 cloud instance and install/start docker as part of bringup. The instance I started had the following user-data:

  - docker
  - [ systemctl, daemon-reload ]
  - [ systemctl, enable, docker.service ]
  - [ systemctl, start, docker.service ]

After the system came up and some time had passed (takes a minute for the package to get installed) here is what we are left with:

[root@f22 ~]# pstree -asp 925
systemd,1 --switched-root --system --deserialize 21
  `-cloud-init,895 /usr/bin/cloud-init modules --mode=final
      `-runcmd,898 /var/lib/cloud/instance/scripts/runcmd
          `-systemctl,925 start docker.service
[root@f22 ~]# systemctl status | head -n 5
● f22
    State: starting
     Jobs: 5 queued
   Failed: 0 units
    Since: Tue 2015-08-04 00:49:13 UTC; 30min ago

Basically the systemctl start docker.service command has been started but is blocking until it finishes. It doesn't ever finish though. As can be seen from the output above it's been 30 minutes and the system is still starting with 5 jobs queued.

I suspect this is because the start command queues the start of the docker service which then waits to be scheduled. It doesn't ever get scheduled, though, because the cloud-final.service unit needs to complete first.

The Solution

Is there a way to get the desired behavior? There is an option to systemctl that will cause it to not block during an operation, but rather just queue the action and exit. This is the --no-block option. From the systemctl man page:

    Do not synchronously wait for the requested operation
    to finish. If this is not specified, the job will be
    verified, enqueued and systemctl will wait until it is
    completed. By passing this argument, it is only
    verified and enqueued.

To test this out I just added --no-block to the user-data file that was used previously:

  - docker
  - [ systemctl, daemon-reload ]
  - [ systemctl, enable, docker.service ]
  - [ systemctl, start, --no-block, docker.service ]

And.. After booting the instance we get a running service:

[root@f22 ~]# systemctl is-active docker



Capture Elusive cloud-init Debug Output With journalctl

Recently I have been trying to debug some problems with cloud-init in the alpha versions of cloud images for CentOS 7 and Fedora 21. What I have found is that it's not so straight forward to figure out how to set up debug logging.

The defaults (defined in /etc/cloud/cloud.cfg.d/05_logging.cfg ) for some reason don't really capture the debug output in /var/log/cloud-init.log. Luckily, though, on systemd based systems we can get most of that output by using journalctl. There are several services releated to cloud-init and if you want to get the output from all of them you can just use wildcard matching in journalctl (freshly added in ea18a4b ) like so:

[root@f21test ~]# journalctl --unit cloud-* ...debug...debug...blah...blah

This worked great in Fedora 21, but in CentOS/RHEL 7 this actually won't work because wildcard matching is too new. As a result I found another way to get the same output. It just so happens that the services all use the same executable (/usr/bin/cloud-init) so I was able to use that as a trigger:

[root@c7test ~]# journalctl /usr/bin/cloud-init ...debug...debug...blah...blah

I hope others can find this useful when debugging cloud-init.