Intro
Using cloud-init to bootstrap cloud instances and install custom sofware/services is common practice today. One thing you often want to do is install the software, enable it to start on boot, and then start it so that you don't have to reboot in order to go ahead and start using it.
The Problem
Actually starting a service can be tricky though because when executing cloud-init configuration/scripts you are essentially already within a systemd unit while you try to start another systemd unit.
To illustrate this I decided to start a Fedora 22 cloud instance and install/start docker as part of bringup. The instance I started had the following user-data:
#cloud-config packages: - docker runcmd: - [ systemctl, daemon-reload ] - [ systemctl, enable, docker.service ] - [ systemctl, start, docker.service ]
After the system came up and some time had passed (takes a minute for the package to get installed) here is what we are left with:
[root@f22 ~]# pstree -asp 925 systemd,1 --switched-root --system --deserialize 21 `-cloud-init,895 /usr/bin/cloud-init modules --mode=final `-runcmd,898 /var/lib/cloud/instance/scripts/runcmd `-systemctl,925 start docker.service [root@f22 ~]# systemctl status | head -n 5 ● f22 State: starting Jobs: 5 queued Failed: 0 units Since: Tue 2015-08-04 00:49:13 UTC; 30min ago
Basically the systemctl start docker.service command has been started but is blocking until it finishes. It doesn't ever finish though. As can be seen from the output above it's been 30 minutes and the system is still starting with 5 jobs queued.
I suspect this is because the start command queues the start of the docker service which then waits to be scheduled. It doesn't ever get scheduled, though, because the cloud-final.service unit needs to complete first.
The Solution
Is there a way to get the desired behavior? There is an option to systemctl that will cause it to not block during an operation, but rather just queue the action and exit. This is the --no-block option. From the systemctl man page:
--no-block Do not synchronously wait for the requested operation to finish. If this is not specified, the job will be verified, enqueued and systemctl will wait until it is completed. By passing this argument, it is only verified and enqueued.
To test this out I just added --no-block to the user-data file that was used previously:
#cloud-config packages: - docker runcmd: - [ systemctl, daemon-reload ] - [ systemctl, enable, docker.service ] - [ systemctl, start, --no-block, docker.service ]
And.. After booting the instance we get a running service:
[root@f22 ~]# systemctl is-active docker active