Sharing a Go library to Python using CFFI

Disclaimer

I am not a Go expert so I may not be able to answer questions you may have about this process. I am simply trying to reproduce and document what I saw. Additionally, it was actually recommended to not do this for long running processes because you'll have two runtimes that may work against each other eventually (garbage collection, etc). Be wary.

Introduction

Back in July I went to Go Camp that was a part of Open Camps in NYC. It was an informal setting, which actually made it much more comfortable for someone relatively new to the Go programming language.

At the conference Filippo Valsorda (@FiloSottile) did an off the cuff session where he took a Go function and then created a C shared library (.so) out of it (added in Go 1.5). He then took the C shared library and generated a Python shared object that can be directly imported in Python and executed.

Filippo actually has an excellent blog post on this process already. The one difference between his existing blog post and what he presented at Go Camp is that he used CFFI (as opposed to defining CPython extensions) to generate the shared object files that could then be imported directly into Python.

In an attempt to recreate what Filippo did I created a git repo and I'll use that to demonstrate the process that was showed to us a Go Camp. If desired there is also an archive of that git repo here.

System Set Up

I ran this on a Fedora 24 system. To set a Fedora 24 system up from scratch I ran:

$ sudo dnf install golang git python2 python2-cffi python2-devel redhat-rpm-config

I then set my GOPATH. On your system you may already have your GOPATH set:

$ export GOPATH=~/go

Then I cloned the git repo with the example code for this blog post and changed into that directory:

$ go get github.com/dustymabe/go2python-example
$ cd $GOPATH/src/github.com/dustymabe/go2python-example/

Hello From Go

Now we can look at our the file that contains the function we want to export to Python:

$ cd _Go
$ cat hello.go 
package main

import "C"
import "fmt"
import "math"

//export Hello
func Hello() {
   fmt.Printf("Hello! The square root of 4 is: %g\n", math.Sqrt(4))
}

func main() {
    Hello()
}

In this file you can see that we are importing a few standard libraries and then defining a function that prints some text to the terminal.

The import "C" is part of cgo and the //export Hello comment right before the function declaration is where we tell go that we want to export the Hello function into the shared library.

Before we generate a shared library, let's test to see if the code compiles and runs:

$ go build -o hello .
$ ./hello 
Hello! The square root of 4 is: 2

Now let's generate a shared library:

$ go build -buildmode=c-shared -o hello.so .
$ ls
hello hello.go  hello.h  hello.so  README.md  vendor
$ file hello.so
hello.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=ecf1770f0897ca064aab8dacbcb5f7c2f688f34d, not stripped

The hello.so and hello.h files were generated by go. The .so is the shared library that we can now use in C.

Hello From C

We could jump directly to Python now, but we are going to take a detour and just make sure we can get the shared object we just created to work in a simple C program. Here we go:

$ cd ../_C
$ tree .
.
├── hello.c
├── hello.h -> ../_Go/hello.h
├── hello.so -> ../_Go/hello.so
└── README.md

0 directories, 4 files
$
$ cat hello.c
#include "hello.h"

void main() {
     Hello();
}
$ gcc hello.c hello.so 
$ LD_LIBRARY_PATH=$(pwd) ./a.out                                                                                                                                                           
Hello! The square root of 4 is: 2

What did we just do there? Well, we can see that hello.h and hello.so are symlinked to the files that were just created by the go compiler. Then we show the simple C program that just includes the hello.h header file and calls the Hello() function. We then compile that C program and run it.

We also set the LD_LIBRARY_PATH to $(pwd) so that the runtime shared library loader can find the shared object (hello.so) at runtime and then we ran the program.

So... It worked! Everything looks good in C land.

Hello From Python

For Python we'll first generate the shared object that can be imported directly into Python (just like any .py file). To do this we are using CFFI. A good example that is close to what we are doing here is in the CFFI API Mode documentation.

Here is the file we are using:

$ cd ../_Python/
$ tree .
.
├── hello_ffi_builder.py
├── hello.h -> ../_Go/hello.h
├── hello.py
├── hello.so -> ../_Go/hello.so
└── README.md

0 directories, 5 files
$
$ cat hello_ffi_builder.py 
#!/usr/bin/python
from cffi import FFI
ffibuilder = FFI()

ffibuilder.set_source("pyhello",
    """ //passed to the real C compiler
        #include "hello.h"
    """,
    extra_objects=["hello.so"])

ffibuilder.cdef("""
    extern void Hello();
    """)

if __name__ == "__main__":
    ffibuilder.compile(verbose=True)

The ffibuilder.set_source("pyhello",... function sets the name of the file that will get created (pyhello.so) and also defines the code that gets passed to the C compiler. Additionally, it specifies some other objects to load (hello.so). The ffibuilder.cdef defines what program we are building into a shared object; in this case extern void Hello();, so we are just stealing what was defined in hello.so.

Let's run it and see what happens:

$ ./hello_ffi_builder.py 
running build_ext
building 'pyhello' extension
gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python2.7 -c pyhello.c -o ./pyhello.o
gcc -pthread -shared -Wl,-z,relro -specs=/usr/lib/rpm/redhat/redhat-hardened-ld ./pyhello.o hello.so -L/usr/lib64 -lpython2.7 -o ./pyhello.so
$ ls pyhello.* 
pyhello.c  pyhello.o  pyhello.so
$ file pyhello.so 
pyhello.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=9a2670b5d287fe80180b158a61ea3e35086e89d7, not stripped

OK. A program (pyhello.c) was generated and then compiled into a shared object (pyhello.so). Can we use it?

Here is the hello.py file that imports the library from pyhello.so and then runs the Hello() function:

$ cat hello.py 
#!/usr/bin/python

from pyhello import ffi, lib
lib.Hello()

Does it work?:

$ LD_LIBRARY_PATH=$(pwd) ./hello.py                                                                                                                                                   
Hello! The square root of 4 is: 2

You bet!

- Dusty

Booting Lenovo T460s after Fedora 24 Updates

Introduction

I recently picked up a new Lenovo T460s work laptop. It is fairly light and has 20G of memory, which is great for running Virtual Machines. One of the first things I did on this new laptop was install Fedora 24 onto the system. After installing from the install media I was up and running and humming along.

Soon after installing I updated the system to get all bugfixes and security patches. I updated everything and rebooted and:

Probing EDD (edd=off to disable)... OK

Yep, the system wouldn't even boot. The GRUB menu would appear and you could select different entries, but the system would not boot.

The Problem

It was time to hit the internet. After some futile attempts on Google searches I decided to ask on Fedora's devel list. I got a quick response that led to two bug reports (1351943, 1353103) about the issue. It turns out that the newer kernel + microcode that we had updated to had some sort of incompatibility with the existing system firmware. According to the comments from 1353103 there was most likely an assumption the microcode was making that wasn't true.

The Solution

The TL;DR on the fix for this is that you need to update the BIOS on the T460s. I did this by going to the T460s support page (at the time of this writing the link is here) and downloading the ISO image with the BIOS update (n1cur06w.iso).

Now what was I going to do with an ISO image? The laptop doesn't have a cd-rom drive, so burning a cd wouldn't help us one bit. There is a helpful article about how to take this ISO image and update the BIOS.

Here are the steps that I took.

  • First, boot the laptop into the older kernel so you can actually get to a Linux environment.
  • Next, install the geteltorito software so that we can create an image to dd onto the USB key
$ sudo dnf install geteltorito 
  • Next use the software to create the image and write it to a flash drive. Note, /dev/sdg wsa the USB key on my system. Please be sure to change that to the device corresponding to your USB key.
$ geteltorito -o bios.img n1cur06w.iso 
Booting catalog starts at sector: 20 
Manufacturer of CD: NERO BURNING ROM
Image architecture: x86
Boot media type is: harddisk
El Torito image starts at sector 27 and has 47104 sector(s) of 512 Bytes

Image has been written to file "bios.img".

$
$ sudo dd if=bios.img of=/dev/sdg bs=1M
23+0 records in
23+0 records out
24117248 bytes (24 MB, 23 MiB) copied, 1.14576 s, 21.0 MB/s
  • Next, reboot the laptop.
  • After the Lenovo logo appears press ENTER.
  • Press F12 to make your laptop boot from something else than your HDD.
  • Select the USB stick.
  • Make sure your laptop has its power supply plugged in. (It will refuse to update otherwise.)
  • Follow the instructions.
  • Select number 2: Update system program
  • Once finished reboot the laptop and remove the USB key.

Done! I was now happily booting Fedora again. I hope this helps others who hit the same problem!

- Dusty

Non Deterministic docker Networking and Source Based IP Routing

Introduction

In the open source docker engine a new networking model was introduced in docker 1.9 which enabled the creation of separate "networks" for containers to be attached to. This, however, can lead to a nasty little problem where a port that is supposed to be exposed on the host isn't accessible from the outside. There are a few bug reports that are related to this issue.

Cause

This problem happens because docker wires up all of these containers to each other and the various "networks" using port forwarding/NAT via iptables. Let's take a popular example application which exhibits the problem, the Docker 3rd Birthday Application, and show what the problem is and why it happens.

We'll clone the git repo first and then check out the latest commit as of 2016-05-25:

# git clone https://github.com/docker/docker-birthday-3
# cd docker-birthday-3/
# git checkout 'master@{2016-05-25}'
...
HEAD is now at 4f2f1c9... Update Dockerfile

Next we'll bring up the application:

# cd example-voting-app/
# docker-compose up -d 
Creating network "examplevotingapp_front-tier" with the default driver
Creating network "examplevotingapp_back-tier" with the default driver
Creating db
Creating redis
Creating examplevotingapp_voting-app_1
Creating examplevotingapp_worker_1
Creating examplevotingapp_result-app_1

So this created two networks and brought up several containers to host our application. Let's poke around to see what's there:

# docker network ls
NETWORK ID          NAME                          DRIVER
23c96b2e1fe7        bridge                        bridge              
cd8ecb4c0556        examplevotingapp_front-tier   bridge              
5760e64b9176        examplevotingapp_back-tier    bridge              
bce0f814fab1        none                          null                
1b7e62bcc37d        host                          host
#
# docker ps -a --format "table {{.Names}}\t{{.Image}}\t{{.Ports}}"
NAMES                           IMAGE                         PORTS
examplevotingapp_result-app_1   examplevotingapp_result-app   0.0.0.0:5001->80/tcp
examplevotingapp_voting-app_1   examplevotingapp_voting-app   0.0.0.0:5000->80/tcp
redis                           redis:alpine                  0.0.0.0:32773->6379/tcp
db                              postgres:9.4                  5432/tcp
examplevotingapp_worker_1       manomarks/worker              

So two networks were created and the containers running the application were brought up. Looks like we should be able to connect to the examplevotingapp_voting-app_1 application on the host port 5000 that is bound to all interfaces. Does it work?:

# ip -4 -o a
1: lo    inet 127.0.0.1/8 scope host lo\       valid_lft forever preferred_lft forever
2: eth0    inet 192.168.121.98/24 brd 192.168.121.255 scope global dynamic eth0\       valid_lft 2921sec preferred_lft 2921sec
3: docker0    inet 172.17.0.1/16 scope global docker0\       valid_lft forever preferred_lft forever
106: br-cd8ecb4c0556    inet 172.18.0.1/16 scope global br-cd8ecb4c0556\       valid_lft forever preferred_lft forever
107: br-5760e64b9176    inet 172.19.0.1/16 scope global br-5760e64b9176\       valid_lft forever preferred_lft forever
#
# curl --connect-timeout 5 192.168.121.98:5000 &>/dev/null && echo success || echo failure
failure
# curl --connect-timeout 5 127.0.0.1:5000 &>/dev/null && echo success || echo failure
success

Does it work? Yes and no?

That's right. There is something complicated going on with the networking here. I can connect from localhost but can't connect to the public IP of the host. Docker wires things up in iptables so that things can go into and out of containers following a strict set of rules; see the iptables output if you are interested. This works fine if you only have one network interface per container but can break down when you have multiple interfaces attached to a container.

Let's jump in to the examplevotingapp_voting-app_1 container and check out some of the networking:

# docker exec -it examplevotingapp_voting-app_1 /bin/sh
/app # ip -4 -o a
1: lo    inet 127.0.0.1/8 scope host lo\       valid_lft forever preferred_lft forever
112: eth1    inet 172.18.0.2/16 scope global eth1\       valid_lft forever preferred_lft forever
114: eth0    inet 172.19.0.4/16 scope global eth0\       valid_lft forever preferred_lft forever
/app # 
/app # ip route show
default via 172.19.0.1 dev eth0 
172.18.0.0/16 dev eth1  src 172.18.0.2 
172.19.0.0/16 dev eth0  src 172.19.0.4

So there is a clue. We have two interfaces, but our default route is going to go out of the eth0 on the 172.19.0.0/16 network. It just so happens that our iptables rules (see linked iptables output from above) performed DNAT for tcp dpt:5000 to:172.18.0.2:80. So traffic from the outside is going to come in to this container on the eth1 interface but leave it on the eth0 interface, which doesn't play nice with the iptables rules docker has set up.

We can prove that here by asking what route we will take when a packet leaves the machine:

/app # ip route get 10.10.10.10 from 172.18.0.2
10.10.10.10 from 172.18.0.2 via 172.19.0.1 dev eth0

Which basically means it will leave from eth0 even though it came in on eth1. The Docker documentation was updated to try to explain the behavior when multiple interfaces are attached to a container in this git commit.

Test Out Theory Using Source Based IP Routing

To test out the theory on this we can use source based IP routing (some reading on that here). Basically the idea is that we create policy rules that make IP traffic leave on the same interface it came in on.

To perform the test we'll need our container to be privileged so we can add routes. Modify the docker-compose.yml to add privileged: true to the voting-app:

services:
  voting-app:
    build: ./voting-app/.
    volumes:
     - ./voting-app:/app
    ports:
      - "5000:80"
    networks:
      - front-tier
      - back-tier
    privileged: true

Take down and bring up the application:

# docker-compose down
...
# docker-compose up -d
...

Exec into the container and create a new policy rule for packets originating from the 172.18.0.0/16 network. Tell packets matching this rule to look up routing table 200:

# docker exec -it examplevotingapp_voting-app_1 /bin/sh
/app # ip rule add from 172.18.0.0/16 table 200

Now add a default route for 172.18.0.1 to routing table 200. Show the routing table after that and the rules as well:

/app # ip route add default via 172.18.0.1 dev eth1 table 200
/app # ip route show table 200
default via 172.18.0.1 dev eth1
/app # ip rule show
0:      from all lookup local 
32765:  from 172.18.0.0/16 lookup 200 
32766:  from all lookup main 
32767:  from all lookup default

Now ask the kernel where a packet originating from our 172.18.0.2 address will get sent:

/app # ip route get 10.10.10.10 from 172.18.0.2
10.10.10.10 from 172.18.0.2 via 172.18.0.1 dev eth1

And finally, go back to the host and check to see if everything works now:

# curl --connect-timeout 5 192.168.121.98:5000 &>/dev/null && echo success || echo failure
success
# curl --connect-timeout 5 127.0.0.1:5000 &>/dev/null && echo success || echo failure
success

Success!!

I don't know if source based routing can be incorporated into docker to fix this problem or if there is a better solution. I guess we'll have to wait and find out.

Enjoy!

Dusty

NOTE I used the following versions of software for this blog post:

# rpm -q docker docker-compose kernel-core
docker-1.10.3-10.git8ecd47f.fc24.x86_64
docker-compose-1.7.0-1.fc24.noarch
kernel-core-4.5.4-300.fc24.x86_64

Fedora BTRFS+Snapper – The Fedora 24 Edition

History

In the past I have configured my personal computers to be able to snapshot and rollback the entire system. To do this I am leveraging the BTRFS filesystem, a tool called snapper, and a patched version of Fedora's grub2 package. The patches needed from grub2 come from the SUSE guys and are documented well in this git repo.

This setup is not new. I have fully documented the steps I took in the past for my Fedora 22 systems in two blog posts: part1 and part2. This is a condensed continuation of those posts for Fedora 24.

NOTE: I'm using Fedora 24 alpha, but everything should be the same for the released version of Fedora 24.

Setting up System with LUKS + LVM + BTRFS

The manual steps for setting up the system are detailed in the part1 blog post from Fedora 22. This time around I have created a script that will quickly configure the system with LUKS + LVM + BTRFS. The script will need to be run in an Anaconda environment just like the manual steps were done in part1 last time.

You can easily enable ssh access to your Anaconda booted machine by adding inst.sshd to the kernel command line arguments. After booting up you can scp the script over and then execute it to build the system. Please read over the script and modify it to your liking.

Alternatively, for an automated install I have embedded that same script into a kickstart file that you can use. The kickstart file doesn't really leverage Anaconda at all because it simply runs a %pre script and then reboots the box. It's basically like just telling Anaconda to run a bash script, but allows you to do it in an automated way. None of the kickstart directives at the top of the kickstart file actually get used.

Installing and Configuring Snapper

After the system has booted for the first time, let's configure the system for doing snapshots. I still want to be able to track how much size each snapshot has taken so I'll go ahead and enable quota support on BTRFS. I covered how to do this in a previous post:

[root@localhost ~]# btrfs quota enable /
[root@localhost ~]# btrfs qgroup show /
qgroupid         rfer         excl 
--------         ----         ---- 
0/5           1.08GiB      1.08GiB

Next up is installing/configuring snapper. I am also going to install the dnf plugin for snapper so that rpm transactions will automatically get snapshotted:

[root@localhost ~]# dnf install -y snapper python3-dnf-plugins-extras-snapper
...
Complete!
[root@localhost ~]# snapper --config=root create-config /
[root@localhost ~]# snapper ls
Type   | # | Pre # | Date | User | Cleanup | Description | Userdata
-------+---+-------+------+------+---------+-------------+---------
single | 0 |       |      | root |         | current     |         
[root@localhost ~]# snapper list-configs
Config | Subvolume
-------+----------
root   | /        
[root@localhost ~]# btrfs subvolume list /
ID 259 gen 57 top level 5 path .snapshots

So we used the snapper command to create a configuration for BTRFS filesystem mounted at /. As part of this process we can see from the btrfs subvolume list / command that snapper also created a .snapshots subvolume. This subvolume will be used to house the COW snapshots that are taken of the system.

Next, we'll workaround a bug that is causing snapper to have the wrong SELinux context on the .snapshots directory:

[root@localhost ~]# restorecon -v /.snapshots/
restorecon reset /.snapshots context system_u:object_r:unlabeled_t:s0->system_u:object_r:snapperd_data_t:s0

Finally, we'll add an entry to fstab so that regardless of what subvolume we are actually booted in we will always be able to view the .snapshots subvolume and all nested subvolumes (snapshots):

[root@localhost ~]# echo '/dev/vgroot/lvroot /.snapshots btrfs subvol=.snapshots 0 0' >> /etc/fstab

Taking Snapshots

OK, now that we have snapper installed and the .snapshots subvolume in /etc/fstab we can start creating snapshots:

[root@localhost ~]# btrfs subvolume get-default /
ID 5 (FS_TREE)
[root@localhost ~]# snapper create --description "BigBang"
[root@localhost ~]# snapper ls
Type   | # | Pre # | Date                            | User | Cleanup | Description | Userdata
-------+---+-------+---------------------------------+------+---------+-------------+---------
single | 0 |       |                                 | root |         | current     |         
single | 1 |       | Sat 23 Apr 2016 01:04:51 PM UTC | root |         | BigBang     |         
[root@localhost ~]# btrfs subvolume list /
ID 259 gen 64 top level 5 path .snapshots
ID 260 gen 64 top level 259 path .snapshots/1/snapshot
[root@localhost ~]# ls /.snapshots/1/snapshot/
bin  boot  dev  etc  home  lib  lib64  media  mnt  opt  proc  root  run  sbin  srv  sys  tmp  usr  var

We made our first snapshot called BigBang and then ran a btrfs subvolume list / to view that a new snapshot was actually created. Notice at the top of the output of the sections that we ran a btrfs subvolume get-default /. This outputs what the currently set default subvolume is for the BTRFS filesystem. Right now we are booted into the root subvolume but that will change as soon as we decide we want to use one of the snapshots for rollback.

Since we took a snapshot let's go ahead and make some changes to the system by updating the kernel:

[root@localhost ~]# dnf update -y kernel
...
Complete!
[root@localhost ~]# rpm -q kernel
kernel-4.5.0-0.rc7.git0.2.fc24.x86_64
kernel-4.5.2-300.fc24.x86_64
[root@localhost ~]# snapper ls
Type   | # | Pre # | Date                            | User | Cleanup | Description                   | Userdata
-------+---+-------+---------------------------------+------+---------+-------------------------------+---------
single | 0 |       |                                 | root |         | current                       |         
single | 1 |       | Sat 23 Apr 2016 01:04:51 PM UTC | root |         | BigBang                       |         
single | 2 |       | Sat 23 Apr 2016 01:08:18 PM UTC | root | number  | /usr/bin/dnf update -y kernel |

So we updated the kernel and the snapper dnf plugin automatically created a snapshot for us. Let's reboot the system and see if the new kernel boots properly:

[root@localhost ~]# reboot 
...
[dustymabe@media ~]$ ssh root@192.168.122.188 
Warning: Permanently added '192.168.122.188' (ECDSA) to the list of known hosts.
root@192.168.122.188's password: 
Last login: Sat Apr 23 12:18:55 2016 from 192.168.122.1
[root@localhost ~]# 
[root@localhost ~]# uname -r
4.5.2-300.fc24.x86_64

Rolling Back

Say we don't like that new kernel. Let's go back to the earlier snapshot we made:

[root@localhost ~]# snapper rollback 1
Creating read-only snapshot of current system. (Snapshot 3.)
Creating read-write snapshot of snapshot 1. (Snapshot 4.)
Setting default subvolume to snapshot 4.
[root@localhost ~]# reboot

snapper created a read-only snapshot of the current system and then a new read-write subvolume based on the snapshot we wanted to go back to. It then sets the default subvolume to be the newly created read-write subvolume. After reboot you'll be in the newly created read-write subvolume and exactly back in the state you system was in at the time the snapshot was created.

In our case, after reboot we should now be booted into snapshot 4 as indicated by the output of the snapper rollback command above and we should be able to inspect information about all of the snapshots on the system:

[root@localhost ~]# btrfs subvolume get-default /
ID 263 gen 87 top level 259 path .snapshots/4/snapshot
[root@localhost ~]# snapper ls
Type   | # | Pre # | Date                            | User | Cleanup | Description                   | Userdata
-------+---+-------+---------------------------------+------+---------+-------------------------------+---------
single | 0 |       |                                 | root |         | current                       |         
single | 1 |       | Sat 23 Apr 2016 01:04:51 PM UTC | root |         | BigBang                       |         
single | 2 |       | Sat 23 Apr 2016 01:08:18 PM UTC | root | number  | /usr/bin/dnf update -y kernel |         
single | 3 |       | Sat 23 Apr 2016 01:17:43 PM UTC | root |         |                               |         
single | 4 |       | Sat 23 Apr 2016 01:17:43 PM UTC | root |         |                               |         
[root@localhost ~]# ls /.snapshots/
1  2  3  4
[root@localhost ~]# btrfs subvolume list /
ID 259 gen 88 top level 5 path .snapshots
ID 260 gen 81 top level 259 path .snapshots/1/snapshot
ID 261 gen 70 top level 259 path .snapshots/2/snapshot
ID 262 gen 80 top level 259 path .snapshots/3/snapshot
ID 263 gen 88 top level 259 path .snapshots/4/snapshot

And the big test is to see if the change we made to the system was actually reverted:

[root@localhost ~]# uname -r
4.5.0-0.rc7.git0.2.fc24.x86_64
[root@localhost ~]# rpm -q kernel
kernel-4.5.0-0.rc7.git0.2.fc24.x86_64

Enjoy!

Dusty

Vagrant: Sharing Folders with vagrant-sshfs

cross posted from this fedora magazine post

Introduction

We're trying to focus more on developer experience in the Red Hat ecosystem. In the process we've started to incorporate the Vagrant into our standard offerings. As part of that effort, we're seeking a shared folder solution that doesn't include a bunch of if/else logic to figure out exactly which one you should use based on the OS/hypervisor you use under Vagrant.

The current options for Vagrant shared folder support can make you want to tear your hair out when you try to figure out which one you should use in your environment. This led us to look for a better answer for the user, so they no longer have to make these choices on their own based on their environment.

Current Synced Folder Solutions

"So what is the fuss about? Is it really that hard?" Well it's certainly doable, but we want it to be easier. Here are the currently available synced folder options within vagrant today:

  • virtualbox
    • This synced folder type uses a kernel module from the VirtualBox Guest Additions software to talk to the hypervisor. It requires you to be running on top of the Virtualbox hypervisor, and that the VirtualBox Guest Additions are installed in the Vagrant Box you launch. Licensing can also make distribution of the compiled Guest Additions problematic.
    • Hypervisor Limitation: VirtualBox
    • Host OS Limitation: None
  • nfs
    • This synced folder type uses NFS mounts. It requires you to be running on top of a Linux or Mac OS X host.
    • Hypervisor Limitation: None
    • Host OS Limitation: Linux, Mac
  • smb
    • This synced folder type uses Samba mounts. It requires you to be running on top of a Windows host and to have Samba client software in the guest.
    • Hypervisor Limitation: None
    • Host OS Limitation: Windows
  • 9p
    • This synced folder implementation uses 9p file sharing within the libvirt/KVM hypervisor. It requires the hypervisor to be libvirt/KVM and thus also requires Linux to be the host OS.
    • Hypervisor Limitation: Libvirt
    • Host OS Limitation: Linux
  • rsync
    • This synced folder implementation simply syncs folders between host and guest using rsync. Unfortunately this isn't actually shared folders, because the files are simply copied back and forth and can become out of sync.
    • Hypervisor Limitation: None
    • Host OS Limitation: None

So depending on your environment you are rather limited in which options work. You have to choose carefully to get something working without much hassle.

What About SSHFS?

As part of this discovery process I had a simple question: "why not sshfs?" It turns out that Fabio Kreusch had a similar idea a while back and wrote a plugin to do mounts via SSHFS.

When I first found this I was excited because I thought I had the answer in my hands and someone had already written it! Unfortunately the old implementation didn't implement a synced folder plugin like all of the other synced folder plugins within Vagrant. In other words, it didn't inherit the synced folder class and implement the functions. It also, by default, mounted a guest folder onto the host rather than the other way around like most synced folder implementations do.

One goal I have is to actually have SSHFS be a supported synced folder plugin within Vagrant and possibly get it committed back up into Vagrant core one day. So I reached out to Fabio to find out if he would be willing to accept patches to get things working more along the lines of a traditional synced folder plugin. He kindly let me know he didn't have much time to work on vagrant-sshfs these days, and he no longer used it. I volunteered to take over.

The vagrant-sshfs Plugin

To make the plugin follow along the traditional synced folder plugin model I decided to rewrite the plugin. I based most of the original code off of the NFS synced folder plugin code. The new code repo is here on Github.

So now we have a plugin that will do SSHFS mounts of host folders into the guest. It works without any setup on the host, but it requires that the sftp-server software exist on the host. sftp-server is usually provided by OpenSSH and thus is easily available on Windows/Mac/Linux.

To compare with the other implementations on environment restrictions here is what the SSHFS implementation looks like:

  • sshfs
    • This synced folder implementation uses SSHFS to share folders between host and guest. The only requirement is that the sftp-server executable exist on the host.
    • Hypervisor Limitation: None
    • Host OS Limitation: None

Here are the overall benefits of using vagrant-sshfs:

  • Works on any host platform
    • Windows, Linux, Mac OS X
  • Works on any type-2 hypervisor
    • VirtualBox, Libvirt/KVM, Hyper-V, VMWare
  • Seamlessly Works on Remote Vagrant Solutions
    • Works with vagrant-aws, vagrant-openstack, etc..

Where To Get The Plugin

This plugin is hot off the presses, so it hasn't quite made it into Fedora yet. There are a few ways you can get it though. First, you can use Vagrant itself to retrieve the plugin from rubygems:

$ vagrant plugin install vagrant-sshfs

Alternatively you can get the RPM package from my copr:

$ sudo dnf copr enable dustymabe/vagrant-sshfs
$ sudo dnf install vagrant-sshfs

Your First vagrant-sshfs Mount

To use use the plugin, you must tell Vagrant what folder you want mounted into the guest and where, by adding it to your Vagrantfile. An example Vagrantfile is below:

Vagrant.configure(2) do |config|
  config.vm.box = "fedora/23-cloud-base"
  config.vm.synced_folder "/path/on/host", "/path/on/guest", type: "sshfs"
end

This will start a Fedora 23 base cloud image and will mount the /path/on/host directory from the host into the running vagrant box under the /path/on/guest directory.

Conclusion

We've tried to find the option that is easiest for the user to configure. While SSHFS may have some drawbacks as compared to the others, such as speed, we believe it solves most people's use cases and is dead simple to configure out of the box.

Please give it a try and let us know how it works for you! Drop a mail to or open an issue on Github.

Cheers!
Dusty

802.11ac on Linux With NetGear A6100 (RTL8811AU) USB Adapter

NOTE: Most of the content from this post comes from a blog post I found that concentrated on getting the driver set up on Fedora 21. I did mostly the same steps with a few tweaks.

Intro

Driver support for 802.11ac in Linux is spotty especially if you are using a USB adapter. I picked up the NetGear A6100 that has the Realtek RTL8811AU chip inside of it. Of course, when I plug it in I can see the device, but no support in the kernel I'm using (kernel-4.2.8-200.fc22.x86_64).

On my system I currently have the built in wireless adapter, as well as the USB plugged in. From the output below you can see only one wireless NIC shows up:

# lspci | grep Network
00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 04)
03:00.0 Network controller: Intel Corporation Centrino Ultimate-N 6300 (rev 3e)
$ lsusb | grep NetGear
Bus 001 Device 002: ID 0846:9052 NetGear, Inc. A6100 AC600 DB Wireless Adapter [Realtek RTL8811AU]
# ip -o link | grep wlp | cut -d ' ' -f 2
wlp3s0:

The wlp3s0 is my built in wifi device.

The Realtek Driver

You can find the Realtek driver for the RTL8811AU at a couple of the sites for the many vendors that incorporate the chip:

These drivers often vary in version and often don't compile on newer kernels, which can be frustrating when you just want something to work.

Getting The RTL8811AU Driver Working

Luckily some people in the community unofficially manage code repositories with fixes to the Realtek driver to allow it to compile on newer kernels. From the blog post I mentioned earlier there is a linked GitHub repository where the Realtek driver is reproduced with some patches on top. This repository makes it pretty easy to get set up and get the USB adapters working on Linux.

NOTE: In case the git repo moves in the future I have copied an archive of it here for posterity. The commit id at head at the time of this use is 9fc227c2987f23a6b2eeedf222526973ed7a9d63.

The first step is to set up your system for DKMS to make it so that you don't have to recompile the kernel module every time you install a new kernel. To do this install the following packages and set the dkms service to start on boot:

# dnf install -y dkms kernel-devel-$(uname -r)
# systemctl enable dkms

Next, clone the git repository and observe the version of the driver:

# mkdir /tmp/rtldriver && cd /tmp/rtldriver
# git clone https://github.com/Grawp/rtl8812au_rtl8821au.git
# cat rtl8812au_rtl8821au/include/rtw_version.h 
#define DRIVERVERSION   "v4.3.14_13455.20150212_BTCOEX20150128-51"
#define BTCOEXVERSION   "BTCOEX20150128-51"

From the output we can see this is the 4.3.14_13455.20150212 version of the driver, which is fairly recent.

Next let's create a directory under /usr/src for the source code to live and copy it into place:

# mkdir /usr/src/8812au-4.3.14_13455.20150212
# cp -R  ./rtl8812au_rtl8821au/* /usr/src/8812au-4.3.14_13455.20150212/

Next we'll create a dkms.conf file which will tell DKMS how to manage building this module when builds/installs are requested; run man dkms to view more information on these settings:

# cat <<'EOF' > /usr/src/8812au-4.3.14_13455.20150212/dkms.conf
PACKAGE_NAME="8812au"
PACKAGE_VERSION="4.3.14_13455.20150212"
BUILT_MODULE_NAME[0]="8812au"
DEST_MODULE_LOCATION[0]="/kernel/drivers/net/wireless"
AUTOINSTALL="yes"
MAKE[0]="'make' all KVER=${kernelver}"
CLEAN="'make' clean"
EOF

Note one change from the earlier blog post, which is that I include KVER=${kernelver} in the make line. If you don't do this then the Makefile will incorrectly detect the kernel by calling uname, which is wrong when run during a new kernel installation because the new kernel is not yet running. If we didn't do this then every time a new kernel was installed the driver would get compiled for the previous kernel (the one that was running at the time of installation).

The next step is to add the module to the DKMS system and go ahead and build it:

# dkms add -m 8812au -v 4.3.14_13455.20150212

Creating symlink /var/lib/dkms/8812au/4.3.14_13455.20150212/source ->
                 /usr/src/8812au-4.3.14_13455.20150212

DKMS: add completed.
# dkms build -m 8812au -v 4.3.14_13455.20150212

Kernel preparation unnecessary for this kernel.  Skipping...

Building module:
cleaning build area...
'make' all KVER=4.2.8-200.fc22.x86_64......................
cleaning build area...

DKMS: build completed.

And finally install it:

# dkms install -m 8812au -v 4.3.14_13455.20150212

8812au:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/4.2.8-200.fc22.x86_64/extra/
Adding any weak-modules

depmod....

DKMS: install completed.

Now we can load the module and see information about it:

# modprobe 8812au
# modinfo 8812au | head -n 3
filename:       /lib/modules/4.2.8-200.fc22.x86_64/extra/8812au.ko
version:        v4.3.14_13455.20150212_BTCOEX20150128-51
author:         Realtek Semiconductor Corp.

Does the wireless NIC work now? After connecting to an AC only network here are the results:

# ip -o link | grep wlp | cut -d ' ' -f 2
wlp3s0:
wlp0s20u2:
# iwconfig wlp0s20u2
wlp0s20u2  IEEE 802.11AC  ESSID:"random"  Nickname:"<WIFI@REALTEK>"
          Mode:Managed  Frequency:5.26 GHz  Access Point: A8:BB:B7:EE:B6:8D   
          Bit Rate:87 Mb/s   Sensitivity:0/0  
          Retry:off   RTS thr:off   Fragment thr:off
          Encryption key:****-****-****-****-****-****-****-****   Security mode:open
          Power Management:off
          Link Quality=95/100  Signal level=100/100  Noise level=0/100
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:0  Invalid misc:0   Missed beacon:0

Sweet!!

Keeping it Working After Kernel Updates

Let's test out to see if updating a kernel leaves us with a system that has an updated driver or not. Before the kernel update:

# tree /var/lib/dkms/8812au/4.3.14_13455.20150212/
/var/lib/dkms/8812au/4.3.14_13455.20150212/
├── 4.2.8-200.fc22.x86_64
│   └── x86_64
│       ├── log
│       │   └── make.log
│       └── module
│           └── 8812au.ko
└── source -> /usr/src/8812au-4.3.14_13455.20150212

5 directories, 2 files

Now the kernel update and viewing it after:

# dnf -y update kernel kernel-devel --enablerepo=updates-testing
...
Installed:
  kernel.x86_64 4.3.3-200.fc22
  kernel-core.x86_64 4.3.3-200.fc22
  kernel-devel.x86_64 4.3.3-200.fc22
  kernel-modules.x86_64 4.3.3-200.fc22

Complete!
# tree /var/lib/dkms/8812au/4.3.14_13455.20150212/
/var/lib/dkms/8812au/4.3.14_13455.20150212/
├── 4.2.8-200.fc22.x86_64
│   └── x86_64
│       ├── log
│       │   └── make.log
│       └── module
│           └── 8812au.ko
├── 4.3.3-200.fc22.x86_64
│   └── x86_64
│       ├── log
│       │   └── make.log
│       └── module
│           └── 8812au.ko
└── source -> /usr/src/8812au-4.3.14_13455.20150212

9 directories, 4 files

And from the log we can verify that the module was built against the right kernel:

# head -n 4 /var/lib/dkms/8812au/4.3.14_13455.20150212/4.3.3-200.fc22.x86_64/x86_64/log/make.log
DKMS make.log for 8812au-4.3.14_13455.20150212 for kernel 4.3.3-200.fc22.x86_64 (x86_64)
Sun Jan 24 19:40:51 EST 2016
make ARCH=x86_64 CROSS_COMPILE= -C /lib/modules/4.3.3-200.fc22.x86_64/build M=/var/lib/dkms/8812au/4.3.14_13455.20150212/build  modules
make[1]: Entering directory '/usr/src/kernels/4.3.3-200.fc22.x86_64'

Success!

The CentOS CI Infrastructure: A Getting Started Guide

Background

The CentOS community is trying to build an ecosystem that fosters and encourages upstream communities to continuously perform integration testing of their code running on the the CentOS platform. The CentOS community has built out an infrastructure that (currently) contains 256 servers ("bare metal" servers") that are pooled together to run tests that are orchestrated by a frontend Jenkins instance located at ci.centos.org.

Who Can Use the CentOS CI?

The CentOS CI is primarily targeted at Open Source projects that use CentOS as a platform in some way. If your project meets those two requirements then check out our page for Getting Started and look at the "Asking for your project to be added" section.

What Is Unique About the CentOS CI?

With many test infrastructures that exist today you are given a virtual machine. With the CentOS CI, when you get a test machine you are actually getting a "bare metal" machine, which allows for testing of workloads that may have not been possible otherwise. One specific example of this is testing out virtualization workloads. The RDO and libvirt projects both use the CentOS CI to do testing that wouldn't be possible on an infrastructure that didn't provide bare metal.

The CentOS CI also offers early access to content that will be in a coming release of CentOS. If there is a pending release, then the content will be available for testing in the CI infrastructure. This allows projects to do testing and find bugs early (before release).

I Have Access. Now What?

Credentials/Environment

Now that you have access to the CentOS CI you should have a few things:

  • Username/Password for the ci.centos.org Jenkins frontend
  • An API key to use with Duffy
  • A target slave type to be used for your testing

The 2nd item from above is unique. In order to provision the bare metal machines and present them for testing, the CentOS CI uses a service known as Duffy (a REST API). The Jenkins jobs that run must provision machines using Duffy and then execute tests on those machines; in the future there may be a Jenkins plugin that takes care of this for you.

The 3rd item is actually specific to your project. The slave machines that are contacted from Jenkins have a workspace set up (like a home directory) for your project. These slaves are accessible via SSH and you can put whatever files you need here in order to orchestrate your tests. When a command is executed in a Jenkins job, these machines are the ones that it is run on.

What you really want, however, is to run tests on the Duffy instances. For that reason the slave is typically just used to request an instance from Duffy and then ssh into the instance to execute tests.

A Test To Run

Even though we've brought the infrastructure together we still need you to write the tests! Basically the requirement here is that you have a git repo that can be cloned on the Duffy instance and then a command to run to kick off the tests.

A very simple example of this is my centos-ci-example repo on GitHub. In this repo the run_tests.sh script executes tests. So for our case we will use the following environment varialbes when defining our Jenkins job below:

GIT_REPO_URL=https://github.com/dustymabe/centos-ci-example
TEST_CMD='./run_tests.sh'

Your First Job: Web Interface

So you have access and you have a git repo that contains a test to run. With the username/password you can login to ci.centos.org and create a new job. To create a new job select New Item from the menu on the left hand side of the screen. Enter a name for your job and Freestyle Project as shown below:

image

After clicking OK, the next page that appears is the page for configuring your job. The following items need to be filled in:

  • Check Restrict where this project can be run
    • Enter the label that applies to environment set up for you

As you can see below, for me this was the atomicapp-shared label.

image

  • Check Inject environment variables to the build process under Build Environment
    • Populate the environment variables as shown below:

image

  • Click on the Add Build Step Dropdown and Select Execute Python Script

image

  • Populate a python script in the text box
    • This script will be executed on the jenkins slaves
    • It will provision new machines using Duffy and then execute the test(s).
    • This script can be found on GitHub or here

image

Now you are all done configuring your job for the first time. There are plenty of more options that Jenkins gives you, but for now click Save and then run the job. You can do this by clicking Build Now and then viwing the output by selecting Console Output as shown in the screenshot below:

image

Your Next Job: Jenkins Job Builder

All of those steps can be done in a more automated fashion by using Jenkins Job Builder. You must install the jenkins-jobs executable for this and create a config file that holds the credentials to interface with Jenkins:

# yum install -y /usr/bin/jenkins-jobs
# cat <<EOF > jenkins_jobs.ini
[jenkins]
user=username
password=password
url=https://ci.centos.org
EOF

Update the file to have the real user/password in it.

Next you must create a job description:

# cat <<EOF >job.yaml
- job:
    name: dusty-ci-example
    node: atomicapp-shared
    builders:
        - inject:
            properties-content: |
                API_KEY=aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee
                MACHINE_COUNT=1
                TEST_CMD='./run_tests.sh'
                GIT_REPO_URL='https://github.com/dustymabe/centos-ci-example.git'
        - centos-ci-bootstrap
- builder:
    name: centos-ci-bootstrap
    builders:
        - python:
            !include-raw: './run.py'
EOF

Update the file to have the real API_KEY.

The last component is run.py, which is the python script we pasted in before:

# curl http://dustymabe.com/content/2016-01-23/run.py > run.py

Now you can run jenkins-jobs and update the job:

# jenkins-jobs --conf jenkins_jobs.ini update job.yaml
INFO:root:Updating jobs in ['job.yaml'] ([])
INFO:jenkins_jobs.local_yaml:Including file './run.py' from path '.'
INFO:jenkins_jobs.builder:Number of jobs generated:  1
INFO:jenkins_jobs.builder:Reconfiguring jenkins job dusty-ci-example
INFO:root:Number of jobs updated: 1
INFO:jenkins_jobs.builder:Cache saved

NOTE: This is all reproduced in the centos-ci-example jjb directory. Cloning the repo and executing the files from there may be a little easier than running the commands above.

After executing all of the steps you should now be able to execute Build Now on the job, just as before. Take Jenkins Job Builder for a spin and consider it a useful tool when managing your Jenkins jobs.

Conclusion

Hopefully by now you can set up and execute a basic test on the CentOS CI. Come and join our community and help us build out the infrastructure and the feature set. Check out the CI Wiki, send us a mail on the mailing list or ping us on #centos-devel in Freenode.

Happy Testing!
Dusty

Running Nulecules in Openshift via oc new-app

Intro

As part of the Container Tools team at Red Hat I'd like to highlight a feature of Atomic App: support for execution via OpenShift's cli command oc new-app.

The native support for launching Nulecules means that OpenShift users can easily pull from a library of Atomic Apps (Nuleculized applications) that exist in a Docker registry and launch them into OpenShift. Applications that have been packaged up in a Nulecule offer a benefit to the packager and to the deployer of the application. The packager can deliver one Nulecule to all users that supports many different platforms and the deployer gets a simplified delivery mechanism; deploying a Nulecule via Atomic App is easier than trying to manage provider definitions.

DEMO Time

OK. Let's do a demo. I'll choose the guestbook example application developed by Kubernetes. The Nulecule defintion for this example lives here. To start the container in OpenShift via oc new-app you simply run the following command from the command line:

# oc new-app projectatomic/guestbookgo-atomicapp --grant-install-rights

This will run a pod using the container image projectatomic/guestbookgo-atomicapp. The Atomic App software that runs inside the container image will evaluate the Nulecule in the image and communicate with OpenShift in order to bring up the guestbook application, which also leverages redis.

You should now be able to see the redis and guestbook replication controllers and services:

# oc get rc
CONTROLLER     CONTAINER(S)   IMAGE(S)                  SELECTOR                REPLICAS   AGE
guestbook      guestbook      kubernetes/guestbook:v2   app=guestbook           3          3m
redis-master   redis-master   centos/redis              app=redis,role=master   1          3m
redis-slave    redis-slave    centos/redis              app=redis,role=slave    2          3m
# oc get svc
NAME           CLUSTER_IP      EXTERNAL_IP   PORT(S)    SELECTOR                AGE
guestbook      172.30.24.168                 3000/TCP   app=guestbook           3m
redis-master   172.30.210.63   <none>        6379/TCP   app=redis,role=master   3m
redis-slave    172.30.62.63    <none>        6379/TCP   app=redis,role=slave    3m

As well as the pods that are started as a result of the replication controllers:

# oc get pods
NAME                 READY     STATUS    RESTARTS   AGE
guestbook-6gujf      1/1       Running   0          3m
guestbook-m61vq      1/1       Running   0          3m
guestbook-otoz4      1/1       Running   0          3m
redis-master-wdl80   1/1       Running   0          3m
redis-slave-fbapw    1/1       Running   0          3m
redis-slave-oizwb    1/1       Running   0          3m

If you have access to the instance where the pods are running you can access it via a Node IP and NodePort, however this is not common in a hosted environment. In a hosted environment you need to expose the service in openshift via a route:

# oc expose service guestbook
route "guestbook" exposed
# oc get route guestbook
NAME        HOST/PORT                                       PATH      SERVICE     LABELS          INSECURE POLICY   TLS TERMINATION
guestbook   guestbook-proj1.e8ca.engint.openshiftapps.com             guestbook   app=guestbook

Now you should be able to access the guestbook service via the provided hostname; in this case it is guestbook-proj1.e8ca.engint.openshiftapps.com. A quick visit with Firefox gives us:

image

And we have a guestbook up and running!

Give oc new-app on an Atomic App a spin and give us some feeback at .

Dusty

Archived-At Email Header From Mailman 3 Lists

By now most Fedora email lists have been migrated to Mailman3. One little (but killer) new feature that I recently discovered was that Mailman3 includes the RFC 5064 Archived-At header in the emails.

This is a feature I have wanted for a really long time; to be able to find an email in your Inbox and copy and paste a link to anyone without having to find the message in the online archive is going to save a lot of time and decrease some latency when chatting on IRC or some other form of real time communication.

OK, how do I easily get this information out of the header and use it without having to view the email source every time?

I use Thunderbird for most of my email viewing (sometimes mutt for composing). In Thunderbird there is the Archived-At plugin. After installing this plugin you can right click on any message from a capable mailing list and select "Copy Archived-At URL".

Now you are off to share that URL with friends :)

Happy New Year!
Dusty

Fedora Cloud Vagrant Boxes in Atlas

Cross posted with this fedora magazine post

Since the release of Fedora 22, Fedora began creating Vagrant boxes for cloud images in order to make it easier to set up a local environment for development or testing. In the Fedora 22 release cycle we worked out quite a few kinks and we are again releasing libvirt and virtualbox Vagrant boxes for Fedora 23.

Additionally, for Fedora 23, we are making it easier for the users to grab these boxes by having them indexed in Hashicorp's Atlas. Atlas is essentially an index of Vagrant boxes that makes it easy to distribute them (think of it like a Docker registry for virtual machine images). By indexing the Fedora boxes in Atlas, users now have the option of using the vagrant software to download and add the boxes automatically, rather than the user having to go grab the boxes directly from the mirrors first (although this is still an option).

In order to get started with the Fedora cloud base image, run the following command:

# vagrant init fedora/23-cloud-base && vagrant up

Alternatively, to get started with Fedora Atomic host, run this command:

# vagrant init fedora/23-atomic-host && vagrant up

The above commands will grab the latest indexed images in Atlas and start a virtual machine without the user having to go download the image first. This will make it easier for Fedora users to develop and test on Fedora! If you haven't delved into Vagrant yet then you can get started by visiting the Vagrant page on the Fedora Developer Portal. Let us know on the Fedora Cloud mailing list if you have any trouble.

Dusty