s3blkdev

Info

s3blkdev is a gateway between an S3 compatible storage and Linux network block devices. On its frontend side it acts as an NBD server, which serves block devices (read: virtual hard disks) to the Linux kernel. On its backend side it synchronizes these virtual hard disks to an S3 compatible storage. Virtual hard disks are split into small and compressed pieces called chunks. Most recently used chunks are kept locally. When this local cache becomes too big, least recently used chunks will be evicted to the S3 storage. If the kernel asks for a chunk which is not in the local cache, s3blkdev will transparently download it. Additionally, the cache is synchronized to the S3 storage on a daily or weekly basis.
One s3blkdev instance can present multiple block devices to the kernel, but needs just one S3 bucket. The chunks of each block device are saved to different subfolders (yes, S3 doesn't know anything about subfolders, but for the moment let's refer to the name part between two forward slashes in an S3 url as subfolder). If your S3 storage is reachable on different ip addresses and/or ports, then you can configure s3blkdev to use them in a round robin fashion. Both http and https are supported.

Requirements

Linux version 3.15 or later, as s3blkdev makes use of open file description locks (OFD)
Linux kernel with CONFIG_BLK_DEV_NBD enabled
nbd-client command from the NBD userland tools
a recent C compiler, e.g. gcc 4.9.x
a pthreads library, as it uses POSIX threads (almost every Linux distribution comes with this)
the GnuTLS library for https support
the Nettle library for base64 encoding
the Snappy library for (de-)compression

Development platform is a Gentoo Linux box running a Linux kernel version >= 4, and latest packages from portage.

Installation

Download the latest release
tar -xvjf s3blkdev-0.x.tar.bz2
cd s3blkdev-0.x
make install

You now have:

/usr/local/sbin/s3blkdevd, the main daemon
/usr/local/sbin/s3blkdev-sync, the eviction and synchronization helper
/usr/local/etc/s3blkdev.conf.dist, an example configuration file

Configuration

Both s3blkdevd and s3blkdev-sync expect the configuration file /usr/local/etc/s3blkdev.conf unless overwritten by command line option -c. An example configuration file named s3blkdev.conf.dist has been copied into the same directory during installation:

listen /tmp/s3blkdevd.sock
port 10809
workers 8
fetchers 2

s3host 
s3port 
s3ssl 1
s3accesskey 
s3secretkey 
s3bucket 
s3timeout 10000
# s3name
# s3maxreqsperconn 100

# [device1]
# cachedir /ssd/device1
# size 200000000000

These global options exist:

Option	Description
listen	IPv4/IPv6 address or local Unix socket to listen on
port	port to listen on, ignored if listen is a Unix socket
workers	number of server threads, max. count of simultaneous I/O requests
fetchers	max. number of simultaneous downloads from S3 storage
s3host	IPv4/IPv6 address or hostname without leading bucket name of S3 storage up to 4 host statements may be specified
port	TCP port number of S3 storage up to 4 port statements may be specified, resulting in s3host x s3port connections
s3ssl	use HTTPS instead of HTTP to connect to S3 storage
s3accesskey	user name for S3 storage
s3secretkey	password for S3 storage
s3bucket	name of bucket
s3timeout	timeout of S3 operations in milliseconds
s3name	put this name in the Host: header when talking to S3 backends; by default, the Host: header will contain the value of the current s3host
s3maxreqsperconn	close a backend connection after this many requests; defaults to 100 requests

A name enclosed in square brackets starts an exported device section. Each device expects two parameters:

Option	Description
cachedir	local directory where s3blkdev caches chunks for the current device; place this on an SSD, preferably
size	size of block device in bytes

Usage

s3blkdevd

s3blkdevd should be started as an unprivileged user. As it increases its stack size to 32 MB, you might want to check with ulimits. If you send SIGTERM to s3blkdevd, it will exit nicely, but will most likely leave any mounted filesystem on top its exported devices in an undesirable state. It accepts the following options. Note that the pid file has to be specified:

s3blkdevd V0.6

Usage:

s3blkdevd [-c <config file>] [-p <pid file>]
s3blkdevd -h

  -c <config file>    read config options from specified file instead of
                      /usr/local/etc/s3blkdev.conf
  -p <pid file>       daemonize and save pid to this file
  -h                  show this help ;-)

Let's assume you configured a device named foobar in s3blkdev.conf, and s3blkdevd is running. To create an ext4 filesystem on that device, run the following commands:

Connect to the exported device, the kernel will create a block device named /dev/nbd0:
nbd-client -N foobar -p -u /tmp/s3blkdevd.sock /dev/nbd0
Prepare an ext4 log device (maybe an SSD) to place the journal on and note its filesystem UUID:
mkfs.ext4 -b 4096 -O journal_dev /dev/vg0/lvjournal
Now create the filesystem and place the ext4 journal on the local log device you just created:
mkfs.ext4 -J device=UUID=<UUID_from_mkfs.ext4> -E stride=2048,stripe_width=2048 /dev/nbd0
Disconnect from s3blkdevd:
nbd-client -d /dev/nbd0

See also scripts/init.sh.
To mount that filesystem, do the following:

Connect to the exported device:
nbd-client -N foobar -p -u /tmp/s3blkdevd.sock /dev/nbd0
Mount the filesystem, using the local journal device:
mount -t ext4 -o journal_async_commit,stripe=2048 /dev/nbd0 /mnt

Finally, to unmount and shutdown s3blkdevd nicely:

Unmount the filesystem:
umount /mnt
Disconnect from s3blkdevd:
nbd-client -d /dev/nbd0
Stop s3blkdevd:
kill `cat /var/run/s3blkdevd.pid`

Again, have a look at scripts/start.sh and scripts/stop.sh.

s3blkdev-sync

s3blkdev-sync has two operational modes: Eviction ensures that the local cache directories have enough free space. Synchronization simply copies any changed chunks to S3 storage. Thus, you should create two cron jobs:

* * * * * /usr/local/sbin/s3blkdev-sync -p /var/run/s3blkdev-evict.pid 90 80
0 23 * * * /usr/local/sbin/s3blkdev-sync -p /var/run/s3blkdev-sync.pid 36000

The first cron job runs every minute. If any local cache directory has more than 90% disk space in use, chunks will be evicted until disk usage drops below 80%. Eviction itself consists of two rounds: First, any chunk that has been uploaded to S3 storage, but has not been changed locally, is deleted. Secondly, remaining chunks (starting with least recently used ones) are uploaded and deleted until disk space drops below 80%.
Second cron job starts every night at 23:00 and stops 10 hours later plus the remaining time to upload the current chunk. It uploads any chunks (starting with most recently used ones) that have been modified.
As of s3blkdev 0.4, you can run multiple instances of s3blkdev-sync simultaneously:

* * * * * /usr/local/sbin/s3blkdev-sync -p /var/run/s3blkdev-evict0.pid 90 80 0 50
* * * * * /usr/local/sbin/s3blkdev-sync -p /var/run/s3blkdev-evict1.pid 90 80 50 100
0 23 * * * /usr/local/sbin/s3blkdev-sync -p /var/run/s3blkdev-sync0.pid 36000 0 33
0 23 * * * /usr/local/sbin/s3blkdev-sync -p /var/run/s3blkdev-sync1.pid 36000 33 66
0 23 * * * /usr/local/sbin/s3blkdev-sync -p /var/run/s3blkdev-sync2.pid 36000 66 100

Two instances will start in eviction mode every minute. Each instance will handle 50% of all chunks. Every evening at 11 p.m. three instances will start in sync mode. Each one will run for 10 hours and will handle one 33% of all chunks (to be more precise, the last one will handle 34%). Don't forget to specify different pid files when using multiple instances.

Wishlist

GEOM gate frontend for FreeBSD
Encrypted chunks in S3
~~s3blkdev-sync should stop after a given time span~~

Installation on Ubuntu 16.10

apt-get install build-essential libgnutls28-dev libsnappy-dev nettle-dev libsystemd-dev nbd-client nodejs
Fetch latest release from below
tar -xvjf s3blkdev-0.?.tar.bz2 && cd s3blkdev-0.? && make && make install
Edit /usr/local/etc/s3blkdev.conf, add your S3 credentials, and add a device like:
```
[device0]
cachedir /cache0
size 2000000000
```

Edit /etc/nbdtab, e.g.:

nbd0	/tmp/s3blkdevd.sock	device0		unix,persist,bs=4096

To load the nbd kernel module during system boot, add it to /etc/modules:
```
nbd
```
Create a partition or a logical volume for the cache and the external journal, e.g.:
```
lvcreate -L 4G -n lvcache0 ubuntu-vg
lvcreate -L 128M -n lvjrnl0 ubuntu-vg
```

Edit /etc/fstab, e.g.:

/dev/ubuntu-vg/lvcache0	/cache0	ext4	discard,nodiratime	1 2
/dev/nbd0	/device0	ext4	_netdev,journal_async_commit,stripe=2048,noatime,nodiratime,x-systemd.requires=nbd@nbd0.service	1 2

Create and mount the cache volume:

mkfs.ext4 /dev/ubuntu-vg/lvcache0
mkdir /cache0
mount /cache0

Start s3blkdevd, and attach nbd0 manually:

systemctl start s3blkdevd.service
nbd-client -N device0 -b 4096 -p -u /tmp/s3blkdevd.sock /dev/nbd0

Create an external journal on the local disk (or on the logical volume), and a filesystem on nbd0 (replace the UUID by the one the first mkfs command outputs):

mkfs.ext4 -b 4096 -O journal_dev /dev/ubuntu-vg/lvjrnl0 
mkfs.ext4 -J device=UUID=11111111-2222-3333-4444-555555555555 -E stride=2048,stripe_width=2048 /dev/nbd0

Shut down nbd-client:
```
nbd-client -d /dev/nbd0
```
nbd@nbd0.service depends on s3blkdevd.service, thus create the file /lib/systemd/system/nbd@nbd0.service.d/local.conf:
```
[Unit]
Requires=s3blkdevd.service
After=s3blkdevd.service
```
Add prerequisites to those units whose services access data on /device0, e.g. add the following to /lib/systemd/system/netatalk.service.d/local.conf:
```
[Unit]
ConditionPathIsMountPoint=/device0
```
Reload systemd:
```
systemctl daemon-reload
```
Either mount /device0:
```
systemctl start device0.mount
```
or start any service depending on /device0, e.g. netatalk:
```
systemctl start netatalk.service
```
Add cronjobs for s3blkdev-sync in eviction and sync mode as shown above.

Installation on Ubuntu 15.10

apt-get install build-essential libgnutls-dev libsnappy-dev nettle-dev nbd-client nodejs
Fetch latest release from below
tar -xvjf s3blkdev-0.?.tar.bz2 && cd s3blkdev-0.? && make && make install
Edit /usr/local/etc/s3blkdev.conf, add your S3 credentials, and add a device like:
```
[device0]
cachedir /cache0
size 2000000000
```

Edit /etc/nbd-client.conf, e.g.:

NBD_DEVICE[0]=/dev/nbd0
NBD_TYPE[0]=f
NBD_HOST[0]=/tmp/s3blkdevd.sock
NBD_PORT[0]=
NBD_NAME[0]=device0
NBD_EXTRA[0]="-b 4096 -n -p -u"

Create a cache and an external journal, e.g.:

lvcreate -L 4G -n lvcache0 ubuntu-vg
lvcreate -L 128M -n lvjrnl0 ubuntu-vg

Edit /etc/fstab, e.g.:

/dev/ubuntu-vg/lvcache0	/cache0	ext4	discard,nodiratime	1 2
/dev/nbd0	/device0	ext4	noauto,journal_async_commit,stripe=2048,noatime	1 2

Create and mount cache:

mkfs.ext4 /dev/ubuntu-vg/lvcache0
mkdir /cache0
mount /cache0

Start s3blkdevd, and attach nbd0 manually:

systemctl start s3blkdevd.service
nbd-client -N device0 -b 4096 -p -u /tmp/s3blkdevd.sock /dev/nbd0

Create an external journal on the local disk, and a filesystem on nbd0:

mkfs.ext4 -b 4096 -O journal_dev /dev/ubuntu-vg/lvjrnl0 
mkfs.ext4 -J device=UUID=11111111-2222-3333-4444-555555555555 -E stride=2048,stripe_width=2048 /dev/nbd0

Shut down nbd-client:
```
nbd-client -d /dev/nbd0
```
Reattach nbd0 and mount /device0 using the nbd@.service unit provided by make install (/etc/init.d/nbd-client seems to be buggy):
```
systemctl start nbd@0.service
```
Add prerequisites to those units whose services access data on /device0, e.g. add the following to /lib/systemd/system/netatalk.service.d/local.conf:
```
[Unit]
Requires=nbd@0.service
After=nbd@0.service
ConditionPathIsMountPoint=/device0
```
Reload systemd:
```
systemctl daemon-reload
```
Reconfigure and restart services to access data from /device0.
Add cronjobs for s3blkdev-sync in eviction and sync mode as shown above.

GUI

Starting with version 0.6, s3blkdev comes with an HTML frontend based on Node.js. It shows current disk usage and network throughput. Here's a screenshot.

Download

Bleeding edge: https://github.com/felixjogris/s3blkdev

s3blkdev-0.10.tar.bz2 (released 2017-03-17)
s3blkdev-0.9.tar.bz2 (released 2017-03-15)
s3blkdev-0.8.tar.bz2 (released 2016-12-27)
s3blkdev-0.7.tar.bz2 (released 2016-10-29)
s3blkdev-0.6.tar.bz2 (released 2016-01-30)
s3blkdev-0.5.tar.bz2 (released 2016-01-10)
s3blkdev-0.4.tar.bz2 (released 2016-01-03)
s3blkdev-0.3.tar.bz2 (released 2015-12-14)
s3blkdev-0.2.tar.bz2 (released 2015-10-24)
s3nbd-0.1.tar.bz2 (released 2015-10-03)