Near Realtime File Replication With Lsyncd

Ever wanted real time backups of  directories on your server? To replicate static media files for a website to a separate box to reduce load? An easy solution lays with lsyncd it allows you to watch a directory structure on your file system, and replicate any changes on a remote system.

How It Works:
inotify is a Linux subsystem available from the 2.6.13 release of the Linux kernel that monitors changes made to a file system and reports these changes to interested applications. lsyncd is an application written in lua that uses the inotify service to inform it of changes made to monitored directories and then when notified of a change uses rsync to replicate the changes on a remote service.

Getting lsyncd
The lsyncd sourcecode can be downloaded from Google Code and compiled but depending on the distribution your server is running, the install process be even easier as lsyncd is included in the repositories of many popular Linux distros.

Installing On Debian / Ubuntu
While lsyncd is included in both the repositories of Debian and Ubuntu but unfortunately its an outdated version so check what version you are going to get beforehand with the command:

apt-cache show lsyncd

If apt-cache tells you that only one of the 1.x versions are available, I would recommend grabbing one of the 2.x versions from the Debian testing repository instead, the amd64 package can be found here  and one for the i386 architecture here. Then install using the dpkg command i.e

cd /root
wget wget http://ftp.au.debian.org/debian/pool/main/l/lsyncd/lsyncd_2.0.4-1_amd64.deb
apt-get install lua5.1 rsync
dpkg -i lsyncd_2.0.4-1_amd64.deb

CentOS 5 Installation

yum install lsyncd.x86_64

Preparation
First step for getting started with lsync over two hosts is to create a shared SSH key to allow authentication without a password.

ssh-keygen

Now transfer the servers newly generated public key to the second server you are going to mirror content to.

scp ~/.ssh/id_rsa.pub root@my2ndserver:/tmp

After copying the file the second server add the public key to the ssh authorized_keys file.

cat /tmp/id_rsa.pub >> ~/.ssh/authorized_keys

While you are logged into the second machine also double check that rsync is installed and install it if it isn’t. Now decide the directory you would like to mirror across your machines. In this example I am just going to keep it simple by starting fresh and creating a new directory in the same location on both of my servers:

mkdir /home/mirror

Quick Test Run
cd /home/mirror
touch test_file
lsyncd -log all -nodaemon -rsyncssh /home/mirror sync@my2ndserver /home/mirror

(Note: Unlike the scp command the target host and directory are given as two separate arguments to lsyncd at the command line)

Which should produce output that looks something similar to:

kernels clocks_per_sec=100
Call: configure()
Inotify: inotify fd = 3
Call: initialize()
10:08:17 Function: Inotify.addWatch(/home/mirror/, (true), (nil), (nil))
10:08:17 Inotify: addwatch(/home/mirror/)->1
10:08:17 Normal: recursive startup rsync: /home/mirror/ -> sync@192.168.20.127:/home/mirror/
10:08:17 Exec: /usr/bin/rsync [–delete] [-r] [-lts] [/home/mirror/] [sync@192.168.20.127:/home/mirror/]
10:08:17 Call: getAlarm()
10:08:17 Debug: getAlarm returns: (false)
10:08:17 Masterloop: going into select (no timeout).
10:08:17 Call: collectProcess()
10:08:17 Delay: collected an event
10:08:17 Normal: Startup of ‘/home/mirror/’ finished.
10:08:17 Normal: Finished Blanket on /home/mirror/ = 0
10:08:17 Delay: Finish of Blanket on /home/mirror/ = 0
10:08:17 Call: cycle()
10:08:17 Function: invokeActions(‘Sync1’,(Timestamp: 17184171.29))
10:08:17 Call: getAlarm()
10:08:17 Debug: getAlarm returns: (false)
10:08:17 Masterloop: going into select (no timeout).

Look in the mirror directory on the second server and with any luck you should see your test file has migrated across from the first server.

Potential Issues
When performing the test run you may get some output that appears similar to the text below:

kernels clocks_per_sec=100
Call: configure()
Inotify: inotify fd = 3
Call: initialize()
10:14:39 Function: Inotify.addWatch(/home/mirror/, (true), (nil), (nil))
10:14:39 Inotify: addwatch(/home/mirror/)->1
10:14:39 Normal: recursive startup rsync: /home/mirror/ -> root@192.168.20.131:/home/mirror/
10:14:39 Exec: /usr/bin/rsync [–delete] [-r] [-lts] [/home/mirror/] [root@192.168.20.131:/home/mirror/]
10:14:39 Call: getAlarm()
10:14:39 Debug: getAlarm returns: (false)
10:14:39 Masterloop: going into select (no timeout).
rsync: on remote machine: -sltre.iLsf: unknown option
rsync error: syntax or usage error (code 1) at main.c(1231) [server=2.6.8]
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(601) [sender=3.0.7]
10:14:39 Call: collectProcess()
10:14:39 Delay: collected an event
10:14:39 Error: Failure on startup of ‘/home/mirror/’.

This tends to happen if you are using an older version of rsync on the 2nd server, no need to stress though I found when creating a config file for lysncd removing the -s rsync option seemed to solve this issue.

Creating A Config File
Fire up your preferred editor and create a new file to store your configuration:

nano /etc/lsyncd.conf

With the contents:

settings = {
   logfile    = “/var/log/lsyncd.log”,
}

sync{default.rsync, source=”/home/mirror”, target=”192.168.20.131:/home/mirror”, rsyncOps=”-rltvu”}

Then start the deamon with the comand:

lsyncd /etc/lsyncd.conf

This will start the daemon in the background, go create a new file in your mirror directory then watch the log with the tail command:

tail -f /var/log/lsyncd.log

With any luck you should be able to see your newly created file getting picked up by lsyncd and mirrored on your second server.

Sun Jul 17 11:03:09 2011 Normal: Calling rsync with filter-list of new/modified files/dirs
/myfile.txt
/
building file list … done
./
myfile.txt

sent 139 bytes  received 48 bytes  374.00 bytes/sec
total size is 0  speedup is 0.00
Sun Jul 17 11:03:09 2011 Normal: Finished a list = 0

As you can see the process is not instantaneous with around a 5 – 15 second delay before initiating a transfer and obviously the delay while the file itself transfers across but it is handy for backups and other situations where a small delay is not too much of a concern.

Further Reading:

The lsyncd manual

2 thoughts on “Near Realtime File Replication With Lsyncd

  1. Axel Kittenberger

    Hi, Lsyncd Author here. Thank you for this great introductionary article!

    The -s option for rsync is for files with spaces in their name to be correctly transfered through ssh connections. As you said, unfortunally, rsync <3.0 does not recognize it and one still happens to encounter machines with lower rsync versions. We just figured it would be better to error right away if -s doesn't work, than having later some files missed, just because they have a space in their name.

    You shouldn't add -r to the options. Lsyncd will add it to the rsync calls where approperiate.

    You can change the wait delay via 'settings.delay = VALUE'. Its 15 (seconds) by default so multiple events are packed together in a single rsync call. As many applications work, if something happens on the filesystem, multiple things in a short timespan. If you set it to 0 rsync will be called immediatly for every little event, but since e.g. Create, Close-Write, Change Attributes are 3 events often happening in short timeframes, that would then result in 3 rsync calls. 1 second minimum is recommended.

    An important bug yet in 2.0.4 for big directory structures or old kernels: Lsyncd silently ignores directories if the kernel exceedes max_user_watches. So one might want to ensure max_user_watches is high, and create a lsyncdStatus file, to see if it didn't hit the limit. I'll change 2.0.5 so Lsyncd will cleanly terminate if it exceeds the set kernel watch limit.

Leave a Reply

Your email address will not be published. Required fields are marked *