Wednesday, September 12, 2007

Linux backup powered by RDiff-Backup

I have been covering linux back tools with my recent linux blog entries. Here's another alternative on doing linux backups using rdiff-backup.

Rdiff-backup is similar to rsync linux command. Rdiff-backup uses librsync algorithm library file which is also used by rsync linux command,, but rdiff-backup never uses rsync to do its backup functions. Rdiff-backup is also similar to rsnapshots as rdiff-backup creates a backup copy, an actul mirror/clone of the data being back up to a separate remote location, the difference is rdiff-backup creates a statistical file, sessions, and meta data to identify changed data for further incremetal backup operation. Rdiff-backup makes use of ssh connection during the data transfer between source and destination backup host.

Rdiff-backup backup utility also do incremental backup effectively and supports web-based backup administration using a different rdiff web interface to restore and pull backup data and files from remote host. This nice backup tool is available from Fedora repo, which can be installed using yum like any other open source linux backup tools.

Rdiff-backup man says:
rdiff-backup is a script, written in python(1) that backs up one directory to another. The target directory ends up a copy (mirror) of the source directory, but extra reverse diffs are stored in a special subdirectory of that target directory, so you can still recover files lost some time ago. The idea is to combine the best features of a mirror and an incremental backup.

rdiff-backup also preserves symlinks, special files, hardlinks, permissions, uid/gid ownership, and modification times.

rdiff-backup can also operate in a bandwidth efficient manner over a pipe, like rsync. Thus you can use ssh and rdiff-backup to securely back a hard drive up to a remote location, and only the differences will be transmitted. Using the default settings, rdiff-backup requires that the remote system accept ssh connections, and that rdiff-backup is installed in the user’s PATH on the remote system.

This linux blog entry covers another alternative on doing data backup using rdiff-backup with linux-based server environment.


RDIFF-BACKUP INSTALLATION
=========================

Installation is done on typical mode using yum like so

# yum -y install rdiff-backup


RDIFF SAMPLE USAGE
==================

Rdiff-backup can create a local mirror copy of source folder. Destination can be a mounted device, partition, external USB device, or even a separate harddisk. Destination can also be a remote host or a separate server location.

Below is how to create a data backup locally to a storage device.

To recursively backup data from a folder using rdiff-backup would be

# rdiff-backup source destination
# rdiff-backup /home /dev/sd01
# rdiff-backup /home /home2
# rdiff-backup /home/user1 /mnt/newharddisk
# rdiff-backup /home /mnt/mounted/partition

Make sure the destination device is writeable and accessible by local host. Rdiff-backup session data and statistics are being stored under rdiff-backup-data folder from the mirror copy.

To force a backup mode using rdiff-backup even the destination appeats to have a mirror copy already

# rdiff-backup -b /home /backup/folder

If you wish to exclude files and folders from being processed by rdiff-backup, this can be done like so

To exclude list of files in batch mode using rdiff-backup
# cat exclude.txt
~~~~~~~~~~~~~~~~~~~~~
/home/abc.txt
/home/folder1/hey.mp3
/home/user1/big-ISO.iso
~~~~~~~~~~~~~~~~~~~~~

# rdiff-backup --exclude-filelist exclude.txt /home /backup/folder

To exlude multiple files or shell patterns as input parameters for rdiff-backup can be done like so
# rdiff-backup --exclude "/home/user/*.mp3" /home/user/ /backup/destination

Alternatively, using stdin as input source can also be done to exclude files using rdiff-backup like so
# echo "/home/user/*.ISO" | rdiff-backup --exclude-filelist-stdin /home/ /mnt/backup/partition

A file glob exlustion is also supported directly from stdin to exclude files using rdiff-backup like so
# echo "/home/user1/*.mp3" | rdiff-backup --exclude-globbing-filelist-stdin /home/ /mnt/destination/disk

If you wish to exlude sockets files, special files, device files, and symbolic links from being included with the backup process, this can be done using rdiff-backup like so

# rdiff-backup --exclude-special-files --exclude-sockets --exclude-device-files --exclude-symbolic-links / /mnt/sata/harddisk

If you with to exclude other filesystem using rdiff-backup, that would be like so

# rdiff-backup --exclude-other-filesystems /home/apps /home /mnt/backup/destination

If you wish to exclude hard links from being processed into rdiff backup, this can be specified like so

# rdiff-backup --no-hard-links /home/ /mnt/destination/device

The exclude rdiff-backup parameter can also be applied to all inclusion parameters. Rdiff-backup supports file list inclusions from batch file or stdin as input source as shown with the below samples

# rdiff-backup --include-filelist exclude.txt /home /backup/folder

To backup multiple folders using rdiff-backup would be simple like so

# rdiff-backup --include /usr/local /home /data /config /backup-destination

To include multiple files or shell patterns as input parameters for rdiff-backup can be done like so
# rdiff-backup --include "/home/user/*.mp3" /home/user/ /backup/destination

Another usage for multiple shell patters would be

# rdiff-backup --include ignorecase:'/home/[a-z]/*/*.txt' /home /backup/points

Alternatively, using stdin as input source can also be done to include files using rdiff-backup like so
# echo "/home/user/*.ISO" | rdiff-backup --include-filelist-stdin /home/ /mnt/backup/partition

A file glob inclustion is also supported directly from stdin using rdiff-backup like so
# echo "/home/user1/*.mp3" | rdiff-backup --include-globbing-filelist-stdin /home/ /mnt/destination/disk

If you wish to include special files, and symbolic links to the backup process, this can be done using rdiff-backup like so

# rdiff-backup --include-special-files --include-symbolic-links / /mnt/sata/harddisk


For a more verbose listing of backup process using rdiff-backup, you can specify more arguments like so

# rdiff-backup --print-statistics /home /backup/folder

which gives you detailed data that has been changed from source files like so
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--------------[ Session statistics ]--------------
StartTime 1189595421.00 (Wed Sep 12 12:10:21 2007)
EndTime 1189595421.88 (Wed Sep 12 12:10:21 2007)
ElapsedTime 0.88 (0.88 seconds)
SourceFiles 8
SourceFileSize 11602 (11.3 KB)
MirrorFiles 8
MirrorFileSize 11602 (11.3 KB)
NewFiles 0
NewFileSize 0 (0 bytes)
DeletedFiles 0
DeletedFileSize 0 (0 bytes)
ChangedFiles 0
ChangedSourceSize 0 (0 bytes)
ChangedMirrorSize 0 (0 bytes)
IncrementFiles 0
IncrementFileSize 0 (0 bytes)
TotalDestinationSizeChange 0 (0 bytes)
Errors 0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If you are backing very large data and you wish to avoid file statistics to be written with rdiff-backup data folder and you wish to run rdiff-backup slightly faster with lesser disk space, you can optionally specify --no-file-statistics as an argument.


RDIFF-BACKUP DATA RESTORE
=========================

Since the samples above covers the usual parameters


with rdiff-backup, here are several ways to restore rdiff-backup data.

Let us assume that you did rdiff-back up of /home to /backup/home 7 days ago and you wish to restore the files 7 days ago to a differnt destination folder like /home7daysago , this can be done like so

# rdiff-backup -r 7D /backup/home /home7daysago

Altenatively, you can go to /backup/home/rdiff-backup-data/increments/home..dir /home7daysago like so

# rdiff-backup /backup/home/rdiff-backup-data/increments/home..dir /home7daysago

-r rdiff-backup parameter can also take time strings like

a. now
b. epoch time in seconds like "123456890"
c. date and time stamp format like "2007-09-12T01:00:00+01:00
d. a number followed by m,h,D,W,M,Y for minutes, hours, days, weeks, months and years respectively.
6h55m means 6 hour and 55 minutes ago, 1h2m3D would mean 3 days, 1 hour and 2 minutes ago .
e. a date format like YYYY/MM/DD, or YYYY-MM-DD or MM/DD/YYYY or MM/DD/YYYY can also be specified to rdiff-backup


DATA RESTORATION WITH RDIFF-WEB WEB INTERFACE
=============================================

Here's a nifty web interface that provides smooth data restoration done via rdiff-backup binary.
Rdiff-web supports web-based interface with RSS feeders for data restoration of rdiff-backup file.

rdiffWeb is a web interface for browsing and restoring from rdiff-backup repositories. It is written in Python and is distributed under the GPL license.

To see more rdiff-web screenshots in action, click here.

For more download info and documentation, click here.

USING SSH
=========

Optionally, when backing up to remote location or servers, it is adviseable to do the backup using compressed, or encrypted or tunnelled connection via any other means like SSH.

Trasferring backup rdiff-backup files from source host to destination host requires user account and/or access from both end. When doing this, you can follow a data pull model or dump model combined via SSH security and connectivity features.

This is also supported by rdiff-backup feature which can be done like so:

# rdiff-backup /home/ backupuser@123.123.123.123::/home/backup/

Transfer a rdiff-backup file /home to remote host /home/backup/ with an IP address of 123.123.123.123. Data transfer was done using authorized backupuser account details.

Any ssh access list and account restrictions would not be covered here unfortunately.

DATA BACKUP LISTINGS
====================

To see which rdiff-backup backup increments that are available for data restoration for multiple file restoration from multiple increments

# rdiff-backup -l /mnt/backup/destination


BACKUP DATA WITH RDIFF-BACKUP AUTOMATICALLY USING CRON
=======================================================

Further linux scheduling has been covered here using crontab and at linux utility.

Crontab and at linux utility has been very useful assistance on most users managing linux boxes as they are working server robots that does the your job very well when configured properly.



FINAL NOTE:
===========

Do backup regularly as it removes the chance of losing vital data and important company files as discussed earlier with the first linux backup entry taken here. Backup destination, strategic approach, source file(s), storage capacity. scheduled backup interval and expected time varies from system to system, company to company and from ground to ground.

Bottomline, make sure you have incremental backups of the past for proper filing and reference.

Joe, take a look at the black box around.

Cheers



Related Posts:

Linux Backup using RSnapShot
Linux Backup using Tar
Bandwidth-Effificent and Encrypted Linux Backup

0 comments:

Sign up for PayPal and start accepting credit card payments instantly.
ILoveTux - howtos and news | About | Contact | TOS | Policy