Friday, August 31, 2007

Linux backups powered by Tar

Make backups of your data.

Anybody who manages servers usually does administrative server backup procedures on regular repeated basis. There are several wide procedures and approaches on doing data backups. As to which specific data to do the back up depends on the reasons why the need to backup these files and folders.

There are lots of storage options and backup destinations to choose from nowadays. Selecting backup storage types is as logical as where to have backup destinations and what is available on ground. There are people who does their backups from real work or several occasions into their external tape and/or zip drive devices, external and portable large-capacity USB drives, avilable backup servers, removable backup SCSI harddisk, network SAN drives or even flash drive. The interval basis of doing the backup also depends both from server and end-users data operation as well.

Backup procedures also comes down from simple linux backup commands, from non-interactive shell scripts up to commercial backup products.

Bottomline is that these backup techniques and strategical methods depend on what do you have and what is available on ground.

This document entry however covers a foundation approach and mostly used tar arguments on backing up data using the linux command tar.

=============================================
Data backup samples using Tar Linux command
=============================================

A very simple usage of tar

# tar cvf backup.tar *

Legend:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-c create a new tar file
-v verbose mode
-f select file
* file glob selection
By default, tar digs down into subdirectories
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Making it as a backup copy is as simple as moving to a different location. Preferably to another storage location. As discussed earlier, strategical backup methods depends on what is on ground. If you do not have any more available host, or external backup devices. And what you have is another removable or separate harddisk, you can transfer and copy the backup file on that separate or removable harddisk.

If you wish to copy it on a separate disk mounted on a separate partition, that is just simple file copying linux command like so

# cp backup.tar /mounted/separate/harddisk/backup.tar

If you wish to copy your backup file to a mounted external USB or flash device, similarly it would be done like so

Assuming that the external drive is already mounted
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# cp backup.tar /mnt/usb

Alternatively,
# cp backup.tar /dev/sdb1
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Copying data between host to host can be done in many way. One way to achieve this is making use of shell secure copy named scp. If you wish to copy files between two host as tar file, this is more likely how to approach that issue

# scp -C backup.tar username_from_other_host@other_host:destionation_location
# scp -Cpv backup.tar username_from_other_host@other_host:destionation_location

Remember, linux gives us total control using compounded linux commands like so

# scp -C `tar cvf backup.tar *.mp3` remoteuser@remotehost:~remoteuser

Alternatively

# tar cvf backup.tar *.mp3 | scp -C backup.tar remoteuser@remotehost:~remoteuser

Or

# scp -C $(tar cvf backup.tar *.mp3) remoteuser@remotehost:~remoteuser

The above command would prompt you for user's password coming from host destination.

Legend for scp:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-C enable compression mode during data trasfer
-p enable modify and access time and modes from the source original file
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

From this backup files, they might grow into big filesizes eventually. If you wish to burn them into CD/DVD disks, you might do so.

However if you wish to have backup file not as it is but as an ISO image file before any host to host or harddisk to harddisk transfers, this could be done like so

# mkisofs -o myhostname-backup-image01.ISO backup.tar.gz

Then proceed with data transfer of backup files.

More information on creating an ISO image from your big backup archived file, you can read more on creating ISO from recent entry here. DVD/CD burning applications were also discussed from here.

Tar comes with many arguments. Some of them cannot really be combined anytime like create a new tar file while updating or listing archived files. You can have more tar argument sample, below are more working tar samples.

Another tar backup example using more tar arguments:

# tar cvfp backup.tar * --exclude=*.mp3 --no-recursion
# tar zcvfp backup.tar.gz * --no-selinux
# tar zcvfp backup.tar.gz *
# tar jcvfp backup.bz2 *

Legend:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--exclude exclude file globbing patterns
-z integrate with gzip command, compression on the fly
-j integrate with bzip2 command, compression on the fly
-p preserve permissions
-c create a new tar file
-v verbose mode
-f file selection
* file glob selection
--no-selinux do not include SELinux security context information and extraction date
--no-recursion do not recurse into subdirectories

By default, tar digs down into subdirectories recursively
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If you wish to exclude a directory folder from being archived using tar

# tar cvfp backup.tar * --exclude=FOLDERNAME

From the above examples, --exclude argument was specified since I am currently working on same folder location where
the archived tar file is also being saved and processed.

If you wish to update an existing tar file from new file changes or newly added folders, follow like so

# tar uvfp backup.tar * --exclude=backup.tar
# tar uvfp backup.tar * --exclude=*.tar

From above sample, any file changes and updates would by sync to an existing archived tar file excluding changes occuring from the backup file itself. If the archived file is already existing, the file would be overwritten.

Legend:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-t list down archived files and folder
-u update an existing tar file
--exclude exclude file globbing pattern
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Exclude argument excempts a matched file globbing pattern. From the above example, backup.tar and anyfile that ends with tar file extension (*.tar) would not be updated with the last tar command.

If you wish to exclude batches of files without filename patterns, you can create a text file that contains the filenames that needs to be excluded from the tar operation like so

# cat except.txt
~~~~~~~~~~~~~~~~~~~~~~
filename1.mp3
batibot.ISO
sesame.tar.gz
....
snipped
....
~~~~~~~~~~~~~~~~~~~~~~

and feed it to tar operation command with -X tar argument like so

# tar cvfp backup.tar * -X except.txt
# tar cvfp backup.tar * -X except.txt -X except2.txt





If you wish to append a new set of files or folder to an already existing tar archive, using tar would be

# tar rvfp /home/oldfolder/backup.tar /home/newfolder
# tar rvfp /home/oldfolder/backup.tar /home/newfolder/folder1/*

If you wish to backup a floppy disk using tar, you could do like so

Mount the floppy disk first like so
# mkdir /mnt/floppy
# mount /dev/floppy /mnt/floppy

And tar all contents from floppy disk like so
# tar cvfp floppy.tar /mnt/floppy/.

If you wish to backup a CD disk using tar, you could do like so

Mount the CD disk first like so
# mount /dev/cdrom /mnt/cdrom
# tar cvf CDfiles.tar /mnt/cdrom

If you wish to backup the entire harddisk partition

# tar zcvfp home_partition_backup.tar.gz /mounted/partition
# tar zcvfp home_partition.tar.gz /mnt/home

Tape backup using tar is likely the same as shown below

# tar zcvfp home_partition.tar.gz /dev/st0

USB flash drive for small data tar backup can be done like so

# tar zcvf *.doc /mnt/usb1

Multiple tar file concatenation can also be done from tar to back up multiple tar files like so

# tar Afvp home1.tar home2.tar

If you wish to tar all file glob matched from multiple directories and/or tar dynamic matched files from doing find linux command would be like so

# tar zcvf backup.tar.gz `find /home -name '*.txt'`
# tar zcvf backup.tar.gz `find /etc -name '*.conf'`

Incremental Backup or Archiving using Tar
-----------------------------------------

# tar cvfGp incremental-backup.tar *.txt
# tar cvfGp incremental-backup.tar /home/foldername

Alternatively using a date of reference on incremental backup would be like so

# tar cvfGp incremental-backup.tar *.dat -N '1 Sep 2007'

The above command would do incremental backup tagging all *.dat files having file date stamp creations newer than 1st of September 2007.

# tar cvfGp incremental-backup.tar *.dat --newer-mtime '1 Sep 2007'

The above command would process incremental backup tagging all *.dat files with a newer file date modification value than 1st of September 2007.

Verification Listings of Archived Tar file
------------------------------------------

If you wish to list down or verify files inside an archived backup tar file

# tar tfv backup.tar

If you wish to verify a specific file, folders or multiple files if they are listed from particular archived tar file

# tar tfvp backup.tar | grep specificfile.txt
# tar tfvp backup.tar | grep 'file1\|file2'
# tar tfvp backup.tar | grep 'folder1'

Considering thousands listings of archived files from a single tar file and you wish to exclude a folder from being listed, you can optionally use the --exclude tar argument like so

# tar tfvp backup.tar | grep 'folder1' --exclude=THISFOLDER
# tar tfvp backup.tar | grep 'folder1' --exclude=*.mp3 --exclude=FOLDERNAME

Extraction of files from an Archived Tar File
---------------------------------------------

For basic extraction of archived tar file
# tar xvf backup.tar --no-recursion

For extraction of gzipped compressed archived tar file
# tar zxvf backup.tar

For extraction of bzipped compressed archived tar file
# tar jxvf backup.tar

For extraction of archived tar file to a specific destination folder
# tar xvf backup.tar -C /tmp
# tar jxvf backup.tar -C /tmp/test
# tar zxvf backup.tar -C /home/sesame
# tar zxvf backup.tar -C /home/sesame --overwrite --overwrite-dir

Legend:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--overwrite simply overwrites any existing files
--overwrite-dir simply overwrites any existing directory folders
--delete be careful with this one as it deletes the archived tar file after extraction
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Integration of tar backup procedure with shell scripts
-------------------------------------------------------

Doing the backup can be done non-interactively or on automatic mode. The first step to do this is to create a shell script. From there you can list down all the necessary backup tar commands one line at a time and make the script executable. A very basic sample of backup scripts would be combination of tar commands specific to your needs that suits the required files and folders to be back up.

Sample simple backup script
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#/bin/bash
# backup scripts basic examples

# Go to backup or temp folder
cd /tmp

# what is happening?
echo Starting the backup process...

# Do all tar backup lines below
tar zxvfp backup.tar.gz /home --exclude=*.ISO --exclude=*.tar
tar zxvfp backup.tar.gz /home/www/pages

# Make a copy from here to there

cp backup.tar.gz /mnt/separate/hardisk/or/any/mounted/device

# Transfer the backup file from host to host
# Remember an entry with passwordless / passphraseless ssh connection discussed recently?
scp -Cp backup.tar.gz user@hostname:~user

# Gone with the wind, now delete footprints in the sand to save local disk space
rm -rf backup.tar.gz

# I see
echo Done
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If this helps you out somehow, try contributing from the linux bar's tip using those black ninja boxes if you may.

File Naming Convetions
----------------------

One good practice on administering backup files is a good filename convention. This is mostly applicable on handling and processing any type of files from any file read/write operations.

Yes we need a unique filename. A unique file name is required for a repeated backup routines. Linux can generate unique random characters and numbers but the problem with that approach is unique identification of each filename is not identifiable against each other filenames.

What else could we use on unique filename for our backup files.

Ding! A date string value is always unique and usable on daily backup operations as well as time string value would be acceptable as a unique filename for hourly backup operations. Below are examples of unique file naming convention based on current system date and time. This approach is logical in manner of having proper identification among group of backup files.

This date string command was also recently discussed from INQ7.

Example of how to use from shell script

# cat shellscript.sh
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#/bin/bash

ID=`date "+%m-%d-%Y"`

# Daily backup routine with daily unique file name
tar zcvf webpages-$ID.tar.gz /home/www
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# chown 700 shellscript.sh

Unique filename based on date value from command line terminal using tar command would be like

#tar zcvf home1-folders-`date "+%m-%d-%Y"`.tar.gz /home/www

Another unique filename based on time and date values done from terminal would be like

#tar zcvf www-2-pages-`date "+%m-%d-%Y-%I-%M-%S-%P"`.tar.gz /var/www/html2

Backup Scheduling Using Crontab Utility
---------------------------------------

Combining your backup script commands or scripts into linux crontab utility is required for a nice scheduled backup procudures done non

-interactively and automatically on regular basis. There was a recent crontab howto discussion, you can find it here.

There are companies that do their backup during morning hours as most data transaction occurs on night hours. Several do it during sleeping hours or midnight time. But usually, most backup operations depends on each data backup requirements, cut-off date and time implemented by company policies.

Final Note:

Data backups are SOPs, not only for the reason of serving as a fallback data source, but also for serving as history log records and for future references.

Backup data are just dead 1's and 0's of the past, but in one linux snap, they can be brought back to life when we needed them most.

So goodluck, that's it for now.

Unfortunately, screenshots on how a commercially known Tivoli Storage backup system works with linux would not be here.

Next entry would be backing up files using linux rsync and rsnapshot.

Have a nice weekend!

Related Posts:

Linux Backup using RSnapShot

Linux Backup using RDiff

Bandwidth-Effificent and Encrypted Linux Backup

0 comments:

Sign up for PayPal and start accepting credit card payments instantly.
ILoveTux - howtos and news | About | Contact | TOS | Policy