Now that you’ve got your ZFS Home Fileserver up and running and you’ve got your file systems created and shared to other machines on your home network, now’s the time to consider getting some backup policy in place.
I’ll show a few different possibilities open to you.
It’s all very well having a RAIDZ setup on your fileserver with single-parity redundancy and block checksums in place to protect you from a single failed drive, and snapshots in place to guard against accidental deletions, but you still need backups, just in case something really awful happens.
I built myself a backup machine for around 300 euros, using similar hardware as described in the Home Fileserver: ZFS hardware article, but reduced the cost by using cheaper components and using old SATA drives that were lying on the shelf unused. I will describe the components for this backup machine in more detail elsewhere.
For the purposes of this article, we’ll perform a backup from the fileserver to the backup machine.
The backup machine has Solaris installed on an old Hitachi DeathStar IDE drive I had lying around. These drives don’t have a particularly stellar reliability record, but I don’t care too much as nothing apart from the OS will be installed on this boot drive. All ZFS-related stuff is stored on the SATA drives that form the storage pool and this will survive even if the boot drive performs its ‘click of death’ party trick 🙂
The SATA drives I had that were lying around were the following: a 160GB Maxtor, a 250GB Western Digital, a 320GB Seagate and a 500GB Samsung. In total these drives yielded about 1.2TB of storage space when a non-redundant pool was created with them all. I chose to have no redundancy in this backup machine to squeeze as much capacity as possible from the drives, after all, the data is on the fileserver. In a perfect world I should probably have redundancy too on this backup machine, but never mind, we already have pretty good defences against data loss already with this setup.
So let’s create the ZFS storage pool now from these disks. First let’s get the ids of the disks we’ll use:
# format < /dev/null Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c0d0
/pci@0,0/pci-ide@4/ide@0/cmdk@0,0 1. c1t0d0 /pci@0,0/pci1043,8239@5/disk@0,0 2. c1t1d0 /pci@0,0/pci1043,8239@5/disk@1,0 3. c2t0d0 /pci@0,0/pci1043,8239@5,1/disk@0,0 4. c2t1d0 /pci@0,0/pci1043,8239@5,1/disk@1,0 Specify disk (enter its number): #
Disk id 0 is the boot drive — the IDE disk. For our non-redundant storage pool, we’ll use disks 1 to 4:
# zpool create backup c2t0d0 c2t1d0 c1t1d0 c1t0d0 # # zpool status backup pool: backup state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM backup ONLINE 0 0 0 c2t0d0 ONLINE 0 0 0 c2t1d0 ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c1t0d0 ONLINE 0 0 0 errors: No known data errors # # zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT backup 1.12T 643G 503G 56% ONLINE - < -- here's one I already used a bit # # zfs list NAME USED AVAIL REFER MOUNTPOINT backup 1.07T 28.1G 19K /backup #
This created a storage pool with around 1.12TB of capacity. I have shown data from a pool that was created previously, so it shows that 56% of capacity is already used.
Let's try out iSCSI
As I'd heard that iSCSI performs well, I thought it should make a good choice for performing fast backups across a Gigabit switch on a network.
Quoting from wikipedia on iSCSI we find this:
iSCSI is a protocol that allows clients (called initiators) to send SCSI commands (CDBs) to SCSI storage devices (targets) on remote servers. It is a popular Storage Area Network (SAN) protocol, allowing organizations to consolidate storage into data center storage arrays while providing hosts (such as database and web servers) with the illusion of locally-attached disks. Unlike Fibre Channel, which requires special-purpose cabling, iSCSI can be run over long distances using existing network infrastructure.
Sounds good, let's try it between these two Solaris boxes: (1) the fileserver, and (2) the backup machine.
At this point, I'm trying to choose from the notes I kept of the results of various experiments I did with iSCSI to see which commands I performed. But here is another nice feature of ZFS: it keeps a record of all major actions performed on storage pools. So I'll ask ZFS to tell me which incantations I performed on this pool previously:
# zpool history backup History for 'backup': 2008-02-26.19:40:29 zpool create backup c2t0d0 c2t1d0 c1t1d0 c1t0d0 2008-02-26.19:43:24 zfs create backup/volumes 2008-02-26.20:07:24 zfs create -V 1100g backup/volumes/backup 2008-02-26.20:09:16 zfs set shareiscsi=on backup/volumes/backup #
So we can see that I created the ‘backup’ pool without redundancy, then created a file system called ‘backup/volumes’, then created a 1100GB (1.1TB) volume called ‘backup’. Finally, I set the ‘shareiscsi’ property of the ‘backup’ volume to the value ‘on’, meaning that this volume will become an iSCSI target and other interested machines on the network will be able to access it.
Let’s take a look at the properties for this volume.
# zfs get all backup/volumes/backup NAME PROPERTY VALUE SOURCE backup/volumes/backup type volume - backup/volumes/backup creation Tue Feb 26 20:07 2008 - backup/volumes/backup used 1.07T - backup/volumes/backup available 485G - backup/volumes/backup referenced 643G - backup/volumes/backup compressratio 1.00x - backup/volumes/backup reservation none default backup/volumes/backup volsize 1.07T - backup/volumes/backup volblocksize 8K - backup/volumes/backup checksum on default backup/volumes/backup compression off default backup/volumes/backup readonly off default backup/volumes/backup shareiscsi on local backup/volumes/backup copies 1 default backup/volumes/backup refreservation 1.07T local #
Sure enough, you can see that it’s shared using the iSCSI protocol and that this volume uses the whole storage pool.
This iSCSI shared volume is known as an ‘iSCSI target’. In iSCSI parlance there is the concept of iSCSI targets (server) and iSCSI initiators (clients).
Now let’s enable the Solaris iSCSI Target service:
# svcadm enable system/iscsitgt
Now let’s verify that the system indeed thinks that this volume is an iSCSI target before we proceed further:
# iscsitadm list target -v Target: backup/volumes/backup iSCSI Name: iqn.xxxx-xx.com.sun:xx:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx Alias: backup/volumes/backup Connections: 1 Initiator: iSCSI Name: iqn.xxxx-xx.com.sun:0x:x00000000000.xxxxxxxx Alias: fileserver ACL list: TPGT list: LUN information: LUN: 0 GUID: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx VID: SUN PID: SOLARIS Type: disk Size: 1.1T Backing store: /dev/zvol/rdsk/backup/volumes/backup Status: online #
This was performed after the iSCSI initiator was configured and connected, so you see ‘Connections: 1’ and the initiator’s details.
Now we’re done with setup on the backup server. We’ve created a backup volume with 1.1TB of storage capacity from a mixture of old disparate drives that were lying around and we’ve made it available as an iSCSI target to machines on the network, which is needed, as we want to allow the fileserver to write to it to perform a backup.
Time to move on now to the client machine — the fileserver, which is known as the iSCSI initiator.
Let’s do the backup
Now we’re back onto the fileserver, we need to configure it to enable it to access the iSCSI target we just created. Luckily with Solaris that’s simple.
iSCSI target discovery is possible in Solaris via three mechanisms: iSNS, static and dynamic discovery. For simplicity, I will only describe static discovery — i.e. where you specify the iSCSI target’s id and the IP address of the machine hosting the iSCSI target explicitly:
# iscsiadm modify discovery --static enable # iscsiadm add static-config iqn.xx-xx.com.sun:xx:xx-xx-xx-xxxx-xxxx,192.168.xx.xx
Now that we’ve enabled our fileserver to discover the iSCSI target volume called ‘backup’ on the backup machine, we’ll try to get hold of its ‘disk’ id so that we can create a local ZFS pool with it, after all, it’s a block device just like any other disk, so we can let ZFS use it just like a local, directly-attached, physical disk:
# format < /dev/null Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c0d0
/pci@0,0/pci-ide@4/ide@0/cmdk@0,0 1. c1t0d0 /pci@0,0/pci1043,8239@5/disk@0,0 2. c1t1d0 /pci@0,0/pci1043,8239@5/disk@1,0 3. c2t0d0 /pci@0,0/pci1043,8239@5,1/disk@0,0 4. c3t0100001E8C38A43E00002A0047C465C5d0 /scsi_vhci/disk@g0100001e8c38a43e00002a0047c465c5 Specify disk (enter its number): #
The disk id of this backup volume is the one at item number 4 — the one with the really long id.
Now let’s create the storage pool that will use this volume:
# zpool create backup c3t0100001E8C38A43E00002A0047C465C5d0 # # zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT backup 1.07T 623G 473G 56% ONLINE - tank 2.03T 1002G 1.05T 48% ONLINE - test 3.81G 188K 3.81G 0% ONLINE - #
Voila, the pool ‘backup’ which used the iSCSI target volume ‘backup’ hosted on the backup machine is now usable, so now let’s do the backup — finally! 🙂
For demo purposes I created a 4GB video content folder to backup. We’ll time it being sent over a Gigabit network to see how fast it gets transferred — gotta have some fun after all this aggro, haven’t you? 🙂
# du -hs ./test_data 4.0G ./test_data # # date ; rsync -a ./test_data /backup ; date Thursday, 13 March 2008 00:20:55 CET Thursday, 13 March 2008 00:21:50 CET #
OK, so 4GB was copied from the fileserver to the backup machine in 55 seconds, which is a sustained 73MBytes/second, not bad at all! 🙂
That’s all folks!
I’ll tackle other subjects soon like incremental backups using ZFS commands and also using good old ‘rsync’.