Backups are critical to keeping your data protected, so let’s discover how to use ZFS snapshots to perform full and incremental backups.
In the last article on ZFS snapshots, we saw how to create snapshots of a file system. Now we will use those snapshots to create a full backup and subsequent incremental backups.
Performing backups
Obviously we only created a small number of files in the previous ZFS snapshots article, but we can still demonstrate the concept of using snapshots to perform full and incremental backups.
We’ll write our backups to a backup target file system called ‘tank/testback’.
This backup target could exist within the same pool, like in our simple example, but would most likely exist in another pool, either on the same physical machine, or at any location addressable, using iSCSI or ssh with an IP address etc.
Full backup
Now let’s do a full initial backup from the ‘tank/test@1’ snapshot:
# zfs send tank/test@1 | zfs receive tank/testback
Let’s take a look at the file systems to see what’s happened:
# zfs list NAME USED AVAIL REFER MOUNTPOINT tank 766G 598G 28.0K /tank tank/test 94.6K 598G 26.6K /tank/test tank/test@1 23.3K - 26.6K - tank/test@2 21.3K - 26.0K - tank/test@3 21.3K - 26.0K - tank/test@4 0 - 26.6K - tank/testback 25.3K 598G 25.3K /tank/testback tank/testback@1 0 - 25.3K -
Well, the command not only created the file system ‘tank/testback’ to contain the files from the backup, but it also created a snapshot called ‘tank/testback@1’. The reason for the snapshot is so that you can get the state of the backups at any point in time.
As we send more incremental backups, new snapshots will be created, enabling you to restore a file system from any snapshot. This is really powerful!
Let’s just take a look at the files in the full backup — it should contain the original files referenced from our initial snapshot ‘tank/test@1’.
# ls -l tank/testback total 4 -rw-r--r-- 1 root root 15 May 12 14:50 a -rw-r--r-- 1 root root 15 May 12 14:50 b # cat /tank/testback/a /tank/testback/b hello world: a hello world: b
As we expected. Good 🙂
Incremental backups
Now let’s do an incremental backup, that will only transmit the differences between snapshots ‘tank/test@1’ and ‘tank/test@2’:
# zfs send -i tank/test@1 tank/test@2 | zfs receive tank/testback cannot receive incremental stream: destination tank/testback has been modified since most recent snapshot
Oh dear! For some reason, doing the ‘ls’ of the directory, when we inspected the backup contents, has actually modified the file system.
I have no idea how this happens or why, but I have seen this problem, or phenomenon, mentioned elsewhere.
It appears that the solution is to set the backup target file system to be read only, like this:
# zfs set readonly=on tank/testback
Another possibility is to use the ‘-F’ switch with the ‘zfs receive’ command. I don’t know which is the recommended solution, but I will use the switch for now, as I don’t want to make the file system read only, as we have several incremental backups to perform:
# zfs send -i tank/test@1 tank/test@2 | zfs receive -F tank/testback
Let’s just take a look at the files in the full backup — it should contain the original files referenced from our initial snapshot ‘tank/test@2’ — i.e. just file ‘b’:
# ls -l /tank/testback total 2 -rw-r--r-- 1 root root 15 May 12 14:50 b # cat /tank/testback/b hello world: b
Good, as expected 🙂
Now let’s send all the remaining incremental backups:
# zfs send -i tank/test@2 tank/test@3 | zfs receive -F tank/testback # zfs send -i tank/test@3 tank/test@4 | zfs receive -F tank/testback # zfs list NAME USED AVAIL REFER MOUNTPOINT tank 766G 598G 29.3K /tank tank/test 94.6K 598G 26.6K /tank/test tank/test@1 23.3K - 26.6K - tank/test@2 21.3K - 26.0K - tank/test@3 21.3K - 26.0K - tank/test@4 0 - 26.6K - tank/testback 93.2K 598G 26.6K /tank/testback tank/testback@1 22.0K - 25.3K - tank/testback@2 21.3K - 26.0K - tank/testback@3 21.3K - 26.0K - tank/testback@4 0 - 26.6K -
Here is the final state of the backup target file system after sending all the incremental backups.
As we would expect, it matches the source file system contents:
# cat /tank/testback/b /tank/testback/c hello world: b modified hello world: c
Restore a backup
Now let’s restore all of our four backup target snapshots into four separate file systems, so we can demonstrate how to recover any or all of the data that we snapshotted and backed up:
# zfs send tank/testback@1 | zfs recv tank/fs1 # zfs send tank/testback@2 | zfs recv tank/fs2 # zfs send tank/testback@3 | zfs recv tank/fs3 # zfs send tank/testback@4 | zfs recv tank/fs4
Let’s look at the file systems:
# zfs list NAME USED AVAIL REFER MOUNTPOINT tank 766G 598G 33.3K /tank tank/fs1 25.3K 598G 25.3K /tank/fs1 tank/fs1@1 0 - 25.3K - tank/fs2 26.0K 598G 26.0K /tank/fs2 tank/fs2@2 0 - 26.0K - tank/fs3 26.0K 598G 26.0K /tank/fs3 tank/fs3@3 0 - 26.0K - tank/fs4 26.6K 598G 26.6K /tank/fs4 tank/fs4@4 0 - 26.6K - tank/test 94.6K 598G 26.6K /tank/test tank/test@1 23.3K - 26.6K - tank/test@2 21.3K - 26.0K - tank/test@3 21.3K - 26.0K - tank/test@4 0 - 26.6K - tank/testback 93.2K 598G 26.6K /tank/testback tank/testback@1 22.0K - 25.3K - tank/testback@2 21.3K - 26.0K - tank/testback@3 21.3K - 26.0K - tank/testback@4 0 - 26.6K -
Let’s check ‘tank/fs1’ – it should match the state of the original file system when the ‘tank/test@1’ snapshot was taken:
# ls -l /tank/fs1 total 4 -rw-r--r-- 1 root root 15 May 12 14:50 a -rw-r--r-- 1 root root 15 May 12 14:50 b # cat /tank/fs1/a /tank/fs1/b hello world: a hello world: b
Perfect, now let’s check ‘tank/fs2’ – it should match the state of the original file system when the ‘tank/test@2’ snapshot was taken:
# ls -l /tank/fs2 total 2 -rw-r--r-- 1 root root 15 May 12 14:50 b # cat /tank/fs2/b hello world: b
Perfect, now let’s check ‘tank/fs3’ – it should match the state of the original file system when the ‘tank/test@3’ snapshot was taken:
# ls -l /tank/fs3 total 2 -rw-r--r-- 1 root root 24 May 12 17:35 b # cat /tank/fs3/b hello world: b modified
Perfect, now let’s check ‘tank/fs4’ – it should match the state of the original file system when the ‘tank/test@4’ snapshot was taken:
# ls -l /tank/fs4 total 4 -rw-r--r-- 1 root root 24 May 12 17:35 b -rw-r--r-- 1 root root 15 May 12 18:58 c # cat /tank/fs4/b /tank/fs4/c hello world: b modified hello world: c
Great!
Conclusion
Hopefully, you’ve now seen the power of snapshots. In future posts, I will show what else can be done with snapshots.
For more ZFS Home Fileserver articles see here: A Home Fileserver using ZFS. Alternatively, see related articles in the following categories: ZFS, Storage, Fileservers, NAS.
The reason why an ls changes the file system is probably because access times or other file / directory attributes are being modified… That would also explain the “growth” in a snapshot in the previous article even though you hadn’t changed the files or added any – by looking at the directories you are affecting them.
Hi Marc, thanks a lot for solving that mystery — I was a wondering what caused it 🙂
Very nice article!
I was wondering if there is the possibility to directly mount a filesystems with only the files changed between two snapshots!
Yeah, it is the access time the causes the warning, I hit the same thing myself. This can be turned off with:
# zfs set atime=off tank/testback
Those interested in doing backups with ZFS might want to take a look at Zetaback (available from https://labs.omniti.com/trac/zetaback ), which is a tool to help manage and automate backups using zfs snapshots.
Thanks a lot Mark.
Also, it looks like the next versions of the ‘Time Slider’ software in OpenSolaris 2008.11 will soon come with the ability to send incremental backups between two snapshots across the network to a second box, like a backup server. See:
http://blogs.sun.com/erwann/entry/zfs_on_the_desktop_zfs : look under the ‘What’s next?’ heading where it says ‘- network based backups’
I already have this functionality working on an adhoc basis, but it will be nice to have this setup once and then automated when this new functionality becomes available.
There have been some comments on the zfs-discuss mailing list and OpenSolaris forum that appear relevant to people intending to use ZFS send/recv for backups (especially long term backups between different builds).
On http://opensolaris.org/os/community/on/flag-days/pages/2008042301/
Sun’s Matthew Ahrens says:
‘We have always disclaimed backwards compatibility of the “zfs send” stream format in the zfs(1m) manpage:
The format of the stream is evolving. No backwards com-
patibility is guaranteed. You may not be able to receive
your streams on future versions of ZFS.
However up until now we have maintained compatibility regardless.
All Solaris 10 releases use the new, post-build-35 stream format, so no incompatibility will be introduced to Solaris 10.
We plan to maintain backwards compatibility of the “zfs send” stream format throughout all releases of Solaris, and intend to commit to this backwards compatibility at some point in the future.’
On http://opensolaris.org/jive/thread.jspa?messageID=389857&tstart=0
or http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg26438.html
Daniel Carosone says:
“The zfs send/recv format is not warranted to be compatible between revisions.
I’m concerned that, despite clear recommendations and advice against it, there seem to be a number of solutions appearing (like automated backup to cloud, via the auto-snapshot hooks) that use the stream format for long term storage, even from those within Sun. I think the message needs to be clear, either way – either endorse stream-format compatibility, or discourage such usage.”
Hi again Simon,
It’s funny, I’m looking to do something similar to this, but on FreeBSD. However, since I don’t trust zfs send/receive, I’ll be doing it with rsync as I want verification that the bits that end up on the target are the same as those that were read from the source. I’m pretty sure that rsync does that. I think this means that in order to get the snapshot goodness on the backup (assuming a different pool), you have to clone the snapshot, rsync it across, and take a snapshot on the target.
I suppose you could do it with zfs send provided that you do some sort of MD5 check yourself afterwards.
Actually Simon, I think I’m wrong. Sorry. ZFS send and receive should be fine so long as you aren’t just storing ZFS send streams, to be received at a later date. Doh.
http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg19239.html
Simon, I can’t thank you enough for your blog post. I spend a month and some to trying to identify what is happening. Is there a way around this ? I mean leave the the ZFS system in read only state and/or not mounted at the start on the system where it receive the snaps.
I’ve been in this situation quite a few times now.. The issue is that my external HDDs go bad within 2-3 years without warning..and my data becomes a toast… The fs I have been using till now was NTFS due to max compatibility… But I don’t trust it any more.
And I am fed up now … The hw tech is continuously improving and expanding in terms of storage size… But these fs problems a severe headache..
I just wanted to ask for a suggestion from you. Is it much better and less error prone to use all my partitions as zfs instead of NTFS. As ntfs is closed source,the extent to which any recovery tool can help will always be far lesser than to open fs like zfs?
I would have to deal with incompatibility on windows.. I know that.. But integrity of data is much note important to me.. And of course if this could fine me less headaches this would be worth.
What do you suggest?
You could also use this script to automate zfs backups: https://github.com/psy0rz/zfs_autobackup/blob/master/README.md
Simon,
9 years or so after your post, and I find it very useful, like the rest of the series.
after over a year using ZFS on Ubuntu (currently 16.04), I am finally looking into backups using snapshots and I am really wondering why I used rsync all this time.
thank you very much for posting this series