Home Fileserver: Backups from ZFS snapshots

Backups are critical to keeping your data protected, so let’s discover how to use ZFS snapshots to perform full and incremental backups.

In the last article on ZFS snapshots, we saw how to create snapshots of a file system. Now we will use those snapshots to create a full backup and subsequent incremental backups.

Performing backups

Obviously we only created a small number of files in the previous ZFS snapshots article, but we can still demonstrate the concept of using snapshots to perform full and incremental backups.

We’ll write our backups to a backup target file system called ‘tank/testback’.

This backup target could exist within the same pool, like in our simple example, but would most likely exist in another pool, either on the same physical machine, or at any location addressable, using iSCSI or ssh with an IP address etc.

Full backup

Now let’s do a full initial backup from the ‘tank/test@1’ snapshot:

# zfs send tank/test@1 | zfs receive tank/testback

Let’s take a look at the file systems to see what’s happened:

# zfs list
NAME                 USED  AVAIL  REFER  MOUNTPOINT
tank                 766G   598G  28.0K  /tank
tank/test           94.6K   598G  26.6K  /tank/test
tank/test@1         23.3K      -  26.6K  -
tank/test@2         21.3K      -  26.0K  -
tank/test@3         21.3K      -  26.0K  -
tank/test@4             0      -  26.6K  -
tank/testback       25.3K   598G  25.3K  /tank/testback
tank/testback@1         0      -  25.3K  -

Well, the command not only created the file system ‘tank/testback’ to contain the files from the backup, but it also created a snapshot called ‘tank/testback@1’. The reason for the snapshot is so that you can get the state of the backups at any point in time.

As we send more incremental backups, new snapshots will be created, enabling you to restore a file system from any snapshot. This is really powerful!

Let’s just take a look at the files in the full backup — it should contain the original files referenced from our initial snapshot ‘tank/test@1’.

# ls -l tank/testback
total 4
-rw-r--r--   1 root     root          15 May 12 14:50 a
-rw-r--r--   1 root     root          15 May 12 14:50 b

# cat /tank/testback/a /tank/testback/b
hello world: a
hello world: b

As we expected. Good 🙂

Incremental backups

Now let’s do an incremental backup, that will only transmit the differences between snapshots ‘tank/test@1’ and ‘tank/test@2’:

# zfs send -i tank/test@1 tank/test@2 | zfs receive tank/testback
cannot receive incremental stream: destination tank/testback has been modified
since most recent snapshot

Oh dear! For some reason, doing the ‘ls’ of the directory, when we inspected the backup contents, has actually modified the file system.

I have no idea how this happens or why, but I have seen this problem, or phenomenon, mentioned elsewhere.

It appears that the solution is to set the backup target file system to be read only, like this:

# zfs set readonly=on tank/testback

Another possibility is to use the ‘-F’ switch with the ‘zfs receive’ command. I don’t know which is the recommended solution, but I will use the switch for now, as I don’t want to make the file system read only, as we have several incremental backups to perform:

# zfs send -i tank/test@1 tank/test@2 | zfs receive -F tank/testback

Let’s just take a look at the files in the full backup — it should contain the original files referenced from our initial snapshot ‘tank/test@2’ — i.e. just file ‘b’:

# ls -l /tank/testback
total 2
-rw-r--r--   1 root     root          15 May 12 14:50 b
# cat /tank/testback/b
hello world: b

Good, as expected 🙂

Now let’s send all the remaining incremental backups:

# zfs send -i tank/test@2 tank/test@3 | zfs receive -F tank/testback
# zfs send -i tank/test@3 tank/test@4 | zfs receive -F tank/testback

# zfs list
NAME                 USED  AVAIL  REFER  MOUNTPOINT
tank                 766G   598G  29.3K  /tank
tank/test           94.6K   598G  26.6K  /tank/test
tank/test@1         23.3K      -  26.6K  -
tank/test@2         21.3K      -  26.0K  -
tank/test@3         21.3K      -  26.0K  -
tank/test@4             0      -  26.6K  -
tank/testback       93.2K   598G  26.6K  /tank/testback
tank/testback@1     22.0K      -  25.3K  -
tank/testback@2     21.3K      -  26.0K  -
tank/testback@3     21.3K      -  26.0K  -
tank/testback@4         0      -  26.6K  -

Here is the final state of the backup target file system after sending all the incremental backups.

As we would expect, it matches the source file system contents:

# cat /tank/testback/b /tank/testback/c
hello world: b
modified
hello world: c

Restore a backup

Now let’s restore all of our four backup target snapshots into four separate file systems, so we can demonstrate how to recover any or all of the data that we snapshotted and backed up:

# zfs send tank/testback@1 | zfs recv tank/fs1
# zfs send tank/testback@2 | zfs recv tank/fs2
# zfs send tank/testback@3 | zfs recv tank/fs3
# zfs send tank/testback@4 | zfs recv tank/fs4

Let’s look at the file systems:

# zfs list
NAME                 USED  AVAIL  REFER  MOUNTPOINT
tank                 766G   598G  33.3K  /tank
tank/fs1            25.3K   598G  25.3K  /tank/fs1
tank/fs1@1              0      -  25.3K  -
tank/fs2            26.0K   598G  26.0K  /tank/fs2
tank/fs2@2              0      -  26.0K  -
tank/fs3            26.0K   598G  26.0K  /tank/fs3
tank/fs3@3              0      -  26.0K  -
tank/fs4            26.6K   598G  26.6K  /tank/fs4
tank/fs4@4              0      -  26.6K  -
tank/test           94.6K   598G  26.6K  /tank/test
tank/test@1         23.3K      -  26.6K  -
tank/test@2         21.3K      -  26.0K  -
tank/test@3         21.3K      -  26.0K  -
tank/test@4             0      -  26.6K  -
tank/testback       93.2K   598G  26.6K  /tank/testback
tank/testback@1     22.0K      -  25.3K  -
tank/testback@2     21.3K      -  26.0K  -
tank/testback@3     21.3K      -  26.0K  -
tank/testback@4         0      -  26.6K  -

Let’s check ‘tank/fs1’ – it should match the state of the original file system when the ‘tank/test@1’ snapshot was taken:

# ls -l /tank/fs1
total 4
-rw-r--r--   1 root     root          15 May 12 14:50 a
-rw-r--r--   1 root     root          15 May 12 14:50 b
# cat /tank/fs1/a /tank/fs1/b
hello world: a
hello world: b

Perfect, now let’s check ‘tank/fs2’ – it should match the state of the original file system when the ‘tank/test@2’ snapshot was taken:

# ls -l /tank/fs2
total 2
-rw-r--r--   1 root     root          15 May 12 14:50 b
# cat /tank/fs2/b
hello world: b

Perfect, now let’s check ‘tank/fs3’ – it should match the state of the original file system when the ‘tank/test@3’ snapshot was taken:

# ls -l /tank/fs3
total 2
-rw-r--r--   1 root     root          24 May 12 17:35 b
# cat /tank/fs3/b
hello world: b
modified

Perfect, now let’s check ‘tank/fs4’ – it should match the state of the original file system when the ‘tank/test@4’ snapshot was taken:

# ls -l /tank/fs4
total 4
-rw-r--r--   1 root     root          24 May 12 17:35 b
-rw-r--r--   1 root     root          15 May 12 18:58 c
# cat /tank/fs4/b /tank/fs4/c
hello world: b
modified
hello world: c

Great!

Conclusion

Hopefully, you’ve now seen the power of snapshots. In future posts, I will show what else can be done with snapshots.

For more ZFS Home Fileserver articles see here: A Home Fileserver using ZFS. Alternatively, see related articles in the following categories: ZFS, Storage, Fileservers, NAS.

Join the conversation

13 Comments

  1. The reason why an ls changes the file system is probably because access times or other file / directory attributes are being modified… That would also explain the “growth” in a snapshot in the previous article even though you hadn’t changed the files or added any – by looking at the directories you are affecting them.

  2. Hi Marc, thanks a lot for solving that mystery — I was a wondering what caused it 🙂

  3. Very nice article!
    I was wondering if there is the possibility to directly mount a filesystems with only the files changed between two snapshots!

  4. Yeah, it is the access time the causes the warning, I hit the same thing myself. This can be turned off with:

    # zfs set atime=off tank/testback

  5. Thanks a lot Mark.

    Also, it looks like the next versions of the ‘Time Slider’ software in OpenSolaris 2008.11 will soon come with the ability to send incremental backups between two snapshots across the network to a second box, like a backup server. See:

    http://blogs.sun.com/erwann/entry/zfs_on_the_desktop_zfs : look under the ‘What’s next?’ heading where it says ‘- network based backups’

    I already have this functionality working on an adhoc basis, but it will be nice to have this setup once and then automated when this new functionality becomes available.

  6. There have been some comments on the zfs-discuss mailing list and OpenSolaris forum that appear relevant to people intending to use ZFS send/recv for backups (especially long term backups between different builds).

    On http://opensolaris.org/os/community/on/flag-days/pages/2008042301/
    Sun’s Matthew Ahrens says:
    ‘We have always disclaimed backwards compatibility of the “zfs send” stream format in the zfs(1m) manpage:

    The format of the stream is evolving. No backwards com-
    patibility is guaranteed. You may not be able to receive
    your streams on future versions of ZFS.

    However up until now we have maintained compatibility regardless.

    All Solaris 10 releases use the new, post-build-35 stream format, so no incompatibility will be introduced to Solaris 10.

    We plan to maintain backwards compatibility of the “zfs send” stream format throughout all releases of Solaris, and intend to commit to this backwards compatibility at some point in the future.’

    On http://opensolaris.org/jive/thread.jspa?messageID=389857&tstart=0
    or http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg26438.html
    Daniel Carosone says:
    “The zfs send/recv format is not warranted to be compatible between revisions.

    I’m concerned that, despite clear recommendations and advice against it, there seem to be a number of solutions appearing (like automated backup to cloud, via the auto-snapshot hooks) that use the stream format for long term storage, even from those within Sun. I think the message needs to be clear, either way – either endorse stream-format compatibility, or discourage such usage.”

  7. Hi again Simon,

    It’s funny, I’m looking to do something similar to this, but on FreeBSD. However, since I don’t trust zfs send/receive, I’ll be doing it with rsync as I want verification that the bits that end up on the target are the same as those that were read from the source. I’m pretty sure that rsync does that. I think this means that in order to get the snapshot goodness on the backup (assuming a different pool), you have to clone the snapshot, rsync it across, and take a snapshot on the target.

    I suppose you could do it with zfs send provided that you do some sort of MD5 check yourself afterwards.

  8. Simon, I can’t thank you enough for your blog post. I spend a month and some to trying to identify what is happening. Is there a way around this ? I mean leave the the ZFS system in read only state and/or not mounted at the start on the system where it receive the snaps.

  9. I’ve been in this situation quite a few times now.. The issue is that my external HDDs go bad within 2-3 years without warning..and my data becomes a toast… The fs I have been using till now was NTFS due to max compatibility… But I don’t trust it any more.

    And I am fed up now … The hw tech is continuously improving and expanding in terms of storage size… But these fs problems a severe headache..

    I just wanted to ask for a suggestion from you. Is it much better and less error prone to use all my partitions as zfs instead of NTFS. As ntfs is closed source,the extent to which any recovery tool can help will always be far lesser than to open fs like zfs?

    I would have to deal with incompatibility on windows.. I know that.. But integrity of data is much note important to me.. And of course if this could fine me less headaches this would be worth.

    What do you suggest?

  10. Simon,
    9 years or so after your post, and I find it very useful, like the rest of the series.
    after over a year using ZFS on Ubuntu (currently 16.04), I am finally looking into backups using snapshots and I am really wondering why I used rsync all this time.

    thank you very much for posting this series

Leave a comment

Your email address will not be published. Required fields are marked *