Home Fileserver: I’ll use ZFS

After reading an article that turned out to be pure gold, namely: A Conversation with Jeff Bonwick and Bill Moore – The future of file systems, I felt a warm glow inside and realised that ZFS seemed to be the best solution currently available for solving a number of current problems, and I wanted to use it.

These two guys work for Sun and have been responsible for designing and implementing the ZFS file system which is incorporated within Sun Solaris. Fortunately, Sun have made it open source, and so it will become available eventually in complete form on other operating systems. Solaris has the only complete implementation currently (March 2nd 2008).

Apple OS X 10.5, AKA Leopard, ships with a read-only implementation of ZFS. They also have an incomplete read/write version available from developer.apple.com for free. I tried the Apple developer version of ZFS, but after a couple of kernel panics, I didn’t need much convincing that it was not ready to be trusted with my data quite yet. However, I chucked a couple of spare SATA disks into the Mac Pro case and experimented a little to whet my appetite.

Also, at the time of writing, FreeBSD 7.0 has an incomplete port available whose status you can see here. But why should I choose an incomplete implementation which has been ported from the real deal (Solaris’ ZFS)? How many bugs were created in the porting process? Apparently they had to change a lot of things in FreeBSD due to the ‘impedance mismatch’. And there will always be some lag between Solaris ZFS fixes/updates, and it being implemented finally in FreeBSD.

Linux seems to have problems with ZFS because it is not released under the GPL licence. However, there is apparently a FUSE version of ZFS available under Linux, albeit with degraded performance.

Then my choice of ZFS implementation was a no-brainer: Sun Solaris, the reference ZFS implementation. Bring it on! 😉

To whet your appetite and help you convince yourself that ZFS is cool and should be used, just take a look at the following resources available here. Check out the videos under the ‘demos’ section and especially take a look at the reasoning that went behind the design of ZFS here: ZFS: The Last Word in File Systems.

When you’ve had a chance to digest all that, you’ll no doubt realise the beauty of ZFS and what makes it so great. To name but a few great features:

  1. Simple administration
  2. Ability to create large, redundant data storage pools with one command
  3. Built-in data scrubbing to enable ZFS to self-heal ‘latent failures’ (bit rot etc)
  4. Built-in 256-bit checksumming used for every block
  5. For redundant data pools you choose from mirror, single-parity RAIDZ1 (a la RAID level 5) or double-parity RAIDZ2 (a la RAID level 6)
  6. Transactional file system to guarantee consistent state of data even when catastrophic failures like power loss occurs
  7. High availability: data scrubbing can occur without taking the storage offline — unlike ‘fsck’ in Linux
  8. Is designed upon the assumption that disk hardware should never be trusted, so solid checksumming, transactions are used
  9. Designed to use cheap, commodity SATA disks, not expensive SAS disks
  10. RAIDZ1 can survive 1 drive failure, RAIDZ2 can survive 2 drive failures
  11. Hot spares can be specified when the data pool is created, or added to the data pool later
  12. Hot spares are used automatically if drive failure is detected
  13. Data pools can be sent and received, to allow easy replication/migration of data when upgrading disks
  14. Failed disks can be replaced and substituted with one command (if no hot spares are available)
  15. Regular snapshots can be made to allow easy file system state rollbacks, or retention of deleted/changed files – they are cheap in storage and fast to perform (uses hard links)
  16. ZFS data pools can be shared via NFS, Samba/CIFS and iSCSI
  17. For super valuable data, you can create a ZFS filesystem within a data pool that creates multiple geographically distant copies of the data on the disk, known as ditto blocks: 2 or 3 copies instead of just one
  18. Sun Solaris OS and ZFS are free and open source

That’s all I could think of for now 🙂 But I think you’ll agree, that’s a pretty impressive feature set.

And then the best thing is that all this is free and it’s open source! Sun Solaris is free, and ZFS is part of Solaris, so is free too. Of course, companies wanting support can buy that in the usual way from Sun, but for enthusiasts who don’t need 24/7 engineer support, it’s free for the taking. Just download it and install. Enjoy 🙂

One company that makes a living selling expensive storage boxes, NetApp is understandably, quite worried about ZFS destroying their lucrative business model selling expensive storage boxes, so they decided to sue Sun when Sun refused to comply with their demands to restrict sales of ZFS solutions. Take a look at the ping pong of public insults here:

I don’t know who is ‘right’ here, but all I do know is that NetApp’s Dave Hitz tried to register the legal case in a Texas court well-known to be sympathetic to patent trolls, despite both companies having their headquarters in Silicon Valley, California.

You can read more about that here: Sun sues NetApp, California style. When I last looked, it looked like Sun were ‘winning’ in this row, but I didn’t look recently. You can draw your own conclusions 😉

For more ZFS Home Fileserver articles see here: A Home Fileserver using ZFS. Alternatively, see related articles in the following categories: ZFS, Storage, Fileservers, NAS.

Join the conversation

15 Comments

  1. Hi Tony. I tried the ZFS r/w version available from developer.apple.com that was available in December — it was a free account I used to get it. I slapped a couple of old SATA drives in the Mac Pro case and did a ‘zpool create tank mirror disk1 disk2’ command to setup a mirror to play around with. It gave a couple of kernel panics during that time writing to the pool, so I decided to get some ‘real’ ZFS in Solaris 🙂 I’ve been running ZFS on Solaris now for 2 months — rock solid and dependable. As CIFS and iSCSI support are still a bit new, I think there are a couple of minor inconveniences here and there, but on the whole, it works very well.

  2. FreeBSD 7.0 RELEASE now includes ZFS in the base system.

    There seems to be not a whole lot of missing features other than ACL’s (ZFS uses WinXP/NT and NFS4 style ACLs instead of POSIX … not sure why) and out of the box iSCSI target sharing. You can still use FreeBSD’s version of iSCSI target to share a ZFS pool and/or SAMBA or NFS.

    While the current Solaris version is 9 something FreeBSD “production” version is 6 … but freebsd’s CURRENT branch is pretty much up to date. Anyway I’ve been storing movies, pics, music on FreeBSD ZFS for over a year without incident. The box has over 2TB of storage and I don’t know when I’ll ever back it up 😀

  3. I’ve downloaded the latest ZFS stuff from the macosforge project. The versions available from developer.apple.com are very out of date and (as noted) buggy as hell. However, the macosforge ones are much more recent and I’ve found them to be OK since the latest (102A) revision came out.

    I gather there are some issues relating to spotlight on the ZFS drive, and you don’t get some free stuff like zfs share. Also, the Mac systems don’t use NFS4 so you lose the extended attribute stuff if you’re sharing via NFS (though AFP should be OK).

  4. @luvbsd:

    Yes, I’m sure FreeBSD 7.0’s ZFS implementation is very capable, but when I looked at the list of ‘incomplete/in progress’ items a month or so ago, I decided to take the ZFS implementation which I thought would be likely to give me the least trouble and, rightly or wrongly, finally decided on Solaris. I just wanted to ensure, as much as possible, to avoid ‘bleeding edge’ for the moment 🙂

    For backing up 2TB of data on your fileserver, from my experience, you could build a separate backup box for around 600 euros: 300 euros for the system unit and 300 euros for the disks (no redundancy). E.g. Antec NSK6580 case + 430W PSU, Asus M2N-E motherboard, 2GB Kingston non-ECC DDR2 800 RAM, AMD BE-2350 or cheaper processor, 3 disks: WD7500AAKS, giving ~2TB non-redundant backup storage. Or, of course, if you have space, you could just chuck some more drives in your existing case and create a new backup pool 🙂

  5. @Alex Blewitt:

    I’m glad to hear that the macosforge ZFS implementation is working better than the Apple version I used from developer.apple.com a few months ago!

  6. As a FYI, one of the lead programmers on ZFS from Sun is now working for Apple. I forgot her name, but I expect extremely cool things on OSX/ZFS now. And with opensolaris now capable of booting from zfs, hopefully when 10.6 gets released, OSX will boot off zfs as well.

  7. @goodb0fh: OK, I didn’t know that. Yes, I’m hoping OS X 10.6 will use ZFS as standard too. Then they can reduce the Time Machine code to about 20 lines from a lot more, and hopefully make it open so that the proprietary Time Capsule is not the only option in Town for doing off-box backups 🙂

  8. Interesting… I shall read the other blog posts in this series.

    But, maybe someone can answer me this. I have an old, unused, Pentium 4 box with around (I think) 512Mb (maybe even 1Gb – it’s been a while since I even powered it up) that I haven’t had the heart to toss out. Last time I played with it it seemed to run Ubuntu ok. Is this likely to be enough for a ZFS server or should I just go out and by a cheap(ish) tower with a reliable motherboard and space for 4+ SATA drives?

  9. Cheers Bogey!

    All I can say is this: ZFS will work on a 32-bit processor like the Pentium 4, with limited RAM. But for around $300 or so, excluding disks, you can have a superb machine — see the bottom of the ZFS hardware page. Basically, ZFS loves 64-bit processors and lots of RAM, which costs around $20 per gigabyte these days. So in the end, the solution you choose is down to how much you want to spend, and how much you value your data and access to it. Only you can make that decision. Either way, I’d love to hear what you decide and how it turns out. Good luck!

  10. Hi uman,

    From memory, with Mac OS X 10.5.1 it was possible to use non-Apple devices (read Time Capsule) with Time Machine, but once Apple released Time Capsule and 10.5.2 was released, this was no longer possible. Since I discovered that unfortunate fact, I have not tried again to get Time Machine writing to my ZFS storage pool across the network.

    Until this changes, one possibility, of course, is to just use the Mac for running applications, and write all your user data to the storage available on your ZFS NAS, which is mounted as a share on your Mac.

  11. Hi simon
    could ZFS handle different size of harddisk(SATA), eg. 1TB WD X 2 and 750G Seagate X 2.

    1. Hi Alfie. Yes you could create a pool composed of two mirror vdevs, one of 1TB and the other of 750G, giving a total of 1.75TB of usable storage space.

      Alternatively, you could create a pool composed of one RAID-Z1 vdev, but in this case all drives would give 750GB of raw storage space, and as this is RAID-Z1, the capacity of one drive (750GB) would be used as parity space, leaving 3 x 750GB of usable storage space – thus 2.25TB would be usable.

      Cheers,
      Simon

  12. Hi Simon,

    Kind of late to contribute here but I am searching for the required information now for some time without a positive answer. I am mostly concerned to use ZFS against bit rot. My problem (I guess off understanding 😉 is, which of the pool layouts would be best against bit rot.

    As I understodd, just a simple RAIDZ1 would be enough to correct simple bit flips since it can reconstruct the correct bit value from the checksum(s). Can you confirm this?

    Then, all examples on the web do not use RAIDZ1 to illustrate the correction of bit flips but all use mirroring. Here is my misconception I guess, maybe you can explain it. Of course, a mirror can always reconstruct a faulty bit, given, the mirror’s content is not affected in the first place. Hence, the difference between a RAIDZ1 and a mirrored pool in terms of bit flips is just that the first one is much more space conserving while the latter one would be faster. In terms of data integrity concerning bit flips both would be equal, is that correct?

    Then, is there any reason one would use RAIDZ1 together with a mirror against bit rot?

    Marc

Leave a comment

Your email address will not be published. Required fields are marked *