If you should be unlucky enough to be unable to boot your OpenSolaris NAS one day, then these notes taken from a real restoration test might help you get back up and running again quickly.
After setting up a supposedly robust mirrored ZFS root boot pool here using two SSD drives, I decided to give it a system test by completely destroying the boot pool. I did this in order to understand and verify the process of restoring a boot environment from scratch, and I have documented this process here in case I need it one day, and you’re welcome to use it too if you need it. 🙂
Before I deliberately zap my OS boot environment, I’m going to back it up to a remote machine by following the steps detailed here under the ‘Archiving streams of the file systems in the existing root pool’ section.
Once the boot environment was backed up, it was time to create the disaster. I don’t take any responsibility for any loss you may incur by doing this, but all I can say is that it worked for me.
Boot the OpenSolaris 2009.06 installation CD, select keyboard and language, get to the desktop, and then su to root using ‘opensolaris’ as the password. Then destroy the root boot pool and remove all partitions on the drives it contained:
# zpool import -f rpool # zpool destroy rpool # format
In the format command, I selected the disk I want to zap, selected ‘fdisk’ and deleted the single partition, and exited after saving partition changes.
As I had a redundant mirror-based boot pool, I repeated this for both drives, just to make sure the OS had gone. I could have formatted them too if I was being really professional. 🙂
Boot the OpenSolaris 2009.06 Live CD.
When you reach the desktop, double-click on the Install icon found on the desktop to kick off the installation process.
In the installation program, select the disk to install onto, and in my case this was the first of the two disks I would use in the mirror. Select the whole disk as the installation target. Reboot when finished, and leave the OpenSolaris Live CD in the drive.
After reboot, su to root and restore the backed-up root boot pool filesystem stream to the local root boot pool:
# zpool import -f rpool # mount -F nfs 192.168.0.45:/backup/snaps /mnt # cat /mnt/rpool.recursive.20090827 | zfs receive -Fdu rpool
Create a single 100% solaris partition on drive 2 that will be used for the other half of the mirror by running the format command, selecting the second drive to be used in the boot mirror, select fdisk, create a default 100% solaris system partition, and quit out of the format command back to the command line.
Now we need to create the geometry for the second drive to match that of the formatted first drive. Note the ids of the drives you have and substitute them for the ones shown in the following line:
# /usr/sbin/prtvtoc /dev/rdsk/c11t6d0s2 | /usr/sbin/fmthard -s - /dev/rdsk/c11t7d0s2 fmthard: New volume table of contents now in place.
Now, very importantly, set the boot file system to use the ‘be2’ boot environment, or whichever one you used:
# zpool set bootfs=rpool/ROOT/be2 rpool
Now let’s attach the second drive to the first drive to form the boot pool mirror:
# zpool attach -f rpool c11t6d0s0 c11t7d0s0 Please be sure to invoke installgrub(1M) to make 'c11t7d0s0' bootable.
Now install the GRUB bootloader to the second SSD so that the BIOS can boot the second drive in the event that the first becomes unreadable (don’t forget to enable the two SSDs in the BIOS as bootable drives afterwards!):
# installgrub -m /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c11t7d0s0 Updating master boot sector destroys existing boot managers (if any). continue (y/n)?y stage1 written to partition 0 sector 0 (abs 16065) stage2 written to partition 0, 271 sectors starting at 50 (abs 16115) stage1 written to master boot sector
Now inspect results:
# zpool list rpool NAME SIZE USED AVAIL CAP HEALTH ALTROOT rpool 29.8G 15.1G 14.7G 50% ONLINE - # # zfs list -r rpool NAME USED AVAIL REFER MOUNTPOINT rpool 17.1G 12.2G 81.5K /rpool rpool/ROOT 12.4G 12.2G 19K legacy rpool/ROOT/be2 30.6M 12.2G 6.95G / rpool/ROOT/be3 9.55G 12.2G 7.02G / rpool/ROOT/opensolaris 2.82G 12.2G 2.82G / rpool/dump 2.00G 12.2G 2.00G - rpool/export 578M 12.2G 21K /export rpool/export/home 578M 12.2G 527M /export/home rpool/swap 2.10G 14.2G 101M -
Check the snapshots restored from the stream:
# zfs list -t snapshot NAME USED AVAIL REFER MOUNTPOINT rpool@20090829-migrated 23.5K - 81.5K - rpool@20090830 0 - 81.5K - rpool/ROOT@20090829-migrated 0 - 19K - rpool/ROOT@20090830 0 - 19K - rpool/ROOT/be2@20090830 0 - 6.95G - rpool/ROOT/be3@20090829-migrated 58.9M - 6.14G - rpool/ROOT/be3@2009-08-29-23:45:50 44.9M - 6.95G - rpool/ROOT/be3@20090830 0 - 7.02G - rpool/ROOT/opensolaris@install 3.10M - 2.82G - rpool/dump@20090829-migrated 16K - 2.00G - rpool/dump@20090830 0 - 2.00G - rpool/export@20090829-migrated 16K - 21K - rpool/export@20090830 16K - 21K - rpool/export/home@20090829-migrated 51.1M - 485M - rpool/export/home@20090830 0 - 527M - rpool/swap@20090829-migrated 0 - 101M - rpool/swap@20090830 0 - 101M -
Now you should be able to reboot and successfully boot your previously backed-up OS boot environment. So the key here is to ensure that you have regular backups of your boot environments so that you can restore when you need to.
This post was made from notes made during the process of backing-up, destroying and restoring my NAS’s root boot pool file systems. Let me know if you find any errors with a comment below.