Multiple Personality ZFS Part 0: A Questionable Idea

Introduction

I have a computer that I use for a hobby of mine. And it runs a decent operating system for what I want to accomplish with it. But it’s not my preferred operating system. I wonder if I can share my home directory between two operating systems that never run concurrently. That sounds a lot better than copying files back and forth in an effort to keep things in sync.

The hobby is kind of niche. And the people who use UNIX-like operating systems for that hobby is sub-niche. And people who use BSD-based operating systems for that hobby is sub-sub-niche. So I’m not expecting any sort of revolution to spawn out of this. In fact, this is arguably a useless exercise. But why let that stop us from exploring? Maybe we’ll learn something useful along the way.

I have spread story of this journey across a series of posts:

Part 0: A Questionable Idea ⇐ you are here
Part 1: Switching Personalities
Part 2: The Export/Import Business
Part 3: No Special Snowflakes

Background

I’ve built up a fair amount of data that I want to preserve on this computer. And I want to access it from either of the operating systems on it, without copying it back and forth and letting the two halves get out of sync. So the files they share in common would have to use the same on-disk formats. I’ll just choose a file system that both operating systems can read and write, right? FAT32 and Bob’s your uncle.

There’s a problem with that idea. Consider that the tin/ matter of syncing a couple of terabytes of data to an external SSD formatted with that file system took days. When I reformatted it to something more reasonable (i.e. a UNIX-like file system) it only took hours. And that’s to say absolutely nothing about data integrity or any of that other stuff. (I know I don’t like it when my data randomly changes on me because a mosquito on the other side of the planet sneezed. Do you?)

Last I checked, the various BSDs and Linux didn’t really support each other’s native file systems very well. I hear there are facilities like FUSE which can bridge the gap, but I’m really looking for something that feels native in both worlds: implemented in the kernel or in loadable kernel modules from a reputable source.

ZFS is an awesome way to manage computer storage. It has too many features for me to describe here, but the killer features for me is how little space it wastes. You don’t have to guess how much space a particular file system will take in several years’ time. You won’t run out of space until your entire pool of storage runs out of space.

ZFS has some terminology which I will be using here:

storage pool: A collection of one or more devices (entire disks or partitions) providing storage.
dataset: A file system, snapshot, or volume that consumes storage from one storage pool.
property: Metadata about a storage pool or dataset, such as where it appears within the directory tree structure presented to users.

FreeBSD has been happily using ZFS for quite a while now, and they have adopted the OpenZFS project’s implementation of ZFS. OpenZFS also works well on Ubuntu and other distributions of Linux.

Well, isn’t that interesting! My hobby PC is primarily running Ubuntu, but I’d rather be running FreeBSD for as much of it as I can. (There may be some applications that are just too gnarly to work equally well on both; I’m willing to put up with that.) Maybe I don’t have to copy files back and forth after all. And I do have the ability to create virtual machines pretty easily. Perhaps an experiment is in order!

The standard test bed

Since I don’t want to trash my hobby PC, I’ll create a virtual machine instead and trash it instead. In terms of its virtual hardware, I’ll give it:

8 GB of memory
2 virtual CPUs
an ethernet network interface
a 40 GB SCSI hard drive for operating systems
a 20 GB SCSI hard drive for the shared data
no sound card
no camera
UEFI firmware

This experiment should work with any hypervisor that supports booting guests from UEFI. I’m assuming I can do this because the physical machine I’m trying to imitate boots from UEFI. The physical machine does not use SCSI hardware, but the hypervisor available to me offered it as a suggestion for the machine I was trying to build, so I took it. The device names may change a bit for SATA or NVMe drives, but the concepts should still apply.

What would FreeBSD do?

I’ll start by installing FreeBSD 13.1-RELEASE in the standard way from the usual downloadable ISO image, telling it to use the entirety of the first disk as a ZFS storage pool. Then I’ll see what the resulting GPT looks like.¹

FreeBSD calls the first SCSI disk da0, and the various GPT partitions within it da0p1, da0p2, etc. (For SATA the device names would be ada0 for the disk itself and ada0p1 for the first GPT partition on it.)

# gpart show da0
=>       40  83886000  da0  GPT (40G)
         40    532480    1  efi  (260M)
     532520      1024    2  freebsd-boot  (512K)
     533544       984       - free -  (492K)
     534528    494304    3  freebsd-swap  (2.0G)
    4728832  71955200    4  freebsd-zfs  (38G)
   83884032      2008       - free -  (1.0M)

# cat /etc/fstab
# Device Mountpoint FStype Options Dump Pass#
/dev/da0p1 /boot/efi msdosfs rw 2 2
/dev/da0p3 none swap sw 0 0

ZFS does not need the traditional /etc/fstab method to get everything mounted. Every dataset in a storage pool that sets a property named mountpoint can declare its preferred mount point. ZFS reads all these properties to get things mounted. A storage pool has its usual mount point set when it is initially created, but one can use the property altroot to temporarily change the mount point for the root dataset in the pool. This can be very useful when attempting storage shenanigans (i.e. this experiment), or when you want to tell your operating system’s installer that yes, you really want all of your file systems to be ZFS datasets!

What would Ubuntu do?

Let’s repeat the exercise with Ubuntu 22.04 LTS “Jammy Jellyfish,” using their server install media. Linux calls the first SCSI disk sda, with the partitions within it being sda1, sda2, etc. (The device names appear to be the same for SATA disks.)

# fdisk -l /dev/sda
# ...
Disklabel type: gpt
# ...
Device       Start      End  Sectors  Size Type
/dev/sda1     2048  2203647  2201600    1G EFI System
/dev/sda2  2203648  6397951  4194304    2G Linux filesystem
/dev/sda3  6397952 83884031 77486080 36.9G Linux filesystem

Adding ZFS

I assumed that creating a new storage pool under Ubuntu would be more likely to produce compatible results. So away we go! Running with superuser permissions, either via sudo or from a proper root shell, I’ll dedicate that entire second SCSI disk (sdb in Linux-speak, da1 in FreeBSD-speak) to it.

apt install zfsutils-linux
zpool create zdata /dev/sdb
zfs create -o mountpoint=/zhome zdata/home

Now how does the partition table on sdb look?

# fdisk -l /dev/sdb
# ...
Disklabel type: gpt
# ...
Device        Start      End  Sectors Size Type
/dev/sdb1      2048 41924607 41922560  20G Solaris /usr & Apple ZFS
/dev/sdb9  41924608 41940991    16384   8M Solaris reserved 1

It built a GPT for us. How considerate!

One of the neat things about storage pools is that you can mount them on any system than understands them and you should be able to pick up where you left off. This is called importing a storage pool. Which implies that it must be exported, even if you don’t move physical disks around. And such a concept does exist; it is the act of logically detaching the storage pool from the system and marking it as not currently in use by that system.

Before I shut down, I’ll export zdata to see if I can import it.

zpool export zdata

Note that if I had any mounted file systems (datasets) from zdata, zpool export would unmount them immediately before export. I’ll remember that as something I’d like to perform automatically upon every shutdown.

I booted from the FreeBSD install media and intentionally chose the most difficult partitioning option so as not to disturb the Ubuntu install. It was a lot of typing, based on research I had done a while ago into automated customized FreeBSD installs.

I had arrived at this GPT:

# gpart show da0
=>      34  83886013  da0  GPT  (40G)
        34      1024    4 freebsd-boot  (512K)
      1058       990      - free -  (495K)
      2048   2201600    1 efi  (1.0G)
   2203648   4194304    2 linux-data  (2.0G)
   6397952  37748736    3 linux-data  (18G)
  44146688   4194304    5 freebsd-swap  (2.0G)
  48340992  35543040    6 freebsd-zfs  (17G)
  83884032      2015      - free -  (1.0M)

I had assumed at this point that I was doing quite well. It wasn’t a terrible assumption, but it wasn’t that great either. Why? Because I hadn’t yet wrestled with the elephant in the room: easily booting one computer into either operating system without relying upon install media.

I’ll start that wrestling match in the next post in the series.

“GPT partition table” is a redundant phrase. ↩︎

tnalpgge.github.io