The murky depths of GPT
January 30, 2017 —
BarryK
I will post this while it is fresh in my mind!
For years I have been providing Quirky as a ready-made image for an 8GB (or greater) USB Flash stick or SD-card.
To cater for the fact that "8GB" sticks actually have quite different amounts of memory, I create two partitions a bit smaller than the capacity of the drive. The first is a 512MB fat32 partition and the second is a f2fs or ext4 partition that does not quite fill the drive.
After having written Quirky to the stick, I then copy it back as an image file, only copying to the end of the second partition.
Now for the first problem. I am using a GUID partition table, and the way they work is there is a primary GPT at the start of the drive, and a secondary (backup) GPT at the very physical end of the drive.
In my scenario, I am creating an image file with the secondary GPT missing.
Now, a user downloads my image file and writes it to a 8GB stick. If it is a new stick, or one that has been wiped, no problem, Linux will see only the primary GPT and use that.
The problem arises if the stick has been used before, for Quirky or some other Linux distro, in which case it will have a GPT at the end of the drive, or rather most likely will. Note, GPT is usually required for booting on computers with UEFI-firmware.
Case A
Linux, and indeed the 'fdisk' utility, both get confused here. This is where it gets murky. Today I discovered that if the image file is written to a drive that has larger capacity than the one I created the image from, all is well. Linux, and fdisk, determine that the secondary GPT is faulty, and use the first one. Fine, that is what we want.
Case B
The murkiness comes in when write the image file to a drive that is less, maybe still a nominal 8GB but with less capacity than mine.
When I did this, and replugged the drive, no partition icon showed up on the desktop. Hmmm, I looked in /sys/block/sdb (my flash stick was sdb) and there was a sdb1 but it was reported as not having a filesystem, and the size was completely wrong.
"fdisk -l /dev/sdb" reported that the primary GPT is faulty, and it is using the secondary GPT!
Why? The primary GPT has a pointer to where the secondary GPT is supposed to be. If that pointer is somewhere within the drive, that will be case A. If that pointer is beyond the physical end of the drive, that will be case B. In the latter case, Linux kernel and fdisk then conclude the primary partition is invalid.
This is incredible, but does seem to be the situation.
The new Easy Linux that I am developing, has only one 519MB fat32 partition, total image size of 520MB. If I create this on a flash stick that is smaller than what any user will have, all will be well. I could create it on a 1GB drive, if I had one. I have to start asking around, see if someone has an old one, that is not yet broken.
Easy Linux is actually intended to run on a Flash stick as small as 2GB (linuxcbon will be happy!), though 4GB or more is more useful.
The way I am designing it, at first bootup it will create a ext4 partition to fill the drive, and at the same time create a correct secondary GPT. This will happen automatically at first bootup.
That's the plan anyway.
Comments
Apart from avoiding Case B, the missing secondary GPT still has to be fixed.There have been various posts on the Puppy Forum and elsewhere on the Internet about this. Many people have run Gparted, which offers to fix it, but I don't know what Gparted actually does.
The way to fix it, that works for me, is to use the 'gdisk' utility, like this:
# gdisk /dev/sdb
x
e
w
Y
The "x" is expert-mode, the "e" is to copy the primary GPT to the physical end of the drive and fix the pointers. "w" is write to disk and quit, "Y" is to confirm.
I want to do this in the initrd, unfortunately, gdisk, which is the 'gptfdisk' project, is written in C++, hard to compile statically.
I read on your blog that gdisk seems hard to compile static. I did a test in my xwoaf-toolchain with which I have compiled other c++ apps static - and after installing libuuid-1.0.3, disabled need for libicu in Makefile for gptfdisk-1.0.1 and added "double log2( double n ) { return log( n ) / log( 2 ); } " in top of gpt.cc - it build without problems to a 357K static linked gdisk.
That is great news! I have asked goingnuts to post it to me, and I will test it in my initramfs.
I have got gdisk, with shared-libs, to run in my initramfs, by loading an overlay filesystem inside the initramfs, using 'q.sfs' as bottom ro layer. It works, but is a kludge.
Note: yes, I have come in a circle, the "next generation" Quirky is back to using SFS files and a layered filesystem, similar to Puppy, instead of a full install. I have determined this is the best way to support containers. That's why q.sfs exists, which is the entire Quirky filesystem in a squashfs.
I have got it in the initramfs in Easy Linux, it is used to fix the secondary header/GPT.
The reason for this is to overwrite any pre-existing start of a second partition on the flash stick.
Part of the reason Linux and fdisk choose an old secondary GPT on the disk as being valid, and not the primary one in my image file, is the secondary GPT reporting the existence of both part#1 and part#2, whereas my primary GPT reports only part#1 exists.
The problem is that earlier Quirkies have the same starting-point and f.s. for part#1 and part#2, so if you are re-using such a stick for Easy Linux, that old part#2 is still there.
This is where that extra 1MB size of my Easy Linux image comes in. It over-writes the start of any pre-existing part#2 with zeroes. Thus elimitating it, thus making Linux/fdisk less likely to think the old secondary GPT is the correct one.
Tags: linux