Payload, singleton, and stride lengths
Once again I’m inventing terms for useful distinctions that programmers need to make and sometimes get confused about because they lack precise language.
The motivation today is some issues that came up while I was trying to refactor some data representations to reduce reposurgeon’s working set. I realized that there are no fewer than three different things we can mean by the “length” of a structure in a language like C, Go, or Rust – and no terms to distinguish these senses.
Before reading these definitions, you might to do a quick read through The Lost Art of Structure Packing.
The first definition is payload length. That is the sum of the lengths of all the data fields in the structure.
The second is stride length. This is the length of the structure with any interior padding and with the trailing padding or dead space required when you have an array of them. This padding is forced by the fact that on most hardware, an instance of a structure normally needs to have the alignment of its widest member for fastest access. If you’re working in C, sizeof gives you back a stride length in bytes.
I derived the term “stride length” for individual structures from a well-established traditional use of “stride” for array programming in PL/1 and FORTRAN that is decades old.
Stride length and payload length coincide if the structure has no interior or trailing padding. This can sometimes happen when you get an arrangement of fields exactly right, or your compiler might have a pragma to force tight packing even though fields may have to be accessed by slower multi-instruction sequences.
“Singleton length” is the term you’re least likely to need. It’s the length of a structure with interior padding but without trailing padding. The reason I’m dubbing it “singleton” length is that it might be relevant in situations where you’re declaring a single instance of a struct not in an array.
Consider the following declarations in C on a 64-bit machine:
struct {int64_t a; int32_t b} x;
char y
That structure has a payload length of 12 bytes. Instances of it in an array would normally have a stride length of 16 bytes, with the last two bytes being padding. But in this situation, with a single instance, your compiler might well place the storage for y in the byte immediately following x.b, where there would trailing padding in an array element.
This struct has a singleton length of 12, same as its payload length. But these are not necessarily identical, Consider this:
struct {int64_t a; char b[6]; int32_t c} x;
The way this is normally laid out in memory it will have two bytes of interior padding after b, then 4 bytes of trailing padding after c. Its payload length is 8 + 6 + 4 = 20; its stride length is 8 + 8 + 8 = 24; and its singleton length is 8 + 6 + 2 + 4 = 22.
To avoid confusion, you should develop a habit: any time someone speaks or writes about the “length” of a structure, stop and ask: is this payload length, stride length, or singleton length?
Most usually the answer will be stride length. But someday, most likely when you’re working close to the metal on some low-power embedded system, it might be payload or singleton length – and the difference might actually matter.
Even when it doesn’t matter, having a more exact mental model is good for reducing the frequency of times you have to stop and check yourself because a detail is vague. The map is not the territory, but with a better map you’ll get lost less often.
Eric S. Raymond's Blog
- Eric S. Raymond's profile
- 140 followers
