This was initially published as a section of one of my substack posts.
You might be wondering why the term algebraic data types has the word algebra in it. Don’t sweat it if you hated algebra in high school, I will not call on you to solve simultaneous equations or prove by induction on the board in front of the class.
Algebra
The basic idea is that through a little algebra we can determine the cardinality of an algebraic data type which represents how many possible values that could inhabit it.
Consider the following sum type:
data Color
-- | RGB uses 8 bits in three color channels with values of 0–255
= RGB { red :: Word8, green :: Word8, blue :: Word8 }
Here we have created a type Color
and a data constructor RBG
which
represents a color as 8 bits in three color channels with values of
0–255 (red, green, blue). This is basically a record with three fields
and each field can hold 256 different values which gives us a
cardinality of 256^3 = 256*256*256 = 16777216. This means we can
represent 16,777,216 (>16 million) different colors with this
representation. Record types are sometimes called “product types” for
this reason.
Datatypes defined in terms of mutually exclusive data constructors are called “sum types” for a similar reason because you can “sum up” the cardinalities of all constructors to find the cardinality of the datatype.
Color Example
Consider our Color
datatype from above where now our application needs
to denote colors using the subtractive CMY representation because
sometimes we want to be able to use our library or application for dye
colors (rather than for light-based usages like photography,
television, digital screens, etc.).
data Color
-- | RGB uses 8 bits in three color channels with values of 0–255
= RGB { red :: Word8, green :: Word8, blue :: Word8 }
-- | CMY uses 8 bits in three other color channels w/ subtractive model
| CMY { cyan :: Word8, magenta :: Word8, yellow :: Word8 }
Now for every value of the type Color
, we can choose between RGB
and CMY
constructions. Since RGB
denotes an additive model which is
fundamentally different from the CMY
subtractive model the data
constructor acts as a way to differentiate between the
representations, therefore the number of colors with different
representations we can construct with this new datatype definition is
the sum of the cardinality of both constructions (RBG
, CMY
) which
happens to be 16777216 + 16777216 = 33,554,432.
This post was just a teaser to show how the algebra of algebraic data types start off, but it continues even further than what we have covered with recursive algebraic data types. More details can be found in these blog posts or talks:
- The algebra (and calculus!) of algebraic data types (post)
- The algebra of algebraic data types (talk)
That zippers are derivations of the original type is wild and fun for the math inclined (see the links above)! :)
I hope you enjoy learning more about the algebra of ADTs as much as I did when I first learned about it.
Useful links for the domain we explore with datatypes include:
If you enjoyed this content, please consider sharing this link with a friend, following my GitHub, Twitter/X or LinkedIn accounts, or subscribing to my RSS feed.