Projects / Rainbow arrays

After a lot of work (and software engineering) I finished up the first two episodes in what is hopefully a longer series of illustrated blogs. The theme is multidimensional arrays and how we manipulate them conceptually. I know the deep learning crowd like to call them tensors, but this brazen theft of terminology from mathematical physics has some some big downsides. First, it makes everyone assume that array programming is just linear algebra, which it definitely is not. Second, it means that searching for the term “tensor” will yield two largely non-overlapping magisteria that should not (and cannot) be unified.

Classical array algebra

In the first post, classical array algebramath.tali.link, I start with some real-world examplesmath.tali.link of arrays of different arity (=rank). They’re obviously pretty familiar territory for a lot of people, but it’s good to set the stage with some concrete examples – and some of those examples will be fodder for future posts in the series.

More importantly, I introduce the intuition that arrays are memorized functions. This is important, because it translates array manipulation into higher order functional programming. If you’re used to NumPy, this might come as a surprise. If you’re a Haskeller (or better yet a Dexer), perhaps less so.

Then I describe a common set of primitivesmath.tali.link that all array programming environments use. There are obviously a lot more to be found, but the majority of them boil down to combinations or elaborations of these.

Along the way I introduce a graphical languagemath.tali.link for describing these operations. It amounts to nothing more than using monoidal string diagrams to build new arrays/functions out of old ones. A slight novelty here is a kind of bubble notationmath.tali.link (presumably applicable to any cartesian closed category) that lets us express currying and uncurrying, which cleanly resolves the age old question of “should I consider this vector to be a row vector or a column vector in this context?”. The answer, if you are wondering, is “no” 🙂.

Rainbow array algebra

In the second post, rainbow array algebramath.tali.link, I switch from the common ordered axes approach to the more modern named axes one. With our functional hat on, this has a very straightforward interpretation: we are moving from tuples to records. The resulting string diagram language acquires color.

I’m not sure exactly what the corresponding categorical story is. That’ll be fun to figure out.

With all this in hand I move on to representing some neural networks. At least the forward part. I start with perceptronsmath.tali.link. We now have to confront the fact that there are two kinds of function involved: array-functions, and function-functions. This happens because neural nets distinguish the arrays that are learned from the arrays that get transformed from input to output. This is currently dealt with in category theory with the Para construction. But from a type theory point of view, my claim (yet to be proved) is that an architectural shape, like a perceptron, is really a polymorphic type.

For the pot of gold under the rainbow, I work through a fairly challenging example: a treatment of transformersmath.tali.link. It’s understandably tricky. But then the traditional explanations are, arguably, much worse. Named axes / colored wires helps, I hope. I hope you make it all the way, and will venture further when I take you into the categorical underpinnings in a future post!