Math 4 Wisdom. "Mathematics for Wisdom" by Andrius Kulikauskas.

Sheffer Polynomials: Combinatorial Space for Quantum Physics

Sheffer polynomials construct a combinatorial space which has meaning for quantum physics. I will show you the details of this combinatorial interpretation, how I came up with it and what it means for my investigation of wisdom. I am Andrius Kulikauskas and this is Math 4 Wisdom. In the description of this video, you can find a link to the transcript and all of the slides, as well as how to contact me and support me through Patreon.

What information is encoded by orthogonal polynomials when they appear as solutions of the Schroedinger equation? Two examples to consider are the Hermite polynomials, which appear in solutions for the quantum harmonic oscillator, and the Laguerre polynomials, which appear in solutions for the hydrogen atom. These polynomials {$P_n(x)$} are variants of instances of Sheffer polynomials {$S_n(x)=c_nP_n(rx)$}. Do the coefficients of such polynomials encode anything of physical significance?

In this video, I will define what it means for an infinite family of polynomials {$\{S_n(x)\}$} to be a Sheffer sequence and explain how such a family encodes the construction of a combinatorial space. Each polynomial {$S_n(x)$} will have degree {$n$} and will encode a space containing {$n$} elements. In future videos, I will explore the additional constraint of orthogonality, which I will argue encodes two kinds of causal links between the elements.

In what follows, I will first show you what the combinatorial space looks like. Next, I will explain what Sheffer polynomials are. Then I will show you how Sheffer polynomials encode that combinatorial space. Finally, I will discuss the physical significance, the cognitive significance and some questions I am exploring further.

Combinatorial space: Pointed partitions

We'll be constructing a notion of space that is encoded by the Bell numbers - 1, 2, 5, 15, 52 and so on - which count the number of ways to partition a set of size n. For example, these are the 5 ways of partitioning a set with 3 elements.

[012], [01][2], [02][1], [12][0], [0][1][2]

How do these partitions grow as we add new elements one-by-one? Let us start with what I will call the zeroth element and then add a first element and then add a second and third and so on. I am going to interpret these partitions in a peculiar way which you will appreciate when we consider the calculation of the Sheffer polynomials and which I imagine has physical significance. The point is to realize that the initial element and the initial part are very special. The initial element must go into the initial part and there is no choice about that. So let us think of the initial part as free space and distinguish it from all of the other parts, which I will call compartments. We know that the initial element must be in this free space and since we have no choice where to put it, let us call it the zeroth element, let us think of it as emptiness and simply ignore it. We can always recover it from the free space, which we have distinguished. Since we have this distinguished part, I will call these pointed partitions of a set, with a base part, by analogy with a pointed set, which has a base point, and likewise other pointed objects.

____

So we have this curious way of thinking of free space, that it is an initial compartment with an initial element, the zeroth element, a special element, which we can call the vacuum. Let us see what happens further as we add elements one-by-one, which I will call the first, second, third elements and so on.

We can add the first element to the free space directly or we can add it along with a new compartment which contains it. These are the two possibilities.

__1__ , ____[1]

How can we add a second element? We need to consider three cases. We can add it to the free space, or we can add it to an existing compartment, or we can add it along with a new compartment. Adding it to the free space yields two possibilities; adding it to an existing compartment yields another possibility; adding it along with a new compartment yields two more possibilities. Thus the total is five possibilities.

__12__ , __2__[1] , ____[12] , __1__[2] , ____[1],[2]

If we like, then we can compare this with the five ways of partitioning a set with three elements, which becomes obvious if we conceive of the free space as a compartment and we write in it the zeroth element.

But let us go back to our preferred interpretation and consider what happens when we add a third element? Here again we need to consider three cases. We can add it to the free space, or we can add it to an existing compartment, or we can add it along with a new compartment. This yields a total of fifteen possibilities. The three cases make perfect sense in our interpretation. We will get the same three cases if we keep adding elements.

{$J_{k,n+1}=J_{k,n}+kJ_{k,n}+J_{k-1,n}$}

We can, for the sake of rigor, define the number {$J_{k,n}$} of possibilities with n elements and k compartments, and write down a recursion relation, just to emphasize these three cases, which will seem very natural when we now look at Sheffer polynomials.

The generating function for Sheffer polynomials

{$\sum_{n=0}^{\infty}S_n(x)t^n=A(t)e^{xu(t)}$} where {$a_0=1$}, {$u_0=0$}, {$u_1=1$}

There are various equivalent ways of defining Sheffer polynomials of type zero as he called them. For us today we want to consider them as polynomial sequences whose generating function can take the form {$A(t)e^{xu(t)}$}, where {$A(t)$} and {$u(t)$} are infinite series. In this equation, we assume the coefficients {$a_0=1$}, {$u_0=0$}, {$u_1=1$}.

Let us briefly see how this relates to the generating functions of the Hermite polynomials and of the Laguerre polynomials which I mentioned earlier. If we multiply the Hermite polynomials by {$\frac{1}{n!}$}, and replace {$x$} with {$-x$} in the Laguerre polynomials, then they have generating functions as required for Sheffer sequences of polynomials. In this video, we're focusing on the combinatorial implications of {$S_n(x)$} being a Sheffer sequence but ultimately we'll be interested in describing and classifying what happens when it is also a sequence of orthogonal polynomials.

Combinatorially, I want to think of {$A(t)$} and {$u(t)$} as encoding building blocks and I want to explain how the Sheffer polynomials {$S_n(x)$} encode configurations assembled from these building blocks. Physically, the expression {$A(t)e^{xu(t)}$} brings to mind Lie theory, the elements of a Lie group, where {$u(t)$} plays the role of the Lie algebra, acting locally, and {$A(t)$} manifests the Lie group's real form, how it looks globally, whether it is compact or not. In our interpretation of combinatorial space, {$A(t)$} will play the role of free space, whereas {$e^{xu(t)}$} will contribute the compartments, with each compartment having weight {$x$}. This means that a Sheffer polynomial {$S_n(x)$} will have terms of degree n which will encode {$n$} compartments but will also have terms of lower degree {$k$} which will encode {$k$} compartments and even constant terms of degree zero which will encode no compartments but simply the free space and whatever elements it has.

How can we calculate the Sheffer polynomial? We are given the generating function and we would like to isolate a particular {$S_n(x)$}.

A straightforward way to do that is to simply to calculate the coefficient of the power {$t^n$} on either side of our equation. On the left hand side we have {$S_n(x)$} and on the right hand side we have a product of generating functions. We can write out {$A(t)=1 + a_1t + a_2t^2 + \dots$} and so on, {$u(t)=t+u_2t^2+u_3t^3 + \dots$} and so on, and we have the Taylor expansion {$e^{xu(t)}=1 + xu(t) + \frac{1}{2!}(xu(t))^2 + \frac{1}{3!}(xu(t))^3 + \dots$} and so on. When we multiply together {$A(t)$} and {$e^{xu(t)}$} then the terms with power {$t^n$} will include two contributions, one of order {$t^{n-k}$} from {$A(t)$} and the other of order {$t^{k}$} from {$e^{xu(t)}$}. The latter contribution will come from a term of the form {$\frac{1}{m!}xu(t)^m$} which means that it will bring together {$m$} contributions, each with at least one power of {$t$} and those {$m$} powers will add up as {$k=l_1+l_2+\dots+l_m$}. We can see that we are getting a partition of {$n$} into a special part and {$m$} additional parts. The special part has size {$n-k$} which is zero when {$n=k$}. The additional parts all have size greater than zero. We will see that the special part plays the role of free space, not contributing any power of {$x$}, and the other parts are the compartments, each contributing a power of {$x$}. We would like to think of the special part as having {$n-k$} elements and the {$i$}th additional part as having {$l_i$} elements. However, it's not yet clear how to define and justify the notion of element. Furthermore, we have a factor of {$\frac{1}{m!}$} to worry about.

Typically, the factor {$\frac{1}{m!}$} would serve to convert {$m!$} permutations into a single way of ordering elements. For example, if we have one ordering {$t^2\times t^4$} and we have another ordering {$t^4\times t^2$}, then dividing by {$2!$} would reduce that to a single ordering. However, in our product we can also get terms such as {$t^3\times t^3$} where both factors are identical and their product appears only once, so it doesn't make sense to divide by {$2!$}. As things stand, the algebraic expression doesn't have a form that I would say is completely intelligible and truly meaningful. And yet the information in this expression could have other forms which could be much more intelligible, informative and meaningful.

This is a key lesson of algebraic combinatorics which is very relevant for Math 4 Wisdom. I suggest it's also important for physics. If mathematics is the language of nature, and we want to understand nature, then we need to actively listen to nature, we need to express what nature is telling us in a form that we can make sense of. There are many ways to fiddle with the coefficients of the orthogonal polynomials, and we can write them in certain ways for historical reasons, but ultimately we need to focus on those expressions which could semantically be meaningful, which is to say, those which encode structures. We have to realize that many of the distinctions that we make, such as defining Sheffer polynomials with an ordinary generating function {$\sum_{n=0}^{\infty}S_n(x)t^n$} rather than an exponential generating function {$\sum_{n=0}^{\infty}P_n(x)\frac{1}{n!}t^n$}, have consequences semantically and can keep us from observing how our minds work and also keep us from hearing what nature is saying.

Calculating the kth derivative

One important thing to be aware of is that we can use the Taylor series, or more specifically, the Maclaurin series, to appreciate the role of derivatives and factorials in the information that the functions {$A(t)$} and {$u(t)$} encode.

{$u(t)= u_0 + u_1t + u_2t^2 + u_3t^3 + \cdots = u(0)+u'(0)t + \frac{u''(0)}{2!}t^2 + \frac{u''(0)}{3!}t^3 + \cdots$}

We see that {$u_i = \frac{u^{(i)}(0)}{i!}$}. This means that there are factorials lurking throughout our algebraic expression for the Sheffer polynomials. If we multiply both sides by {$n!$}, then on the right hand side we will be getting fractions of the form {$\frac{n!}{(n-k)!l_1!l_2!\cdots l_m!}$}, which is the multinomial coefficient {$\binom{n}{n-k,l_1,l_2,\dots,l_m)}$}. We can go back to the algebraic expression that we have been analyzing and reinterpret each {$u_i$} as a product {$\frac{u^{(i)}(0)}{i!}$}. The {$i!$} is an integer weight associated with the power {$t^i$} that allows products of these powers to be interpreted as enumerating choices. Importantly, each choice that distributes {$n$} elements across the parts of size {$n-k$} and {$l_1$} through {$l_m$} will make those parts distinct. In particular, now that the {$m$} so-called additional parts are distinct, then they can arise in {$m!$} ways, and we see that the factor {$\frac{1}{m!}$} allows us to reduce them to a single case. Which is to say, the {$m!$} lists of compartments become a single set of compartments. We are very near to claiming that the Sheffer polynomials are encoding partitions of sets, which is to say, ways of partitioning the elements of a set into compartments. However, this approach has become very subtle and abstract and contrived. Let us redo this all in a way that is more concrete and natural.

How can we calculate the Sheffer polynomial? We are given the generating function and we would like to isolate a particular {$S_n(x)$}.

We can do this on the left hand side by taking the {$n$}th derivative. This will eliminate all of the terms {$S_k(x)t^k$} for {$k<n$} as their {$n$}th derivative is {$0$}. The {$n$}th derivative of the higher terms where {$k\geq n$} will be {$\frac{k!}{(k-n)!}t^{k-n}$}. Then if we set {$t=0$} all of these terms will go to {$0$} except for the one term when {$k=n$}, which has no power of {$t$}. Thus on the left hand side we are left with the single term {$n!S_n(x)$}.

What happens on the right hand side when we take the {$n$}th derivative? We will have to make repeated use of the chain rule and the product rule and then look for a meaningful pattern. At first glance, the results may seem haphazard, unwieldy, incomprehensible. But I want to emphasize that taking the derivative is an action which evokes the symmetry that is inherent in this situation. I think of this as analytic symmetry. By that I mean the symmetry whereby the derivative of the exponential function {$e^x$} is that very same function. Or likewise the fourth derivative of {$\textrm{sin}\; x$} is that very same function. The Taylor series makes this symmetry explicit. I imagine that Sophus Lie, the inventor of Lie theory, was thinking of such symmetries as the key to understanding differential equations in the way that Évariste Galois uncovered the symmetries in polynomial equations. All of this to say that it is meaningful to investigate the combinatorial effects of taking a derivative.

After we take the {$n$}th derivative we will set {$t=0$}. We will make use of our initial conditions, {$a_0=1$}, which means {$A(0)=1$}; and {$u_0=0$}, which means {$u(0)=0$} so that {$e^{xu(0)}=1$}; and {$u_1=1$}, which means {$u'(0)=1$}. From these we see that the very first polynomial {$S_0(x)$} is simply {$A(0)e^{xu(0)}$} which is the constant {$1$}. Let us do the first few examples and we will appreciate the combinatorics and how it relates to the construction of space that we considered.

To simplify our notation, let us write {$E(t)=e^{xu(t)}$}. What is the derivative with respect to {$t$}? Note that {$x$} is just a constant with respect to {$t$}. By the chain rule, {$E'(t) = xu'(t)e^{xu(t)}$} which is {$xu'(t)E(t)$}. I think of {$E(t)$} as a mother function in that each time we apply a derivative we get back {$E(t)$} but multiplied with a child {$xu'(t)$}. Which is to say that the mother function never goes away, is always with us, but its children proliferate and further evolve. The mother function is a center of analytic symmetry. Note also that {$E(0)=1$} which means that the mother function itself will have no effect at the end of the calculation.

With all that in mind, let us take the derivative of {$A(t)E(t)$}. By the product rule we have two terms

{$\frac{d}{dt}A(t)E(t) = A'(t)E(t) + A(t)E'(t) = A'(t)E(t) + A(t)xu'(t)E(t)$}

When we set {$t=0$} we get an expression for the polynomial {$1!S_1(x)$}.

{$A'(0)E(0)+A(0)xu'(0)E(0) = A'(0) + x = 1!S_1(x)$}.

Let us take the second derivative. Here we will make use of the product rule for a product of three factors, which is

{$\frac{d}{dt}A(t)B(t)E(t) = A'(t)B(t)E(t) + A(t)B'(t)E(t) + A(t)B(t)E'(t)$}

which you can derive from the product rule for two factors. As you see, the product rule has us take the derivative of a single factor in the product and do that for each of the three factors and then add the results together. Indeed, by induction we can show that the product rule for n factors will be the sum of n terms, where each term is a product of n factors where we have taken the derivative of one of those factors but kept the other factors the same.

The second derivative gives us five terms, which I will call five pedigrees to emphasize that each has its own line of ancestors, each was derived in its own unique way.

{$\frac{d^2}{dt^2}A(t)E(t) = A''(t)E(t) + A'(t)xu'(t)E(t) + A'(t)xu'(t)E(t) + A(t)xu''(t)E(t) + A(t)(xu'(t))^2E(t)$}

We see that the second and third pedigrees yield the same result so we can combine them, although from the combinatorial point of view it helps to think of them as two distinct pedigrees, two distinct paths in the derivation, so that there are five pedigrees in all.

{$\frac{d^2}{dt^2}A(t)E(t) = A''(t)E(t) + 2A'(t)xu'(t)E(t) + A(t)xu''(t)E(t) + A(t)(xu'(t))^2E(t)$}

When we set {$t=0$} then we get an expression for the polynomial {$2!S_2(x)$}.

{$A''(0) + (2A'(0) + u''(0))x + x^2 = 2!S_2(x)$}.

We want to find a pattern. Let's take the third derivative.

{$\frac{d^3}{dt^3}A(t)E(t) = A'''(t)E(t) + A''(t)xu'(t)E(t) + 2A''(t)xu'(t)E(t) + 2A'(t)xu''(t)E(t) + 2A'(t)(xu'(t))^2E(t) $}

{$ + A'(t)xu''(t)E(t) + A(t)xu'''(t)E(t) + A(t)(xu''(t))^2E(t) + A'(t)(xu'(t))^2E(t) + 2A(t)(xu'(t))xu''(t)E(t) + A(t)(xu'(t))^3E(t)$}

Combining terms we are left with

{$\frac{d^3}{dt^3}A(t)E(t) = A'''(t)E(t) + 3A''(t)xu'(t)E(t) + 3A'(t)xu''(t)E(t) + 3A'(t)(xu'(t))^2E(t) $}

{$ + A(t)xu'''(t)E(t) + A(t)(xu''(t))^2E(t) + 2A(t)xu'(t)xu''(t)E(t) + A(t)(xu'(t))^3E(t)$}

We have 8 distinct terms, but if we add the coefficients 1 plus 3 plus 3 plus 3 plus 1 plus 3 plus 1, then we see that we have 15 pedigrees.

When we set {$t=0$} then we get an expression for the polynomial {$3!S_3(x)$}.

{$A'''(0) + (3A''(0) + 3A'(0)u''(0) + A(0)u'''(0))x + (3A'(0) + 2A(0)u''(0) + (u''(0))^2)x^2 + x^3 = 3!S_3(x)$}

At this point, we can start identifying patterns and formulating claims about the terms of the {$n$}th derivative of {$A(t)E(t)$} and the pedigrees of those terms.

A) Each term consists of factors and possibly an integer coefficient.
B) Each term arises from differentiation {$n$} times.
C) In differentiating, we applying the product rule and/or the chain rule.
D) The chain rule is only applied to the mother factor E(t), as I call it, and that introduces a new baby factor x u'(t). This is the only way that a coefficient x is introduced.
E) The product rule does not affect the number of factors in a term. The product rule has us differentiate precisely one factor in a term. Repeated use of the product rule is responsible for the number of times that a factor gets differentiated.
F) Each term is the outcome of a pedigree, the result of a sequence of {$n$} differentiations, where at each stage differentiation acts on precisely one factor in the term. Thus in each term the factors have been differentiated a total of {$n$} times.
G) The chain rule and the product rule do not yield negative coefficients.
H) The same term can have different pedigrees, which is to say, like terms can be combined, yielding positive integer coefficients.
I) Each term is made up of three kinds of factors, which I call the mother factor, the father factor and the baby factors. The mother factor is {$E(t)$}, the father factor is the {$k$}th derivative of {$A(t)$} for some {$k\geq 0$}, which simply equals {$A(t)$} when {$k=0$}. The baby factors are {$x$} times the {$l_j$}th derivative of u(t) for some {$l_j\geq 1$}. A term has one and only one mother factor, one and only one father factor, and zero or more baby factors.

In searching for a combinatorial interpretation of the {$n$}th derivative, I got about this far. I was afraid that it could take days or weeks for me to figure out what exactly is going on here. I am not actually interested in the math for its own sake. Rather I am interested in Math 4 Wisdom, which is to say, math for the sake of wisdom. I realized that I could take a shortcut which is to do a search at the Online Encyclopedia of Integer Sequences. The sequence that I entered is the number of pedigrees. We started with 1, then 2, then 5, and then 15. When we enter that into the encyclopedia's search box, then the first result it gives is the Bell numbers, 1, 1, 2, 5, 15, 52, 203 and so on, which is the number of ways to partition a set of n labeled elements. That is the hint we need and that brings us to the peculiar combinatorics of the pointed partitions with which I started this video.

Bijection and induction

What we can do now is to use induction to construct a bijection for each nonnegative integer {$n$}. The bijection will associate the pedigrees of the terms in the {$n$}th derivative of {$A(t)E(t)$} with pointed partitions of the set {$\{1,\dots , n\}$}.

We proceed by stages. Trivially, at stage {$0$}, we have the empty partitioning, which we associate with the {$0$}th derivative of {$A(t)E(t)$}, in other words, with {$A(t)E(t)$}. Next, for the sake of clarity, at stage {$1$}, if the letter {$1$} is in the free space, then we will differentiate the factor {$A(t)$} in {$A(t)E(t)$} yielding the term {$A'(t)E(t)$}. Whereas if the letter {$1$} is in a compartment, then we will differentiate the factor {$E(t)$} in {$A(t)E(t)$} yielding the term {$A(t)xu'(t)E(t)$}. Thus we get the two terms in the {$1$}st derivative of {$A(t)E(t))$}, and more precisely, we get the two pedigrees, one associated with each term.

At any subsequent stage {$n+1$}, let us suppose that we have constructed a bijection between the pointed partitions of the letters {$1,...,n$} and the pedigrees of the terms in the {$n$}th derivative of {$A(t)E(t)$}. Furthermore, let us suppose that by our construction each pedigree results in a term is the product of three factors:

the father factor, {$j$}th derivative of {$A(t)$}, where {$j\geq 0$};
the product, possibly empty, of {$m$} baby factors of the form {$x$} times the {$l_i$}th derivative of {$u(t)$}, where {$i$} ranges from {$1$} to {$m$};
the mother factor {$E(t)$}.

More precisely, each pedigree results in a term of the form {$A^{(j)}(t)xu^{(l_1)}(t)xu^{(l_2)}(t)\cdots xu^{(l_m)}(t)E(t)$}, where {$n=j+l_1+l_2+\dots +l_m$}.

Let us suppose that to a pedigree of such a term the bijection we have constructed so far associates a pointed partition such that the base part, what I call the free space, contains {$j$} elements and there are {$m$} additional parts, which I call compartments, and if we order those compartments by their smallest element, then the {$i$}th compartment will have {$l_i$} elements. And let us suppose, vice versa, that given any pointed partition of {$n$} elements, the bijection we have constructed so far likewise associates a pedigree for some term in the {$n$}th derivative, where that term will be the product of the {$j$}th derivative of {$A(t)$} and then {$m$} factors of the form {$x$} times the {$l_i$}th derivative of {$u(t)$} and finally {$E(t)$}.

Broadly speaking, taking the derivative of {$A(t)$} corresponds to inserting an element in the free space, and taking the derivative of some factor {$u(t)$} corresponds to placing an element in the associated compartment, and each such compartment has weight {$x$}. Supposing all of that for stage {$n$}, we can now define how to extend our bijection so that it holds at stage {$n+1$}, and that will not only establish our bijection but also make it most precise and rigorous.

We examine three cases that we considered earlier in this video. Let us suppose we are given the pedigree of a term in the {$n$}th derivative of {$A(t)E(t)$}. Let us take the derivative of that term. Then by the product rule we will be differentiating one of the factors in that term. Each factor will thus yield a new pedigree.

Differentiating the father factor, the {$j$}th derivative of {$A(t)$}, yields the {$j+1$}th derivative of {$A(t)$}, and corresponds to adding the element {$n+1$} to the free space in the associated pointed partition.
Differentiating a baby factor, {$x$} times the {$l_i$}th derivative of {$u(t)$}, yields {$x$} times the {$l_i + 1$}th derivative of {$u(t)$}. This corresponds to adding the element {$n+1$} to the {$i$}th compartment.
Differentiating the mother factor, {$E(t)=e^{xu(t)}$}, yields by the chain rule {$xu'(t)E(t)$}. This corresponds to adding a new compartment with weight {$x$} and a single element {$n+1$}.

And vice versa, suppose that we have a pointed partition of the set of letters {$1,...,n$}. Then we extend this with a letter {$n+1$} by either adding that to the free space, or adding it to one of the compartments, or adding it as the element in a brand new compartment. These three cases correspond to the differentiations of the factors in the term for the associated pedigree.

Thus we see that given a pedigree we have a term which we extend by differentiation and that term corresponds to a pointed petition which we extend by adding a letter. And given a pointed partition which we extend by adding a letter their we can assign the corresponding term and the factor in it which gets differentiated.

Here's the subtle part. We need to assign not just the term but the pedigree. Different pedigrees can give the same term. (Can explain with the example here.)

In the induction, which pedigree do we choose? We know what pedigree to choose because by induction we already have a bijection that works for stage {$n$}. This part of the proof is nonconstructive. But it lets us see that by induction the construction that is taking us from stage {$n$} to stage {$n+1$} can hold for all stages. Which is to say, the histories match up on both sides. Choosing what factor to differentiate at stage {$n$} corresponds to choosing what compartment in which to place the letter {$n$}. So we have actually proved not just that the bijection between pedigrees and pointed partition holds for each stage {$i$} but that the bijection respects and identifies the constructions on both sides.

Alternatively, we could have proven this all constructively. But that would have required us to develop a more explicit notation. For example, we could use letters {$1,\dots,n$} to indicate the order in which we have differentiated the factors in the term for each pedigree. Then it would be crystal clear how the two sides are saying exactly the same thing. But that would require creating a lot of new symbols and tedious terminology. So this is one trade-off between nonconstructive and constructive approaches to be aware of. In general, this is a big problem in combinatorics, that we have simple things to say but we may need a lot of verbage. In this sense, combinatorics is anti-mathematical, even while it is arguably the foundation for mathematical constructions.

At this point, we're at the top of the mountain that we set out to climb, even if we've used induction nonconstructively to drive up most of the way to the top. We have established our bijection but in order to isolate {$S_n(x)$} we still need to set {$t=0$}. From our initial conditions, we know that {$A(0)=1$}, {$u(0)=0$},{$u'(0)=1$} and {$E(0)=e^{xu(0)}=1$}. But the higher derivatives of {$A(t)$} and {$u(t)$} persist. And the power of {$x$} keeps track of the number of compartments. Finally, we will need to divide by {$n!$} but I will refrain from that to keep the expression simpler.

The upshot is that {$S_n(x)$} does not keep track of situations which yield no information. It never keeps track of the mother function, and it does not keep track of the father function when the free space has no elements, and it does not keep track of compartments if they only have a single element. It does keep track of additional elements in the form of higher derivatives. The {$j$}th derivative of a factor corresponds to the free space or a compartment having {$j$} elements as we have seen.

Thus, in particular, the leading term of {$n!S_n(x)$} is {$x^n$}, which is monic, because there is a single pedigree, a single history, by which there are {$n$} compartments, each with a single element. The coefficient for the {$k$}th power of {$x$} encodes the ways of arriving at {$k$} compartments. And the coefficient of the constant term of {$n!S_n(x)$}, the lowest term, of zero degree, is given by the {$n$}th derivative of {$A(t)$}, which corresponds to free space containing all of the {$n$} elements.

If you've made it this far, then you can see that the result of the calculation is about as easy and simple as it could be. You should be able to write out the terms of the {$n$}th derivative of {$A(t)E(t)$}, either by hand or with a computer program, and then set {$t=0$}. It is simply a matter of listing the pointed partitions, which are basically the partitions of a set, for we simply erase the very first element and thereby distinguish the very first compartment as being free space.

Significance

In conclusion, I want to share what I learned in discovering and presenting this result, and I want to mention how it relates to my further investigations which will be subjects of future videos.

The terms in the {$n$}th derivative of {$A(t)E(t)$} had three kinds of factors: a father factor, a mother factor and baby factors. This brings to mind the cognitive framework one - all - many which I talked about in my introductory video to Math 4 Wisdom, where I show how it is inherent in the minimization operator {$\mu$}. The father factor is the first compartment, a special compartment, a unique compartment, and in that sense, may be thought of as the "one". The mother factor is the source of all of the other compartments, those that are not special, and in that sense, may be thought of as "all". The baby factors exist in parallel with each other. Each of them, through an act of differentiation, links a template within the mother function with a manifestation outside of the mother function, and so they may be thought of as the "many". So I think of this as an intriguing example of "one, all, many" and it is curious how terms factor accordingly, which for me brings to mind divisions of everything.

Furthermore, in my understanding of the language of wisdom, or wondrous wisdom, as I call it, "one - all - many" is a conception of the threesome, the learning cycle of taking a stand, following through and reflecting. Such a three-cycle is conspicuous in the Jacobi identity for Lie algebras and I think it is plausible that Lie algebras describe the avenues for learning within Lie groups. The generating function for the Sheffer polynomials resembles the exponential expression of Lie groups in terms of Lie algebras, where the function {$u(t)$} would be the element of the Lie algebra. So in my beautiful mind I register this example for future study of the link between one-all-many and the learning cycle.

Philosophically, cognitively, metaphysically, mathematically, this is all evidence of the significance of the pointed partitions, which are just a slight reformulation of the usual partitions of sets, which themselves are one of the very simplest of combinatorial spaces. Given that the number of pointed partitions of {$\{1,\dots ,n\}$} is the same as the number of partitions of {$\{1,\dots ,n+1\}$}, how can it be more natural to use the pointed partitions? Note that a polynomial of degree {$n$} actually has {$n+1$} terms with powers that range from {$x^0$} to {$x^n$}. The counting includes a distinguished lowest term with no power of {$x$}, or as we are saying, with no explicit compartment. The peculiarity of the pointed partitions is that we have a distinguished part, the factor {$A^{(k)}(t)$}, which we can think of as free space, without compartment, without a weight {$x$}. We can think of this distinguished part as introducing subjectivity into objectivity, in that a subjective account is accorded special status amongst objective accounts. Indeed, the free space can be empty, with no elements, and analogously, the factor {$A(t)$} need not be differentiated. Whereas any baby factor must be differentiated at least once, and analogously, its compartment must have at least one element. Otherwise we would be dealing with infinitely many empty compartments.

Recall that the reason that the initial part was distinguished was that it provided only one choice where to place the first element. This is an example of choosing one out of one, which qualitatively is very different from choosing one out of several. I look forward to making a video explaining how this distinction is relevant in the {$q$} analogue of the binomial theorem. When we work over a finite field {$F_q$}, we have a formula for counting the subspaces of a vectorspace, and when we let {$q$} go to one, then this degenerates into the usual binomial formula for counting the subsets of a set, which is qualitatively of a simpler nature. This degeneracy arises when the choice of one out of several possible elements for a basis becomes the choice of one out of one. And this yields an insight into the nature of the mythical field with one element, which I imagine is relevant for modeling God and God's situation and their inherent contradictions.

More and more, I have been finding hints that the concept of choice, as in probability, is key to how math expresses the divisions of everything, which in wondrous wisdom are the holistic cognitive frameworks that model our states of mind, or equivalently, our global workspace in our brains. I note that in this video the multinomial coefficients came up in a very significant way when we switched over from ordinary generating functions to exponential generating functions. That is what allowed us to reinterpret the coefficients {$u_i$} as {$\frac{u^{(i)}(0)}{i!}$}, which is to say, each coefficient came to represent a subset of elements distinct from the subsets of elements represented by the other coefficients. And then the fact that the coefficients {$u_1,\dots,u_m$} were thereby all given distinct meaning, and no two had the same meaning, allowed us to interpret {$\frac{1}{m!}$} as selecting for them a single, definite order. This is another example to keep in mind where choice and probability are at play.

The multinomial coefficients also speak to the importance of teasing out the semantics as to which mathematical expressions are relevant for wisdom and which are not. In future videos I will be showing how the combinatorial space constructed by the Sheffer polynomials {$S_n(x)$} gets constrained and defined further when they are orthogonal. I am personally finding it a challenge to deal with the many versions that a sequence {$P_n(x)$} of orthogonal polynomials can take. Indeed, the sequence {$c_nP_n(x)$}, where the {$c_n$} are nonzero, will likewise consist of orthogonal polynomials. The consequence is that my quest for combinatorial, cognitive, metaphysical meaning may often lead me to focus on variants whose coefficients and generating functions are different than the ones which mathematicians arrived at for historical reasons.

I am investigating the fivefold classification of orthogonal Sheffer polynomials, which I am relating to the fivesome, the division of everything into five perspectives for decision making, which we conceive either spatially or temporally. In probablity and statistics, there is a related classification of probability distributions, namely of the natural exponential families with quadractic variance functions, which are central to the generalized linear model, which is a generalization of ordinary linear regression. There is a similar classification by Pearson of continuous probability distributions. This all brings us again to the concept of choice. My working hypothesis is that the orthogonal Sheffer polynomials give rise to precisely those solutions of the Schroedinger equation which have physical expression. Each of the five kinds is orthogonal by way of its own space-time wrapper and so I expect there are five kinds of physical space for measurements and for causal relations. But there is one underlying discrete combinatorial space, as established by the generating function for Sheffer polynomials, which is what we considered today.

In solutions of the Schroedinger equation, of special importance are the powers of {$x$}, which indicate the number of times the graph of a polynomial crosses the {$x$} axis, and thus the variability, which can relate to the energy level, and is given by the number of compartments in the combinatorial space. It is interesting that the break down of the physical space into regions will be given by the roots of the polynomials, whereas the nature of the combinatorial space is encoded by the coefficients of the polynomial.

In preparing this video, I was struck by the importance of combinatorial histories, what I called pedigrees. We can arrive at the same term following different combinatorial paths. This got me thinking about the Fundamental Theorem of Covering Spaces, a key result in algebraic topology, which I have been studying at the New York Category Theory Meetups led by Wenbo Gao. Given a topological space X, which typically may have loops, we can construct its universal covering space as consisting of all paths through X where we unfold loops and rename points so that there is basically one way to get from one point to another. That is very much the case in our combinatorial space, where we can think of a term in the {$k$}th derivative of {$A(t)E(t)$} as a combinatorial object in its own right but we can also think of it as composed of pedigrees, which is to say, different combinatorial paths, different derivations, different combinatorial histories.

In general, this suggests a picture of combinatorial objects as living within a universal covering space that can be folded up in a variety of ways to yield simpler objects, which identify similar objects as being the same, which is to say, the combinatorial objects can be counted. Indeed, the kinds of universal covering spaces which express combinatorial histories may be very restricted. A knowledge of such restrictions might provide a map of how all of mathematics unfolds. I have been watching online the lectures by John Baez, my favorite math educator, who is visiting Tom Leinster at the University of Edinburgh. His lecture about Lie theory made me realize how compact Lie groups, which are folded up along every dimension, are related to their unfolded versions, their universal coverings, by way of root lattices. These root lattices and their root systems are highly constrained, yielding four classical families of Lie groups, and a handful of exceptions. I know from my own investigations that each of the classical root systems establishes in its own way the duality inherent in counting forwards and counting backwards. That will be the subject of a future video. I appreciate your help in collecting examples of combinatorial histories and related algebraic structures. For example, the evolution of Young diagrams is captured by the representations of the symmetric group and various classical groups, which is another subject of John Baez's lectures.

I think that this combinatorial world within the coefficients of the polynomials could encode all manner of evolving information. It's a world that could be inhabited by cellular automata as in Steven Wolfram's A New Kind of Science. It would be fantastic to think through how the learning cycle in the Jacobi identity could model the culling that is needed for evolutionary models. My colleague John Harland thinks a lot about an evolutionary basis for physics and so that's on my mind.

If you're interested in such research, then please leave comments, contact me, and sign up for the Math 4 Wisdom discussion group. This bijection which I have shared is my own original research and I have not yet encountered it in the existing literature although it seems that Tian-Xiao He gave a more abstract talk about this in 2006. I will mention here that the Bell number {$B_n$} is a sum of Stirling numbers of the second kind, which count the number of ways of partitioning a set of size n into k parts, which in our context is the number of pedigrees associated to the power {$x^k$}. Our vision of Sheffer polynomials as space builders should relate to existing combinatorics of Stirling numbers and Stirling polynomials.

I am further investigating how the combinatorics of the orthogonal Sheffer polynomials expresses the fivesome of wondrous wisdom, the framework for double causality, whereby "every effect has had its cause, but not every cause has had its effects, and there is a critical point for deciding". At my website Math4Wisdom.com you can find my research notes and links to resources that I have been studying, including the original paper by Isador Sheffer, the superb combinatorics by Dongsu Kim and Jiang Zeng, much appreciated derivations by Daniel Galiffa and Tanya Riston, enlightening videos by Xavier Viennot, the useful book by Theodore Chihara. I am grateful to Tom Copeland for his blog posts and his helpful answers at Math Overflow.

Thank you for supporting me through Patreon. Please like, subscribe, leave comments and join our community. I am Andrius Kulikauskas and this is Math 4 Wisdom.