Möbius Inversion and Beyond

Cover image
🏛 A Project Cauchy Op-ed

Named after French mathematician Augustin-Louis Cauchy, Project Cauchy column is where I invite some of the HFI Programming club members to provide neat proofs or explanations about some number theory puzzles. I highly suggest that you read these articles with a pencil and paper so you can sketch things out and scribble solutions to exercises as you come across them. This time, we have Travor Liu, one of the great mass communicators of Maths in our school. If you want good, honest worked examples of practice problems, especially for group theory, it’s hard to find a deeper well of examples than what Travor provides. In this article, he will elegantly lead us to the algebraic properties of multiplicative functions, using Möbius inversion formula and Dirichlet convolutions.

In number theory, Möbius inversion is a common technique to study the properties of arithmetic functions (i.e. those that map N+\mathbb N^+ to C\mathbb C), and all of these brilliant things are derived from the following formula:

dnμ(d)=1n={1n=10otherwise(1)\sum_{d|n}\mu(d)=\left\lfloor\frac1n\right\rfloor= \begin{cases} 1 & n=1 \\ 0 & \text{otherwise} \end{cases} \tag1

To understand this formula, let's first understand what it says:

dn\sum_{d|n} Summation over Divisors

The first part of (1) is the summation symbol: dn\sum_{d|n}. Instead of summing over all positive integers within nn, it is summing over the divisors of nn. For instance, if n=10n=10, then dn\sum_{d|n} sums over n=1,2,5,10n=1,2,5,10. Pedantically, we write

d10f(d)=f(1)+f(2)+f(5)+f(10)\sum_{d|10}f(d)=f(1)+f(2)+f(5)+f(10)

To enhance your understanding of this operator, try these exercises:

  1. Calculate d15d2\sum_{d|15}d^2
  2. Explain the meaning of dn1\sum_{d|n}1

Formulating μ(n)\mu(n), the Möbius Function

Usually, the Möbius function is defined as

μ(n)={1n has even distinct prime factors1n has odd distinct prime factors0n is not square-free\mu(n)= \begin{cases} 1 & \text{$n$ has even distinct prime factors} \\ -1 & \text{$n$ has odd distinct prime factors} \\ 0 & \text{$n$ is not square-free} \end{cases}

This standard definition may appear to be strange: why would people care whether some number is square-free or not? To address this, let's turn to a totally different perspective: to expand the following product that runs over all prime numbers:

F(s)=p prime(1ps)F(s)=\prod_{p\text{ prime}}(1-p^{-s})

By some combinatoric skills, we are able to expand it like this:

F(s)=(12s)(131)=112s13s15s+12s3s17s+13s5s7s+n=1a(n)ns\begin{aligned} F(s) &=(1-2^{-s})(1-3^{-1})\cdots \\ &=1-{1\over2^s}-{1\over3^s}-{1\over5^s}+{1\over2^s\cdot3^s}-{1\over7^s} \\ &+\cdots-{1\over3^s\cdot5^s\cdot7^s}+\cdots \\ &\triangleq\sum_{n=1}^\infty{a(n)\over n^s} \end{aligned}

By observing the expansion, we discover that a(n)a(n) must satisfy

  • when the nn is composed of odd number of prime factors a(n)=1a(n)=-1
  • when the nn consists of even number of prime factors a(n)=1a(n)=1.
  • Because each prime only appears only once in the product, a(n)=0a(n)=0 for all nn that contains square factors

As a result, a(n)a(n) is the Möbius function μ(n)\mu(n).

The Formula

With our understanding of the symbols, we can now be capable of understanding the implication of (1). That is, (1) declares that the summation of the Möbius function over divisors of some certain number nn is one for n=1n=1 and zero for all n1n\ne1. Obviously, when n=1n=1, we have

d1μ(d)=μ(1)=1\sum_{d|1}\mu(d)=\mu(1)=1

Did you understand the definition of μ(n)\mu(n)? Try these problems!

  1. μ(20)\mu(20)
  2. μ(5)μ(2)\mu(5)\mu(2)
  3. μ(10)\mu(10)

Properties of μ(n)\mu(n)

Generally, we say some arithmetic function f(n)f(n) to be multiplicative when for all coprime positive integers aa and bb, f(ab)=f(a)f(b)f(ab)=f(a)f(b). Now, let's show the following fact of μ(n)\mu(n):

Theorem: μ(n)\mu(n) is multiplicative

Proof. For coprime positive integers aa and bb, we may divide this proof into two situations:

  1. aa and/or bb contains square factors, their product abab would also have square factors. As a result, μ(a)μ(b)=μ(ab)=0\mu(a)\mu(b)=\mu(ab)=0
  2. For aa and bb being square free, let's denote rnr_n be the number of prime factors in nn, so we have
μ(a)μ(b)=(1)ra+rb=μ(ab)\mu(a)\mu(b)=(-1)^{r_a+r_b}=\mu(ab)

which completes the proof. \square

With these tools being prepared, we can delve into proving (1).

Proof of (1)

Let n=abn=ab where aa and bb are coprime positive integers, then

dnμ(d)=d1a,d2bμ(d1d2)=d1aμ(d1)d2bμ(d2)\sum_{d|n}\mu(d)=\sum_{d_1|a,d_2|b}\mu(d_1d_2) =\sum_{d_1|a}\mu(d_1)\sum_{d_2|b}\mu(d_2)

Now, let's plug prime powers n=pkn=p^k into (1), so we have

dpkμ(d)=r=0kμ(pr)=μ(1)+μ(p)=11=0\sum_{d|p^k}\mu(d)=\sum_{r=0}^k\mu(p^r)=\mu(1)+\mu(p)=1-1=0

Due to the fact that dnμ(d)\sum_{d|n}\mu(d) is multiplicative, we conclude (1) is true.

Application of Möbius Inversion: Euler's Totient Function φ(n)\varphi(n)

In fact, (1) can help us find a definition for Euler's totient function φ(n)\varphi(n), i.e. number of positive integers within nn that are coprime to nn:

First, we write down φ(n)\varphi(n) in terms of summation:

φ(n)=kngcd(k,n)=11\varphi(n)=\sum_{k\le n\atop\gcd(k,n)=1}1

Then, using the identity given by (1)(1), we have

φ(n)=kndgcd(k,n)μ(d)=kndn[dk]μ(d)=dnμ(d)kn,k=qd=dnμ(d)qn/d1\begin{aligned} \varphi(n) &=\sum_{k\le n}\sum_{d|\gcd(k,n)}\mu(d)=\sum_{k\le n}\sum_{d|n}[d|k]\mu(d) \\ &=\sum_{d|n}\mu(d)\sum_{k\le n,k=qd}=\sum_{d|n}\mu(d)\sum_{q\le n/d}1 \\ \end{aligned}

Eventually, we obtain the formula for Euler's totient function:

φ(n)=dnμ(d)nd(2)\varphi(n)=\sum_{d|n}\mu(d)\frac nd\tag2

As we can see, (1) actually helps us give definition to other arithmetic functions. Now, it is your job to discover the properties of this function:

Show that φ(n)\varphi(n) is multiplicative and, in addition, φ(n)\varphi(n) can be expressed by

φ(n)=np primepn(11p)\displaystyle{\varphi(n)=n\prod_{p\text{ prime}\atop p|n}\left(1-\frac1p\right)}

Divisor sum of Euler's totient function

If we were to sum φ(n)\varphi(n) over divisors of nn, we could magically obtain nn:

dnφ(d)=dnφ(nd)=dnkn/dkμ(ndk)=dknkμ(ndk)=knkdn/kμ(d)=knkkn=n\begin{aligned} \sum_{d|n}\varphi(d) &=\sum_{d|n}\varphi\left(\frac nd\right) =\sum_{d|n}\sum_{k|n/d}k\mu\left(n\over dk\right) \\ &=\sum_{dk|n}k\mu\left(n\over dk\right) =\sum_{k|n}k\sum_{d|n/k}\mu(d) \\ &=\sum_{k|n}k\left\lfloor\frac kn\right\rfloor=n \end{aligned}

This identity can also be seen by listing fractions. For instance let's consider the case for n=20n=20

120,220,320,420,,1820,1920,2020{1\over20},{2\over20},{3\over20},{4\over20},\dots, {18\over20},{19\over20},{20\over20}

In total, there are nn fractions. If we were to simplify these fractions, we get

120,110,320,15,,910,1920,11{1\over20},{1\over10},{3\over20},{1\over5},\dots, {9\over10},{19\over20},\frac11

Particularly, the denominators in these simplified fractions are always the divisor of nn. Moreover, for each dnd|n there are exactly φ(d)\varphi(d) simplified fractions with denominator dd. As a result, we can also observe that

dnφ(d)=n(3)\sum_{d|n}\varphi(d)=n\tag3

Dirichlet Convolution

If we juxtapose (2) and (3), we can see that φ(n)\varphi(n) and nn are closely related to each other, particularly if we define (4) as Dirichlet convolution, then we can say that φ(n)\varphi(n) can be obtained by convolving Möbius function with nn. Similarly, nn can be obtained by convolving φ(n)\varphi(n) with 11.

(fg)(n)dnf(d)g(nd)=dng(d)f(nd)(4)(f*g)(n)\triangleq\sum_{d|n}f(d)g\left(\frac nd\right) =\sum_{d|n}g(d)f\left(\frac nd\right)\tag4
  • Commutativity and associativity: Obviously this operation commutative. Moreover, it can be easily verified that Dirichlet convolution is also associative using similar techniques to prove (3).
  • Identity element: There is also an identity function for Dirichlet convolution. That is, Dirichlet convolution between any arithmetic function and 1/n\lfloor1/n\rfloor gives the original function:
dnf(d)dn=f(n)\sum_{d|n}f(d)\left\lfloor\frac dn\right\rfloor=f(n)
  • Dirichlet inverse: If g(n)g(n) and f(n)f(n) are Dirichlet inverse to each other, then
dnf(d)g(nd)=1n\sum_{d|n}f(d)g\left(\frac nd\right)=\left\lfloor\frac1n\right\rfloor

Although μ(n)\mu(n) and 11 are inverses to each other, not every arithmetic function has Dirichlet inverse. Hopefully, the following theorem helps us determine whether some arithmetic function is Dirichlet-invertible or not:

Theorem: An arithmetic function f(n)f(n) has Dirichlet inverse if and only if f(1)0f(1)\ne0

Proof. For convenience, suppose f(n)f(n) has a Dirichlet inverse g(n)g(n), so we need to ensure

dnf(d)g(nd)=1n\sum_{d|n}f(d)g\left(\frac nd\right)=\left\lfloor\frac1n\right\rfloor

For n=1n=1, we have g(1)=1/f(1)g(1)=1/f(1), so f(1)f(1) must be non-zero in order for its Dirichlet inverse to exist. In addition, for n>1n>1 we have

0=dnf(d)g(nd)0=dn,d<nf(d)g(nd)+f(1)g(n)g(n)=1f(1)dn,d<nf(d)g(nd)\begin{aligned} 0&=\sum_{d|n}f(d)g\left(\frac nd\right) \\ 0&=\sum_{d|n,d<n}f(d)g\left(\frac nd\right)+f(1)g(n) \\ g(n)&=-{1\over f(1)}\sum_{d|n,d<n}f(d)g\left(\frac nd\right) \end{aligned}

which also implies the theorem. \square

Algebraic Properties of Multiplicative Functions

Let GG be the set containing all multiplicative functions and * be the Dirichlet convolution operator, then we can verify that

  • For all f,gGf,g\in G, we have fg=gfGf*g=g*f\in G
  • For all f,g,hGf,g,h\in G, (fg)h=f(gh)(f*g)*h=f*(g*h)
  • For all fGf\in G, (f1/n)(n)=f(n)(f*\lfloor1/n\rfloor)(n)=f(n)
  • For all fGf\in G, there exists gGg\in G such that (fg)(n)=1/n(f*g)(n)=\lfloor1/n\rfloor

In order for the last condition to hold, readers could consider proving that every multiplicative function f(n)f(n) satisfies f(1)=10f(1)=1\ne0.

As a result, we conclude that all multiplicative functions form an abelian group under Dirichlet convolution.

Conclusion

In a nutshell, we begin our discussion with the explanation and proof Möbius inversion formula as in (1), and then we present Dirichlet convolution, a generalization of the sum-of-divisor operation. At last, we discover an algebraic property in multiplicative functions: that is, all multiplicative functions form an abelian group under Dirichlet convolution.

Norm: A Brief Introduction to the "Size" of Vectors in Machine Learning ▶