Here we will implement an algorithm for computing the 'inverse' of Euler totient function

ϕ(n)

.

ϕ(n)

is defined as the number of positive integers less than

n

and relatively prime to

n

. The inverse function for a given

m

will return the list of all the numbers

n

such that

ϕ(n)=m

.

How can all the solutions to the equation

ϕ(n)=m

be found? Naturally, let's start by looking at the decomposition of

m

and

n

into primes. Let

n=

r

∏

k=1

a

k

p

k

,m=

s

∏

k=1

b

k

q

k

,ϕ(n)=m

Euler function can be expressed explicitly via the prime factors of

n

:

ϕ(n)=

r

∏

k=1

a

k

-1

p

k

(

p

k

-1)=m=

s

∏

k=1

b

k

q

k

( 1 )

It immediately follows from (1) that

a

k

-1

p

k

and

(

p

k

-1)

cannot contain any prime factors that are not in the set

Q=

s

{

q

k

}

k=1

. Hence we can rewrite their decompositions thus:

∀k=

1,r

a

k

=1∨∃

q

i

=

p

k

and

∀k=

1,r

p

k

=2∨

p

k

=

t

k

∏

i=1

b

k,i

q

k,i

+1

where each

q

k,i

belongs to

Q

. Allowing the powers

b

k,i

to be zero, we can rewrite the last formula including all the

q

k

in the product:

∀k=

1,r

p

k

=

s

∏

i=1

b

k,i

q

i

+1

Another fact that follows from (1) is that

r

∏

1

(

p

k

-1)

divides

m

. Therefore, for each

i,k

we have

b

k,i

≤

b

i

. This means that we can find all the

p

k

that can possibly occur in

n

by simply trying all possible products of

q

i

raised to any power not greater than

b

i

. And by transforming (1) we get

ϕ(n)=

r

∏

1

a

k

p

k

(

p

k

-1)

p

k

=n

r

∏

1

1-

1

p

k

=m

n=m

r

∏

1

-1

1-

1

p

k

( 2 )so we can get all the solutions

n

by substituting all possible combinations of

p

k

into (2). Not all

n

that we get this way will satisfy the equation

ϕ(n)=m

, but right now it is important to us that no solutions will be lost.

Another simple way to find

-1

ϕ

(m)

would be to estimate the range containing all the solutions. First,

-1

1-

1

p

k

>1

, and second, if

p

i

<

p

j

, then

-1

1-

1

p

i

>

-1

1-

1

p

j

. If the total number of

p

k

is

R

and they are sorted from smallest to largest:

p

1

<

p

2

<...<

p

R

, then

-1

m1-

1

p

R

≤n≤m

R

∏

k=1

-1

1-

1

p

k

Now we need to estimate the values of

p

k

. The largest is not greater than

(m+1)

because each

(

p

k

-1)

divides

m

(see above). And if we denote the

k

th prime in the row of natural numbers as

π

k

, then

p

k

≥

π

k

. Additionally, for all

k

this inequality holds:

π

k

>kln(k)

. For

k

equal to one the right side is zero, so we use this inequality for

k≥2

(and explicitly substitute

π

1

=2

), and get

m

R

∏

k=1

-1

1-

1

p

k

≤m

R

∏

k=1

-1

1-

1

π

k

≤2m

R

∏

k=2

-1

1-

1

kln(k)

The constant

R

is not greater than

χ(m)

, the number of divisors of

m

(again, because

m

is divisible by each

(

p

k

-1)

), and finally we get

m+1≤n≤2m

χ(m)

∏

k=2

-1

1-

1

kln(k)

invphidemo1[m_]:=Module[{Lp,Lq,Lb,Lpow,Lfact,n},{Lq,Lb}=Transpose[FactorInteger[m]];Lpow=MapThread[Table[#1^i,{i,0,#2}]&,{Lq,Lb}];Lp=Outer[Times,Sequence@@Lpow];Lp=Select[Flatten[Lp]+1,PrimeQ];Lfact=Table[(Lp[[i]]/(Lp[[i]]-1))^j,{i,Length[Lp]},{j,0,1}];n=mOuter[Times,Sequence@@Lfact];Select[Flatten[n],EulerPhi[#]m&]]

invphidemo2[m_]:=SelectRangem+1,2m

DivisorSigma[0,m]

∏

k=2

-1

(1-1/(kLog[k]))

,EulerPhi[#]m&

invphidemo1[12]invphidemo2[12]

{13,21,26,28,36,42}

Of course, these are bad solutions. They are very slow, and the first one requires prodigious amounts of memory. The reason is that if the powers

b

k

are large, the set of

p

k

can be many times larger than the set

Q

, and creating the table of all the possible products of

p

k

is nearly impossible. The function

invphi

still will involve the search for all combinations of

p

k

that yield solutions but our goal is to make this search more efficient.

Let's denote the set of all the

p

k

that can possibly occur in a solution as

P=

R

{

p

k

}

k=1

. We want to create list

L

r

={

p

i

1

,

p

i

2

,...,

p

i

r

}

, with elements from

P

, corresponding to one of the solutions

n

. This means that we'll get a solution by applying formula (2) to

L

r

. Note that

L

r

denotes the list with

r

elements and not the

r

th element. This notation will be handy for us because

r

is different for each

n

and is not known in advance. Let on some step in the algorithm we already have

t

elements

L

t

. Then the problem is reduced to deciding adding which

p

k

to

L

t

can yield a valid solution, and which

p

k

needn't be tried, being 'incompatible' with those

p

k

that are in the list already.

First, as was already shown, the product

r

∏

k=1



p

i

k

-1

divides

m

, so their quotient

d

r

=m

r

∏

k=1

-1



p

i

k

-1

is integer. Therefore, before adding

p

k

to

L

t

we should check that the resulting quotient

d

t+1

is integer, and if it's not,

p

k

is not added.

Second,

d

r

contains only prime factors from

Q

(because

d

r

is

m

divided by something). But by formula (1),

d

r

=

r

∏

k=1

a

i

k

-1

p

i

k

It follows that only those primes can occur in

d

r

which belong to

P⋂Q

and are also members of

L

r

. So on the step where we have

t

elements we should check if the 'current' quotient

d

t

contains a factor

q

k

which is not a member of

L

t

. If such

q

k

exists, then it cannot appear in the 'final' quotient

d

r

unless it belongs to

L

r

. Therefore, we can add

q

k

to

L

t

(possible only if

q

k

belongs to

P

) or eliminate

q

k

from the quotient on the next steps, if we want to arrive to a solution. The elimination can be done by selecting

p

i

t+1

such that



p

i

t+1

-1

is divisible by

q

k

, so in the next quotient

d

t+1

=

d

t



p

i

t+1

-1

the power to which

q

k

appears in it will be lowered (though

q

k

may still be present in

d

t+1

). How this all benefits the algorithm is that instead of trying to add all the possible

p

k

that are not in the list yet, we can find out that there is a much smaller subset of

P

such that at least one of its elements should necessarily be appended to

L

t

, in order to eliminate a

q

k

. Then we may try building new lists from

L

t

by adding only

p

k

from that subset first (adding them separately, thus creating several new 'candidates' from

L

t

). This way we can significantly reduce the amount of search without losing any of the solutions.

Another important consequence is that if want to know whether

L

t

corresponds to a valid solution or not, it can be checked without computing

EulerPhi

. The criterion of its validity is exactly the condition discussed above, namely that

d

t

can contain only those

q

k

that appear in

L

t

.

The program

invphi

first finds

P

.

P

and

Q

are rearranged so that all their common elements, the number of which is denoted

r

0

, come in the beginning and have the same positions in both lists. The main structure

wrk

is a list of solution candidates. Each element of

wrk

has in turn two elements. The first one defines

L

t

. It contains an indicator for each

p

k

from

P

showing if that

p

k

is already in

L

t

, can, or can not be added to it (1, 0, -1 respectively). The second element is

d

t

. On each step we check if there are any factors

q

k

in the quotient

d

t

that should be eliminated. If there are, then we see how many

p

i

can reduce the power to which

q

k

occurs in

d

t

, ie how many

p

i

have the property that

(

p

i

-1)

is divisible by

q

k

. (As discussed above, another possibility is

p

i

=

q

k

). We always select the

q

k

for which the number of those

p

i

is the smallest, thus trying to minimize the number of new candidates that will be added. If no

q

k

to be eliminated are found, then all members of

P

that are allowed for the current

L

t

are added to it. For each new element of

wrk

thus created we find the new quotient

d

t+1

, and mark as 'incompatible' those

p

k

that would make

d

t+2

fractional. Also, if we are adding several

p

k

to

L

t

, say

p

1

and

p

2

, creating two new lists,

(1)

L

t+1

=Append[

L

t

,

p

1

]

and

(2)

L

t+1

=Append[

L

t

,

p

2

]

, then in

(2)

L

t+1

p

1

is marked as 'incompatible' too in order to avoid possible repetitions of the same sets of

p

k

later. When nothing further can be added to

wrk

, the final list of solutions is generated by selecting those elements from

wrk

that yield correct values of

n

. This is done in a similar way; we consider the corresponding quotient, and if no

q

k

need to be eliminated, then we have a solution.

There are some other ways to improve the performance. It can be easily noticed that if we add

p

k

=2

to the list, then the quotient doesn't change on the next step. We can speed up the search by never considering

p

k

=2

, and only taking into consideration the possible presence of this factor later, when generating solutions from the lists of

p

k

. Also, for selecting the

p

i

that can eliminate a given

q

k

, the structure

Mdiv

is created in advance, in which

Mdiv〚k〛

is the list of indices

i

such that

q

k

divides

(

p

i

-1)

. Finally, after we have created new lists from

L

t

, we'll never return to

L

t

again. When the structure

wrk

grows large, we can convert those elements to the final result

n

and remove them from

wrk

. This not only reduces the memory usage, but makes large computations significantly faster, because operations will be performed with smaller lists.

In Mathematica,

ϕ(0)=0,ϕ(1)=1

.

invphi

complies with this conventions. Also note that only the list of nonnegative solutions is returned, while in Mathematica

ϕ(-n)=ϕ(n)

.

invphi[m_Integer]:=Module[{main,init,genp,bestcand,gencand,addcand,genans,Lp,Lq,r,s,r0,Mdiv},main[]:=Module[{ans={},wrk,threshold=100,Lstate,Ladd,quo,indx,i},wrk={{Table[0,{r}],m}};wrk[[1,1,1]]=-1;For[i=1,i≤Length[wrk],i++,If[ithreshold+1,ans={ans,genans[Take[wrk,threshold]]};wrk=Drop[wrk,threshold];i=1];{Lstate,quo}=wrk[[i]];indx=bestcand[Lstate,quo];Ladd=gencand[Lstate,indx];wrk=Join[wrk,addcand[Lstate,quo,Ladd]]];Flatten[{ans,genans[wrk]}]];init[]:=Module[{Lb,Lpq},{Lq,Lb}=Transpose[FactorInteger[m]];genp[Lb];Lpq=Intersection[Lp,Lq];{Lp,Lq}=Join[Lpq,Complement[#,Lpq]]&/@{Lp,Lq};{r,s,r0}=Length/@{Lp,Lq,Lpq};Mdiv=Cases[Range[r],x_/;Mod[Lp[[x]]-1,#]0]&/@Lq;];genp[Lb_]:=Module[{Lpow,tmp},Lpow=MapThread[Table[#1^i,{i,0,#2}]&,{Lq,Lb}];Lp={};Outer[If[PrimeQ[tmp=Times[##]+1],Lp={Lp,tmp}]&,Sequence@@Lpow];Lp=Flatten[Lp];];bestcand[Lstate_,quo_]:=Module[{len=Infinity,indx=0,cur,i},For[i=1,i≤s,i++,If[((i≤r0&&Lstate[[i]]≠1)||i>r0)&&Mod[quo,Lq[[i]]]0,cur=Length[Mdiv[[i]]];If[cur<len,len=cur;indx=i]]];indx];gencand[Lstate_,indx_]:=Module[{Ladd},Ladd=If[indx≠0,If[indx≤r0,Prepend[Mdiv[[indx]],indx],Mdiv[[indx]]],Range[r]];Select[Ladd,Lstate[[#]]0&]];addcand[Lstate_,quo_,Ladd_]:=Module[{ans={},Lstate2,quo2,len,i},len=Length[Ladd];For[i=1,i≤len,i++,Lstate2=ReplacePart[Lstate,1,Ladd[[i]]];quo2=quo/(Lp[[Ladd[[i]]]]-1);(Lstate2[[Ladd[[#]]]]=-1)&/@Range[i-1];If[Lstate2[[#]]0&&Mod[quo2,Lp[[#]]-1]≠0,Lstate2[[#]]=-1]&/@Range[r];AppendTo[ans,{Lstate2,quo2}]];ans];genans[L_]:=Module[{ans={},Lstate,quo,res,add2,i,j},For[i=1,i≤Length[L],i++,{Lstate,quo}=L[[i]];For[add2=0,add2≤1,add2++,If[add21,Lstate[[1]]=1];For[j=1,j≤s,j++,If[((j≤r0&&Lstate[[j]]≠1)||j>r0)&&Mod[quo,Lq[[j]]]0,Break[]]];If[j≠s+1,Continue[]];res=Cases[Transpose[{Lp,Lstate}],{x_,1}x];res=mTimes@@res/Times@@(res-1);ans={ans,res}]];ans];Switch[m,0,Return[{0}],1,Return[{1,2}],_?(OddQ[#]||Negative[#]&),Return[{}]];init[];main[]]

Here finding all 847 solutions takes only a few seconds.

Timing[Length[invphi[

5

2

5

3

5

]]]

{6.37Second,847}

In this case, factoring each of the resulting integers would take about 150 seconds, so checking the answer takes many times longer than finding it.

Timing[invphi[

3

2

3

31

3

313

3

619

3

1511

3

1733

]]

{6.48Second,{62244230888910733677707758056446080408564,46683173166683050258280818542334560306423,93366346333366100516561637084669120612846}}