Academic Articles & Supplements

Solving Knapsack Problems with Mathematica

Daniel Lichtblau

Author

Daniel Lichtblau

Title

Solving Knapsack Problems with Mathematica

Description

Solving Knapsack Problems with Mathematica

1
. Abstract

Knapsack problems and variants thereof arise in several different fields from operations research to cryptography to really, really serious problems for hard-core puzzle enthusiasts. We discuss some of these and show ways in which one might attempt to solve them using Mathematica.

2
. Introduction

A knapsack problem is described informally as follows. One has a set of items. One must select from it a subset that fulfills specified criteria. A classical example, from cryptosystems, is what is called the "subset sum" problem. From a set

of numbers, and a given number

, find a subset of

whose sum is

. A variant is to find a subset whose sum is as close as possible to

. Another variant is to allow integer multiples of the summands, provided they are small. That is, we are to find a componentwise "small" vector

such that

v.S≈k

(where we regard

as being an ordered set, that is, a vector). More general knapsack problems may allow values other than zero and one (typically selected from a small range), inequality constraints, and other variations on the above themes..

Of note is that the general integer linear programming problem (ILP) can be cast as a knapsack problem provided the search space is bounded. Each variable is decomposed into new variables, one for each "bit"; they are referred to as

0-1

variables because these are the values they may take. One multiplies these new variables by appropriate powers of two in reformulating the problem in terms of the new variables. We will use this tactic in an example below. An advantage (as we will see) is that often one need not strictly enforce the

0-1

requirement.

Applications of knapsack problems are manifold. The approximate knapsack with small multipliers variant is used, for example, to find a minimal polynomial given an approximation to a root [Lenstra 1984]. The knapsack approximation problem is also used in a more efficient algorithm for univariate factorization from [van Hoeij 2002]. Applications to cryptosystems are discussed in [von zur Gathern and Gerhard 1999, chapter 17] and [Nguyen 1999].

Among the tools one might use for knapsack problems are brute force search, smart combinatorial search techniques, integer programming optimization methods, algebraic solvers with appropriate constraints, and lattice methods. We will show in the examples below several ways one might attempt to solve knapsack problems and their relatives using some of these approaches in Mathematica [Wolfram 2003]. The methods we discuss are not new and most have previously appeared in various venues as cited. The object of this paper is to gather together several convenient examples, references, applications, and Mathematica methods under the unifying theme of knapsack solvers. We should mention from the outset that some of these methods are ,at first glance rabbits pulled from hats, and an amount of work remains to make them fully algorithmic. That said, we feel they are of interest at least insofar as they seem to do useful things and moreover point to directions for further study.

Timings, where indicated, are performed on a 1.4 Ghz machine using the development kernel for Mathematica.

3
. A simple subset sum

The example below arose in the Usenet news group comp.soft-sys.math.mathematica. A response indicating the method we illustrate appears in [Lichtblau 2002a]. It is the very classical subset sum problem: we are given a set of rational numbers and a desired value, and seek a subset that sums to the value.

In[]:=

vec={1/2,1/3,1/4,1/8,3/10,12/79,13/38};val=2509/2280;

In a way this is a linear diophantine equation but we seek a particular type of solution, wherein all components are zero or one. As such solutions are small in Euclidean norm, a common method by which to attempt such problems involves lattice reduction [Lenstra, Lenstra, and Lovácz 1982]. The idea is to set up a matrix wherein the last column is precisely the vector augmented at the bottom with the negated value, and we augment to the left with an identity matrix (next to the vector), and a row of zeros preceding the negated value. We then reduce this lattice. The elements of the identiy matrix serve as recorders of the multiples we use in (attempting to) zero the last column, in much the same way that augmenting may be used to form a matrix inverse. If we obtain a row with a zero in the last entry and all ones and zeros preceding, then the columns with ones correspond to set elements we take for our summands. We will rescale to make the last column large though this is not always necessary. We will show the lattice below so one might see why such a "reduction" can give something useful.

In[]:=

lattice1=Transpose[Append[IdentityMatrix[Length[vec]],10^10*vec]];lattice2=Append[lattice1,Append[Table[0,{Length[vec]}],-10^10*val]]

Out[]=

{1,0,0,0,0,0,0,5000000000},0,1,0,0,0,0,0,

10000000000

,{0,0,1,0,0,0,0,2500000000},{0,0,0,1,0,0,0,1250000000},{0,0,0,0,1,0,0,3000000000},0,0,0,0,0,1,0,

120000000000

,0,0,0,0,0,0,1,

65000000000

,0,0,0,0,0,0,0,-

627250000000



We will reduce and then select those rows with last element zero and all preceding elements either zero or one. We check that such candidate solutions actually work.

In[]:=

lr=LatticeReduce[lattice2];result=Map[Most,Select[lr,Last[#]0&&Apply[And,Map[#1||#0&,Most[#]]]&]]Map[#.vecval&,result]

Out[]=

{True}

We thus see that the second, fourth, fifth, and seventh elements sum to the desired value.

We should note that the method illustrated above it is by no means guaranteed to give a solution if one exists. The useful fact is that, for a large class of problems, it does. This method and a variant thereof are discussed in [Schnorr and Euchner 1991]. The variant uses an encoding that frequently works better when there might be small vectors in the lattice other than the solution vector.

Variations on this lattice technique have applications to polynomial factorization methods already noted in the introduction, finding small solutions to linear diophantine equations (the "closest vector problem" in lattices; see e.g. [Nguyen 1999], [Matthews 2001], or [Lichtblau 2003]), and simultaneous diophantine approximation ([Lenstra, Lenstra, and Lovácz 1982], [von zur Gathen and Gerhard 1999]; this is in essence the method behind the function AffineRationalize in the Mathematica standard add-on package NumberTheory`Rationalize`).

4
. Numeric solvers and knapsacks

Another method for the previous example, workable for problems of small size, is to use a numeric solver with equations in place to insure that all variables take on only values

. This method was first brought to my attention by [Boneh 1997] (though not indicated in [Lichtblau 2002a], I used it on this example in private follow-up correspondence).

In[]:=

vars=Array[x,Length[vec]];polys=Append[vars*(vars-1),vars.vec-val];NSolve[polys]

Out[]=

It must be mentioned that this method is, for practical purposes, generally no better than brute search, and sometimes considerably worse. The reason, roughly, is that we have an overdetermined nonlinear system wherein all equations but one are quadratic. Such systems are easily overwhelmed by computational complexity when one applies the methods of NSolve [Lichtblau 2000]. The nice feature of the method lies in the simplicity of code. More importantly, and the reason we showed it above, is that it may be used to advantage in larger knapsack-style problems when there are more linear equations, as these have an effect of reducing the complexity (for one they reduce the a priori bound on the size of the solution set; indeed even nonlinear equations may help to reduce complexity since the system is overdetermined). In [Rasmusson 2003] we see a nice application to solving a type of classic puzzle. The particular example is known as "Johnny's Ideal Woman" and the problem statement is as follows (see the URL to the Georgia College & State University BITS & BYTES electronic journal given in the Rasmusson reference below).

Johnny's ideal woman is brunette, blue eyed, slender, and tall. He knows four women: Adele, Betty, Carol, and Doris. Only one of the four women has all four characteristics that Johnny requires.1. Only three of the women are both brunette and slender.2. Only two of the women are both brunette and tall.3. Only two of the women are both slender and tall.4. Only one of the women is both brunette and blue eyed.5. Adele and Betty have the same color eyes.6. Betty and Carol have the same color hair.7. Carol and Doris have different builds.8. Doris and Adele are the same height.

Which one of the four women satisfies all of Johnny's requirements? Rasmusson solved this is tackled as follows (I have made modest alterations to the code but the essentials remain unchanged).

In[]:=

gals={Adele,Betty,Carol,Doris};features={blueeye,slender,brunette,tall};solvvars=Map[perfectgirl,Range[4]];elimvars=Flatten[Outer[Operate[##,0]&,features,Range[4]]];galnumbers=Thread[{Adele,Betty,Carol,Doris}{1,2,3,4}];constraints=Map[#^2#&,Flatten[Join[elimvars,solvvars]]];problem=Join[constraints,{Sum[blueeye[i]*slender[i],{i,4}]3,Sum[brunette[i]*tall[i],{i,4}]2,Sum[slender[i]tall[i],{i,4}]2,Sum[blueeye[i]*brunette[i],{i,4}]1,blueeye[Adele]blueeye[Betty],brunette[Betty]brunette[Carol],slender[Carol]1-slender[Doris],tall[Adele]tall[Doris]},Table[brunette[i]blueeye[i]slender[i]tall[i]perfectgirl[i],{i,4}]]/.galnumbers;

We solve the system, eliminating all variables other than the perfectgirl set. We then do some replacement postprocessing to make it clear who is Johnny's favorite (at least this should make it clear to Johnny!).

In[]:=

sol=(NSolve[problem,solvvars,elimvars]/.feature_[j_Integer]feature[gals[[j]]])/.{1.True,0.False,RuleEqual}Cases[sol,perfectgirl[gal_]==Truegal,{2}]

Out[]=

{{perfectgirl[Adele]True,perfectgirl[Betty]False,perfectgirl[Carol]False,perfectgirl[Doris]False}}

Out[]=

{Adele}

As problems of that particular type often contain several equations (some of them linear) the computational complexity tends to be more modest than is the case for a subset sum problem of comparably many variables. Hence one can have more variables and still hope for a result in reasonable time.

5
. Linear diophantine equations

We now show a simple example from [Lichtblau 2002a]. Given 143267 coins (pennies, nickels, dimes, and quarters) of total value $12563.29, how many coins might be of each type? In general one might expect many different solutions; we will be happy to obtain any one of them. In the reference we show an easy way to obtain results using an optimization method (this amounts to a constraint satisfaction ILP). Here we will attempt a lattice-based method quite similar to what was shown above for the subset sum . We require nonnegative integer solutions to a pair of equations

p+5n+10d+25q=1256329

and

p+n+d+q=143267

. We use these as columns of our lattice, again augmenting by an identity matrix.. To assist the process of obtaining a small vector that really gives a solution we will multiply the equations by a large value.

In[]:=

vecs=10^8*{{1,5,10,25,-1256329},{1,1,1,1,-143267}};lattice=Transpose[Join[IdentityMatrix[Length[vecs[[1]]]],vecs]];redlat=LatticeReduce[lattice]

Out[]=

{{0,3,-4,1,0,0,0},{-5,3,4,-2,0,0,0},{41749,39185,35978,26355,1,0,0},{0,-2,1,0,0,0,-100000000},{1,-2,1,0,0,100000000,0}}

The third row gives a solution: 41749 pennies, 39185 nickels, 35978 dimes, and 26355 quarters.

Generally speaking it is not difficult to find "solutions" when we allow negative values. Making this method work in the presence of many small null vectors becomes problematic precisely because it then becomes all the more likely that small solutions will have components of both signs.

It is of some interest to pursue this as a

0-1

problem. From each variable we make new variables corresponding to each bit. Note that, as above, we will work with vectors of numbers and not form actual variables at all.

In[]:=

base=2;sizes1=Ceiling[Log[base,1256329/{1,5,10,25}]];size2=Ceiling[Log[base,143267]];sizes=Map[Min[#,size2]&,sizes1];expandedvec=Map[base^Range[0,#-1]&,sizes];mult=10^3;vec1=mult*Append[Flatten[expandedvec*{1,5,10,25}],-1256329];vec2=mult*Append[Flatten[expandedvec*{1,1,1,1}],-143267];lattice=Transpose[Join[IdentityMatrix[Length[vec1]],{vec1,vec2}]];

What we seek is a solution vector containing all zeros and ones, with one in the third to last position (denoting that we utilized the last row but not with a multiplier), and zeros in the last two slots (indicating that we satisfied both linear relations).

In[]:=

redlat=LatticeReduce[lattice];solvec=First[Cases[redlat,{a___,1,0,0}{a}]]

Out[]=

{1,0,0,0,1,0,-1,0,1,0,0,1,0,1,0,0,0,0,-1,0,0,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,1,0,1,1,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}

This, in the words of television's Maxwell Smart, is "not quite what I had in mind." No matter; we can use it anyway. We simply reform our solution values as coefficients times powers of two. For this we must first break apart the solution vector into components that correspond to each of the original variables (pennies, nickels, dimes, and quarters). So long as the largest nonzero component in each is positive we will be able to get a valid solution.

In[]:=

start=1;solcomponents=Table[res=Take[solvec,{start,start+sizes[[j]]-1}];start+=sizes[[j]];res,{j,Length[sizes]}]

Out[]=

{{1,0,0,0,1,0,-1,0,1,0,0,1,0,1,0,0,0,0},{-1,0,0,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0},{0,1,0,0,0,0,0,1,0,1,1,0,0,0,1,1,1},{1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}}

In[]:=

values=Map[#.(2^Range[0,Length[#]-1])&,solcomponents]

Out[]=

{10449,16463,116354,1}

In[]:=

{10449,16463,116354,1}

We check that this satisfies the two equations.

In[]:=

values.{1,1,1,1}==143267&&values.{1,5,10,25}==1256329

Out[]=

True

6
. A more challenging subset sum

The next example is from [Reinholtz 2003]. We are given the first forty reciprocals of squares of integers larger than one. The goal is to find a subset whose sum is

. We show Reinholtz' method below. We begin with the vector itself, clear denominators, and compute the target value in terms of the new vector.

In[]:=

vec=Table[1/n^2,{n,2,40}];len=Length[vec];lcm=Apply[LCM,Denominator[vec]];ivec=lcm*vec;s=lcm/2;

Now find partial sums of all smallest elements, as we decrease the number of such elements. A placeholder table is defined along with a pair of predicates that in effect allow us to tell when to call recursively, and what to test. The bounding feasibility test, used recursively, acts as a sort of backtracking mechanism (one might also view it as a branch-and-bound tactic).

In[]:=

tottbl=Apply[Plus,ivec]-FoldList[#1+#2&,0,ivec];b=Table[0,{len}];feasibleQ[i_,tot_]:=tot≤s≤tot+tottbl[[i]];solutionQ[i_,tot_]:=tots;

In[]:=

try[i_?(#1≤len&),tot_]:=With[{rtot=tot+ivec[[i]]},b[[i]]=1;solutionQ[i,rtot]&&Throw[b];If[feasibleQ[i+1,rtot],try[i+1,rtot]];b[[i]]=0;try[i+1,tot];];Timing[result=Catch[try[1,0]];]result.vec1/2Select[result*vec,#≠0&]

Out[]=

{20.55Second,Null}

Out[]=

True

Out[]=



144

225

400

784

1225



With some work this can also be attacked as a lattice problem. Specifically we can set this up as a linear diophaantine problem. There is a hitch: we typically have many null vectors and these tend to give solutions that do not have all ones. We can attempt to alleviate this by adding extra linear relations. For example, suppose we know or at least suspect that a solution using exactly

values exists. We can set up a lattice as follows.

In[]:=

size=40;squares=Table[n^2,{n,2,size}];lcm=Apply[LCM,squares];vec=lcm/squares;vec=Append[vec,-lcm/2];vec2=Table[1,{size}];vec2[[size]]=-10;lattice=Transpose[Join[IdentityMatrix[Length[vec]],{vec,vec2}]];

The last columns contain the two linear relations we need to enforce; the sum of square reciprocals must equal

and the number of values in the sum must be

. As in earlier examples, the augmented identity matrix records the multiples used in satisfying these relations.

We still face the same difficulty; if we reduce this lattice we can readily make zeros in the last two columns, using the last row multiplied by one, but the multiples of the other rows are not all likely to be one. This in turn is because there are still many small "null" vectors that give rise to many small solutions, from which lattice reduction may not find the sort we require (using all ones). In order to further coerce the lattice reduction to yield a good vector we will alter the default behvior. Specifically we change what is often called the δ parameter, typically set to

in the liteature (see e.g. [Lenstra, Lenstra, Lovácz 1982]), to something much closer to

In[]:=

Developer`SetSystemOptions["LatticeReduceOptions"{"LatticeReduceRatioParameter".99}];Timing[redlat=LatticeReduce[lattice];]

Out[]=

{0.36Second,Null}

We now see if we have any solutions. These will have final three components of a one (signifying that the last row is multiplied by one) followed by two zeros (indicating that we successfully satisfied the two linear relations). Actually these might all be negated but we ignore that possibility for now. Moreover even if the last value is not zero (signifying that the number of elements used was not

) that would not be cause for concern.

In[]:=

solns=Cases[redlat,{a___,1,0,_}/;Max[a]1&&Min[a]0]

Out[]=

{{1,1,1,1,0,1,0,0,0,0,1,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0}}

Our solution is given by this vector with the last three elements removed. We see that it agrees with the previous one.

In[]:=

soln=Drop[First[solns],-3];soln.(1/squares)1/2Select[soln*(1/squares),#≠0&]

Out[]=

True

Out[]=



144

225

400

784

1225



Obviously we made use of the fact that we expected the solution to use

values. In reality we would loop over the possible lengths of a solution subset. It remains to ponder what to do in cases where we do not get a solution vector in the reduced lattice. One method I have used with modest success involves adding one to three more columns of random small integers (perhaps all multiplied by some constant larger than unity). The last element in each column will be the negative of a possible sum of a subset of the values above it. This forces solutions to satisfy more relations, thus often removing the small null vectors. We now must iterate over all possible values of sums of subsets of the new columns. If the random values are drawn from a small pool then typically there are not many such values for each column. This is important because we want to arrange matters so that we do no more than some low degree polynomial number of iterations in the number of rows of the lattice.

An alternative is to put those problematic null vectors to work for us (in modern parlance one might regard this as "embracing our null vectors"). We will show an example of this in a later section when we attempt to surpass a computational number theory record.

7
. Covering a set by subsets

Subset covering is an important task that appears, for example, in the Quine-McCluskey algorithm for finding an optimal disjunctive normal form for a boolean expression [McCluskey 1956]. We give an example that arose in the Usenet news group comp.soft-sys.math.mathematica. The approach we use appeared previously in [Lichtblau 2002b]. We are given a set of sets, each containing integers between 1 and 64. Their union is the set of all integers in that range, and we want to find a set of 12 subsets that covers that entire range (we are given in advance that that number can be achieved).

In[]:=

subsets={{1,2,4,8,16,32,64},{2,1,3,7,15,31,63},{3,4,2,6,14,30,62},{4,3,1,5,13,29,61},{5,6,8,4,12,28,60},{6,5,7,3,11,27,59},{7,8,6,2,10,26,58},{8,7,5,1,9,25,57},{9,10,12,16,8,24,56},{10,9,11,15,7,23,55},{11,12,10,14,6,22,54},{12,11,9,13,5,21,53},{13,14,16,12,4,20,52},{14,13,15,11,3,19,51},{15,16,14,10,2,18,50},{16,15,13,9,1,17,49},{17,18,20,24,32,16,48},{18,17,19,23,31,15,47},{19,20,18,22,30,14,46},{20,19,17,21,29,13,45},{21,22,24,20,28,12,44},{22,21,23,19,27,11,43},{23,24,22,18,26,10,42},{24,23,21,17,25,9,41},{25,26,28,32,24,8,40},{26,25,27,31,23,7,39},{27,28,26,30,22,6,38},{28,27,25,29,21,5,37},{29,30,32,28,20,4,36},{30,29,31,27,19,3,35},{31,32,30,26,18,2,34},{32,31,29,25,17,1,33},{33,34,36,40,48,64,32},{34,33,35,39,47,63,31},{35,36,34,38,46,62,30},{36,35,33,37,45,61,29},{37,38,40,36,44,60,28},{38,37,39,35,43,59,27},{39,40,38,34,42,58,26},{40,39,37,33,41,57,25},{41,42,44,48,40,56,24},{42,41,43,47,39,55,23},{43,44,42,46,38,54,22},{44,43,41,45,37,53,21},{45,46,48,44,36,52,20},{46,45,47,43,35,51,19},{47,48,46,42,34,50,18},{48,47,45,41,33,49,17},{49,50,52,56,64,48,16},{50,49,51,55,63,47,15},{51,52,50,54,62,46,14},{52,51,49,53,61,45,13},{53,54,56,52,60,44,12},{54,53,55,51,59,43,11},{55,56,54,50,58,42,10},{56,55,53,49,57,41,9},{57,58,60,64,56,40,8},{58,57,59,63,55,39,7},{59,60,58,62,54,38,6},{60,59,57,61,53,37,5},{61,62,64,60,52,36,4},{62,61,63,59,51,35,3},{63,64,62,58,50,34,2},{64,63,61,57,49,33,1}};

In[]:=

Union[Flatten[subsets]]Range[64]

Out[]=

True

To cast this as a standard knapsack problem we first transform our set of subsets into a "bit vector" representation; each subset is represented by a positional list of zeros and ones. We will show the first such bit vector.

In[]:=

densevec[spvec_,len_]:=Module[{vec=Table[0,{len}]},Do[vec[[spvec[[j]]]]=1,{j,Length[spvec]}];vec]mat=Map[densevec[#,64]&,subsets];mat[[1]]

Out[]=

{1,1,0,1,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1}

To form a knapsack problem, the idea is to add component-wise as few bitvectors as possible, subject to the constraint that each component sum be greater than zero (indicating we have "covered" that position). We remark that this example is in a sense harder than might otherwise be the case due to the presence of inequalities. This also provides a clue that it might be wise to employ discrete optimization methods. We thus code it as such and use the Mathematica function NMinimize in order to solve it.

In[]:=

spanningSets[set_,iter_,sp_,seed_,cp_:.5]:=Module[{vars,rnges,max=Length[set],nmin,vals},vars=Array[xx,max];rnges=Map[(0≤#≤1)&,vars];{nmin,vals}=NMinimize[{Apply[Plus,vars],Join[rnges,{Element[vars,Integers]},Thread[vars.set≥Table[1,{max}]]]},vars,MaxIterationsiter,Method{DifferentialEvolution,CrossProbabilitycp,SearchPointssp,RandomSeedseed}];vals=vars/.vals;{nmin,vals}]{min,sets}=spanningSets[mat,2000,100,0,.9]

Out[]=

{12.,{0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,1}}

Information regarding NMinimize and in particular the selection and setting of its various options, may be found in advanced documentation for Mathematica, or in [Champion 2002]. The fascinating engine behind the optimizer utilized in the example above is described in [Price and Storn 1997]; while it is primarily intended for continuous optimization, one lesson is that applications of evolutionary methods can themselves evolve. Several related discrete optimization examples are attacked using this functionality in [Lichtblau 2002b] where further mention is made of option settings for NMinimize appropriate for such problems.

8
. Random determinants

It is sometimes important to understand extremal behavior of random polynomials or matrices comprised of elements from a given set. Below we apply knapsack-style optimization to study determinants of

7x7

matrices of integers taken from the set

{-1,0,1}

, with diagonal elements all set to

. We arrived at the option settings utilized below after some trial and error experimentation.

In[]:=

n=7;mat=Array[x,{n,n}];func[a:{{_?NumberQ..}..}]/;Length[a]Length[First[a]]:=Det[a]vars=Flatten[mat];problemlist={func[mat],Flatten[{Element[vars,Integers],Map[-1≤#≤1&,vars],Table[x[j,j]1,{j,n}]}]};

In[]:=

Timing[{min,vals}=NMinimize[problemlist,vars,MaxIterations50,Method{"DifferentialEvolution",CrossProbability1/50,SearchPoints50}];]

Out[]=

{37.34Second,Null}

In[]:=

{min,mat/.vals}

Out[]=

{-576.,{{1,-1,-1,1,1,-1,-1},{1,1,1,-1,1,1,-1},{-1,-1,1,1,1,1,-1},{1,1,-1,1,-1,1,-1},{1,1,1,1,1,1,1},{1,-1,-1,-1,-1,1,1},{-1,1,-1,-1,1,1,1}}}

Note that the Hadamard bound claims the minimum must be no smaller than

, or

-907

. A random search that took approximately twice as long as the code above found nothing smaller than

-288

. We'll now try with dimension increased by one.

In[]:=

Timing[{min,vals}=NMinimize[problemlist,vars,MaxIterations200,Method{"DifferentialEvolution",CrossProbability1/50,SearchPoints100}];]

Out[]=

{464.97Second,Null}

In[]:=

{min,mat/.vals}

Out[]=

{-4096.,{{1,1,1,-1,1,-1,-1,-1},{1,1,-1,1,1,1,1,-1},{1,1,1,1,-1,1,-1,1},{-1,1,1,1,-1,-1,1,-1},{1,-1,1,1,1,-1,1,1},{-1,1,1,-1,1,1,1,1},{1,-1,1,-1,-1,1,1,-1},{1,1,-1,-1,-1,-1,1,1}}}

This is the Hadamard bound, by the way.

9
. In search of those elusive Keith numbers

Keith numbers are defined as follows. Suppose we are given a number

digits (we work in base

, but these can be defined with respect to arbitrary bases). Form a sequence in Fibonacci style as follows. The first

elements are the digits themselves. The

(n+1)

element is the sum of the first

digits. Subsequent elements are the sums of the preceding

elements. Then s is called a Keith number (for Mike Keith, who first discussed these), if it appears in this sequence. For example, the sequence for

197

{1,9,7,17,33,57,107,197,...}

and so 197 is a Keith number. Keith originally referred to these as repfigits, for "replicating Fibonacci digits".

Keith numbers tend to be quite rare (there are only

of them less than

), and known methods for finding them, while flawless (in the sense that they find all of them), are limited in range due to algorithmic complexity and memory requirements. The latest I have seen on the topic, [Keith 1998], claims that while all such numbers up to

digits have been found, none larger have yet been discovered. We will remedy that situation.

We begin with some background remarks on the nature of two methods on which we have relied thus far. One, lattice reduction, can be used to find small integer solutions to diophantine linear problems. It is particularly useful for finding small null vectors to a given homogeneous integer equation. The other tool, differential evolution, can frequently enforce "reasonable" linear inequality constraints if provided with input that is not too far from satisfying the constraints, especially when that input contains small components and the constraints are directly influenced by (integer) perturbations in the optimization variables. Both methods in a sense form new vectors from old by their respective means of recombination. The fact that one tries to find small things, and the other can more readily impose constraints on sets comprised of small values (as well as the fact that we discuss them together in this section), suggests that it might be profitable to use the two methods in tandem (our own style of recombination, so to speak). That is exactly what we will do.

To begin we must find equations to describe these things. If the digits are

{

,...,

n-1

}

then the number is

n-1

∑

j=0

n-1-j

. Meanwhile we form the sequence using a Fibonacci matrix of dimension

. This is simply a matrix that, when operating on a vector, replaces each element up to the last by its successor, and replaces the last by the sum of the elements. For

n=2

this is simply



0	1
1	1



. For, say,

n=4

, it is

0	1	0	0
0	0	1	0
0	0	0	1
1	1	1	1

If we multiply this matrix by itself

k-1

times then the dot product of the bottom row with the digit sequence will give the

(n+k)

term in the sequence. Some simple inequality considerations will give fairly tight bounds on how many such multiples can possibly work for a given number of digits

. We will use each possibility to form a homogeneous linear diophantine equation (that is, the sum will be zero). For efficiency, in the actual code we take advantage of the structure of the matrix to avoid forming explicit matrix products.

In[]:=

keithEquations[len_Integer/;len>0]:=Module[{matrow,n,list,res,vecs},res=list[];Do[matrow[j]=Table[KroneckerDelta[k,j+1],{k,len}],{j,len-1}];matrow[len]=Table[1,{len}];n=len;While[9*Apply[Plus,matrow[n]]<10^(len-1),n++;matrow[n]=Sum[matrow[k],{k,n-len,n-1}];];While[First[matrow[n]]≤10^(len-1),res=list[res,matrow[n]];n++;matrow[n]=Sum[matrow[k],{k,n-len,n-1}];];vecs=Apply[List,Flatten[res,Infinity,list]];Map[(#-10^Range[len-1,0,-1])&,vecs]]

Next we need to solve such systems. This is really just an integer null space computation. For convenience we strip down code from [Lichtblau 2003]. Note that there are computationally more efficient ways to do this. But nonetheless this step will not prove to be a bottleneck.

In[]:=

Needs["LinearAlgebra`MatrixManipulation`"]integerNullSpace[vec:{_Integer..}]:=Module[{mat,hnf,j=1},mat=Prepend[Transpose[{vec}],{0}];mat=AppendRows[mat,IdentityMatrix[Length[mat]]];hnf=Last[Developer`HermiteNormalForm[mat]];While[j≤Length[hnf]&&hnf[[j,1]]!=0,j++];LatticeReduce[Map[Drop[#,2]&,Drop[hnf,j]]]]

We demonstrate with a short example. We start by obtaining the set of candidate equation vectors for

digit examples.

In[]:=

k5=keithEquations[5];

We find the small null vectors for one of these candidates.

In[]:=

nulls[5,2]=integerNullSpace[k5[[4]]]

Out[]=

{{-3,-1,-3,-3,-1},{-2,-4,-3,3,-3},{1,6,-5,6,-2},{7,-3,-15,5,26}}

Notice that for any solution vector, its negative is also a solution vector. Thus we see that

31331

is a keith number of five digits. That was not too difficult. We now look at examples with

digits.

In[]:=

k6=keithEquations[6];

We find the small null vectors for one of these candidates.

In[]:=

nulls[6,2]=integerNullSpace[k6[[2]]]

Out[]=

{{0,3,-3,0,4,0},{-1,1,-3,-2,-4,-4},{0,-6,-3,1,0,-2},{0,-1,1,5,0,-6},{0,4,-12,15,-12,9}}

We arrive at a set of small null vectors, none of which have entirely nonnegative or entirely nonpositive values (with the first being nonzero, in order that they give a legitimate six digit number). It turns out that this will be the case for all possible equations given by

. We now need a way to recombine these so that the component is positive and the rest are nonnegative. This is a job for differential evolution with integer variables. The idea is quite simple. For each null vector we create an integer-valued variable. We allow arbitrary linear combinations of these vectors subject to the constraints that all resulting components be nonnegative, and the first be positive. This is simply a constraint satisfaction problem. Thus we can either use a constant objective function or else use some function that would have the effect of imposing a penalty on combinations that violate the constraints. As the NMinimize constraint handler already do this, we will opt for the former (actually, since the zero vector is close to satisfying the constraints, one should perhaps add emphasis to the constraint that the first value is nonzero). The code for this is below.

In[]:=

keithSolution[nulls_,iters_:Automatic]:=Module[{len=Length[nulls],vars,x,vecs,constraints,program,min,vals},vars=Array[x,len];vecs=vars.nulls;constraints=Join[{Element[vars,Integers],1≤First[vecs]≤9},Map[0≤#≤9&,Rest[vecs]]];program={1,constraints};{min,vals}=NMinimize[program,vars,MaxIterationsiters];vecs/.vals]

In[]:=

keithSolution[nulls[6,2]]

Out[]=

{1,4,7,6,4,0}

We note from [Keith 1998] that

147640

is in fact in the list. Now we will try for something more ambitious. As there are no known Keith numbers of

digits, we will attempt to find one.

In[]:=

k20=keithEquations[20];nulls[20,3]=integerNullSpace[k20[[3]]];Timing[keithSolution[nulls[20,3],200]]

During evaluation of In[]:=

NMinimize::"incst":"\!$NMinimize$ was unable to generate any initial points satisfying the inequality constraints \!${\(\(\(\(\(\(\(\(-2$\)\\ $\(Round[\(\(x$1808[2]$\)]\)\)\)\) + $\(Round[\(\(x$1808[3]$\)]\)\) + $\(3\\ \(\(Round[\(\(x$1808[4]$\)]\)\)\)\) + $\(2\\ \(\(Round[\(\(x$1808[5]$\)]\)\)\)\) + $\( 4 $\) + $\(Round[\(\(x$1808[10]$\)]\)\) - $\(2\\ \(\(Round[\(\(x$1808[11]$\)]\)\)\)\) - $\(4\\ \(\(Round[\(\(x$1808[12]$\)]\)\)\)\) + $\( 5 $\)\)\) ≤ 0\)\), $\( 10 $\)}\). The initial region specified may not contain any feasible points. Changing the initial region or specifying explicit initial points may provide a better solution."

Out[]=

{46.29Second,{2,7,8,4,7,6,5,2,5,7,7,9,0,5,7,9,3,4,1,3}}

We have found the first known example of a

digit keith number (hooray for us!). It is, moreover, the first pandigital example (that is, containing all

digits). The warning message tells us, not surprisingly, that none of the initial combinations satisfied the constraints. Letting differential evolution work its magic over the course of

200

generations sufficed to overcome that defect.

Utilizing a clever search algorithm that relies on large tables, Mike Keith found all examples up to 19 digits using about 500 hours of computation time with hardware of mid-to-late 1990's vintage. To be fair in a comparison, the method we show above is by no means guaranteed to work. It just happens to do a nice job, and in the example above required no tuning beyond setting the MaxIterations (though possibly other option tuning would make it more effective). But clearly it is much faster than the direct search and by no means requires much memory.

We conclude that for some types of knapsack problem, the tandem of lattice and optimization tools can be quite powerful.

10
. Notes on related work and implementations

There is a large body of literature regarding attatcks on knapsacks via lattice reduction methods. Applications to cryptosystems are discussed in [Schnorr and Euchner 1991], [von zur Gathen and Gerhard 1999], and [Nguyen 1999]. It seems that many such cryptosystems were vanquished in the 1980s and 1990s due to lattice methods. In [Schnorr 1993] there is moreover an attempt to apply lattice methods to integer factorization and computation of discrete logarithms (which could have the effect of breaking RSA-type cryptosystems). This has not yet been successful (to my knowledge!).

Computation of reduced lattices may be done in various ways. The original method of [Lenstra, Lenstra, and Lovácz 1982] utilized rational arithmetic. It was recognized even that that integer arithmetic sufficed. A nice exposition may be found in [Cohen 1993, chapter 2]. Efficient variations using floating point (machine and higher precision) arithmetic and integer arithmetic appear in [Schnorr and Euchner 1991] and [Storjohann 1996] respectively. At various times in the past the default Mathematica implementation has utilized approximate arithmetic but as of this writing it uses techniques from the latter paper. A considerable amount of experience (my own) indicates that an approximate arithmetic version can be difficult to keep both fast and free of bugs; other programmers/implementations may have fared better in this regard. One might experiment with LatticeReduce in Mathematica using approximate arithmetic by doing

In[]:=

Developer`SetSystemOptions["LatticeReduceOptions"{"LatticeReduceArithmetic"ApproximateNumbers}];

For restoring default behavior one sets it to Integers.

There are many methods for ILPs and knapsack problems that we did not show herein. One with origins in computational commutative algebra is done via Gröbner bases. A nice Mathematica demonstration notebook (with some nontrivial examples) may be found in [Kapadia 2003].

Another example discussed in [Lichtblau 2002b] is as follows. Take the set of reciprocals of the first

100

integers. Divide it into two subsets each of size

in such a way that the difference between the sums is minimized (that is, they are the closest pair to half the total). Clearly this can be set up as an approximate subset sum problem; in the reference it is handled in two ways (one as a knapsack), both utilizing the Mathematica function NMinimize as the underlying solver. It would be interesting to see a successful attack based on lattice reduction. We remark that it is effectively a high density subset problem and this alone makes for trouble with lattice methods. But the real problem seems to be the presence of large null sets containing many small vectors. Possibly the tandem of lattice reduction and differential evolution might be of use?

This sort of usage of NMinimize brings up another topic. As one might notice, the example involving the coin problem is not terribly exciting in and of iteself. It is included because it was the first sort of discrete optimization problem we successfully made to work at WRI using differential evolution. Some remarks on the history are in order. In 1999 I had the good fortune to teach a course on nonlinear programming as a visitor in the mathematics department at the University of Illinois. In the class were two graduate students who went on to do work at Wolfram Research, Inc.: Serguei Chebalov and Brett Champion (the official spelling of the former has since changed twice and is now Sergey Shebalov; I'd not venture to guess what it might be by the time this is published). Sergey spent that summer and the next as an intern at WRI working primarily on what would become NMinimize. By the end of that first summer we had the coin example working using a sort of penalty method to enforce integrality. (This is similar in spirit to a method from [Gisvold and Moe 1972]. We had the advantage that we worked with a method that made no requirement of smoothness and hence we were free to use much harsher penalties.) As Sergey wrapped up his work the next summer, Brett joined the company and jumped (fell?) right into the project. We decided after some experimentation that enforcing integrality could better be accomplished by judicious use of Round in function evaluation. A year or so later we obtained a copy of [Storn 1999] and learned that the inventors of differential evolution had had much the same experience with this manner of discrete optimization (and in the book containing that reference some of the immediately following chapters discuss examples where differential evolution works well with discrete and mixed optimization problems). That said, in private correspondence the inventors were mildly (and pleasantly) surprised that we had, with modest success, applied differential evolution to nontrivial examples that are entirely discrete and in some cases combinatorial in nature.

11
. References

[Boneh 1997] D. Boneh. Private communication, 1997.[Champion 2002] B. Champion. Numerical Optimization in Mathematica: An Insider's View of NMinimize. In SCI2002, Proceedings of the 6th World Multiconference on Systemics, Cybernetics, and Informatics. Volume 16, N. Callaos, T. Ebisuzaki, B. Starr, J. M. Abe, D. Lichtblau, eds., pages 136-140. International Institute of Informatics and Systemics, 2002.A Mathematica notebook version may be found at:http://library.wolfram.com/infocenter/Conferences/4311/Up to date documentation regarding NMinimize may be found at:http://documents.wolfram.com/v5/Built-inFunctions/AdvancedDocumentation/Optimization/NMinimize/[Cohen 1993] H. Cohen. A Course in Computational Algebraic Number Theory. Graduate Texts in Mathematics 138. Springer-Verlag, 1993.[von zur Gathen and Gerhard 1999] J. von zur Gathen and J. Gerhard. Modern Computer Algebra. Cambridge University Press, 1999.[Gisvold and Moe 1972] K. M. Gisvold and J. Moe. A method for nonlinear mixed-integer programming and its application to design problems. Journal of Engineering for Industry, pages 353-364, 1972.[van Hoeij 2002] M. van Hoeij. Factoring polynomials and the knapsack problem. Journal of Number Theory 95:167-181, 2002.[Kapadia 2003] D. Kapadia. Integer Programming with Groebner Bases. Mathematica Demo notebook, 2003.Available electronically at:http : // library.wolfram.com/infocenter/Demos/4825/[Keith 1998] M. Keith. Determination of all Keith Numbers up to

. Electronic manuscript, 1998.Available electronically at:http://users.aol.com/s6sj7gt/keithnum.htmSee also:http://users.aol.com/s6sj7gt/mikekeit.htm[Lenstra 1984] A. K. Lenstra. Polynomial factorization by root approximation. In EUROSAM 84, Proceedings of the International Symposium on Symbolic and Algebraic Computation. Lecture Notes in Computer Science 174, pages 272-276, Springer, 1984.[Lenstra, Lenstra, and Lovácz 1982] A. Lenstra, H. Lenstra, and L. Lovácz. Factoring polynomials with rational coefficients. Mathematische Annalen

261

:515-534, 1982.[Lichtblau 2000] D. Lichtblau. Solving finite algebraic systems using numeric Gröbner bases and eigenvalues. In SCI2000, Proceedings of the World Conference on Systemics, Cybernetics, and Informatics. Volume 10 (Concepts and Applications of Systemics, Cybernetics, and Informatics), M. Torres, J. Molero, Y. Kurihara, and A. David, eds., pages 555-560. International Institute of Informatics and Systemics, 2000.[Lichtblau 2002a] D. Lichtblau. "Re: need a function for sums of subsets", Usenet news group comp.soft-sys.math.mathematica communication, 2002.Available electronically at:http://library.wolfram.com/mathgroup/archive/2002/Feb/msg00410.htmlhttp://groups.google.com/groups?q=+%22Re:+need+a+function+for+sums+of+subsets%22+group:comp.soft-sys.math.mathematica&hl=en&lr=&ie=UTF-8&group=comp.soft-sys.math.mathematica&safe=off&selm=a5flu1%24e10%241%40smc.vnet.net&rnum=2&filter=0[Lichtblau 2002b] D. Lichtblau. Discrete optimization using Mathematica. In SCI2002, Proceedings of the World Conference on Systemics, Cybernetics, and Informatics. Volume 16, N. Callaos, T. Ebisuzaki, B. Starr, J. M. Abe, D. Lichtblau, eds., pages 136-140. International Institute of Informatics and Systemics, 2002.A Mathematica notebook version may be found at:http://library.wolfram.com/infocenter/Conferences/4317/[Lichtblau 2003] D. Lichtblau. Revisiting strong Gröbner bases over Euclidean domains. Submitted, 2003.[Matthews 2001] K. R. Matthews. Short solutions of A X=B using a LLL-based Hermite normal form algorithm. Manuscript, 2001.[McCluskey 1956] E. J. McCluskey, Jr. Minimization of boolean functions. Bell System Technical Journal 35:1417-1444, 1956.[Nguyen 1999] P. Nguyen. Cryptanalysis of the Goldreich-Goldwasser-Halevi cryptosystem from Crypto '97. Advances in Cryptology, Proceedings of CRYPTO 1999, Santa Barbara, CA, 1999.Available electronically at:http://www.di.ens.fr/~pnguyen/pub.html#Ng99[Price and Storn 1997] K. Price and R. Storn. Differential evolution. Dr. Dobb's Journal, April 1997, pages 18-24,78.[Rasmusson 2003] L. Rasmusson. "Re: Mathematica commands needed to solve problem in Set Theory!", Usenet news group comp.soft-sys.math.mathematica, 2003.Available electronically at:http://forums.wolfram.com/mathgroup/archive/2003/Sep/msg00258.htmlhttp://groups.google.com/groups?q=Rasmusson+group:comp.soft-sys.math.mathematica&hl=en&lr=&ie=UTF-8&group=comp.soft-sys.math.mathematica&safe=off&selm=bl3klo%24eqa%241%40smc.vnet.net&rnum=2The original statement of the problem is from Georgia College & State University BITS & BYTES 3:3, February 1998.http://www.gcsu.edu/acad_affairs/coll_artsci/mathcomp_sci/bits/V3N3Feb98.html[Reinholtz 2003] K. Reinholtz. "Re: Notebook for low density subset sum?", Usenet news group comp.soft-sys.math.mathematica, 2003.Available electronically at:http://forums.wolfram.com/mathgroup/archive/2003/Apr/msg00409.htmlhttp://groups.google.com/groups?q=Reinholtz+group:comp.soft-sys.math.mathematica&start=10&hl=en&lr=&ie=UTF-8&group=comp.soft-sys.math.mathematica&safe=off&selm=b7gemn%24dkv%241%40smc.vnet.net&rnum=11[Schnorr 1993] C. P. Schnorr. Factoring integers and computing discrete logarithms via diophantine approximation. Advances in Cryptology-Eurocrypt '91. Published in Lecture Notes in Computer Science 547, Springer-Verlag, 171-182, 1993.[Schnorr and Euchner 1991] C. P. Schnorr and M. Euchner. Lattice basis reduction: improved practical algorithms and solving subset sum problems. In Proceedings of the 8th International Cnnference on Fundamentals of Computation Theory, 1991. L. Budach, ed. Lecture Notes in Computer Science 529, Springer-Verlag, 68-85, 1991.[Storjohann 1996] A. Storjohann. Faster algorithms for integer lattice basis reduction. Technical Report 249, Departement Informatik, ETH Zürich, 1996.[Storn 1999] R. Storn. An introduction to differential evolution. Chapter 6 (pages 79-108) of New Ideas in Optimization, D. Corne, M. Dorigo, and F. Glover, eds. Advanced Topics in Computer Science Series, McGraw-Hill, 1999.[Wolfram 2003] S. Wolfram. The Mathematica Book (5th edition). Wolfram Media, 2003.

Cite this as: Daniel Lichtblau, "Solving Knapsack Problems with Mathematica" from the Notebook Archive (2006), https://notebookarchive.org/2018-07-6z255zh

1. Abstract

2. Introduction

3. A simple subset sum

4. Numeric solvers and knapsacks

5. Linear diophantine equations

6. A more challenging subset sum

7. Covering a set by subsets

8. Random determinants

9. In search of those elusive Keith numbers

10. Notes on related work and implementations

11. References

1
. Abstract

2
. Introduction

3
. A simple subset sum

4
. Numeric solvers and knapsacks

5
. Linear diophantine equations

6
. A more challenging subset sum

7
. Covering a set by subsets

8
. Random determinants

9
. In search of those elusive Keith numbers

10
. Notes on related work and implementations

11
. References