Academic Articles & Supplements

Data structures and efficient algorithms in Mathematica

Daniel Lichtblau

Author

Daniel Lichtblau

Title

Data structures and efficient algorithms in Mathematica

Description

Data structures and efficient algorithms in Mathematica

Daniel Lichtblau
Wolfram Research, Inc.
October, 1999

Abstract

Certain data structures and algorithms for their manipulation pervade computer science. It seems that many of these are not commonly encountered in Mathematica programming. In this talk we show how many data structures may be implemented and used effectively, giving simple Mathematica examples.

Initialization

Off[General::spell];Off[General::spell1];

Introduction

We will discuss definitions (briefly) of sparse arrays, stacks, queues, trees (especially the binary flavor), and bitvectors. We will show how these may be implemented in Mathematica and will give simple but effective examples of each in use. In many cases we will also see, by way of contrast, equivalent simple code that is ineffective on sizable problems precisely because it ignores these basic techniques; think of this as a sort of morality play.

One will observe that some examples in subsequent sections make use of common techniques. For example, one sparse array example also uses a stack, and the queue implementation uses sparse arrays. This gives further credence to the claim that these simple methods are in some sense fundamental.

References for the techniques presented herein include:
(1980) Data Structure Techniques, Standish, Thomas. Addison Wesley.
(1974) The Design and Analysis of Computer Algorithms, Aho, Alfred, Hopcroft, John, Ullmann, Jeffrey. Addison Wesley.

Tables (lists)

The villain in many examples below is the garden-variety Mathematica list. This is of course unfair. Lists are the most useful of data structures in Mathematica. They are actually fixed-length vectors (often called vectors or arrays elsewhere in computer science, as distinct from what are called linked lists), and this is their chief drawback. Specifically, one cannot extend them efficiently. Indeed, much of this notebook is devoted to examples where extensible data structures are a must. That said, applications for which a fixed size table may be used are typically the most efficient.

There are several reasons for this. First, element retrieval is constant time. Second, all functional programming in Mathematica is geared toward list structures. Third, when elements may all be represented by machine numbers of the same type, Mathematica can used packed arrays internally which are both less memory-consumptive and faster for element access and other operations. Indeed, for many purposes one can use the Compile function, getting still further improvement in speed.

To give some indication of relative speed of building a fixed size table vs. a nested list, we show simple examples below.

Timing[t1=Table[N[j],{j,10^6}];]t2={};Timing[Do[t2={t2,N[j]},{j,10^6}]]Timing[t3=Flatten[t2];]t3===t1

{1.02Second,Null}

{21.01Second,Null}

{2.37Second,Null}

True

While it is clear that tables are faster to build for such an example, one can show that nested list construction at least has the appropriate linear complexity, hence is not a bad alternative for applications where fixed-length tables are inappropriate. We will see an example of this sort presently. It should also be noted that much of the speed advantage of Table in the above example is due to use of packed arrays. For, say, tables of symbolic expressions, this is greatly diminished.

Timing[t1=Table[a[j],{j,10^5}];]t2={};Timing[Do[t2={t2,a[j]},{j,10^5}]]Timing[t3=Flatten[t2];]t3===t1

{0.99Second,Null}

{1.66Second,Null}

{0.49Second,Null}

True

Sparse arrays (hash tables)

Sparse arrays are quite simple, and are among the most heavily used of data structures in Mathematica (often without giving that name to them). One uses them to store values associated to "indices" attached to a given "head." But this head need not be an atom, and, unlike Mathematica part specifications, the indices need not be positive integers. When left-hand-sides of such indices are free of patterns, Mathematica will compute what is called a "hash value" for them; think of these as positive integers that can be used as an index into an actual table (more on this below).

Below we use the symbol a as a head for a sparse array defined for three particular indices.

Clear[a]a[4][5]=3;a["string index",Pi]={y7,1};a[{1,2,q}]=xyz;??a

"Global`a"

a[4][5]=3

a[{1,2,q}]=xyz

a["string index",π]={y7,1}

One might (and frequently will) choose to view such definitions as something other than a sparse array. For example, one often defines Mathematica functions in this way. But most often function definitions will rely on Mathematica patterns and this disqualifies them from being sparse arrays in an important aspect.. Specifically, it is only families of associated values free of patterns that may be hashed. This is both because pattern matching relies heavily on technology that is intrinsically non-hashable and because rule ordering neecessary for pattern matching precludes look-up based on hashed values.

What is the benefit of these sparse arrays? It all boils down to efficiency. The Mathematica kernel will compute what is called a "hash value" for each left-hand-side, and use that to place it into an internal data structure with associated right-hand-value. While multiple left-hand-sides might have the same hash value and hence be placed in the same bin, the Mathematica kernel checks that bins do not grow too large, rehashing to a larger space when need be. The upshot is that if we assume the computation of a hash value is constant time, then:
(i) Adding new elements to such a table is, on average, constant time.
(ii) Element look-up is, on average, constant time.
Hence we may use such arrays to store values and retrieve them quickly. Below we show simple implementations of a union that does not sort its elements. One uses Mathematica lists and the other uses a sparse array.

unionNoSort1[ll__List]:=Module[{htable,mylist,all,result={}},all=Join[ll];Do[If[!MemberQ[result,all[[j]]],AppendTo[result,all[[j]]]],{j,Length[all]}];result]

unionNoSort2[ll__List]:=Module[{htable,newlist,all,result},result=newlist[];all=Join[ll];Do[If[!TrueQ[htable[all[[j]]]],htable[all[[j]]]=True;result=newlist[result,all[[j]]]],{j,Length[all]}];result=Apply[List,Flatten[result,Infinity,newlist]];Clear[htable];result]

First we check that these give the same results.

uni1=unionNoSort1[Range[50],Reverse[Range[100]],{300,400,399}];uni2=unionNoSort2[Range[50],Reverse[Range[100]],{300,400,399}]uni1===uni2

{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,100,99,98,97,96,95,94,93,92,91,90,89,88,87,86,85,84,83,82,81,80,79,78,77,76,75,74,73,72,71,70,69,68,67,66,65,64,63,62,61,60,59,58,57,56,55,54,53,52,51,300,400,399}

True

Now we check speed and algorithmic complexity.

Timing[u1=unionNoSort1[Range[1000],Reverse[Range[1100]]];]Timing[u2=unionNoSort1[Range[2000],Reverse[Range[2200]]];]Timing[u1b=unionNoSort2[Range[10000],Reverse[Range[11000]]];]Timing[u2b=unionNoSort2[Range[20000],Reverse[Range[22000]]];]

It is quite clear that unionNoSort1 has quadratic complexity whereas unionNoSort2 has only linear complexity.

There are two further observations to be made about this implementation of unsorted union. First, the semantics are not entirely the same as in Union because that latter will test equivalence using SameQ whereas unionNoSort2 relies on hash look-up. Second, in addition to hashing we make rudimentary use of a sort of "stack" by nesting the result we construct. We will cover this in more detail in the next section.

Application: sparse sets

One might use sparse arrays to represent sets with cardinality relatively small by comparison to the universe of allowable elements. We show a simple implementation of such sets, including functions to find unions, intersections and complements of pairs of such sets.

setLength[_]:=0containsElement[set_,elem_]:=Head[set[elem]]=!=setaddElement[set_,elem_]:=If[Head[set[elem]]===set,set[elem]=elem;setLength[set]++;]removeElement[set_,elem_]:=If[containsElement[set,elem],set[elem]=.;setLength[set]--;]displaySet[set_]:=Map[#[[2]]&,DownValues[set]]setClear[set_]:=(Clear[set];setLength[set]=0;)setCopy[set1_,set2_]:=Module[{elems=displaySet[set1],j,len},setClear[set2];len=Length[elems];For[j=1,j≤len,j++,addElement[set2,elems[[j]]]];]setUnion[set1_,set2_,result_]:=Module[{elems,j,len},setCopy[set1,result];elems=Map[#[[2]]&,DownValues[set2]];len=Length[elems];For[j=1,j≤len,j++,addElement[result,elems[[j]]]];]setIntersection[set1_,set2_,result_]:=Module[{elems,j,len},setClear[result];elems=Map[#[[2]]&,DownValues[set2]];len=Length[elems];For[j=1,j≤len,j++,If[containsElement[set1,elems[[j]]],addElement[result,elems[[j]]]]];]setComplement[set1_,set2_,result_]:=Module[{elems,j,len},setCopy[set1,result];elems=Map[#[[2]]&,DownValues[set2]];len=Length[elems];For[j=1,j≤len,j++,removeElement[result,elems[[j]]]];]

Here we show some simple examples.

SeedRandom[1111]Timing[Do[addElement[set1,Random[Integer,10000]],{3000}]]Timing[Do[addElement[set2,Random[Integer,10000]],{3000}]]setLength[set1]setLength[set2]

{0.63Second,Null}

{0.65Second,Null}

2598

2595

We will use Mathematica built-in logical operations on the set elements in order to check our set functions for correctness.

elems1=displaySet[set1];elems2=displaySet[set2];elemsu=Union[elems1,elems2];elemsi=Intersection[elems1,elems2];elemsc=Complement[elems1,elems2];

Timing[setUnion[set1,set2,setu]]Timing[setIntersection[set1,set2,seti]]Timing[setComplement[set1,set2,setc]]Length[elemsu]==setLength[setu]Length[elemsi]==setLength[seti]Length[elemsc]==setLength[setc]

{1.62Second,Null}

{0.55Second,Null}

{1.53Second,Null}

True

It is easy to demonstrate that the average complexity for adding or removing elements is O(1) (that is, constant time), as expected.

SeedRandom[1111]Timing[Do[addElement[seta3K,Random[Integer,5000]],{3000}]]Timing[Do[addElement[setb3K,Random[Integer,5000]],{3000}]]Timing[Do[addElement[seta30K,Random[Integer,50000]],{30000}]]Timing[Do[addElement[setb30K,Random[Integer,50000]],{30000}]]

{0.68Second,Null}

{0.71Second,Null}

{7.13Second,Null}

{7.38Second,Null}

Timing[setUnion[seta3K,setb3K,r3k]]Timing[setUnion[seta30K,setb30K,r30k]]

{1.47Second,Null}

{14.88Second,Null}

Stacks

Stacks are data structures with a top, wherein we remove the element most recently placed thereon; the descriptive phrase is "last in, first out," or LIFO for short. Stacks may be represented conveniently in Mathematica s nested lists. We push a new element onto the top by forming the nested list
newstack = {new,oldstack}.

One finds stack uses throughout computer science algorithms. For example they are extremely useful for walking through expression trees. We give a simple example below where we flatten an expression.

First we give simple Mathematica code to implement a stack.

SetAttributes[push,HoldFirst]push[stack_,elem_]:=stack={elem,stack}SetAttributes[pop,HoldFirst]pop[{}]:=Nullpop[stack_]:=With[{elem=First[stack]},stack=Last[stack];elem]emptyStack[stack_]:=stack=={}

Next we show a stripped-down Mathematica implementation of Flatten. It relies on recursion to flatten sublists and then joins the results.

myFlatten1[xx_List]:=With[{flatparts=Map[myFlatten1,xx]},Join[Apply[Sequence,flatparts]]]myFlatten1[xx_]:={xx}

test={};Do[test={j,test},{j,250}]

Attempting to do this without an adequate $RecursionLimit setting can be hazardous to the health of your Mathematica session.

myFlatten1[test](*Kids,don'ttrythisathome!*)

$RecursionLimit::"reclim":"Recursion depth of \!$256$ exceeded."

General::"stop":"Further output of \!$$RecursionLimit :: \"reclim\"$ will be suppressed during this calculation."

Join::"heads":"-- Message text not found -- (\!$List$) (\!$Hold$) (\!$1$) (\!$2$)"

Join::"heads":"-- Message text not found -- (\!$List$) (\!$Hold$) (\!$1$) (\!$3$)"

Next we show how to implement this without recursion. We initialize the length of our result to zero. We then walk through our expression left-to-right, depth first. Every element that is a list has its sub-elements pushed onto the main stack. Each non-list element gets put onto a secondary stack containing the result elements, and we increment the length of that result. When the main stack is empty we have put all non-list elements onto the secondary stack, so we simply pop off all elements therefrom and put them into a table.

myFlatten2[xx_List]:=Module[{stack1={},len=0,elem,stack2={}},Map[push[stack1,#]&,xx];While[!emptyStack[stack1],elem=pop[stack1];If[ListQ[elem],Map[push[stack1,#]&,elem],push[stack2,elem];len++];];Table[pop[stack2],{len}]]

myFlatten2[test]===Flatten[test]

True

test2={{a,b,{c,d},{e,{f,g,h,{i,j}}}},{{k,l},m},n,{{{o,p,{q}},{{r,s},t},u},{v,w}},x};

myFlatten2[test2]===Flatten[test2]

True

It should be mentioned that this simple example illustrates a very common use of stacks, as a tool for algorithms that require walks through tree structures.

Queues

A queue is like a stack except that we now must retain a pointer to the "front" and another to the "back" because we put new elements in one end and remove from the other; this is "first in, first out," or FIFO for short. We add at the front and remove at the back. We require routines to (i) get a new queue, (ii) check the length and in particular check if it is empty (could also have a check-for-full-queue function), (iii) add elements, and (iv) remove elements. One method is to use a fixed size table and have front and back indices that wrap around.

qelems=1;qlen=2;qsize=3;qfront=4;qback=5;newQueue[len_Integer]/;len>0:={Table[0,{len}],0,len,1,1}emptyQueue[Q_]:=Q[[qlen]]0queueLength[Q_]:=Q[[qlen]]SetAttributes[enQueue,HoldFirst]enQueue[Q_,elem_]:=If[Q[[qlen]]Q[[qsize]],Print["Queue is full"],Q[[qlen]]++;Q[[qelems,Q[[qfront]]]]=elem;Q[[qfront]]=Mod[Q[[qfront]]+1,Q[[qsize]],1];]SetAttributes[deQueue,HoldFirst]deQueue[Q_]/;emptyQueue[Q]:=NulldeQueue[Q_]:=(Q[[qlen]]--;Q[[qback]]=Mod[Q[[qback]]+1,Q[[qsize]],1];Q[[qelems,Mod[Q[[qback]]-1,Q[[qsize]],1]]])

Here is a simple example.

myQ=newQueue[7];Map[enQueue[myQ,#]&,{"Hello Dolly","West Side Story"}];queueLength[myQ]show1=deQueue[myQ]

"Hello Dolly"

Map[enQueue[myQ,#]&,{"My Fair Lady","Les Miz","Phantom","Fiddler","Mouse Trap","Rent","Sleuth"}];

"Queue is full"

We ran out of space. We'll remove an element and then we can put on that last one.

show2=deQueue[myQ]

"West Side Story"

enQueue[myQ,"Sleuth"]

If one does not want to worry about running out of space, or does not want to preallocate alot of memory in order to avoid this, then it is convenient to represent queues in Mathematica using a sparse array. We show a way to implement this below. We require a symbol to give to newQueue. It determines where we attach down-values for queue entries. A small subtlety is due to the desire to remove from memory elements no longer on the queue. To this end we keep a storage slot for one element in the queue structure. When we de-queue an element, we temporarily house it in that slot, remove the sparse array reference for that queue member, then return the warehoused value. An alternative would be to use Module and declare a local variable, but this adds overhead in execution time.

newQueue[sym_Symbol]:={Unique[sym],0,Null,0,0}Clear[queueLength,emptyQueue,enQueue,deQueue];qptr=1;qlen=2;qelem=3;qfront=4;qback=5;queueLength[Q_]:=Q[[qlen]]emptyQueue[Q_]:=Q[[qlen]]==0SetAttributes[enQueue,HoldFirst]enQueue[Q_,elem_]:=(Q[[qlen]]++;Q[[qfront]]++;Q[[qptr]][Q[[qfront]]]=elem;)SetAttributes[deQueue,HoldFirst]deQueue[Q_]:=(Q[[qlen]]--;Q[[qback]]++;Q[[qelem]]=Q[[qptr]][Q[[qback]]];Q[[qptr]][Q[[qback]]]=.;Q[[qelem]])

Let us see how this process works. We repeat the previous example. This time we'll not run out of space.

myQ=newQueue[qq];Map[enQueue[myQ,#]&,{"Hello Dolly","West Side Story"}];queueLength[myQ]show1=deQueue[myQ]

"Hello Dolly"

Map[enQueue[myQ,#]&,{"My Fair Lady","Les Miz","Phantom","Fiddler","Mouse Trap","Rent","Sleuth"}];

show2=deQueue[myQ]queueLength[myQ]

"West Side Story"

We can see exactly what is in our queue using the Information function. This is in some sense cheating, because queues do not have a function to return all elements. Regard this as a useful debugging tool to check the correctness of our queue. Notice that we now have no sparse array reference to the original first two queue elements, so we are indeed recycling memory.

One might hybridize these approaches to get the faster access associated with a fixed-size array, combined with the memory management associated with sparse functions. This hybridization can be accomplished by, say, doubling the size of the table every time the queue gets full, copying over the entries already on it. Again we sometimes have an O(n) time for an en-queue operation, but the average is O(1). If so desired, one can get more subtle with memory management and allow for the queue table to shrink by some factor, say 1/2, whenever the fraction of entries falls below some threshhold, e.g 1/3, of the full capacity. This again can help to avoid wasting space in cases of once-large queues that have shrunk considerably. So long as fixed ratios are used it is not hard to show that average time for en-queue and de-queue will be O(1) even though occasionally such an operation is O(n).

Information[Evaluate[myQ[[qptr]]]]

"Global`qq$13"

qq$13[3]="My Fair Lady"

qq$13[4]="Les Miz"

qq$13[5]="Phantom"

qq$13[6]="Fiddler"

qq$13[7]="Mouse Trap"

qq$13[8]="Rent"

qq$13[9]="Sleuth"

The advantage to this second approach is we only use the memory we need, and we do not run out (subject to machine limitations). The disadvantage is that, while complexity of each operation is constant time, the constants can be high for adding and removing elements. This is because hash table look-up is happening behind the scenes, and this is alot slower than simple array indexing. Also note that since sparse arrays are implelemted using hashing, occasionally adding elements will be O(n) because crowding in the table can force it to double in size. Thus it is the average time for adding that is constant.

Here is an application of queueing. Suppose we take the stack-based flattening code and everywhere replace stack uses with their queue analogues. What might be the result?

funnyFlatten[xx_List]:=Module[{q1=newQ[s1],len=0,elem,q2=newQ[s2]},Map[enQ[q1,#]&,xx];While[!emptyQ[q1],elem=deQ[q1];If[ListQ[elem],Map[enQ[q1,#]&,elem],enQ[q2,elem];len++];];Table[deQ[q2],{len}]]

funnyFlatten[test]

{n,x,a,b,m,c,d,e,k,l,u,v,w,f,g,h,o,p,t,i,j,q,r,s}

One checks that in addition to flattening, we have reordered the non-list elements based on nesting depth. For comparison, here is a functional programming approach to the same problem.

e1=MapIndexed[C,test2,{-1}]e2=Cases[e1,_C,Infinity]e3=Sort[e2,(Length[#1[[2]]]≤Length[#2[[2]]])&];Map[First,e3]

{n,x,a,b,m,c,d,e,k,l,u,v,w,f,g,h,o,p,t,i,j,q,r,s}

It unfortunately only works if the elements in the (nested) lists are atomic. It is a more challenging problem to find a clever functional program that does not have this limitation.

While queues are of interest in their own right, it is important to realize that the method of implementation in this section is quite general. One can use similar structures, implemented as sparse arrays with other information e.g. length also available, for a variety of purposes. For example, sets that can evolve over time, with members being added, removed, or merged in from other sets, can be handled in this way.

Trees

Among the most pervasive data structures in computer science are trees. A tree consists of a root containing some atomic value, and children which are themselves either atomic values or subtrees. For example, the expression (a+b)*c*Sin[x] can be regarded as a tree with Times as its root and children that are respectively the subtree Plus[a,b], the atom c, and the subtree Sin[x] .

Trees are used in a variety of algorithms both for reasons of efficiency and because many data abstractions are best modelled via trees. Indeed, all Mathematica expressions fall into the class of structures known as "directed acyclic graphs (DAGs), and these are a small generalization of trees.

A particularly simple flavor is known as a binary tree. Each subtree has at most two children, referred to by the terms "left" and "right." Such trees may be used to implement fast average-time sorting algorithms. The simplest example is shown below. We build a tree by placing new elements to the left or right of a given node according to ordering between node and new element. Once all are in the tree we flatten it; the result is a sorted version of the original set. If we have n elements then the average-case complexity is O(n*log(n)). This is because each atom (typically) will be at depth not much more than log(n), hence each of n insertions averages that many steps.

Below we implement an ordered binary tree. We use Mathematica canonical ordering to compare elements. We also use List as our head for each triple {left,val,right}. This effectively precludes use of lists for element values. One can use a different head to get around this drawback, but for our purposes this suffices.

leftsubtree[{left_,_,_}]:=leftrightsubtree[{_,_,right_}]:=rightnodevalue[{_,val_,_}]:=valemptyTree={};treeInsert[emptyTree,elem_]:={emptyTree,elem,emptyTree}treeInsert[tree_,elem_]/;SameQ[nodevalue[tree],elem]:=treetreeInsert[tree_,elem_]/;OrderedQ[{nodevalue[tree],elem}]:= {leftsubtree[tree],nodevalue[tree],treeInsert[rightsubtree[tree],elem]}treeInsert[tree_,elem_]:={treeInsert[leftsubtree[tree],elem],nodevalue[tree],rightsubtree[tree]}

Now we run a few examples and demonstrate that this does indeed sort the original list of elements. We gather timings to check algorithmic complexity.

len1k=1000;ee1k=Table[Random[],{len1k}];ff1k={};t1k=First[Timing[Do[ff1k=treeInsert[ff1k,ee1k[[j]]],{j,len1k}];]]/Secondgg1k=Flatten[ff1k];gg1k===Sort[ee1k]

1.54

True

len5k=5000;ee5k=Table[Random[],{len5k}];ff5k={};t5k=First[Timing[Do[ff5k=treeInsert[ff5k,ee5k[[j]]],{j,len5k}];]]/Secondgg5k=Flatten[ff5k];gg5k===Sort[ee5k]

11.46

True

len10k=10000;ee10k=Table[Random[],{len10k}];ff10k={};t10k=First[Timing[Do[ff10k=treeInsert[ff10k,ee10k[[j]]],{j,len10k}];]]/Secondgg10k=Flatten[ff10k];gg10k===Sort[ee10k]

24.53

True

With this data we can now see that the algorithmic complexity is roughly as billed, O(n*log(n)).

{(t5k/t1k)/(5000.*Log[5000.]/(1000.*Log[1000.])),(t10k/t1k)/(10000.*Log[10000.]/(1000.*Log[1000.])),(t10k/t5k)/(10000.*Log[10000.]/(5000.*Log[5000.]))}

{1.20708,1.19464,0.9897}

It should be noted that this sorting algorithm takes a minor shortcut. In ordering elements in a tree node as {left,val,right} we may simply recover the sorted list using Flatten. A different internal node element ordering would not support this. We could more generally use a stack and walk the tree, as was done in the code to emulate Flatten, and this would still give the appropriate algorithmic complexity.

Bitvectors

Sometimes one needs to represent elements of the power set of a given set of n elements, for n not too large. For such purposes one may represent elemets of the power set by integers in the range from 0 to

-1

. A value m in this range can represent a given subset by its binary string, where 1's stand for those elements present in the subset.

For example, say we have a set of 100 elements. We will use integers in the range from 0 to

100

-1

to represent subsets, and take a random subset of, say, 128 elements in the power set (that is, subset whose elements are subsets of the original set). Actually, there is nonzero probability that my subset contains strictly fewer than 128 elements, but this is still okay for the examples below.

k=100;len=128;subs=Union[Table[Random[Integer,0,

-1],{len}]];

A reasonable question might be: "How many elements of subs contain the third element of the original set?" To answer this we represent the elements of the original set as "bit masks", that is, integer powers of two.

masks=Table[

,{m,0,k-1}];k=.

So the third element of the set is masks[[3]], which is 2^2, or 4. We form the function that takes an integer and computes its bitwise AND with masks[[3]]. We Map this function over the subset list subs, then count the number of elements in the result that are positive. The bitwise operation has masked out all but the third binary digit of each member of subs, so the positive values resulting (all will be 4) are from those member that have a third binary digit set.

func=BitAnd[masks[[3]],#]&;Count[Map[func,subs],_?Positive]

Okay, easy enough. How about "How many members of subs contain the sixth and eleventh elements of the original set, but not the fourteenth?" (one might expect on average that this would be about an eighth of the number of elements in subs). Same idea.

func2=BitAnd[masks[[6]]+masks[[11]]+masks[[14]],#]&;criterion=(#==masks[[6]]+masks[[11]])&;Count[Map[func2,subs],_?criterion]

What about "How many elements contain at least two from among the fourth, tenth, twenty-eighth, and fifty-first elements of the original set?" As before, we first mask appropriately.

func3=BitAnd[masks[[4]]+masks[[10]]+masks[[28]]+masks[[51]],#]&;

Now we simply Count all those masked values whose binary DigitCount is at least two.

Count[Map[func3,subs],_?(DigitCount[#,2,1]>=2&)]

It is a simple matter to perform the basic set operations on sets represented as bitvectors. Set union is obtained with BitOr, intersection with BitAnd, and complement is only a little more complicated.

bitUnion[a_,b_]:=BitOr[a,b]bitIntersection[a_,b_]:=BitAnd[a,b]bitComplement[a_,b_]:=BitAnd[a,BitNot[b]]

Application: working with a matrix modulo 2

For some number-theoretic algorithms such as factoring integers, one wants to find relations among a set of vectors modulo 2. Specifically, we suppose we have a matrix (regard it as a set of row vectors) where all elements are zeros and ones. We want to find elements of the null space. A simple way to get a generating set of this space is via NullSpace[Transpose[mat], Modulus->2].

One problem is that even with packed arrays we still require four bytes per entry. Clearly we can pack bits tighter than that!. Moreover it is not hard to see that we can avoid some work if we are willing to settle for, say, just one null vector. To this end, another method for finding these null vectors will prove useful. We augment on the right of the matrix by an identity matrix and then row reduce modulo 2. The values in the augmented columns record the operations that were done. Hence any row that has zeroes in the original columns tells us how to obtain a null vector using the information in the augmented columns. If any row suits the purpose then the last one must do so, because pivots are in increasing columns. So we simply look at that last column. Code to do this is as follows. First we set up an example. It is "typical" for the type of application mentioned in that it is approximately square.

len=400;ncols=390;SeedRandom[1111];mat=Table[Random[Integer],{len},{ncols}];augmat=Transpose[Join[Transpose[mat],IdentityMatrix[len]]];

Now we do the augmentation and row reduction. We ascertain that the original columns in the last row all have zeroes, and that the augmented part of that row gives a null vector.

Timing[rows=RowReduce[augmat,Modulus2];]first=Take[rows[[len]],ncols];Max[first]last=Drop[rows[[len]],ncols];Max[Mod[last.mat,2]]

{17.8Second,Null}

Let us compare to finding a null space directly.

Timing[nulls=NullSpace[Transpose[mat],Modulus2];]Max[Mod[nulls[[1]].mat,2]]

{5.89Second,Null}

Using NullSpace was alot faster. By approximately a factor of about three, which is now surprise if you think about the underlying Gaussian elimination that is done. Thus far we have neither used bit vectors nor seen an advantage to the augmented matrix row reduction method. Here it is. We can treat each vector as a bit vector, thus saving alot of space. As we are doing Gaussian elimination modulo 2, the only operation consists of adding vectors to one another elementwise. This is easily seen to be equivalent to bitwise exclusive-or. A second area for speed improvement is that we need not do full row reduction, but only triangulate until we have a row of zeroes in the original columns (unless one wants more independent generators of the null space).

Below is code to implement this matrix triangulation method using bitvectors.

getMaxPos[mat_,j_]:=Module[{big=Developer`BitLength[mat[[j]]],indx=j,len=Length[mat],nbits},Do[nbits=Developer`BitLength[mat[[k]]];If[nbits>big,big=nbits;indx=k],{k,j+1,len}];indx]getNullVec[mat_]:=Module[{len=Length[mat],new=mat,j,k,indx,nbits,tmp1,tmp2,jth,kth,bigrow,blen,bigblen},bigrow=getMaxPos[new,1];tmp1=Do[If[j≠bigrow,new[[{j,bigrow}]]=new[[{bigrow,j}]]];jth=new[[j]];nbits=Developer`BitLength[jth];bigblen=0;tmp2=Do[kth=new[[k]];blen=Developer`BitLength[kth];If[blennbits,kth=BitXor[kth,jth];blen=Developer`BitLength[kth];If[blen≤len,Return[IntegerDigits[kth,2,len]]];new[[k]]=kth;];If[blen>bigblen,bigrow=k;bigblen=blen],{k,j+1,len}];If[tmp2=!=Null,Return[tmp2]],{j,1,len}];tmp1]

We illustrate on the example above. We must first translate the matrix to a list of bitvectors (nonnegative integers).

zerovec=Table[0,{len}];SeedRandom[1111];numvec1=Table[ii=Table[Random[Integer],{ncols}];augvec=zerovec;augvec[[j]]=1;FromDigits[Join[ii,augvec],2],{j,len}];

Now we obtain a null vector for this matrix and check that it is indeed such.

Timing[nullvec=getNullVec[numvec1];]Max[Mod[nullvec.mat,2]]

{8.05Second,Null}

Still slower than using Nullspace directly. But the interesting fact is that the asymptotic complexity is better. If we double the size of the input then the Nullspace or RowReduce methods will take around eight times longer, as they are O(

) complexity algorithms. In contrast, for inputs in the size range up to several thousands, the bitvector formulation is only O(

). This is because the bitwise operations are sufficiently fast as to be treated as constant speed up to a large size, so in effect other code in the loops dominates. We illustrate by multiplying the size of the example by four.

len=1600;ncols=1560;SeedRandom[1111];mat=Table[Random[Integer],{len},{ncols}];augmat=Transpose[Join[Transpose[mat],IdentityMatrix[len]]];

zerovec=Table[0,{len}];SeedRandom[1111];numvec1=Table[ii=Table[Random[Integer],{ncols}];augvec=zerovec;augvec[[j]]=1;FromDigits[Join[ii,augvec],2],{j,len}];

Timing[nulls=NullSpace[Transpose[mat],Modulus2];]Max[Mod[nulls[[1]].mat,2]]

{341.03Second,Null}

Timing[nullvec=getNullVec[numvec1];]Max[Mod[nullvec.mat,2]]

{131.16Second,Null}

At this size the bitvector approach is substantially faster than using Nullspace. A reasonable question is whether one might in effect emulate Nullspace but using bit vectors. This can be done but in order to be as efficient as possible one would need the ability to check or set a given bit in a bignum in constant time, and this capability is not present in version 4 of Mathematica.

Tree application: A rudimentary parser

In this section we present a not-too-complicated grammar and a random sentence generator for that grammar. Then we reverse this process and "parse" the sentences according to that grammar. Strictly speaking there is no issue of computational complexity in this section. Nonetheless it is shown because it is a cute example of tree construction, it is generally relevant to computer science (where parsing theory is a common theme), and because more complicated parsers will frequently work in similar manner but require, say, stacks as intermediate data structures. Indeed, more complicated grammars generally require preprocessing that is itself computationally intensive and in need of useful data structures. A good reference for all this is
(1977) Principles of Complier Design, Aho, Alfred, and Ullman, Jeffrey. Addison Wesley.
(aka "The Dragon Book"). In particular see chapter 5.

First we give our grammar. It is quite capable of producing a variety of nonsense sentences, as we will soon see. With minor modifications one could probably use it to turn out scripts for bad movies. We use Hold to prevent infinite recursion. We also weight the probabilities of "empty" productions in adjective and adverb clauses so that they will not be terribly long.

NIL="";sentence:=Hold[Or[declarative,interrogative,imperative]]declarative:={subject,predicatepast}interrogative:={qverb,subject,predicatepresent}imperative:={actverb,subject}subject:=Hold[nounclause||{nounclause,prepositionclause}]nounclause:=Hold[{adjectiveclause,noun}]noun:=Or["skyscraper","ball","dog","cow","shark","attorney","hatter","programmer","city","village","buffalo","moon","librarian","sheep"]adjectiveclause:={article,adjectivelist}adjectivelist:=Hold[myOr[.6,Hold[tmpadjlist=adjective;NIL],Hold[With[{tmp=randomPart[tmpadjlist]},tmpadjlist=Complement[tmpadjlist,Unevaluated[Or[tmp]]];{tmp,adjectivelist}]]]]article:=Or["a","the","this","that"]adjective:=Or["big","wet","mad","hideous","red","repugnant","slimy","delectable","mild-mannered","lazy","silly","crazy","ferocious","cute"]tmpadjlist=adjective;prepositionclause:={preposition,nounclause}preposition:=Or["in","above","under","from","near","at","with"]predicatepresent:={verbpresent,subject}predicatepast:={verbclause,subject}verbclause:=Hold[{adverblist,verbpast}]adverblist:=Hold[myOr[.8,Hold[tmpadvlist=adverb;NIL],Hold[With[{tmp=randomPart[tmpadvlist]},tmpadvlist=Complement[tmpadvlist,Unevaluated[Or[tmp]]];{tmp,adverblist}]]]]adverb:=Or["swiftly","unflinchingly","smugly","selflessly","oddly","mightily"]tmpadvlist=adverb;verbpast:=Or["ate","threw","gnashed","boiled","grated","milked","spanked","jumped"]verbpresent:=Or["eat","throw","gnash","boil","grate","milk","spank","salivate","jump"]qverb:=Or["did","will","could","should"]actverb:=Or["break","fix","launch","squeeze","fetch"]

Now generate random sentences from that grammar. We also provide a means to format the results.

randomPart[type_]:=Switch[Head[type],Hold,randomPart[type[[1]]],String,type,List,If[Length[type]0,"",Flatten[Map[randomPart,type]]],myOr,With[{rnd=Random[]},randomPart[If[rnd<type[[1]],type[[2]],type[[3]]]]],Or,randomPart[type[[Random[Integer,{1,Length[type]}]]]]]randomSentence[]:=Apply[sentenceType,randomPart[sentence]/.""Sequence[]]isQ[type_,a_]:=MemberQ[type,a]Format[sentence_sentenceType]:=Module[{word=First[sentence],words,punc},words=Map[StringJoin[#," "]&,sentence];punc=If[isQ[qverb,word],"?",If[isQ[actverb,word],"!","."]];words[[Length[words]]]=StringReplacePart[Last[words],punc,-1];words[[1]]=StringReplacePart[First[words],ToUpperCase[StringTake[First[words],1]],1];Apply[StringJoin,words]]

Here is a set of random sentences. Note that our random generator is rigged to go slightly beyond strict syntax, in that adjectives and adverbs are never reused within the same clause.

Table[randomSentence[],{20}]//TableForm

"A red shark grated a sheep above this crazy sheep."

"That wet hideous mad red hatter gnashed the delectable village in that cow."

"Fix that shark in this red big ball!"

"Launch the village!"

"Could the repugnant ball salivate that cow from this wet delectable buffalo?"

"Did the librarian grate the slimy hatter?"

"Break a slimy village from that repugnant village!"

"The hideous librarian jumped a programmer at a ball."

"Squeeze the programmer in that city!"

"Could the mad librarian with the silly city salivate the ball?"

"Did the attorney jump that ball?"

"A attorney at this attorney milked a village."

"Will the wet lazy hideous mild-mannered mad red attorney near a lazy programmer eat a dog?"

"This city threw that librarian from a delectable slimy village."

"Squeeze the attorney!"

"Fix the crazy programmer under the red crazy repugnant wet silly cow!"

"Will a red librarian milk that big moon?"

"A city with the hideous wet big city ate a sheep."

"Fix this mad librarian in a programmer!"

"Should that shark under that red crazy skyscraper salivate this city near that lazy silly hatter?"

Now for the parser. We will break down each sentence part and indicate this by encapsulating it in a special head. When we encounter empty subexpressions e.g. adjective lists with no adjectives, we do not return them; for many purposes this is acceptable and of course one could easily modify the parsing code below to indicate where these lists would be. Note that this parser assumes the input is valid, and makes no attempt to detect errors.

getNextWord[{}]:=""getNextWord[a_]:=First[a]atomParse[hed_,a_]:={hed[getNextWord[a]],1}sentenceParse[a_sentenceType]:=With[{b=Apply[List,a]},If[isQ[qverb,First[b]],interrogativeParse[b],If[isQ[actverb,First[b]],imperativeParse[b],declarativeParse[b]]]]declarativeParse[a_]:=Module[{b=subjectParse[a],c},c=predicatepastParse[Drop[a,b[[2]]]];"DECLARATIVE SENTENCE"[b[[1]],c[[1]]]]interrogativeParse[a_]:=Module[{b=atomParse["QUESTION VERB",a],c,d},c=subjectParse[Drop[a,b[[2]]]];d=predicatepresentParse[Drop[a,b[[2]]+c[[2]]]];"INTERROGATIVE SENTENCE"[b[[1]],c[[1]],d[[1]]]]imperativeParse[a_]:=Module[{b=atomParse["ACTION VERB",a],c},c=subjectParse[Drop[a,b[[2]]]];"IMPERATIVE SENTENCE"[b[[1]],c[[1]]]]subjectParse[a_]:=Module[{b=nounclauseParse[a],c},c=Drop[a,b[[2]]];If[!isQ[preposition,getNextWord[c]],{"SUBJECT"[b[[1]]],b[[2]]},c=prepositionclauseParse[c];{"SUBJECT"[b[[1]],c[[1]]],b[[2]]+c[[2]]}]]predicatepastParse[a_]:=Module[{b=verbclauseParse[a],c},c=subjectParse[Drop[a,b[[2]]]];{"PREDICATE"[b[[1]],c[[1]]],b[[2]]+c[[2]]}]predicatepresentParse[a_]:=Module[{b=atomParse["VERB (PRESENT TENSE)",a],c},c=subjectParse[Drop[a,b[[2]]]];{"PREDICATE"[b[[1]],c[[1]]],b[[2]]+c[[2]]}]verbclauseParse[a_]:=Module[{b=adverblistParse[a],c},c=atomParse["VERB (PAST TENSE)",Drop[a,b[[2]]]];If[b[[2]]0,c,{"VERB CLAUSE"[b[[1]],c[[1]]],b[[2]]+c[[2]]}]]nounclauseParse[a_]:=Module[{b=adjectiveclauseParse[a],c},c=atomParse["NOUN",Drop[a,b[[2]]]];{"NOUN CLAUSE"[b[[1]],c[[1]]],b[[2]]+c[[2]]}]adjectiveclauseParse[a_]:=Module[{b=atomParse["ARTICLE",a],c},c=adjectivelistParse[Drop[a,b[[2]]]];If[c[[2]]0,b,{"ADJECTIVE CLAUSE"[b[[1]],c[[1]]],b[[2]]+c[[2]]}]]adjectivelistParse[a_]:=Module[{b=a,c,result,len=0},result="ADJECTIVE LIST"[];While[isQ[adjective,getNextWord[b]],c=atomParse["ADJECTIVE",b];len+=c[[2]];result="ADJECTIVE LIST"[result,c[[1]]];b=Drop[b,c[[2]]]];{Flatten[result,Infinity,"ADJECTIVE LIST"],len}]prepositionclauseParse[a_]:=Module[{b=atomParse["PREPOSITION",a],c},c=nounclauseParse[Drop[a,b[[2]]]];{"PREPOSITION CLAUSE"[b[[1]],c[[1]]],b[[2]]+c[[2]]}]adverblistParse[a_]:=Module[{b=a,c,result,len=0},result="ADVERB LIST"[];While[isQ[adverb,getNextWord[b]],c=atomParse["ADVERB",b];len+=c[[2]];result="ADVERB LIST"[result,c[[1]]];b=Drop[b,c[[2]]]];{Flatten[result,Infinity,"ADVERB LIST"],len}]

Table[sentenceParse[bb[j]=randomSentence[]],{j,5}]

{"DECLARATIVE SENTENCE"["SUBJECT"["NOUN CLAUSE"["ADJECTIVE CLAUSE"["ARTICLE"["a"],"ADJECTIVE LIST"["ADJECTIVE"["mild-mannered"],"ADJECTIVE"["mad"]]],"NOUN"["sheep"]],"PREPOSITION CLAUSE"["PREPOSITION"["from"],"NOUN CLAUSE"["ARTICLE"["this"],"NOUN"["village"]]]],"PREDICATE"["VERB (PAST TENSE)"["grated"],"SUBJECT"["NOUN CLAUSE"["ARTICLE"["the"],"NOUN"["librarian"]]]]],"INTERROGATIVE SENTENCE"["QUESTION VERB"["did"],"SUBJECT"["NOUN CLAUSE"["ARTICLE"["this"],"NOUN"["attorney"]]],"PREDICATE"["VERB (PRESENT TENSE)"["throw"],"SUBJECT"["NOUN CLAUSE"["ADJECTIVE CLAUSE"["ARTICLE"["a"],"ADJECTIVE LIST"["ADJECTIVE"["wet"],"ADJECTIVE"["slimy"]]],"NOUN"["programmer"]]]]],"IMPERATIVE SENTENCE"["ACTION VERB"["squeeze"],"SUBJECT"["NOUN CLAUSE"["ADJECTIVE CLAUSE"["ARTICLE"["that"],"ADJECTIVE LIST"["ADJECTIVE"["slimy"]]],"NOUN"["attorney"]]]],"DECLARATIVE SENTENCE"["SUBJECT"["NOUN CLAUSE"["ARTICLE"["this"],"NOUN"["village"]],"PREPOSITION CLAUSE"["PREPOSITION"["in"],"NOUN CLAUSE"["ARTICLE"["a"],"NOUN"["village"]]]],"PREDICATE"["VERB (PAST TENSE)"["ate"],"SUBJECT"["NOUN CLAUSE"["ADJECTIVE CLAUSE"["ARTICLE"["that"],"ADJECTIVE LIST"["ADJECTIVE"["delectable"],"ADJECTIVE"["lazy"]]],"NOUN"["moon"]]]]],"DECLARATIVE SENTENCE"["SUBJECT"["NOUN CLAUSE"["ADJECTIVE CLAUSE"["ARTICLE"["this"],"ADJECTIVE LIST"["ADJECTIVE"["crazy"]]],"NOUN"["shark"]],"PREPOSITION CLAUSE"["PREPOSITION"["under"],"NOUN CLAUSE"["ARTICLE"["the"],"NOUN"["hatter"]]]],"PREDICATE"["VERB (PAST TENSE)"["jumped"],"SUBJECT"["NOUN CLAUSE"["ARTICLE"["this"],"NOUN"["buffalo"]],"PREPOSITION CLAUSE"["PREPOSITION"["with"],"NOUN CLAUSE"["ADJECTIVE CLAUSE"["ARTICLE"["this"],"ADJECTIVE LIST"["ADJECTIVE"["repugnant"]]],"NOUN"["ball"]]]]]]}

Cite this as: Daniel Lichtblau, "Data structures and efficient algorithms in Mathematica" from the Notebook Archive (2017), https://notebookarchive.org/2018-07-6z257dd