Todd-Coxeter Algorithm: Implementation and Analysis of The
Todd-Coxeter Algorithm: Implementation and Analysis of The
Todd-Coxeter Algorithm: Implementation and Analysis of The
= ede~ia~1da = ea~1bcbae~1b~1a~1c~1ab~1
(a) the comparative behaviour of the three most commonly programmed forms
of the algorithm, and
(b) the behaviour of the algorithm with respect to different classes of defining
relations and different choices of subgroup.
In Section 2, we discuss various forms of the TC algorithm, while Section 3
contains a formal description of the lookahead TC algorithm. Section 4 contains
the results of our investigation of the TC algorithm.
A general account of earlier TC programs, together with some applications,
can be found in Leech [11].
2. Algorithm Design. Suppose
G = gp(g,, ■ ■ ■ , gr | Rligl, ■ ■■ , gr) = ■■■ = RÁgl, ' • • , gr) = 1)
and let H = (h„ • ■■ , ht) be a subgroup of G, where the A,'s are words in the g,'s.
If [G : H] = n, the right cosets of H in G will be denoted by the integers 1 to n. The
coset consisting of H itself will be always represented by the integer 1. We shall
write ii)gj to denote the coset obtained upon multiplying coset i on the right by
generator g¡.
Definition 1. If m is the number of cosets of H currently defined and r is the
number of generators in the given presentation for G, then the partial coset table T
is an m X 2r array whose (/, /)th element is defined as follows. Suppose / is a coset
of H and s¡ = g¡ or g'Z for some generator g¡. Then, T\i, j) = k, if (/>, is known
to be coset k (i.e., if (z>, = k), and T\i, j) = 0, if (/>, is unknown. Thus, the columns
of T correspond to the generators of G and their inverses, while the rows of T cor-
respond to the currently defined cosets.
In practice, if a generator is an involution, then T contains a single column for
both the generator and its inverse. If 1, • • • , m are precisely the complete set of
cosets of H and no entry of T is zero, then T is called the coset table for H in G.
Since it will be clear from context when we are discussing a partial coset table rather
than a coset table, we shall use the term coset table in both cases.
Let w = s, ■■■ sk, where s, = g¡ or gj1 for some generator g¡, be a word and
let i be a coset. Assuming that the coset ii)s, ■■■s,; is defined, the coset ii)s, ■■■s,si+,
is said to be defined if and only if THi)s, • • ■ s;, si+,) is nonzero, in which case
(i>i • • ■ s,s/+i = Tiity, ■■• s¡, si +i).
If w is the empty word, then (z')vvis defined to be /'. Sometimes, we shall indicate
that (z')wis not defined, by writing ii)w = 0.
We shall usually denote a word in the generators of G by a subscripted w and
a single generator or its inverse by a subscripted j.
Definition 2. Let R be a relator in G and let i be some coset of H. We now proceed
to define the process of applying relator R to coset i with respect to some partial
coset table T by defining a transformation of T corresponding to each of the following
four possibilities:
(a) R = w and (z')w = i- Here, T is left unchanged.
(b) R = w,sw2, where w, is a subword of R, s is not the identity and w2is a subword
of R right adjacent to s, such that Q)w, = j, (/)w~' = k but Tij, s) = r(/V, s'1) = 0.
Note that w, or w2 may be empty. We then set 7Yj, s) = k and Tik, s'1) = j. This is
called a deduction.
is trivial but min(/>, q) cosets must be defined before the relation a = 1 are discovered.
Indeed, given any integer m, one can produce a presentation for the trivial group
which requires the definition of at least m cosets before the TC algorithm is able to
deduce that the group is trivial. Then, by adding such a presentation of the trivial
group to some presentation of a group G, we can produce an arbitrarily bad pre-
sentation for G.
Thus, it is fruitless to expect to be able to produce a version of the TC algorithm
which performs well even for all presentations of a single group. However, the
majority of presentations which arise in practice turn out to be reasonable from the
point of view of the TC algorithm and so we concentrate our efforts on producing
forms of the algorithm which operate as efficiently as possible on this class of presen-
tations.
The two most popular strategies for introducing new cosets are the Felsch method
[8], which defines fewer redundant cosets at the expense of execution time, and the
Haselgrove-Leech-Trotter method [11], [21] which usually defines more redundant
cosets but which typically runs faster than the Felsch method. The Haselgrove-
Leech-Trotter method was developed by Haselgrove in 1953 and later adapted by
Leech, Trotter and others. For brevity, we shall refer to it simply as the HLT method
or HLT algorithm.
Suppose that the definition ii)s, = k has just been made. The Felsch procedure
is to apply all significantly different cyclic permutations of relators beginning with
s¡, to coset z. This process is repeated with any deductions, ii')s'¡ = k!, which may
have been discovered, until all possible consequences of the original definition have
been discovered. Only at this stage, will a new coset be introduced, if necessary,
and then by defining it so that the first vacant position in the coset table is filled.
In the HLT method, relators are applied to the cosets in the order in which the
cosets were introduced. If for some coset z and relator R, the R-cycle at coset i is
incomplete, sufficient new cosets are immediately introduced so as to complete the
.R-cycleat i.
Another form of the TC algorithm has been discussed by Mendelsohn [17].
However, it is much too inefficient to be considered for machine implementation
and so it will not be discussed further.
In order to be able to compare various forms of the TC algorithm, we introduce
some measures which are associated with the enumeration of the cosets of a subgroup
H in a group G by a specific form of the TC algorithm. Suppose
/ is the index of H in G;
M is the maximum number of cosets defined at any instant during the enumeration;
T is the total number of cosets defined in the enumeration;
/ is the execution time;
r = M/I,
e = T/I.
We shall indicate the form of the TC algorithm to which one of the above measures
refers by a subscript: F for the Felsch procedure and H for the Haselgrove-Leech-
Trotter procedure.
The number t can be interpreted as measuring the amount of space required
to perform the enumeration, over and above that required to store the coset table
for H in G. As noted above, coset enumerations with respect to certain presentations
will necessarily involve redundancies, so that it is not possible to design a coset
enumeration algorithm having r = 1 for all possible coset enumerations. So, current
efforts are directed towards producing fast programs for which r is close to 1 for
large classes of presentations. It is obvious that usually rH ^ t> and it is observed
in practice that for a large proportion of presentations t „ is significantly greater
than 7>.
Of the machine versions of the TC algorithm which have thus far been proposed,
the Felsch method usually defines the fewest cosets as it tries to extract the maximum
amount of information each time a new coset is defined. However, a form of the
algorithm has been developed which performs the majority of enumerations very
quickly but which rarely requires significantly more space than the Felsch algorithm
in order to complete an enumeration. As this form of the Todd-Coxeter, which we
shall call the lookahead method, gives the best all round performance, we shall describe
it in some detail in this and the next section.
A type of lookahead was used by Leech in 1959 (Leech [11, p. 265, last paragraph])
but this form of the TC algorithm did not really begin to develop until Guy [9] wrote
his lookahead TC program for the ATLAS at Cambridge in 1967.
The lookahead method operates in two distinct phases: a defining phase and
a lookahead phase. As long as the number of cosets defined at any instant is less
than a specified number ML, the algorithm remains in the defining phase. In this
phase, the enumeration proceeds by the HLT method. When, however, the number of
cosets defined exceeds the limit Mh, the algorithm switches to the lookahead phase.
Here, relators are applied to cosets as before, but if a relator cycle is incomplete
at some coset, no new cosets are defined to complete the cycle. The aim is to discover
a large number of deductions and coincidences without introducing any new cosets.
If the enumeration is still incomplete at the end of the lookahead phase and if sufficient
storage space is available, we return to the definition phase in which we remain
until either the enumeration completes or the number of cosets defined again passes
some preset limit. Thus, the algorithm alternates between the defining phase and
the lookahead phase.
Let us denote a lookahead form of the TC algorithm by the letter L. If the look-
ahead phase is not used, then tl = t„. When lookahead is used, rL varies with the
value of ML supplied. Let us denote the values of rL and eL, when ML is the minimum
possible for the enumeration to complete, by fL and iL, respectively. Experimentally,
it is found that fL ~ j>. While tL is often much greater than eF, this is not felt to
be an important consideration because of the faster processing of cosets in a look-
ahead program as compared to a Felsch program.
To summarize, the lookahead form of the TC algorithm can usually perform
coset enumerations faster than other machine algorithms used at present.
On the other hand, if storage space is at a premium, a lookahead program will
not require significantly more storage in order to complete an enumeration than a
Felsch program. Indeed, the lookahead algorithm often requires less space to complete
an enumeration than the Felsch algorithm !
There are a number of possible ways of arranging the lookahead. Guy [9], for
example, divides his available space up into a number of blocks and applies look-
ahead before allowing the coset table to extend into a new block. We shall refer to
this technique as bumping. If the optimum block size is chosen, this technique will
result in extremely rapid enumerations. On the other hand, a poor choice of block
size can result in inefficient enumerations. The other possibility, of course, is to let
the coset table exhaust the available space before applying lookahead. Also, when
the program is in the lookahead phase, it can return to the defining phase as soon
as a single coincidence has been discovered (incremental lookahead) or only after
all relators have been applied to all cosets currently defined icomplete lookahead).
If all cosets have been processed in the lookahead phase of an incremental look-
ahead program, the lookahead begins again with the first coset not yet processed
in the defining phase. A single application of lookahead, of either kind, to every
coset not yet processed in the defining phase is called a lookahead pass. Generally,
both incremental and complete lookahead programs are arranged so that they
terminate when either the enumeration completes or a lookahead pass fails to discover
a single coincident coset. In the case of complete lookahead, a considerable amount
of time can sometimes be saved in situations where the enumeration does not com-
plete by specifying that execution is to terminate if less than a certain number of
coincidences are discovered during an application of lookahead.
We have programmed both the complete and incremental forms of the look-
3, 3, 3, 3, 8, 4, 5, 3, 5, 4, 5, 3, 5
from the active row list, the two corresponding fields of A are spare. So a flag to
indicate that coset j has been found redundant is placed in Aij, 2), together with a
pointer to row i, while a pointer to the next member of the queue Q of coincidences
awaiting processing is placed in Aij, 1).
Coincidences are added to Q in the following way. Suppose we deduce i ~ j.
We examine Aij, 2) to see if row j is already marked as coincident. If j is marked
as coincident with k (say), we reset j to k and go back and examine the new Aij, 2).
We do the same with i, and finally reach Ï ~ / where neither /' nor / is already
coincident. If i' = f, then the original coincidence i ~ j yields no new information
so nothing need be added to Q. Otherwise, choosing i" = min(z', /), j" = max(z', /),
we mark /' as coincident with i" and add /' to Q. Finally, when the coincidence
is fully processed Aij, 2) is linked to the free space list.
In this way, coincidences may be queued and processed so that no extra space
is required.
3.2. Coset Enumeration Algorithm. We now give a formal description of the
algorithm.
The algorithm first applies the nontrivial generators of H (if any) to coset 1, and
then begins applying the defining relators to the cosets from 1 on until either the
enumeration is complete or the available space is exhausted. If the latter occurs,
lookahead is performed and, provided some coincidences are discovered, control
is then returned to the defining phase. If no coincidences are found in lookahead,
the enumeration terminates with the coset table incomplete.
A coset i is said to be closed if all the relator cycles at coset i are complete. The
action of the defining phase is to systematically close the defined cosets from 1 on-
wards in the order defined by the active row list. If lookahead discovers a closed
coset i, the coset's position in the active row list is altered so that coset z becomes
the last coset closed in the defining phase. This sometimes saves quite a lot of un-
necessary work in the defining phase while the extra overhead is negligible.
The coset enumeration program consists of two central subroutines:
ENUM(ERATE)and COINC(IDENCE),together with a drivingprogram TODCOX.
TODCOX. This routine initializes the various data structures and performs
the function of control routine.
(1) [Initialize] Input the generators for H and the defining relators for G; Set up
pointers and tables; Allocate storage for the adjustable array SPACE.
(2) [Perform enumeration] Call ENUM.
(3) [Next problem] Go to (1) for next problem if desired, otherwise exit.
ENUM(NMAX, NCOL, SPACE). This routine systematically enumerates the
cosets of H in G. The routine operates in three modes:
(a) Apply subgroup generators to coset 1, mode SG.
(b) Apply relators in defining phase, mode DP.
(c) Apply relators in lookahead phase, mode LA.
When the enumeration is complete, or alternatively when the storage available is
exhausted, the index of H in G or an overflow message is printed with some execution
statistics. Control then returns to TODCOX.
IFRONT is the last coset reached in the forward scan.
IBACK is the last coset reached in the backward scan.
T(IBACK,INVCOL<SGREL{N)))
♦- IFRONT;
(deduction);If T(IFRONT, SGREL{N))* 0, IFRONT «- T(IFRONT, SGREL(N))
and go to (7); Else T(IFRONT, SGREL(N)) «- IBACK; (another deduction);
IFRONT «- IBACK.
(7) [Relator complete] If IFRONT ^ IBACK call COINC; (process nontrivial
coincidence); If more words left in set for this phase go to (2); If mode is LA go to
(11).
(8) [Start or continue defining phase] If mode is SG reset mode and parameters
to DP; If no more cosets to be processed go to (13); (enumeration complete); Get
next coset to be processed; Reset pointers to first word of set; Go to (2).
(9) [No space left in DP] Output overflow message; Reset mode to LA.
(10) [Incomplete scan in lookahead] CLOSED <—. FALSE .; (note coset is not
closed); If more words left in set go to (2).
(11) [End of set of words in LA] If CLOSED = . TRUE ., link current coset
to become last closed coset to avoid reprocessing in DP; If no more cosets to be
processed go to (12); Else reset pointers to first word of set; CLOSED <—. TRUE .;
Go to (2); (continue lookahead).
(12) [End of LA] Output message to signify end of LA; If no space available
return; (enumeration terminates); Else reset current coset to last closed coset and
mode to DP; Reset pointers to first word of set; Go to (2); (return to DP).
(13) [Enumeration complete] Output index of H in G and statistics; Return.
COINC (NMAX, NCOL, SPACE, KX, KY). Given a pair of coincidentcosets
by ENUM, this routine discovers all consequences and modifies the coset table
accordingly. The modifications which have to be made to the coset table are described
in Leech [11].
NCOL is the number of columns of T.
KX < KY are the original pair of coincident cosets.
KA < KB are a pair of coincident cosets.
JA, JB run through the entries of rows KA, KB respectively.
I indexes the columns of T.
(1) [Initialize] KA <—KX; KB «—KY; Link around row KB; (remove row from
active row list).
(2) [Prepare to process new coincidence] Link KB onto the free row list; Decrease
active coset count; I <—0; (set column counter).
(3) [Compare Ith entries of rows KA and KB] I <- I + 1; If I > NCOL, go to
(7); JB «- T(KB, I);
(a) If JB = 0, go to (3); (no new information).
(b) If JB = KB, JB <- KA; Go to (4).
(c) If JB * 0 and JB ^ KB, T(JB, INVCOL(I)) ^ 0; Go to (4).
(T(JB, INVCOL(I)) initially contains KB, which we delete at this stage rather than
replace by KA to avoid having two occurrences of KA in the same column.)
(4) [Continue comparison] JA <—T(KA, I);
(a) If JA = 0, T(KA, I) <- JB and go to (6); (deduction).
(b) If JA = KB, T(KA, I) <- KA; JA <- KA; Go to (5); (possiblenew coin-
cidence).
(c) If JA 7e 0 and JA ¿¿ KB, go to (5); (possible new coincidence).
(5) [Queue coincidence] Add the pair of coincident cosets JA, JB to the coincidence
queue if they are not already there; Link around the row of T corresponding to the
higher numbered of the two cosets.
(6) [Assign inverse T entry if currently undefined] If T(T(KA, I), INVCOL(I)) = 0,
T(T(KA, I), INVCOL(I)) <- KA; Go to (3).
(7) [Fetch next coincidence from queue] If no more coincidences in queue, return;
Else extract new KA, KB from queue; Go to (2).
* Note that, using the program of Section 3, M,¡ and TH may vary slightly with the available
space whenever T„ exceeds the number of cosets which can be stored in the available space.
Table 1
Definition of some sets of defining relations
Order of
Designation group Ref Remarks Defining relations
Table 2
Definition of families of defining relations
The Macdonald presentations Gia, ß) are highly pathological for most values
of a and ß. If the orders of the generators a and b are added as relations then there
is a dramatic reduction in the degree of difficulty of the enumerations, suggesting
that it is much easier for current versions of the TC algorithm to deduce relations
in G from this extended relation set. Further, it appears to be a difficult task to
deduce the orders of a and b in Gia, ß) by hand, except in a few special cases.
The Macdonald groups are also the only family of groups we know which give
rise to enumerations for which t„ «t,. Thus the enumerations
Gia, ß) | ([a, b], lb, a'1], [a"1, ¿T1], [b'\ a}),
where a = —2, —1, 3, 4 and ß is odd, become extremely difficult for the Felsch
algorithm as ß increases but remain quite easy for the HLT algorithm. For example,
if a = 3, ß = 21, then the index is 40 and MF = 16,063,TF = 16,067while M„ = 84,
TH = 91. This is an example of a situation where the HLT method of introducing
new cosets is far superior to the Felsch method. This was verified by Alford who
produced a Felsch program which used the HLT method of introducing new cosets.
Applying this program to the above enumerations resulted in their completion
with MF equal to the index. Unfortunately, when this program was applied to other
groups, enumerations became much harder than when the usual Felsch method of
introducing new cosets was used.
In the case of Cam(3), the addition of the orders of the generators r and s as
extra relations does not significantly decrease the difficulty of the enumeration
^t m ci •—>oo O
fNN'-it'ir^ON'î'n^f
N W tf (S ^ n N
oo^or-oor^mosOO
M(N\DOO-0^'0'NO\0
CN OO
NnH\oo\T-ooTr^oo\o
(N CO
o\ \o \o in
"î
m m ^
h- if) ^O
r- 88 •n n \o o
Hnn^hnnNrt ^r n ^
A
88888883888
888888888888
<ncov~ir-r--i^o<No\fv4oo^
OOt\DOOHioOCOhO¡--
tH «O /i co \o *o o o> rn ph"
lofNtNr-T-HO^-o
O (N «otNOoooor-'noo
in ^h \000-—>^-ioomo
r^-Ov'OOr'-ir-^'fNvCiOOO
h'JfnMinioO'ifMr^oo
Í •—«©^OtN^cnoocor-vo
A
>n en (N r«^ r^ oo
o a\ ûo r^ fi v£)
n O -" *T
^^ro^-^mifií^ODO
o OS
»n o o o o —<o — ^r o
— fl m ^J- TJ- *r> v~t so so cc so
o as
>n o ooo — o — ^ro
kj
kj
'3
^
+ ^
íü °
t»3 fc¡ fn - -O -O
■"_
1 - —8 3->«l—"l-S^ÍS
oo \o in m ^ h
rN o r- (N *h co
M O ^ N O CO
ID M H O CO ^
O —t O O vo Tf co o
VO in
T-H 00
Tf *—<<n in
m m m
A
888 P- On (N
in Tf —i -—i tN -— N CO 00 —< On fi
0\ ín oo ON m —*
^O (N m ON OO 00
on «n vo »n
00 Tf —* T-i ,—t ^H N co ^i rt - O O
oo m r- ,__, ,_■ (-». ^- vi
»n (N Tf fN ON CO
«n fN m in VO Tl- tN m vo vo m oo ^*
rO VO O ON fN ON ST-H on Tf on m m
r-- f. on Tf co vo O <*"■On m »—'O
fN fN Tf NO fN O -^
^j- io T-H fN r-
fN a
oo r- m Tf o On (N oo vo in rN ro
«n (N f—i fN oo Tf rN m O in oo On
r-- tN m oo oo m Tf ,-t fN rN in On
*^
oo r- •—'oo oo m vo o Q oo Tf r^ £
oo m r- *— -^ rN O Tf vo Tf vo (N O
<n fN Tf -h no fN m r- vo »n oo vo •-<
,__ ,_h ,_ ,_i ^_, tj »—i«n vo on O
on Tf Tf oo tN in m tN >n o *—'co fN
Tf Tf vo oo m r- !0 Is h vi co "O m
vo m r— -—<«—<On fN fN CO V) oo «n fN
i—( fN i—<vo m
<n Tf ^ -— vo »-n — m on m Tf p-
ON fN 00 VO 00 fN Tf in oo f- On Q
VO fN f> VO fN f- tN in r-« On i-i O
00 Tf ^ vo O O fN on oo r-- fN p- oo
co in r- vo on oc O m fi Tf t-i on m
in fN Tf O m on m Tf vo »n oo in in
---s (NO
_o\cn
vo o r-
OÛOO^n^^C
«a- ^ tN o
^r cn r-. in
—
"^ Uj
UJ
5pPi^u--líl,,^li)
UJ
tq Ci& c2d^
2 I ¿ cîd ¿
Z UÜo o o
Table 5(a)
Variation of time with maximum table size for (8, 7 | 2, 3) | (a2, a_16), with relators in the order
presented
(Times are given in CDC 6600 seconds)
Ml 2,176 2,100 1,900 1,700 1,500 1,300 1,250 1,240 1,225 1,220
TL 2,626 2,531 2,288 2,009 1,744 1,498 1,448 1,461 1,441 1,437
tL 0.927 0.934 0.880 0.822 0.766 0.712 0.704 0.898 0.898 1.093
/ 0 1111112 2 3
Table 5(b)
Variation of time with maximum table size for M}}' | (a)
ML 4,201 3,900 3,600 3,300 3,000 2,615 2,500 2,300 2,100 1,900
TL 6,331 5,755 5,318 4,635 4,312 3,862 3,728 3,508 3,308 3,710
tL 2.015 2.016 1.957 1.928 1.894 1.881 1.912 1.941 1.980 2.191
/ 0 111111111
ML 1,800 1,600 1,400 1,300 1,200 1,100 1,000 900 800 750
TL 3,373 2,412 2,189 2,538 2,079 1,995 1,862 1,769 1,470 1,405
tL 2.191 2.242 2.296 2.362 2.691 2.863 3.140 3.349 4.426 5.038
/ 222234558 10
HLT Method. It has been found that the order in which the relators of a presenta-
tion are processed by the HLT algorithm can have a significant effect on the number
of redundant cosets introduced. As a general rule, it is found that the best perform-
ance is achieved when the relators are processed in order of increasing length. In-
tuitively, the reason for this is that short relators are less likely to introduce redundant
cosets than long ones. New information deduced when applying the short relators
to coset i sometimes means a reduction in the number of new cosets which must
be defined when applying the long relators to coset /.
We illustrate this effect with several related presentations for the group ((5,4,7; 3)),
which collapses to the identity, in Table 6. Recall that ((/, m, n; p)) = gp(a, b, c \ a2 =
b2 = c2 = iab)1 = ibc)m = icdf = iabcf = 1 ). This can be written in the alternative
form
In the table, TH denotes the total number of cosets defined when the relators
are applied to a coset in the order in which they are written down above, and T%,
the total number of cosets when the relators are applied in order of increasing length.
The enumerations are all performed over the trivial subgroup. In case 2, we see that
the number of cosets defined has dropped by 19 per cent.
It should be noted that applying relators in order of increasing length does not
always lead to the fewest cosets being defined. However, in most situations where
we have found that it is not the optimal ordering, we have observed that the difference
in total cosets defined between the optimal arrangement of relators and the ordered
arrangement is insignificant. Thus, the best strategy is to set the relators up so that
they are processed in order of increasing length.
Table 6
Effect of ordering relators
Table 7
£7jec/ of permuting the letters of the relators of G(2, 2) on TH
0 1
Table 7 contains the values of TH for all distinct permutations of the two relators
of G(2, 2). The z'thcolumn contains the effect of rotating relator 1 left i places while
the y'th row contains the effect of rotating relator 2 left j places.
The best result, T„ = 273 (row 5, column 8), is obtained when G(2, 2) is written
i.e., when the two relators agree on the last four letters. The worst result, T„ = 2,219
(row 2, column 6), is obtained for
Note that there is a factor of 8 between the best and worst results! The minimum
values of TH are obtained when the relators are permuted so that one of the following
arrangements results:
(a) The first four letters of each relator are identical.
(b) The last four letters of each relator are identical.
(c) The word formed by the first four letters of one relator is the inverse of the
word formed by the last four letters of the other relator.
These arrangements imply that, usually, fewer cosets have to be defined in order
to complete a relator cycle at coset i. That this behaviour is not peculiar to groups
having a very small number of relators can be seen by considering the following
presentation for the Hall-Janko group J2 (McKay and Wales [16]):
¿2 = gp(a> b,c \ a = b = c = abab' a' b~ = (ca) = (c/3)
= (cb' cb)2 = a'1 baca' bac~ a~ b' ac'1
The last five relators are redundant. Let us denote the presentation of PSL2(25)
formed by removing the last i relators from the above presentation by P{i), where
/ = 0, 1, 2, 3, 4, 5. We summarize the effect of removing the last five relators one by
one in Table 8. In each case the enumerations were performed over the subgroup
H = (a, b, d~1b~1c2), which is the normalizer of the Sylow 5-subgroup.
It can be seen that the presence of the fourth last relator makes the enumeration
much easier. However, the addition of further redundant relators causes the enumera-
tion to become steadily more difficult. In general, a pathological presentation (P
can be greatly improved by the addition of the right redundant relators. (See Section
4.1.) Although one may have no idea as to what the critical missing relators are,
it is sometimes worthwhile adding any further known relators which are not im-
mediately derivable from (P.
On the other hand, the addition of trivially redundant relators, such as cyclic
permutations and inverses of cyclic permutations of relators, to a presentation,
seldom helps in the case of a pathological enumeration. As far as the HLT method
is concerned, the additional relators will change the values of MH and TH, because
of the different order of coset definition. However, this change is not necessarily
an improvement and the execution time usually increases because of the increased
processing per coset. Even if such redundant relators are used only in the lookahead
phase, little advantage is gained, for if a cyclic permutation of relator R gives a
deduction when applied to coset z, then that same deduction must be made when
relator R itself is applied to some other coset j.
Unfortunately, it is usually difficult to derive useful additional relations by hand.
However, it is often possible to use the coset enumeration process itself to derive
additional relations. A simple minded method of doing this consists in doing a partial
Table 8
Effect of removing redundant relators on a presentation for PSL2(25)
enumeration of the cosets of the identity using the Felsch method and examining
those cosets which are found to be coincident. If i and j are coincident cosets, it is
a simple matter to read from the partial coset table, coset representatives wQ) and
w(j') for i and j, to obtain the relation vv(z')= w(y').The most promising of such re-
lations are added to (P and either the above process is repeated to deduce further
relations or the enumeration of cosets of the desired subgroup is attempted.
As an example of this process, consider the presentation (due to Brunner [2]):
C29 = gp(tf> b | bab2iab~1)2a2 = ib2a)2baba~2 = [a, b]2 = 1)
of the cyclic group of order 29. Using Felsch's method to enumerate the cosets of
the identity, 15 redundant cosets were found during the definition of 14,000 cosets.
One of the corresponding relators found was
4, -2i_3 -112
a ba b a b .
Enumerating the cosets of (a) (index 1 in C29), we find that MF drops from 1084
to 557 with the addition of the above relator.
The Felsch algorithm is preferred when trying to deduce new relations this way
because it minimizes the number of redundant cosets corresponding to unimportant
relations in the group.
A more sophisticated method of deriving new relations from coincidences is
described in Leech [13]. Unlike the above method, his method can be used to derive
relations corresponding to coincidences which occur when enumerating the cosets
of nontrivial subgroups.
4.4. Behaviour of the TC Algorithm with Respect to Different Presentations of
the Same Group. It is of considerable interest to take a number of different presenta-
tions for some group G and study the behaviour of the TC algorithm over this set
of presentations.
We begin by examining the effect of the length of relators. Intuitively, we would
expect that the presence of long relators would lead to the introduction of propor-
tionally more redundant cosets and this effect is illustrated by the four presentations
of the symmetric group S5 of order 120, given in Table 9. In each case, the cosets
of the identity have been enumerated and we tabulate the number of Felsch redundant
cosets RF, and the number of HLT redundant cosets Rrr. In addition, we tabulate
the maximum relator length m and the mean relator length m. Note, in particular,
the behaviour of the first three presentations which are virtually identical except
for the last relator.
In Table 10, we compare the behaviour of the algorithm with respect to eleven
different presentations of the simple group PSL2(13) of order 1092. In each case,
the cosets of the identity have been enumerated. It can be seen that while there is
considerable correlation between the degree of difficulty for the Felsch and HLT
algorithms, the correlation is by no means complete. For example, though presenta-
tion 1 is by far the worst for Felsch, it is nowhere near the worst for the HLT method.
Also, there is a fairly high correlation between m and M„. (For a discussion of some
of these presentations see Leech [12].)
As noted earlier, the HLT algorithm is particularly susceptible to the ordering
of the relators and the relative ordering of their letters. Thus, we would expect that,
if these effects could be removed, there would be an even higher correlation between
Table 9
Behaviour of algorithms with respect to different presentations of Sr,
Presentation m m RF R„
Table 10
Behaviour of algorithms with respect to different presentations of PSL2(13).
Each enumeration is over the identity so that the index is 1092 in each case
Presentation m m MF Mu Tit
m and M„ and between rF and r,,. Then, we are faced with the central problem:
Why are the first seven presentations bad compared to the last four? In particular,
consider presentations 3 and 11, where the only difference is that the last relator in
3 is [a, b]7 while in 11 it is [a, b]°.
We conclude this section with two further observations. The effect of adding
certain redundant relations to the presentations G(a, ß) and Cam(3) has been noted
in Section 4.1. Secondly, if the relator ia''bab)" in the presentation of the Burnside
group B2 4 is replaced by the relator [a, b]4 (still giving the same group) when enu-
merating the cosets of the identity, then TF drops from 5,022 to 4,282 (A/,,-is 4,096
in both cases), while M„ drops from 12,386 to 7,333.
4.5. Behaviour of the Algorithm with Respect to Different Subgroups of a Fixed
Group. In this section, we suppose that we are given a fixed presentation of a group G
and we shall consider two questions:
(a) How does the difficulty of an enumeration depend upon the subgroup chosen?
(b) How does the difficulty of an enumeration depend upon the subgroup gen-
erators used?
The answer to the first question, in general, is that r„ and t,, remain fairly con-
stant with varying index. This is illustrated by the groups G1'7'13, M[f and (8, 7 | 2, 3)
in Table 11. Note, however, that the enumerations
Table 11
Behaviour of the algorithm with respect to different subgroups
Enumeration I MF MH TF TH tf th
are slightly more difficult than the others. The enumeration PSL2(17) |([/3, a][b, a2])
is significantly more difficult than other enumerations in that group over cyclic sub-
groups. So, we tentatively conclude that the complexity of a word taken as a subgroup
generator has an influence on the degree of difficulty of an enumeration. The two
enumerations in J2 apparently demonstrate that in some situations the choice of
subgroup has a great effect on the difficulty of an enumeration. However, further
experiments lead us to believe that this effect is caused by the choice of subgroup
generators rather than the choice of subgroup. Note that the subgroup (a, b, bca''c)
is isomorphic to the unitary simple group £/3(3) while the subgroup (a, c, cb) is the
centralizer of an involution. As yet we have no knowledge of how the embedding of
subgroup H in G affects the difficulty of enumerations G | H.
Regarding question (b), we have found that the choice of the generating set for
the subgroup H can have a great effect on the enumeration G | H. This is illustrated
in Table 12 by tabulating the values of M,, for four different sets of generators of
the normal closure N of element a in Men(5) (recall that Men(zz) was defined in
Section 1). Here, we see that while N is generated by a and ab alone, the addition of
redundant generators ac, a'1, a' decreases M„ by a factor of 28. A further example
is provided by the enumeration (8, 7 | 2, 3)|(a2, a"'/3) where, as noted in Section 4.2,
the addition of the redundant subgroup generator (a2/34)0greatly decreases the dif-
ficulty of the enumeration. In the case of the Felsch algorithm, V. Felsch has observed
Table 12
Effect of redundant subgroup generators on an enumeration in Men(5)
Subgroup generators I MH TH rn
that the use of the defining relators as redundant subgroup generators seldom hurts
and occasionally has the effect of drastically reducing the number of redundant
cosets for a difficult enumeration. For example, in the enumeration B2 4 | E, the
number of redundant cosets introduced is more than halved.
The enumeration G | H involves constructing a set of generators for a subgroup
H of the free group F of rank r (assuming that G is given as an /--generator group)
such that the coset table of H in £ is isomorphic to the coset table of H in G. As
the images in £ of the Schreier generators of H in G generate H, we would expect
that, if we used the Schreier generators of H, then the enumeration G | H would be
particularly easy. That this is so in practice is demonstrated by the enumeration
J2 | (a, b, bca~lc). Using a complete set of 201 Schreier generators for the subgroup
instead of a, b and b'"'1", MH dropped from 12,907 to 101 !
John McKay reports that if a number of complicated generators are taken for
H, then the number of redundant cosets occurring in the enumeration G | H can
depend significantly on the order in which the generators of H are applied to coset 1.
He suggests that, before proceeding with the enumeration proper, the user should
apply different permutations of the generators of H to coset 1 and take that per-
mutation for which the number of cosets introduced so far is a minimum.
4.6. The Use of External Storage. It is clear that the next major step in the
evolution of coset enumeration programs must be the development of forms of the
algorithm which operate efficiently when the coset table is kept on some external
storage medium such as disk. The present algorithms access the table in a semi-
random fashion so that the task of developing a form of the algorithm for use with
external storage will be rather difficult. To illustrate some of the problems, we shall
describe an experiment in which HLT and Felsch programs running on an IBM
360/50, were modified so that the coset table was kept on an IBM 1316 disk pack.
The coset table is broken up into blocks, a small number of which may be resident
in the core memory at any instant. Whenever a reference is made to a row of the
coset table not already in core, the block on disk containing this row must be read
into core, while some block currently in core must be written onto disk in order to
make room for the new block. The first major choice which has to be made is the
block size, where a block is to consist of a fixed number of rows of the coset table.
Table 13 demonstrates the effect of different block sizes on the HLT and Felsch
algorithms. It is seen that while the number of reads from disk is roughly constant
with decreasing block size the execution time drops sharply, because of faster transfer
times, so that smaller block sizes are to be preferred.
Table 13
Behaviour of the disk storage versions of the HLT and Felsch
algorithms with respect to varying block sizes
With virtually random access throughout the coset table, the least recently used
algorithm was used when replacing a block in core with a block from disk. The
number of writes to disk can be reduced by associating a flag with each block resident
in core to indicate whether the block has been changed since its arrival in core. If
it has not been changed at the time of its replacement, it is, of course, unnecessary
to copy it onto the disk. This can reduce the number of disk writes by up to a factor
of 2. For further details see Watson [22].
Table 14 contains the statistics from a number of enumerations using disk. It
can be seen that the slower processing of the Felsch algorithm is more than made
up for by the smaller ratio of total cosets to cosets in core, so that the Felsch execution
times are significantly better than the HLT execution times. Even so, the Felsch
execution times are still unacceptably large. At the time of writing, no similar experi-
ments have been carried out with a lookahead program but it is clear that new tech-
niques will have to be used if we are to develop economic disk-based coset enumera-
tion programs.
Tf m --h r^ ^h r-
O _ h in» 0\h fN m oo
^h J3 es ^r <n Tf
„ o - - - -
in «¡2 fN vo r- oo rN r-
in fjj vo on Tf vo Tf ro
Ph o vo
c
fN £
<N O 6
■9F m so
< *
VO o
+
O o oo fN ^t m
Ni N I-. mN N
r- .. o
cg
r- - oo- as-. vo.
O OJ © © 2* f*"1 VO fN
m ti, m m o o
vo
in
fN r-
ÜJ
r^ vo "^ oo rr
T. S
J m "î M rr as in
hH co fN r-- oo
6
y ov >n
i*l fN
2r~ o vn
s
" ■* -<r fN r-
S E
O O
<n r_, »n on t—i fN
— _1 nO VO fN Tf"
s fl fN
tu Tf — r- oo vo m u-j fN B E
O J3 Tf Tf «n oo fN M- fN oo
in o m m Tf vo fN Tf
—-*S m in r- r>
fN pL( r-* r- vo r-i
o -d
^ "^ -S
2 Si
■a
■3 S §o^-I
a g!
s
1IshIÍ3S T U
Bell Laboratories
Murray Hill, New Jersey
Department of Mathematics
Institute of Advanced Studies
Australian National University
Canberra, Australia
1. M. J. Beetham, "A set of generators and relations for the groups PSL(2,g), q odd,"
/. London Math. Soc, v. 3, 1971, pp. 544-557. MR 44 #2806.
2. A. Brunner, Personal communication.
3. C. M. Campbell, "Some examples using coset enumeration," in Computational Prob-
lems in Abstract Algebra (Edited by John Leech), Pergamon Press, New York, 1970, pp. 37-
41. MR 40 #7341.
4. H. S. M. Coxeter, "The abstract groups Rm = Sm = (R'S1)"' = 1, Sm = T¿ =
(S'T)*1" - 1 and Sm = T2 = (S^TS'T)") = 1," Proc. London Math. Soc. (2), v. 41, 1936,
pp. 278-301.
5. H. S. M. Coxeter, "The abstract groups G"''"'"," Trans. Amer. Math. Soc, v. 45,
1939, pp. 73-150.
6a,b. H. S. M. Coxeter & W. O. J. Moser, Generators and Relations for Discrete
Groups, Springer-Verlag, Berlin, 1957; 2nd ed., Ergebnisse der Mathematik und ihrer Grenz-
gebiete, Neue Folge, Band 14, Springer-Verlag, Berlin and New York, 1965. MR 19, 527; MR
30 #4818.
7. L. A. Dimino, "A graphical approach to coset enumeration," SIGSAM Bulletin, v.
19, 1971, pp. 8-43.
8. H. Felsch, "Programmierung der Restklassenabzählung einer Gruppe nach Unter-
gruppen," Numer. Math., v. 3, 1961, pp. 250-256. MR 24 #B488.
9. M. J. T. Guy, Coset Enumeration, Lecture delivered at the Conference on Computa-
tional Problems in Abstract Algebra, Oxford, 29 August-2 September, 1967.
10. G. Higman & J. McKay, "On Janko's simple group of order 50,232,960," Bull. Lon-
don Math. Soc, v. 1, 1969, pp. 89-94. MR 40 #224.
11. J. Leech, "Coset enumeration on digital computers," Proc. Cambridge Philos. Soc,
v. 59, 1963,pp. 257-267. MR 26 #4513.
12. J. Leech, "Generators for certain normal subgroups of (2,3,7)," Proc. Cambridge
Philos. Soc, v. 61, 1965,pp. 321-332. MR 30 #4821.
13. J. Leech, "Coset enumeration," in Computational Problems in Abstract Algebra
(Edited by John Leech), Pergamon Press, New York and Oxford, 1970, pp. 21-35. MR 40
#7343.
14. J. Leech & J. Mennicke, "Note on a conjecture of Coxeter," Proc. Glasgow Math.
Assoc, v. 5, 1961, pp. 25-29. MR 27 #196.
15. I. D. Macdonald, "On a class of finitely presented groups," Canad. J. Math., v. 14,
1962, pp. 602-613. MR 25 #3992.
16. J. McKay & D. Wales, "The multipliers of the simple groups of order 604,800 and
50,232,960,"/. Algebra, v. 17, 1971, pp. 262-272. MR 43 #340.
17. N. S. Mendelsohn, "An algorithmic solution for a word problem in group theory,"
Canad. J. Math., v. 16, 1964, pp. 509-516; Correction, Canad. J. Math., v. 17, 1965, p. 505.
MR 29 #1248; 31 #237.
18. J. L. Mennicke & D. Garbe, "Some remarks on the Mathieu groups," Canad. Math.
Bull., v. 7, 1964, pp. 201-212. MR 29 #150.
19. B. H. Neumann, Personal communication.
20. J. A. Todd & H. S. M. Coxeter, "A practical method for enumerating cosets of a
finite abstract group," Proc. Edinburgh Math. Soc, v. 5, 1936, pp. 26-34.
21. H. F. Trotter, "A machine program for coset enumeration," Canad. Math. Bull., v.
7, 1964, pp. 357-368; Program listing in Mathematical Algorithms, v. 1, 1966, pp. 12-18. MR
29 #5415.
22. J. M. Watson, An Extended Todd-Coxeter Algorithm, Department of Mathematics,
Institute of Advanced Studies, Australian National University, Canberra, 1971, 15 pages.