Cent. Eur. J. Math. • 12(2) • 2014 • 175-211 DOI: 10.2478/s11533-013-0335-4

VERS ITA

Central European Journal of Mathematics

Equations in simple matrix groups: algebra, geometry, arithmetic, dynamics

Review Article

Tatiana Bandman1*, Shelly Garion2^, Boris KunyavskiT1^

1 Department of Mathematics, Bar-Ilan University, Ramat Gan, 5290002, Israel

2 Fachbereich Mathematik und Informatik, Universität Münster, Einsteinstrasse 62, Münster, 48149, Germany

Received 9 May SOIS; accepted SS March S013

Abstract: We present a survey of results on word equations in simple groups, as well as their analogues and generalizations, which were obtained over the past decade using various methods: group-theoretic and coming from algebraic and arithmetic geometry, number theory, dynamical systems and computer algebra. Our focus is on interrelations of these machineries which led to numerous spectacular achievements, including solutions of several long-standing problems.

MSG: 14G05, 14G15, 20D06, 20D10, 20G40, 37P05, 37P25, 37P35, 37P55

Keywords: Matrix groups • Finite simple groups • Special linear group • Word map • Trace map • Arithmetic dynamics • Periodic points • Finite fields • Lang-Weil estimate

© Versita Sp. z o.o.

И снова скальд чужую песню сложит И как свою ее произнесет.

Once more a skald will other's song compose And utter it as of his own gift.

Осип Мандельштам, 1914 Osip Mandelstam, 1914 (translated by Rafael Shusterovich)

* E-mail: bandman@macs.biu.ac.il

f E-mail: shelly.garion@uni-muenster.de

* E-mail: kunyav@macs.biu.ac.il

Springer

1. Introduction

Matrix equations, which In the most general form can be written as

F (Ai.....Am,Xi.....X, ) = 0,

where At,... ,Am are some fixed matrices, Xt,... ,Xd are unknowns, F is an associative noncommutative polynomial, and the solutions must belong to a certain class of matrices, constitute a vast research domain, with spectacular applications well beyond algebra, say, in areas such as differential equations and mathematical physics. Noncommutativity brings lots of counter-intuitive phenomena, which can be observed even when asking deceptively simple questions about solvability of the equation or about the number of its solutions, even within the class of innocent looking quadratic matrix equations, see, e.g., [25, 35, 39, 42-44, 116]. (Of course, the reader coming from differential equations and familiar, say, with Riccati matrix equations will not be too much surprised by the difficulty of the question.)

To simplify the problem, one can limit oneself to considering equations of the form

where A is the only allowed constant matrix, and only scalar constants are permitted to appear as coefficients of the polynomial F. Even in this limited form, the question about solvability of (1) is far from being settled though it has been extensively studied since the 1970's when Kaplansky asked about the existence of polynomials whose value sets consist of the scalar matrices. Such polynomials (called central) were discovered by Formanek [36] and Razmyslov [106]; naturally, if F is central and the matrix A is not scalar then equation (1) has no solutions. The same happens in the case where the value set of F consists of matrices with zero trace (say, when F is a Lie polynomial) and tr A = 0 (and, of course, in the trivial case where F is identically zero on the algebra M(n, K) of n x n-matrices with coefficients from the field K). However, there are subtler obstacles to the solvability of (1), see [62] and references therein. Even a very special case where F is a Lie polynomial is not yet settled though some cases where (1) is solvable as well as some obstacles to solvability were discussed in [10].

Further simplification is essential for the present survey: we focus our attention on the case where the solutions to the equations under consideration must belong to a certain multiplicative group of matrices. This naturally leads to the next modification: instead of considering polynomials F as in (1), which can be viewed as elements of the free associative algebra K (Xt ,...,Xd), we consider monomials w(xt, x-t,..., xd, xd:), which can be regarded as elements of the free d-generated group Fd = {xt,..., xd). Thus our main object is the following equation (where we shorten our previous notation for w):

Here g is a fixed element of a group G, and we are looking for solutions among d-tuples (gt,...,gd) of elements of G. By w (gt,...,gd) we understand the element of G obtained after substituting gt instead of xt and performing all multiplications and inversions in G. We define w(G) = {w(gt,..., gd) : gt,...,gd e G} as the set of values of w in G. Although equation (2) (henceforth called a word equation) makes sense in the case where G is not necessarily a group of matrices, we make emphasis on the matrix case. The reason is two-fold. First, main applications we are going to discuss refer to the cases where G has nice natural matrix representations. Second, our main goal is to stress the strength of algebraic-geometric methods in treating problems related to word equations. More specifically, our viewpoint can be described as follows: let us regard all matrix entries of xt as indeterminates, then, after clearing denominators arising because of determinants, we reduce equation (2) to a system of dn2 polynomial equations in n2 variables over the ground field K (all matrices are assumed to be of size n x n). Thus we managed to go over from a noncommutative problem to a commutative one, but the price paid is high: the resulting system may be huge and inaccessible to computer algebra systems. However, in certain cases such an approach may turn out to be fruitful, and several instances of successful implementation of this idea will follow.

Once equation (2) is proclaimed as the main object of our investigation, the first natural question one should ask oneself is the following: what is the most natural class of groups G to start with? A strong hint is given by a theorem of

F (Xi.....Xd) = A,

w (xi.....Xd ) = g.

Borel [17] (see also [68]), which states that If G Is the group of rational points of a connected semlslmple algebraic group defined over an infinite field, then for any w = 1 equation (2) is solvable when its right-hand side is, roughly speaking, a "typical" element g e G; see subsection 5.1 for a precise statement and some details. Thus the most substantial part of the present survey is devoted to the case where G is a finite simple group. Although in this case Borel's theorem does not give any direct indication which elements should be thought of as "typical", we shall try to convince the reader that there are several ways out which make some sense. In particular, Larsen [68] showed that for every nontrivial word w and e > 0 there exists a number C(w, e) such that if G is a finite simple group with # G > C(w, e) then # w(G) > # , see subsection 5.2.

One should note that there is not much hope for positive results of Borel's flavour for groups G which are too far from those singled out in his theorem;see the discussion in [10, Section 5] and relevant references therein (to which one should add a recent paper by Myasnikov and Nikolaev [94]). Therefore, as mentioned above, we restrict ourselves to considering the case where G runs within a family of finite non-abelian simple groups. Moreover, since most of the problems will be of asymptotic nature, we shall usually ignore the sporadic groups. We give more attention to the case of the family SL(2, q) (sometimes extending it to all groups of Lie rank 1, thus adding the family of Suzuki groups Sz(q)). This case is, on the one hand, typical enough, and sufficiently rich to have interesting applications. On the other hand, it allows one to use some additional efficient tools (such as trace maps) and get more precise results. Details can be found below, in Sections 3, 4 and 6.

One can ask various questions on equation (2). First, one can fix a concrete group or a family of groups G (say, G = SL(2, q)), a concrete element g of G (say, g = 1) and a concrete word w, and ask whether equation (2) is solvable (or has a non-identity solution, if g = 1 is chosen). Even in such a restrictive setting, some spectacular applications can be obtained, in particular, to a long-standing problem of characterization of finite solvable groups by recursive identities, see Section 3. A variation of this approach, where one proves solvability of a countable system of equations wn =1 (with G fixed as above), or, particularly, of the system wm = wn with wt obtained after the ith iteration from certain initial data, can be reformulated in the language of periodic points. This viewpoint was (implicitly) taken in [20], where another sequence of words characterizing finite solvable groups was constructed, and made explicit in [13]. In the latter paper this machinery has been further developed which led naturally to some new concepts in arithmetical dynamics. Another impressive application of arithmetical dynamics to a hard group-theoretic problem has been demonstrated by Borisov and Sapir [18, 19], see subsection 4.2.

Further, one can consider more general questions. First, still fixing some "interesting" word w, one may vary g e G and ask about the solvability of (2) for all g, or, at least, for "most of g". In the language of word maps, where we denote by

w: Gd ^ G

the map given by (g1,..., gd) ^ w(g1,..., gd), this means that we are asking whether the word map is surjective or, at least, has a "large" image. Ore's conjecture [102] on representing every element of a finite simple group as a commutator is a fabulous instance of such a setting. Due to immense work spread over more than 50 years, it is now known that the commutator word w = [x, y] e F2 satisfies w(G) = G for any finite non-abelian simple group G (see [79], the references therein and subsection 5.2.2 below). It was therefore conjectured by Shalev that a similar result also holds for iterated commutator (so-called Engel words). The study of this case started in [9] (for G = SL(2, q), see subsection 6.1). The words of the form w = xayb e F2 have also attracted special interest. Larsen, Shalev and Tiep [73] proved that any such word is surjective on sufficiently large finite simple groups. By further recent results of Guralnick and Malle [53] and of Liebeck, O'Brien, Shalev and Tiep [81], some words of the form xbyb are known to be surjective on all finite simple groups. The particular case of surjectivity of these words on SL(2, q) was studied in [8] (see subsection 6.1).

A ramification of this sort of questions, where one asks whether any element g e G (or "most" of them) can be represented as a product of at most k values of w (k is a fixed natural number), has been christened "a Waring-type problem" by Shalev (by analogy to a celebrated problem on representing a natural number as a sum of k powers), and a number of impressive results has been obtained. The most conclusive one, due to Larsen, Shalev and Tiep, says that for every nontrivial word w there exists a constant C(w) such that if G is a finite simple group satisfying # G > C(w) then w2(G) = G, see [71, 73, 74, 114] and subsection 5.2.1 below.

Note that a naive question whether w(G) = G for any nontrivial word w and all sufficiently large finite simple non-abelian groups G is clearly answered in the negative: indeed, it is easy to see that if G is a finite group and m is an

integer which is not relatively prime to the order of G then for the word w = x" one has w(G) = G. Hence, if v e Fd is any word, then the word map corresponding to w = vm cannot be surjective. A natural question, suggested by Shalev, whether these words are generally the only exceptions for surjectivity of word maps in finite non-abelian simple groups, was recently answered in the negative by Jambor, Liebeck and O'Brien [60] (see Section 7).

Yet another type of problems, first introduced in [41], arises when one asks about the behaviour of the fibres of the word map rather than about its image. Namely, Shalev asked [113, Problem 2.10] whether (the cardinalities of) these fibres are equidistributed (or close to be equidistributed) when g varies in some "large" subset of the image of w and G runs over some family of finite groups. It was proved in [41] that the word w = x2y2 e F2, the commutator word w = [x,y] e F2, as well as the words w = [xt,...,xd] e Fd, d-fold commutators in any arrangement of brackets, are almost equidistributed on the family of finite simple non-abelian groups. The case G = SL(2, q) was studied in some more detail in [8, 9, 14], see Section 6.

We do not pretend that the present overview of word maps is comprehensive. As one can see from the title, shamelessly plagiarized from Manin's book [91], it was conceived for emphasizing the role of various tools lying outside group theory, particularly, of algebraic-geometric and arithmetic-geometric nature, in solving hard group-theoretic problems (note, however, that the role of the "algebra-geometry-arithmetic" triad in Manin's setting was somewhat different: his focus was on systematic use of algebraic, mainly Galois-cohomological, methods in relating arithmetic phenomena, such as counter-examples to local-global principles, to geometric ones, such as non-rationality; our viewpoint is different: a typical target is a group-theoretic question whereas algebraic and arithmetic geometry provide an efficient machinery). The reader inclined to purely algebraic approaches to word equations is referred to an excellent exposition given in the monograph by Segal [111] and the more recent survey by Nikolov [97]. Vast literature on equations and system of equations in free (and close to free) groups is left aside; see, e.g., [24], particularly the introduction, for a comprehensive bibliographical survey of this theory.

2. Main tools

In this section we shall briefly describe some common machinery used while treating word matrix equations, leaving details, which may vary from one problem to another, for subsequent relevant sections (note, in particular, that the problems discussed in Section 5 require lots of other techniques).

2.1. Crucial assumptions

Our general setting is as follows.

(I) We view a matrix group G as an algebraic variety G C AN embedded into an affine space, where the entries of a matrix v e G play role of its coordinates (xt(v),... ,xN(v)) = (vt,..., vN) (though this is not the case if G is a Suzuki group, G then has a "large" subset meeting this condition; see Section 3).

(II) The group product (vt, v2) — v = vtv2 is a polynomial morphism G x G — G.

Changing N if necessary, we can guarantee that the inversion will also be a polynomial map;in most cases, however, this is not needed because usually we work with matrices of determinant t.

Let d > 2 be an integer, and let w = w(xt,...,xd) be a nontrivial element of the free group Fd = {xt,...,xd) on d letters xt. This means that it is a reduced word in xt and xwith nonzero exponents. Then, given a group H, the word w defines a map fw: Hd — H,

fw (hi.....hd) = w (hi.....hd).

When no confusion may arise, we will shorten fw to w.

Provided (I) and (II), we may interpret an endomorphism of a group as a polynomial morphism G — G and a word w (xt,... ,xd) as a morphism Gd —> G. Similarly, for a word map w: Gd — G and an element g e G, a solution of the equation w(xt,...,xd) = g corresponds to an affine subvariety S(w,g) C Gd. If G is a finite group (and this is the

case in most applications considered in the present survey), we can choose the ground field to be a finite field F. If the problem under consideration requires a study of a family of finite matrix groups Gq, each defined over its own ground field Fq, it is convenient to view Gq as a fibre of a group Z-scheme G. Then solutions to the equation w(x1,... ,xd) = g are described by points S(w, g)(Fq). This leads to the following local-global approach: instead of solving equations in an infinite family of groups Gq, one has to study a single Z-scheme S(w, g) and then look at its fibres at every q.

2.2. Lang-Weil inequality

Once the initial group-theoretic problem is reduced to an arithmetic geometry problem of proving the existence of a rational point on a variety defined over a finite field, a natural tool to use is the Lang-Weil inequality [67]. It says that if X is an n-dimensional absolutely irreducible variety over Fq, then asymptotically (i.e., for q large enough), the number of its rational points #X(Fq) does not differ too much from the number #Pn(Fq) of points of the n-dimensional projective

space (which is qn + qn—1 +-----hq +1). Namely, the difference is O(qn—1/2) < C1(X)qn—1/2 + C2(X)qn—1. If one can make

this inequality effective, i.e., compute or at least estimate C1 and C2, one can guarantee the existence of an Fq-point on X for q large enough. Note that such a computation is related to deep topological properties of X and requires some information on its Betti numbers (in suitable cohomology), see [45] for a nice survey of various ramifications and generalizations of the Lang-Weil inequality. In the one-dimensional case the classical Weil estimate gives an expression of the remainder term via the genus of the curve.

One should not forget another difficulty related to checking absolute irreducibility of X, which may be a highly nontrivial task in concrete examples. Most of arising problems comprise computational aspects, and usually require some advanced computer algebra to overcome them, see Section 3.

2.3. Trace map

In the case where G = SL(2), apart from the general techniques described in the previous section, we have at our disposal another powerful tool going back to classical works of Vogt, Fricke and Klein [37, 38, 124], cited here from the paper [56] (see also [47, 89, 90] for a nice exposition of these results).

Let Fd = (x-\,... ,xd) denote the free group on d generators. For G = SL(2, k) (k is any commutative ring with 1) and for any u e Fd denote by tr u: Gd —> G the trace character, (g1,..., gd) ^ tr(u(g1,..., gd)).

Theorem 2.1.

If u is an arbitrary element of Fd, then the character of u can be expressed as a polynomial

tr U = P(t-1.....td, f-12.....i12...d)

with integer coefficients in the 2d — 1 characters ti1i2 __iv = tr(xt1 xi2 .. .xiv), 1 < v < d, 1 < i1 < t2 < ... < tv < d.

Let G = SL(2, q), and let n: Gd -> A2"—1 be defined by

n(g1,..., gd) = (t1,..., td, t-,2,..., t-]2...d)

in the notation of Theorem 2.1.

Let Zd = n(Gd) C A?d—1. Let w: Gd —> G be a word map. It follows from Theorem 2.1 that for every d there exists a polynomial map ^: Azd—1^> A1 such that the following diagram commutes:

Zd (Fq )

Moreover, for small d we have a more precise information: one can take Z2 = A3 and Z3 C A7 an explicitly given hypersurface. This diagram allows one to reduce the study of the image and fibres of w to the corresponding problems for which may be much simpler, see subsections 4.1.3 and 4.1.4.

3. Equations in groups of Lie rank one and characterization of finite solvable groups

As mentioned in the introduction, even a solution of only one word equation in a small family of finite groups may lead to spectacular consequences. Here is an instance of such a phenomenon.

We consider a problem of characterizing various classes of groups by identical laws. Say, G is abelian if and only if [x, y] = 1 for all x, y e G; a finite group G is nilpotent if and only if there exists n = n(G) such that

en(x, y) = jjx y]^ ..,y] = 1

n times

(Zorn's theorem [128]);what about finite solvable groups?

There are various characterizations of finite solvable groups (some of them are discussed in [52]), among which there are those using identities. For instance, a finite group G satisfying a "short" identity of some special form must be solvable [23] (we thank the second referee for providing this reference). We want to obtain a criterion as close as possible to Zorn's theorem.

Two-variable sequences

The purpose is to establish a characterization of the solvable groups in the class of finite groups by an inductively defined, Engel-like sequence of two-variable identities. This problem has a long history and admits many counterparts and generalizations;the interested reader is referred to [52] for a survey.

Definition 3.1.

Let w e F2 and f e F3.

(i) We say that a two-variable sequence {vn(x, y)} is defined by the recursive law (w, f) if

vt(x,y) = w (x,y), vn+t(x,y) = f (x,y,vn(x,y)).

(ii) We say that the sequence is fc-valent, k = 1, 2, 3, if f = f(x,y,z) depends on k variables among which z must appear.

(iii) We say that the sequence is Engel-like if f (x, y, 1) = 1 in F3.

(iv) We say that the sequence {vn} characterizes finite solvable groups if the following holds: A finite group G is solvable if and only if there exists n e N such that for all (x, y) e G x G, vn(x, y) = 1.

Remark 3.2.

If {vn} is an Engel-like sequence, the condition vn(x, y) = 1 implies that vm(x, y) = 1 for all m > n.

Example 3.3.

The original Engel sequence corresponds to w(x, y) = [x, y], f(x, y,z) = [z, y]. It is 2-valent and Engel-like in the sense of the previous definition but, of course, does not characterize finite solvable groups.

The first example of a sequence satisfying the conditions of Definition 3.1 was found in [11]. Let the commutator be defined as [a,b] = aba-1b-1. Set

w(x, y) = x-2y-^x, f(x, y,z) = [xzx-1, yzy-:].

The corresponding sequence is

m(x,y)= x-2y^x, Un+1(x,y)= [xUn(x,y)x-\ yUn(x,y)y-^]. (3)

Theorem 3.4 ([11, 12]).

A finite group G is solvable if and only if for some n we have un(x, y) = 1 for all x, y e G.

This section is a brief exposition of the proof of this theorem. It involves algebraic geometry, arithmetic geometry, algebra and heavy MAGMA and SINGULAR computations.

Remark 3.5.

The role of the first word is very important: for example, if one changes the first word to a more natural one: ut(x, y) = w(x, y) = [x, y], keeping the same recursive law f, we do not know whether or not the obtained sequence characterizes finite solvable groups. Moreover, neither do we know this for the prototypical sequence arising from u1(x,y) = [x,y], f(x,y,z) = [[z,x], [z,y]], which characterizes finite-dimensional solvable Lie algebras (if [ •, ■ ] is understood as Lie bracket) [51].

More 3-valent sequences {un} for which Theorem 3.4 holds were produced in [107]. We conjecture after long computer experiments that for most (if not for any) choices of f satisfying obvious necessary conditions, there is an initial word u1 such that Theorem 3.4 holds;see the discussion in subsection 4.1.4 and Conjecture 7.1 below.

An example of a 2-valent sequence which still characterizes finite solvable groups was given in [20],

S1(x,y)= x, Sn+1(x,y) = [ysn(x,y)y-\ Sn(x,y)-1]. (4)

Theorem 3.6 ([20]).

A finite group G is solvable if and only if for some n we have sn(x, y) = 1 for all x, y e G.

Whereas the proof of Theorem 3.4 is based on solvability of a single word equation in a family of simple groups (see the next section), the proof of Theorem 3.6 requires solvability of a system of countably many equations and has an evident dynamical flavour, see subsection 4.1.

Equations in SL(2) and Suzuki groups

In this section we sketch a proof of Theorem 3.4. Clearly in every solvable group the identities un(x, y) = 1 are satisfied from a certain n e N onward. By a standard argument, using minimal counter-examples, the nontrivial direction of Theorem 3.4 follows from

Theorem 3.7.

Let G be a finite non-abelian simple group. Then there are x, y e G such that

u 1 (x, y) = u2(x,y), u 1 (x, y) = 1. (5)

Moreover, the same argument shows that we only need to prove Theorem 3.7 for the groups G in J.Thompson's list of the minimal simple non-solvable groups [119]. For simplicity we slightly extend it:

(a) G = PSL(2, q) where q > 4 (q = pm, p prime),

(b) G = Sz(22m+1), m e N,

(c) G = PSL(3, 3),

because this does not make our proof more complicated. Here S^22m+^, m e N, denotes the Suzuki groups (see [59, XI.3]).

For small groups from this list it is an easy computer exercise to verify Theorem 3.7. There are for example altogether 44928 suitable pairs x, y in the group PSL(3,3).

The general idea of the proof is as described in Section 2. For a group G in the above list, using a matrix representation over Fq, we interpret solutions of the equation u1(x,y) = u2(x,y) as Fq-rational points of an algebraic variety. Estimates of Lang-Weil type for the number of rational points on a variety defined over a finite field guarantee in appropriate circumstances the existence of such points for big q. Of course we are faced here with the extra difficulty of having to ensure that u1(x, y) = 1 holds. This is achieved by taking x, y from appropriate Zariski-open subsets only.

The case G = PSL(2, q)

Let us look for a solution to (5) among the matrices of the form

x(v ) =

t -1 1 0

y(v ) =

1 b c 1 + be

where v = (t, b, c) e A3.

SINGULAR computations show that the equation u1(x(v), y(v)) = u2(x(v), y(v)) defines an algebraic curve C C A3. Note that a priori one should expect dim C = 0 because C is defined by three equations in three variables. It is this dimension jump that forced a somewhat peculiar choice of the initial word u1(x, y).

It turns out that

• C does not depend on Fq;

• C is absolutely irreducible for any p;

• Pa(C) = 10, deg C =12 (where C denotes the projective closure of C, pa is the arithmetic genus and deg is the degree).

(The second and third statements were established by computer calculations;the proof of absolute irreducibility is technically the most difficult part;the interested readers are referred to the original papers.)

From Weil's estimate for the number of rational points over a finite field it follows that C(Fq) > q +1 — 2pa^fq — d, hence for q > 593 there are enough rational points on C to prove Theorem 3.7. Solutions for (5) in the groups G = PSL(2, q), q < 593, were found by computer.

The case of Suzuki groups

In the proof of Theorem 3.7 the case of Suzuki groups G = Sz(q), q = 22m+1, is the most difficult one. The main reason is that although Sz(q) is contained in GL(4, q), it is not a fibre of a Z-scheme.

In fact the group Sz(q) is defined with the help of a field automorphism of Fq ("square root of Frobenius", see, e.g., [58, Chapter 20] for a precise definition), and hence the standard matrix representation for Sz(q), obtained in the original paper of Suzuki [117], contains entries depending on q. We shall describe now how our problem can still be treated by methods of algebraic geometry.

This time one looks for a solution of (5) among the matrices parameterized by points of an 8-dimensional space: for v = (a, b, c, d, a0, b0, c0, d0) e A8, let

x(v ) =

ab + bo aa0 + b a 1

a0 1 0

y(v ) =

c2c0 + cd + d0 d c 1

cc0 + d c0 1 0

c 10 0

1 0 0 0

Moreover, x(v) G Sz(q) If and only If

2"+1 u u2"+1

a0 = a , bo = b

Similarly, y(v) G Sz(q) If and only If

Define L = {v : u^(x(v), y(v)) = u2(x(v), y(v))} C A8. From relations (6) and (7) it follows that

Z(Fq) = L n {ao = a'2", b0 = b2", co = c2", do = d2"}.

In order to prove that the set I(Fq) is not empty, it is represented as the set of fixed points of an automorphism of A8. Namely, consider an automorphism a: A8 —> A8 defined by

Then a2 is the Frobenius automorphism. Moreover, from (6) and (7) it follows that x(v),y(v) e Sz(q) if and only if am(v) = v, i.e., if and only if v is a fixed point of the mth iteration of a. Computations show that there exists a subset U C I such that

(i) U is a smooth, absolutely irreducible, affine variety,

(ii) dim U = 2,

(iii) U is a-invariant.

Denote by bl(U) = dim H^U, Q) the £-adic Betti number, £ = 2. Then

Proposition 3.8.

b\U) < 675 and b2(U) < 222.

The estimates contained in Proposition 3.8 are derived from results of Adolphson-Sperber [2] and Ghorpade-Lachaud [46] allowing one to bound the Euler characteristic of an affine variety in terms of the number of variables, the number of defining polynomials and their degrees. Note that since U is affine, we have b3(U) = b4(U) = 0. Since U is nonsingular, the ordinary and compact Betti numbers of U are related by the Poincare duality, and we have b'c(U) = b4-l(U).

In order to estimate the number #Fix(U, n) of an-fixed points in U, applied was the Lefschetz Trace Formula:

where Fix(U,n) is the set of fixed points of a" acting on U.

(A remark for the reader interested in algebraic-geometric details: technical difficulties arising from the fact that U is not projective and a is not Frobenius, were overcome, roughly, as follows: a can be extended to an endomorphism of P8 having no fixed points in U \ U, where U is the projective closure of U. Then one can use Deligne's conjecture stating that a formula of Lefschetz type holds after composing a with a sufficiently high power of Frobenius, which, in our case, means a high odd power of a. Recall that Deligne's conjecture has been proved by Fujiwara [40];see [123] for simplifications and generalizations.)

From formula (8) and Deligne's estimates for the eigenvalues of the endomorphism induced by a on the étale cohomology, an inequality of Lang-Weil type follows:

An easy estimate shows that # Fix(U,n) = 0 for n > 48. The cases n < 48 were checked with the help of MAGMA.

a (a, b, c, d, a0, b0, c0, d0) = (a0, b0, c^, d0, a, b, c, d).

# Fix(U,n) - 2"\ < b\U)23nl4 + b2(U)2"12.

4. Arithmetic dynamics

In this section we present two group-theoretic problems for which the language of arithmetic dynamics appears to be an adequate one. The first one, discussed in subsection 4.1, arose from attempts to understand the proof of Bray, J.Wilson, R.Wilson [20], who exhibited another sequence characterizing finite solvable groups (see Section 3), find an explanation of the phenomenon that a sequence possesses this property, and produce more such sequences. This was essentially done in [13], where the interested reader can find more elaborated constructions in arithmetic dynamics which are beyond the scope of the present survey.

Another instance is related to the work of Borisov and Sapir [18, 19], see subsection 4.2, where somewhat similar philosophy led to an answer to another long-standing group-theoretic question (this time, from the theory of infinite groups).

Given a two-variable sequence {vn(x,y)} defined by a recursive law (w, f) (see Definition 3.1), one can ask whether or not it characterizes finite solvable groups. An obvious necessary condition is that {vn} must descend along the derived series, and we always assume that this condition holds. We also assume that {vn} does not contain the identity word. Then the only condition to check is the following one:

This condition may be further reduced (see Section 3) to the following property.

Property 4.1.

Let G be one of the groups PSL(2, q), q =2,3, Sz(22m+1), or PSL(3, 3). Then for all n there exists (x, y) e G x G such that vn(x, y) = 1.

Since it is very easy to check the single case PSL(3, 3), we assume throughout below that the property holds for this group.

We will say that a sequence is very good if Property 4.1 holds for all groups listed therein and is good if it holds at least for all PSL(2, q), q =2, 3. In this section we want to describe good sequences.

For a further simplification, the following observation is crucial: if {vn} is an Engel-like sequence then Property 4.1 holds in a group G as soon as there exist n, m > n and (x, y) e G x G such that

This simple reformulation allows one to replace the proof of solvability of a concrete equation (say, v1(x, y) = v2(x, y) = 1) in a family of groups, as was done in Section 3, by the proof of existence of a preperiodic point (different from identity) of a certain dynamical system generated by the recursive law f.

Recall that a sequence vn(x,y) = (w(x,y),f(x,y,z)), generated by the first word w and recursive law f, is formed by the rule

v1(x, y) = w(x,y), vn+1(x,y) = f(x,y,vn(x,y)).

In such a situation, one can define, for any group G, a self-map G x G x G ^ G x G x G by adding "tautological" variables. We arrive at the notion of verbal dynamical system (G, </>, V) consisting of the following data:

• a group scheme G (in our case G = SL(2, Z));

• a morphism ^: G x G x G ^ G induced by the word f(x, y,z);

4.1. Verbal dynamical systems

G is not solvable

for all n there exists (x, y) e G x G such that vn(x, y) = 1.

vn(x,y) = vm(x,y) = 1.

• an endomorphism ^: G x G x G — G x G x G defined by $(x, y,z) = (x, y, $(x, y,z));

• a forbidden set V (in our case V = G x G x {id});

• an initial word w: G x G — G.

Conversely, given such data, we reconstruct our iterative sequence {vn(x, y)}.

For each field Fq we consider the fibre (SL(2, q), fiq,V(Fq)). The sequence {vn} is good if for every q there are </>q-preperiodic points outside V(Fq). Consider three possible types of f.

4.1.1. 1-valentlaw

The only 1-valent law is f(v) = v'. Then vn+1 = v'n. This sequence is not going down along the derived series and hence does not meet the necessary condition for characterizing solvable groups.

4.1.2. PSL(2), 2-valent law

An example of a 2-valent law is sequence (4). In such a case our general setting can be simplified. Namely, if f = f(y,z) does not depend on x, we can restrict our verbal dynamical system to the form ^: G x G — G x G, (y,z) — (y, f(y,z)), with the forbidden set G x {id} and initial word w(x, y) = x.

4.1.3. Traces

In order to further simplify the dynamical system, one can use the trace map (see subsection 2.3). In the special case d = 2, Theorem 2.1 may be formulated as follows.

Theorem 4.2.

Let G = SL(2, Z). Define n: G x G — A3 by n(x, y) = (tr x, tr(xy), tr y). Then for a word map $: G x G — G, there is a polynomial in three variables (s, u, t) such that tr $(x, y) = P$(trx, tr(xy), tr y).

Denote s = tr x, u = tr(xy), t = tr y. Let f1(s, u, t) = tr $(x, y), f2(s, u, t) = tr($(x, y)y), and ^ = (f1(s, u, t), f2(s, u, t), t). According to Theorem 4.2, we have the following factorization of the verbal dynamical system (G, V), i.e., the following commutative diagram:

G x G G x G

n n (9)

A3 _y A3

rSis,u,t rSis,u,f

In diagram (9),

• n is defined over Z;

• n is surjective for all Fq (see, e.g., [87] or [13]);

• the set Z of fixed points of being defined by the system f1(s, u, t) = s, f2(s, u, t) = u, has a positive dimension. Respectively, for every q we have a commutative diagram:

SL(2, q) x SL(2, q) —SL(2, q) x SL(2, q)

n n (10)

Alu.t (Fq ) ---- A3ut (Fq).

Assume that Z(Fq) contains an irreducible over Fq curve C of genus g and degree d that intersects n(V) at k points. Then by Weil's inequality

#(C\n(V))(Fq) > (q + 1) — 2g^q — d — k,

and (C\n(V))(Fq) = 0 for q_big enough, q > q0(g,d,k). Let a e (C\n(V))(Fq) and Fa = n—1(a) C SL(2, q) x SL(2, q) \ Vq. Then Fa is a <q-invariant finite set and thus contains a nontrivial <q-preperiodic point.

We can now formulate a procedure for checking that a 2-valent sequence {vn} is good. Namely, assume that {vn} is defined by a word (law) f(y, z). The process consists of the following steps:

(i) compute the trace map p of the endomorphism <;

(ii) compute the set Z of fixed points of p;

(iii) find an affine curve S C Z such that

(1111) S is defined over Z;

(1112) S is irreducible over Q;

(1113) S is not contained in n(V).

If the process succeeds, then one has to check finitely many remaining cases, namely,

• for finitely many primes p1,..., ps one has to check whether C is irreducible over Fpi (cases of bad reduction);

• for all small fields Fq with q < q0(g, d,k) one has to find a nontrivial <-preperiodic point in SL(2, q) x SL(2, q).

All these steps can be performed by computer.

Example 4.3.

The process was performed for the sequence sn. The curve Z contains a line in n(V) and two curves, each of genus 1 with three punctures. Thus, each of these two curves has a point over every field Fq, q > 9.

Example 4.4.

Using this process, it was proved that the 2-valent sequence r1(x,y) = x, rn+1(x,y) = [y2xy—2,x—1] is good as well. Moreover, it was checked for the Suzuki groups too. Thus it characterizes finite solvable groups.

4.1.4. 3-valent sequences

An example of a 3-valent sequence characterizing finite solvable groups is sequence (3). Consider any 3-valent sequence vn = (w,f). As explained above, it gives rise to a morphism ^: G3 ^ G and an endomorphism <: G3 ^ G3, <(x, y,z) = (x, y, f(x, y,z)). The sequence is good if for every q there exists m = m(q) such that there is a solution in SL(2, q) x SL(2, q) to the equation

v1(x,y) = vm(x,y) = 1. This means that there is a pair (x, y) e SL(2, q) x SL(2, q) such that

<m(x, y, w(x, y)) = (x, y, w(x, y)) e Vq,

where Vq = SL(2, q) x SL(2, q) x {id} denotes the forbidden set.

As in Theorem 2.1, we express the trace of fi(x, y,z) as a polynomial in seven variables a1 = trx, a2 = tr y, a3 = trz, a12 = tr(xy), a13 = tr(xz), a23 = tr(yz), a123 = tr(xyz). These variables are dependent (see, e.g., [89] or formulas (2.3)-(2.5) in [56]):

a223 — a123(a12a3 + a13 a2 + a23a1 —a1a2a3) + (a2 + a2+a3 + a22+a23 + a23 —a1 a2a12 — a1a3a13 — a2a3a23+a12a13a23 — 4) = 0.

Let a = (a1, a2, a3, a12, a13, a23, a123) e A7. Let n(x, u, y) = a e Z be the trace projection. Let Z = n(G3) C A7. Then we have a commutative diagram

G x G x G

G x G x G

where p(a) = (a1, a2, l1(a), a12, l2(a), l3(a), l4(a)),

li=tr y(x,y,z), I2 = tr(y(x,y,z)x), I3 = tr(y(x, y, z)y), U = tr(y(x, y, z)xy).

In [13] it is proved that Z is an irreducible hypersurface over any algebraically closed field and that n is surjective for every Fq. The dimension of the variety F(q>) C Z of fixed points of p is at least 3. The additional condition z = w(x, y) defines a 3-dimensional affine subset W(w) C Z.

The sequence is good if ((F(q>) n W(w))\ V) (Fq) = 0 for all q big enough. Since dim Z = 6, dim F(q>) = 3, dim W(w) = 3, one should expect that F(q>) n W(w) is zero-dimensional. However, it turns out that for the sequence un defined by (3) it is an absolutely irreducible curve! Thus we can formulate a sufficient condition for a 3-valent sequence to be good.

Theorem 4.5.

Let w, f define a sequence {vn(x, y)}. Let F(q>) be the variety of fixed points of the trace map p of the corresponding endomorphism y (see diagram (11)), and let W(w) be defined by w. Let V = {a2 = 2,a1 = a12, a3 = a23, a13 = a123}. Assume that F(q>) n W(w) contains a positive dimensional, absolutely irreducible Q-subvariety 0 C V. Then there is q0 such that for every q > q0 there exists a <j>-preperiodic point in SL(2, q)3 \ Vq.

Once again, this theorem provides a finite process, which may be performed by computer, determining whether a sequence is good.

It is a conceptual challenge to understand whether or not the property of a sequence to characterize finite solvable groups is generic in some reasonable sense. We suspect that this question can be answered in the affirmative, in the sense that for almost every law f(x, y,z), satisfying necessary conditions, there exists a first word w(x, y) such that the resulting sequence is as required. Of course, one has to make more precise what is meant by "almost every law". See Conjecture 7.1 below.

4.2. Mapping tori of endomorphisms of free groups

Below we describe in brief another spectacular application of arithmetic dynamics to group theory, following [18].

Given a group G with generators x1,... ,xd and a relation set R, and its injective endomorphism $ taking xt to a word wt e Fd, i = 1,..., d, the mapping torus T of $ is defined as the group extension of G obtained by adding a generator t subject to the relations txt-1 = wt. One can ask whether T is residually finite (this property means that the intersection of the subgroups of finite index of T is trivial). We refer to [64] and [18] for the history and context of this problem.

The following theorem answers this question in affirmative in the case where the set R is empty, i.e., G is a free group.

Theorem 4.6 ([18]).

The mapping torus of any injective endomorphism of a free group is residually finite. Here are the main steps of the proof.

• By the definition of the mapping torus of an Injectlve endomorphism, It Is enough to prove that for any word w e F and any positive integer a there is a homomorphism h: T ^ H to a finite group H such that

h(tawt-a) = h(^<a>(w)) = id . (12)

• Any homomorphism hg of Fd to SL(2) is defined by a point g = (g1,..., gd) e SL(2)d. Then

hg(xi) = gu hg(w(xi.....xd)) = w (g-i.....gd).

• Define Ф: SL(2)d ^ SL(2)d by 0(S1.....Sd) = (w-(s-.....Sd).....wd(s-.....Sd)). Then

hg(J>^(w)) = w^<a>(gi.....gd)). (13)

• Take a field Fq big enough and consider the endomorphism Ф?: SL(2, q)d ^ SL(2, q)d. Then (13) is still valid.

• Thus any point g e SL(2, q)d such that

- it is periodic for Ф(?,

- nw(g) = w(g) = w(gi.....gd) = 1,

will have property (12).

• Let V С SL(2)d be the Zariski closure of ф(4d)(SL(2)d). Then

- V is Ф-invariant,

- Ф[у: V ^ V is dominant,

- nw(V) = {id, - id}.

• Let Z = V \ n-1({id, — id}) and fix Fq. By a theorem of Hrushovski [57, Corollary 1.2], there is an extension Fqi of Fq and a Ф^-periodic point g e Z(Fqi).

In order to avoid Hrushovski's theorem, the authors prove its particular case for endomorphisms of affine space. That is why they consider SL(2) as a subscheme of the scheme M2 of all 2 x 2 matrices. Note, however, that using Hrushovski's theorem, they obtain a stronger result where residual finiteness is established for the mapping torus of any endomorphism of any finitely generated linear group. Moreover, a further refinement of their method in [19], involving, in particular, search of periodic points of self-maps defined over p-adic fields, allows one to get more precise information on the structure of the mapping tori.

Remark 4.7.

Note that in the framework of the dynamical approach presented above it is permitted in Theorem 4.6 to look for periodic points defined over some extension of the original ground field. This is a subtle but important difference from the method described in subsection 4.1 where such an extension is forbidden, which prevents from using Hrushovski's theorem.

5. Word maps: image and fibres

In this section we focus on the case of finite simple groups asking the following questions:

• How big is the image of a word map?

• What are the sizes of the fibres of a word map?

These questions are interrelated: one can prove that the image is large by estimating the sizes of fibres. We start, however, with recalling a seminal result by Borel [17] where a general answer to the first question was obtained for connected semisimple algebraic groups.

5.1. Borel's theorem Theorem 5.1 ([17]).

Let G be a connected semisimple algebraic group defined over a field K. Let fw: Gd ^ G be the map associated to a nontrivial element w of the free group on d > 2 letters. Then fw is dominant, i.e., its image is Zariski dense in G.

Here are the main steps of the proof.

(I) Since dominance is preserved by field extension, one may assume that K is algebraically closed of arbitrary transcendence degree.

(II) One can show that the assertion of the theorem does not depend on the choice of G within its isogeny class, so one may assume G is simply connected. One can also easily reduce to the case where G is simple.

(III) There is a d-tuple (g1,..., gd) e Gd such that fw(g1,..., gd) = 1 because a simple group has no identical relations, in view of a theorem of Platonov [103] stating that a linear group with an identical relation must be solvable-by-finite (this can also be deduced from the Tits alternative [122]).

(IV) First consider the case G = SLn.

• The following observation, going back to the unitary trick of Weyl and used in [26], is crucial: one can find a subfield L C K and a division ¿-algebra D so that the group G = SLn(K) contains its anisotropic form H = SL1(D), the group of elements of D of reduced norm 1, as a dense subset.

• One proves that

if id = h e H, then 1 is not an eigenvalue of h. (14)

• Let im fw = Z be the closure of im fw. One should prove that Z = SL(n, K). Since H is dense in G,

{0} = fw(Hd) C Z n H. (15)

• Use induction on n. If n =2 and Z = G, then dimZ < 2 and Z is a union of a finite number of conjugacy classes of (non-identity) semisimple elements and the set U consisting of unipotents. Since id e Z and Z is irreducible, it is contained in U, which contradicts (14). Hence, for n = 2 the statement is valid.

• Assume that the statement is proved for n < m — 1. Fix a maximal torus T C SL(m, K). Let T' C T be the union of subtori consisting of elements with at least one eigenvalue 1. T' is a hypersurface in T. By the induction hypothesis, Z D T'. On the other hand, by (15), there is a d-tuple (h1,...,hd) e Hd such that fw(h1,..., hd) e T\ T'. Since SL(m, K)d is an irreducible variety, Z should be irreducible as well. Therefore, if it contains a hypersurface T' and at least one point outside T', it contains T. Thus, Z = SL(m, K) since the conjugates of T are dense in SL(m,K) and im fw is invariant under conjugation.

(V) Any simple group of rank r not isogenous to SLn contains a subgroup of the same rank r which is isogenous to a direct product of groups SLni, and the assertion of the theorem follows from the previous step.

Remark 5.2.

See [61] for an alternative proof based on Amitsur's theorem on generic division rings [3].

5.2. The image of the word map on finite simple groups

From now on G is a finite simple group, w(x1,... ,xd) e Fd is a nontrivial word, and we shorten our previous notation so that w: Gd — G denotes the corresponding word map. It turns out that one can provide an analogue of Borel's theorem. Although dominance does not make any sense in this context, it was shown by Larsen [68] that the image of the word map is large.

Theorem 5.3 ([68]).

For every nontrivial word w and any e > 0 there exists N = N(w, e) such that if G is a finite simple group of order greater than N, then

# w(G) > # G^—e.

The original proof of this theorem heavily relied on the techniques of Larsen-Pink [69] for estimating the sizes of the fibres of w in the case where G is of Lie type. The case G = An always requires separate consideration, and sporadic groups can be ignored whenever one restricts attention to asymptotic questions.

Later on, this theorem was reproved by Larsen and Shalev [71] in the case of groups of Lie type using more traditional methods of arithmetic geometry, such as Lefschetz's trace formula (not including, however, Suzuki and Ree groups, successfully treated in [69]). In the same paper [71], Larsen and Shalev considered some other variations on estimating the size of w(G). Namely, if G is a group of Lie type (different from Ar or 2Ar) of fixed Lie rank r, they obtained an estimate of the form

# w(G) > cr—1# G.

Here c is a positive absolute constant and G is of order greater than N(w). For the alternating groups the estimate is slightly weaker.

As there is no hope to establish surjectivity of the map w for arbitrary words (power maps provide easy counter-examples), one can try to say something more about w(G). The following terminology, reminiscent of classical number theory, was introduced by Shalev.

5.2.1. Waring type properties [66, 71, 73, 74,114]

Waring's problem deals with expressing every natural number as a sum of f(k) kth powers for some suitable function f. Noncommutative analogues of this problem were investigated during the past years, answering the following questions: Can one write any element of a finite simple group as a product of f(k) kth powers? (See [93, 109].) Can this result be extended by replacing the word xk with an arbitrary nontrivial word w? Can these results be improved by replacing the function f with a global (small) constant? See [73] and the references therein, and the most recent improvements in [55].

A prototypical result here is due to Liebeck and Shalev [83]: if G is a finite simple group and w is not trivial on G, then there is a constant c = c(w) such that every element in G is the product of c values of w on G. Using lots of various methods and ingenious techniques, Shalev showed in [114] that for every nontrivial word w there exists a constant N(w) such that if G is a finite simple group of order greater than N(w) then w3(G) = G. Two alternative proofs of this result were recently found. The first, due to Nikolov and Pyber [98], is based on a recent result of Gowers [50], and the second, for finite simple groups of bounded Lie rank, due to Macpherson and Tent [88], relies on model theory. This result was substantially improved by Larsen, Shalev and Tiep in [70, 71, 73].

Theorem 5.4 ([73]).

For any nontrivial word w there exists a constant N(w) such that for all finite non-abelian simple groups G of order greater than N(w) we have w2(G) = G.

The particular case of w = xk shows that this is the best possible Waring type result for powers.

Conjecture 5.5 (Shalev, [113, Conjectures 2.8 and 2.9]).

Let w = 1 be a word which is not a proper power of another word. Then there exists a number N(w) such that, if G is either An or a finite simple group of Lie type of rank n, where n > N(w), then w(G) = G.

Note that the assumption on the rank of G cannot be removed, in view of counter-examples presented in [60] (see the footnote to Conjecture 7.14).

A recent result of Kassabov and Nikolov [66] shows that the assumption in Theorem 5.4 that G is sufficiently large cannot be removed, even if we only require that G = wk(G) for a fixed k. Indeed, it is shown in [66] that for any integer k

there exist a word w and a finite simple group G, such that w is not an identity in G, but G = wk(G). This is done by constructing for any n > 13 a specific word w e F2 such that w(An) consists of the identity and all 3-cycles. The result follows since for n > 2k + 1 there are elements in An which cannot be written as a product of less than k + 1 3-cycles.

Further examples of word maps on SL(2, 2n) whose image is very small (consisting of the identity and a single conjugacy class) have been constructed in [75]. In a more recent preprint [85], Lubotzky proved (assuming the classification of finite simple groups) that any given subset of a finite simple group G which contains the identity and is invariant under Aut(G) can arise as the image of some word map. In [76], this result was extended to some almost simple and quasisimple groups.

The main tools in [71, 73, 114] involve representation theory, algebraic geometry and probabilistic methods. For any two nontrivial words w1, w2 the rough idea is to construct special conjugacy classes C1,C2 C G satisfying

C1 C w-1 (G), C2 C w2(G), (16)

C1C2 D G \{1}. (17)

The proof of (17) relies on the following classical result of Frobenius. Let C = s!p, C2 = Sj . The number of ways to write a group element g e G as g = x1x2, where xt e Ch is given by

# C1# C2 s— x(s1)xx(g) (18)

#G x(1 , ()

where Irr(G) denotes the set of irreducible complex characters of G.

The case of the alternating groups was established by Larsen and Shalev in [71 ]. First, it was proved that for any w = 1 there exists N(w) such that if n > N(w) then the image w(An) contains the conjugacy classes of permutations sn with a few cycles (at most 23), thus implying (16). This is highly non-elementary, involving algebraic geometry and results from analytic number theory (such as weak versions of the Goldbach Conjecture). The idea is to embed groups of the form SL(2, p) and their products into An, basing on the fact that SL(2, p) embeds into Ap+1, and an element of order (p —1)/2 in SL(2, p) has two nontrivial cycles and two fixed points in this embedding, and then use the following property of word maps on SL(2,p).

Theorem 5.6 ([71, Theorem 4.1]).

For every nontrivial word w there exist constants Mw and mw with the following property: for every prime p > Mw, such that p — 1 is divisible neither by 4 nor by any prime 3 < l < mw, w(SL(2, p)) contains an element of order (p — 1)/2.

Inclusion (17) can now be obtained by combining (18) with the fact that all character values x(s) of a permutation s G S, can be bounded in terms of the number of cycles alone.

Theorem 5.7 ([71, Theorem 7.2]).

Let s G S, be a permutation with k cycles (including 1 -cycles). Then

ix(s)i< 2k-1k!

for all irreducible characters x of S, .

Larsen and Shalev [71 ] also treated the case of finite simple groups of Lie type of bounded rank. Later on, Larsen, Shalev and Tiep [73] completed the proof for finite simple groups of Lie type of arbitrary rank.

For finite simple groups of Lie type, Q, C2 are the classes of suitable regular semisimple elements s-|,s2 e G lying in maximal tori T1 ,T2 C G. The tori Tt are chosen so that if x is an irreducible character of G such that x(si)x(s2) = 0 then x is unipotent;moreover, there is a small (in particular, bounded) number of such unipotent characters. These results are obtained using the machinery of Deligne-Lusztig, see, e.g., [27]. This implies that the number of nonzero summands in (18) is small, and moreover, the character ratios can be bounded.

Theorem 5.8 ([73, Theorem 1.2.1]).

If G is a finite quasisimple classical group over Fq and g e G is an element of support at least N, then

\x(g)l < q-/w481 x(1) q

for all 1 g = x e Irr(G).

(Here the support of g is defined as the codimension of its largest eigenspace, see [73, Definition 4.1.1]. Recall that a quasisimple group G is a perfect group such that G/Z(G) is simple.)

The proof of (16) is based on geometric tools, and in particular, on the Lang-Weil estimate, that allows one to establish a Chebotarev Density Theorem for word maps.

Theorem 5.9 ([73, Corollary 5.3.3]).

For every fixed nontrivial word w and fixed integer N, there exists 5 > 0 such that for every semisimple algebraic group G of dimension less than N over a finite field Fq and every maximal torus T of G defined over Fq, we have

# (T(Fq) n w(G(Fq))) > 5 # T(Fq).

Hence, if q is sufficiently large, there exist regular semisimple elements St e wt(G(Fq)) lying in any prescribed maximal torus T(Fq). This itself is not enough, since the group G is of unbounded Lie rank. This obstacle is treated by embedding groups H of very small rank (such as SL2) over large extension fields into G so that st e wt(H) remains regular semisimple in G, and lies in the required maximal torus Tt of G. Clearly st e wt(G) so that wt(G) contains the conjugacy class Ct = stG, as required.

Similar Waring type results were obtained by Larsen, Shalev and Tiep in [74] for quasisimple groups.

Theorem 5.10 ([74]).

For a fixed nontrivial word w there exists a constant N(w) such that if G is a finite quasisimple group of order greater than N(w), then w3(G) = G.

For various families of finite quasisimple groups, including covers of alternating groups, a stronger result was proved in [74], namely that w2(G) = G. This was recently finalized by Guralnick and Tiep [55], who proved that for any nontrivial word w there exists N = N(w) such that w2(G) D G \Z(G) for all quasisimple groups G of order greater than N. Note, however, that in contrast with the case of simple groups studied in [73], the equality w2(G) = G may not hold for all large finite quasisimple groups G. The nontrivial central elements of finite quasisimple groups G provide the main obstructions (see subsections 5.2.3 and 6.1 for examples).

Further variations on the Waring theme, discussed in [71, 113], include considering products w1(G) • • • wk(G) and intersections w1(G) n • • • n wk(G) where wt are distinct words. The latter case can be fit into the same framework by looking at the fibre product of the word maps wt: Gd —> G over G (a different approach was suggested in [98]). The reader is addressed to the original papers for details.

Some counterparts of Waring type properties discussed above can be formulated for maps of matrix algebras induced by associative noncommutative polynomials, see [61] for a survey.

Remark 5.11.

In [72], Larsen and Shalev obtained a general estimate for the size #Nw(g) of the fibres of word maps: for all w e Fd, w = 1, there exists e > 0 such that for all finite simple groups G and all g e G we have # Nw(g) = O(# Gd—e), where the implicit constant depends only on w. Naturally, these universal estimates are rough in comparison with the equidistribution results because they hold for all nontrivial words, including power words, which are far from being equidistributed, and also for all elements in the group, including 1, for which the fibre may be very large (as in the case of commutators).

Remark 5.12.

In a different spirit, estimates for the size of the fibres of word maps were used In [1, 99], where they yielded new criteria for distinguishing finite nilpotent and solvable groups. (We thank the first referee for pointing out the references mentioned above.)

5.2.2. Commutators [34,53, 79,80,102,120]

In this and next sections we consider the image of word maps for some special words w. First note that for any primitive word w (this means that w is a part of a free generating set of Fd), as well as for any word of the form w = x^1 .. .xeddf, where the et are coprime and f belongs to the derived group F'd, the induced map w: Gd ^ G is surjective for an arbitrary group G (see, e.g., [111, 3.1.1]). The commutator word is the first nontrivial instance of the surjectivity problem.

Theorem 5.13 (Ore's Conjecture).

If G is a finite non-abelian simple group, then every element of G is a commutator.

In other words, for the commutator word w = [x, y] e F2, one has w(G) = G for any finite non-abelian simple group G. This statement was originally posed in 1951 and proved by Ore himself for the alternating groups [102]. During the years, this conjecture was proved for various families of finite simple groups (see [79] and the references therein). R.Thompson [120] established it for the linear groups PSL(n, q), later Ellers and Gordeev [34] proved the conjecture for all finite simple groups of Lie type defined over a field with more than 8 elements, and recently an impressive full stop was put by Liebeck, O'Brien, Shalev and Tiep [79] who completed the proof for all finite simple groups.

The original proofs of Ore [102] and R.Thompson [120] were obtained by explicitly finding pairs of permutations (respectively, matrices) whose commutator corresponded to some representative in any given conjugacy class.

In order to complete the proof of Ore's Conjecture, Liebeck, O'Brien, Shalev and Tiep used in [79] the following classical criterion dating back to Frobenius, that the number of ways to write an element g in a finite group G as a commutator is

# g ^ xM. (19)

xelrr(G) ()

Roughly speaking, it was shown that if g is an element with a small centralizer, then x(g)/x(1) is small for x = 1, and the main contribution to the character sum (19) comes from the trivial character x = 1. Hence, this sum is positive, so elements with small centralizers are commutators. This is based on the Deligne-Lusztig theory, and also on the theory of dual pairs and Weil characters of classical groups [54, 121]. For elements whose centralizers are not small, the strategy is to reduce to groups of Lie type of lower dimension and use induction. Namely, if a certain element has a Jordan decomposition into several Jordan blocks, and if it is possible to express each block as a commutator in the smaller classical group, then clearly the original element is itself a commutator.

Computer calculations (using MAGMA) played a significant role in the proof of [79]. Since the proof uses induction, it was necessary to establish various base cases. The conjecture was proved directly for many of these base cases by constructing the character table of the relevant group. For various other groups certain elements with prescribed Jordan forms as commutators were explicitly constructed. Similar methods were used in the subsequent paper [80], in which it was shown that with a few (small) exceptions, every element of a finite quasisimple group is a commutator, and moreover, any such element is a product of two commutators.

Ellers and Gordeev [34] have proved, for the finite simple groups of Lie type over fields with more than eight elements, a stronger conjecture, known as Thompson's Conjecture.

Conjecture 5.14 (Thompson's Conjecture).

Every finite simple group G has a conjugacy class C such that C2 = G.

Observe that Thompson's Conjecture immediately Implies Ore's Conjecture. Indeed, If C2 = G then 1 G C2 so C 1 = C and G = CC-1. Hence, for any g G G there exist x G C and h G G such that g = xhx— = [h,x], as required.

Thompson's conjecture was verified for many families of finite simple groups, Including the alternating groups and the sporadic groups, see the introduction of [34], but nevertheless it is still very much open today.

The proof of Ellers and Gordeev is based on the following generalization of the Gauss decomposition of matrices.

Theorem 5.15 ([31-33]).

Let G be a Chevalley group, and let Г be a group generated by G and a cyclic group {a) which normalizes G in Г and acts as a diagonal automorphism on G (perhaps trivially).

Let у = ag e Г \ Z(Г). If h is any fixed element in the group H, then there is т e G such that тут-1 = au1hu2, where u1 e U- and u2 e U.

Here H, U and U- are the subgroups of G defined by

H = {ha : a e П), U = {Xa : a e Ф+), U- = {Xa : a e Ф-),

where Ф is the root system corresponding to G and П denotes the simple roots of Ф. Recall that the Chevalley group G is generated by the root subgroups Xa, a e Ф.

As a consequence of Theorem 5.15, one easily gets a statement in the spirit of inclusion (17).

Corollary 5.16.

If h1,h2 e H are regular semisimple elements in G from a maximal split torus and C1,C2 are the respective conjugacy classes, then C1C2 D G \ Z(G).

Indeed, by [31, Proposition 1], for fixed h1,h2 any u1 e U- and u2 e U can be represented as

u1 = v^h^v-^h- and u2 = h-1v2h2 v-1,

for some v1 e U- and v2 e U. Thus by Theorem 5.15, for any noncentral conjugacy class C С G one can find a representative с e C such that

с = u1h1h2u2 = (v1h1v--1 h-^)h1h2(h- v2h2v-1) = (v1 h1v-^)(v2h2v2-^).

Corollary 5.16 immediately implies Ore's conjecture for any simple group G containing a regular semisimple element h in a maximal split torus, and Thompson's Conjecture if this element is in addition real, i.e., if h and h-1 are conjugate. In [34] a careful analysis is done to show that such desired elements actually exist in groups of Lie type over fields with more than eight elements.

In addition, Guralnick and Malle [53] have extended the aforementioned result (17) from [73] and proved the following variant of Thompson's Conjecture.

Theorem 5.17 ([53, Theorem 1.4]).

If G is a finite non-abelian simple group, then there exist conjugacy classes Q, C2 in G with G \ {1} = C C2. Moreover, aside from G = PSL(2, 7) or PSL(2,17), one can assume that each Cj consists of elements of order prime to 6.

Similarly to [73], the proof in [53] also relies on estimating the character sum (18) using the Deligne-Lusztig theory, or for some small rank groups it is computed directly from known character tables, to show that triples (x1,x2, g) of elements from specified conjugacy classes Cj exist in a given group G. The conjugacy classes Cj are chosen so that only few irreducible characters vanish simultaneously on these classes. These triples moreover generate G, since the conjugacy classes Cj were chosen so that their elements are contained in few maximal subgroups of G.

After considering the commutator word, it is natural to go over to the Engel words, defined recursively by

ei(x,y) = [x,y] = xyx 1y 1, en(x,y) = [en-i,y\,

and the corresponding Engel word maps en : G x G —> G. The following conjecture Is naturally raised.

Conjecture 5.18 (Shalev).

Let n G N, then the nth Engel word map is surjective for any finite simple non-abelian group G.

See Section 6 for the cases G = SL(2, q) and PSL(2, q).

5.2.3. Two-power words [53, 73, 74,81]

It follows from Theorem 5.4, due to Larsen, Shalev and Tiep [73], that any two-power word is surjective on sufficiently large finite simple groups (see [73, Theorem 1.1.1 and Corollary 1.1.3]). More precisely,

Theorem 5.19 ([73]).

Let a, b be two nonzero integers. Then there exists a number N = N(a, b) such that if G is a finite non-abelian simple group of order at least N, then any element in G can be written as xayb for some x, y e G.

Furthermore, by recent results of Liebeck, O'Brien, Shalev and Tiep [81] and of Guralnick and Malle [53] (see Theorem 5.17), some words of the form xbyb are known to be surjective on all finite simple groups.

Theorem 5.20 ([53, Corollary 1.5]).

Let G be a finite non-abelian simple group and let b be either a prime power or a power of 6. Then any element in G can be written as xbyb for some x, y e G.

Note that in general, the word xbyb is not necessarily surjective on all finite simple groups. Indeed, if b is a multiple of the exponent of G then necessarily xbyb = 1 for every x, y e G.

In addition, by another recent work of Larsen, Shalev and Tiep [74], if G is a finite quasisimple group, then the word w = x2y2 is surjective on G. On the other hand, if b > 2 then the word w = xbyb is not surjective on infinitely many finite quasisimple groups.

5.3. The fibres of the word map

Recall that estimating the sizes of the fibres of the word map appeared as an integral part of estimating its image. In this section we address a subtler problem trying to distinguish words for which the fibres of the corresponding word map are of the same size, at least approximately. First note that for certain words w all fibres of the word map w: Gd —> G are exactly of the same size for any finite group G. According to recent results of [104, 105], this holds only for primitive words. Primitive words are asymptotically very rare (exponentially negligible, in the terminology of [65]): if we count them among all words of fixed length, their proportion tends to 0 exponentially fast (see, e.g., [95]). Another viewpoint at the set of primitive elements of Fd is that this set is closed in the profinite topology of Fd [105].

We are interested in weaker equidistribution properties which hold for more general words.

5.3.1. Equidistribution and measure-preservation [41,80]

For a word w = w(x1,..., xd) e Fd, a finite group G and some g e G, we denote

Nw(g) = {(g1.....gd) e Gd : w(g1.....gd) = g}.

Definition 5.21 ([41]).

A word map w: Gd — G is almost equidistributed for a family of finite groups G if any group G e G contains a subset S = SG C G with the following properties:

(i) # S = # G(1 — e(G)),

(ii) # Nw(g) = (# G)d—1(1+ e(G)) uniformly for all g e S, where e(G) — 0 whenever # G — to.

Theorem 5.22 ([41, Theorem 1.5]).

The commutator word w = [x, y] e F2 is almost equidistributed for the family of finite simple groups.

Note that we cannot require in this theorem that S = G. Indeed, it is well known (and follows from (19) above) that for w = [x,y] we have Nw(1) = k(G)# G where k(G) is the number of conjugacy classes in G. Since k(G) — to as # G — to we see that the fibre above g = 1 is large and does not satisfy condition (ii).

Two proofs are given in [41] for this theorem. The first is probabilistic whereas in the second the subsets S are explicitly constructed.

Let P = PG be the commutator distribution on G, namely P(g) = Nw(g)/# G2, and let U = UG be the uniform distribution on G (so U(g) = 1/# G for all g e G). The probabilistic proof bounds the L1 -distance

\\P — Uh= £\P(g) — U(g)\

between the probability measures above. Using the Frobenius formula (19) and the Cauchy-Schwarz inequality, it is deduced in [41, Proposition 1.1] that

\\PG — UG\\1 < (ZG(2) — 1)V2, where ZG(s) = £ x—s(1)

xelrr(G)

is the so-called Witten zeta function.

Now Theorem 5.22 follows from results of Liebeck and Shalev [84], who showed that if G is simple, then

ZG(s) — 1 as # G — to provided s > 1.

This proof also provides some estimation of the function e. Namely, e(An) = O(n—1/2), and if G is of Lie type of rank r over a field with q elements then e(G) = O(q—r/4).

The second, constructive proof describes the subsets S explicitly. If G = PSL(2, q) then Theorem 5.22 follows from (19) directly using the well-known character table of G. If G = PSL(2, q) varies among groups of Lie type of bounded Lie rank then S = SG is chosen as the set of regular semisimple elements of G. If G varies among groups of Lie type with unbounded Lie rank, then S = SG contains elements whose centralizer is not very large. If G = An then S = SG is chosen as the set of permutations in An with at most ^fn fixed points. This yields better lower bounds on the cardinality of S. For example, in the constructive proof for An we obtain #S > (1 — 2/[yfn]\)#An, which is much better than the lower bound (1 — O(n—1/2))#An given by the probabilistic proof.

Similarly, It was shown In [80, Proposition 3] that the commutator word Is also almost equidistributed on the family of finite quasisimple groups.

In [41, Section 7] it is proved that the property to be almost equidistributed behaves well under direct products and compositions, implying that the words w = [x1,...,x^] G F, rf-fold commutators in any arrangement of brackets, are almost equidistributed within the family of finite simple groups. Similar methods are used in [41, Theorem 7.1] to show that the word w = x2y2 is almost equidistributed on finite simple groups, and Larsen and Shalev have recently obtained similar results in more general contexts.

By [41, Section 3], any "almost equidistributed" word map w: Gd —> G (see Definition 5.21) is also "almost measure preserving" in the following sense.

Definition 5.23 ([41]).

A word map w: Gd — G is almost measure preserving for a family of finite groups G if every group G G G satisfies the following conditions:

(i) for every subset Y C G we have # w—\Y)/# Gd = # Y/# G + o(1);

(ii) for every subset X C Gd we have # w(X)/# G > #X/# Gd — o(1);

(iii) in particular, if X C Gd and #X/# Gd = 1 — o(1), then almost every element g G G can be written as g = w(gi.....gd) where (gi.....gd) G X;

here o(1) denotes a function depending only on G which tends to zero as # G — oo.

This allows one to deduce in [41, Corollary 1.6] that the commutator map is almost measure preserving on the family of finite simple groups. Since almost all pairs of elements of a finite simple group are generating pairs (see [28, 63, 82]), the probability that some g G G can be represented as a commutator g = [x, y], where x, y generate G, tends to 1 as # G — to, by [41, Theorem 1.7].

It seems to be not so easy to extend equidistribution results from commutators to more general words. See, however, the next section where this is done in the particular cases G = SL(2, q) and PSL(2, q).

6. Word maps on SL (2, q) and PSL (2, q) 6.1. Surjectivity for Engel words and some positive words

In [9] the particular case of Engel words in the groups PSL(2, q) and SL(2, q) was analyzed using the trace map method, in an attempt to prove Conjecture 5.18 for PSL(2, q). The main idea was to check the surjectivity of the trace map on A3(Fq) instead of the surjectivity of the word map on SL(2, q).

For any x,y G G = SL(2, q) denote s = trx, t = tr y and u = tr(xy), and define a morphism n: G x G —> A3„( by n(x, y) = (s, u, t). Then the following diagram commutes:

G x G -3- G x G

A3,u,t ^ A3,u,t.

In this diagram, the maps fi" and are defined recursively as follows.

• fi(x, y) = ([x, y], y) ^ ip(s, u, t) = (S1, t, t), S1 = s2 + u2 + t2 — sut — 2;

• fi2(x, y) = ([[x, y], y], y) ^ p2(s, u, t) = p(s1, t, t) = (s2, t, t), s2 = r(s1, t) = s2 + 2t2 — s112 — 2; ...

• $n(x,y) = (en(x,y),y) ^ pn(s,u,t) = Pn—1(s1,u,t) = ... = ^(sn—1,t,t) = (sn,t,t), sn = r(sn—1,t) = r (r(sn—2,t),t) = ... = r(n—1)(s1,t),

where r(s, t) = s2 + 2t2 — st2 — 2.

Let ±2 = a e Fq. It is proven in [9] that a matrix z e SL(2, q) with trz = a can be written as z = en+1(x, y) for some x, y e SL(2, q) if and only if there is a solution (s1, t) e F^ of the equation r(n)(s1, t) = a, that is if and only if the curve Cna defined over Fq by the equation r(n)(s1, t) = a has a rational point.

In [9] it was shown that the curve Cn a is absolutely irreducible over any finite field Fq. In addition, its genus satisfies the inequality g(Cn a) < 2n(n — 1) + 1, and it has at most 5 = 5 • 2n punctures. By Weil's inequality, for a = ±2 and q > 22n+3(n — 1)2 we have Cn a(Fq) = 0, as required.

This implies the following results, obtained in [9].

Theorem 6.1.

The nth Engel word map is surjective on SL(2, q)\{— id} (and hence on PSL(2, q)) provided that q > q0(n) is sufficiently large.

On the other hand,

Proposition 6.2.

There is an infinite family of finite fields Fq such that if n > n0(q) is large enough, then the nth Engel word map is not surjective on SL(2, q) \ {— id}.

Proposition 6.3.

For every odd prime power q there is n0 = n0(q) such that en(x, y) = — id for every n > n0 and every x, y e SL(2, q).

Indeed, z = en+1(x, y) = — id implies that there is a solution to the equation r(n)(s1,0) = —2, and then there is c e Fq2 such that c2n = —1.

In certain cases, en is always surjective on PSL(2, q).

Proposition 6.4.

en is surjective on SL(2, 2e) = PSL(2, 2e).

Indeed, if q = 2e, take t = 0 and then r(s, 0) = s2, so rn(s, 0) = s2" is an isomorphism, implying that en is surjective on PSL(2,q).

Proposition 6.5.

en is surjective on PSL(2, q) if V2 e Fq, e Fq.

Proposition 6.6.

If n < 4 then en is surjective for all groups PSL(2, q).

The last result is a consequence of Theorem 6.1 and MAGMA calculations.

In [8] there is a precise description of the positive integers a, b and prime powers q for which the word map w(x, y) = xayb is surjective on the group PSL(2, q) (and SL(2, q)). The proof is based on the investigation of the trace map of positive words. The key result is that tr(xayb) is a linear polynomial in u, namely,

tr(xayb) = u • fa,b(s, t) + ha,b(s, t), where fab(s, t), h a,b (s, t) e Fq[s, t].

Thus, if neither a nor b is divisible by the exponent of PSL(2, q), then any element in Fq can be written as tr(xayb) for some x,y G SL(2, q). This immediately implies that in this case, any semisimple element (namely, z G SL(2, q) with tr z = ±2) can be written as z = xayb for some x,y G SL(2, q). However, when z is unipotent (namely, z = ± Id and trz = ±2) one has to be more careful, and a detailed analysis is needed. Indeed, it may happen that neither a nor b is divisible by the exponent of PSL(2, q), but nevertheless the image of the word map w = xayb does not contain any unipotent. For example, the word w = x42y42 is not surjective on PSL(2, 7) and PSL(2, 8).

In addition, it was determined when — Id can be written as xayb for some x, y G SL(2, q). In particular, if q = ±3 mod 8, then x4y4 = — id for every x,y G SL(2, q) (the same result was obtained independently in [74]).

These results demonstrate, in particular, the difference between word maps in simple and quasisimple groups (see also the previous discussion in subsections 5.2.1, 5.2.2 and 5.2.3).

6.2. Criteria for equidistribution

In this section we describe some results on equidistribution of solutions of word equations of the form w(x, y) = g in the family of finite groups SL(2, q) which were obtained in [14]. A criterion for equidistribution in terms of the trace polynomial of w is given in Theorem 6.18 below. This allows one to get an explicit description of certain classes of words possessing the equidistribution property and show that this property is generic within these classes. This result can be viewed, on the one hand, as a refinement (in the SL2-case) of equidistribution theorems of [69] and [73] on general words w and general Chevalley groups G, and, on the other hand, as a generalization of equidistribution theorems for some particular words: [41] (commutator words on any finite simple G), [9] (Engel words on SL2), [8] (words of the form w = xayb on SL2). Acting in the spirit of [41], we deduce a criterion for w: SL2 x SL2 — SL2 to be almost measure-preserving. It turns out that "good" (equidistributed, measure-preserving) words are essentially those whose trace polynomial cannot be represented as a composition of two other polynomials.

Here are precise definitions and results. We will follow the approach to equidistribution adopted in [41] (see subsection 5.3.1 above).

Definition 6.7 (cf. [41, § 3] and subsection 5.3.1).

Let f: X — Y be a map between finite non-empty sets, and let e > 0. We say that f is e-equidistributed if there exists Y' C Y such that

(i) # Y' > # Y(1 — e);

(ii) |# f—1(y) — #X/# Y| < e#X/# Y for all y G Y'.

The setting is as follows. Let a family of maps of finite sets Pq : Xq — Yq be given for every q = p". Assume that for all sufficiently large q the set Yq is non-empty. For each such q take y G Yq and denote

Py = {x G Xq : Pq(x) = y}.

Definition 6.8 ([14]).

Fix a prime p. With the notation as above, we say that the family Pq: Xq — Yq, q = p", is p-equidistributed if there exist a positive integer "0 and a function ep: N — R tending to 0 as " — to such that for all q = p" with " > "0 the set Yq contains a subset Sq with the following properties:

(i) #Sq < ep(q)(# Yq);

(ii) |# Py — # Xq/# Yq | < ep(q)# Xq/# Yq for all y G Yq \ Sq.

Remark 6.9.

Definition 6.8 means that for q = p" large enough, the map Xq — Yq is ep(q)-equidistributed, in the sense of Definition 6.7.

Definition 6.10 ([14]).

We say that the family Pq: Xq — Yq is equidistributed if it is p-equidistributed for all p and there exists a function £: N —> N tending to 0 as n — то such that for every p and every q = p" large enough, we have £p(q) < £(q).

Let us now consider the case where Yq = Gq = SL(2, q), Xq = (Gq)2 is a direct product of its two copies, and Pq = Pwq: (Gq)2 — Gq is the morphism induced by some fixed word w e F2. Accordingly, we say that w is equidistributed (or p-equidistributed) if so is the family of maps Pw q: SL(2, q) x SL(2, q) — SL(2, q) (or, in other words, if so is the morphism Pw: SL2sZ x SL2sZ — SL2sZ of group schemes over Z).

Recall some properties of polynomials.

Definition 6.11.

Let F be a finite field. We say that h e F[x] is a permutation polynomial if the set of its values {h(z)}zeF coincides with F.

Theorem 6.12 ([78, Theorem 7.14]).

Let q = p". A polynomial h e Fq [x] is a permutation polynomial of all finite extensions of Fq if and only if h = axp + b, where a =0 and к is a non-negative integer.

The following notions are essential for our criteria.

Definition 6.13.

Let F be a field. We say that a polynomial P e F[x1,... ,xn] is F-composite if there exist Q e F[x1,... ,xn], deg Q > 1, and h e F[z], deg h > 2, such that P = h о Q. Otherwise, we say that P is F-noncomposite.

Note that if E/F is a separable field extension, it is known [4, Theorem 1 and Proposition 1] that P is F-composite if and only if P is E-composite. In particular, working over perfect ground fields, we may always assume, if needed, that F is algebraically closed.

Definition 6.14.

Let P e Z[x1,... ,xn\ Fix a prime p.

• We say that P is p-composite if the reduced polynomial Pp e F^x^ ... ,xn] is Fp-composite. Otherwise, we say that P is p-noncomposite.

• We say that a p-composite polynomial P is p-special if, in the notation of Definition 6.13, Pp = h о Q where h e Fp[x] is a permutation polynomial of all finite extensions of Fp.

Definition 6.15.

We say that a polynomial P e Z[x1,... ,xn] is almost noncomposite if for every prime p it is either p-noncomposite or p-special. Otherwise we say that P is very composite.

Remark 6.16.

If a polynomial P e Z[x1,... ,xn] is Q-noncomposite, it is p-noncomposite for all but finitely many primes p [16, 2.2.1]. If P e Z[x1,... ,xn] is Q-composite, it is very composite.

We can now formulate the main results of [14].

Theorem 6.17.

Let w e F2. The morphism Pw: SL2iZ x SL2iZ — SL2iZ is p-equidistributed if and only if the trace polynomial fw is either p-noncomposite or p-special.

Theorem 6.18.

Let w G F2. The morphism Pw: SL2,Z x SL2,Z — SL2,Z is equidistributed if and only if the trace polynomial fw is almost noncomposite.

Corollary 6.19.

Suppose that for each p and all n big enough the image of PWsPn : SL(2,p") x SL(2,p") —> SL(2,p") contains all noncentral semisimple elements of SL(2,p"). Then w is equidistributed.

For a given word w G F2, let us now consider the family of groups Gq = PSL(2, q) and the corresponding word maps

wq : Gq xGq — Gq •

Proposition 6.20.

If the morphism Pw: SL2Z x SL2 Z — SL2Z is equidistributed (or p-equidistributed), then so is the family of maps

wq : Gq xGq — Gq.

Here are the main ingredients of the proofs of Theorems 6.17 and 6.18:

• The diagram

Gq x Gq -^ Gq

n tr (20)

A?,„,((Fq) A](Fq)

where n(x, y) = (trx, tr(xy), tr y).

• An explicit Lang-Weil estimate (Ghorpade-Lachaud [46]): if H C A|q is an absolutely irreducible hypersurface of degree d, then

\# H(Fq) - q2\ < (d - 1)(d - 2)q3/2+12(d + 4)4q or, equivalently, # H(Fq) = q2(1 + r1) with

\ri\ < q-1/2[(d- 1)(d-2) + 12(d + 4)4q-1'2].

For q > Cds this gives \n\ < 1/2.

• A generalized Stein-Lorenzini inequality [96]: if fwp is p-noncomposite, then the spectrum a(fwp), i.e., the set of all points z G A1(Fp) such that the hypersurface Hz C A3„((Fp), defined by the equation fw(s,u,t) = z, is reducible, contains at most d - 1 points, where d = deg fw. The same is true for each aq(fw) = a(fwp) n Fq. Let z G A^(Fp)\a(fwp). Then Hz is an irreducible hypersurface and hence satisfies the Ghorpade-Lachaud inequality.

• Estimates for fibres of the trace map:

Lemma 6.21.

Let D(s, u, t) = (s2 - 4)(t2 - 4)(s2 + t2 + u2 - ust - 4), and let A C A3„,t be defined by the equation D = 0. Let H C A3,u t(Fp) be a hypersurface of degree d such that H C A. Then for n from diagram (20) we have # n-1(H)(Fq) = # H(Fq)q3(1 + T2), where ^ < Cd/q.

• Estimates for the size of the value set of polynomials [125]: if R is not a permutation polynomial for Fq, q = p", then it is not a permutation polynomial for any extension Fq™ of Fq and omits at least (qm - 1)/d^ values of Fq™, where d1 = deg R.

6.2.1. Composite trace polynomials

The goal of this section Is to describe words In two variables whose trace polynomial Is composite. We send the curious reader to [14] for proofs.

Throughout this section Tn(x) stands for the nth Chebyshev polynomial, and Dn(x) = 2 Tn(x/2) for the nth Dickson polynomial. It is well known (see, e.g., [77, (2.2)]) that this polynomial satisfies Dn(x + 1/x) = xn + 1/xn and is completely determined by this functional equation. We always assume that w(x, y) is written in the form

w = x°i ybi ■■■ x°'yb' (21)

and is reduced (all integers at, bj are nonzero). We call the number r the complexity of w.

Definition 6.22.

We say that two reduced words w = xai ybl ...x'arybr and v = xC1 ydl ...xcr' ydr', written in form (21), are trace-similar if r = r', the array {{a^} is a rearrangement of {|cj|}, and the array {|bf|} is a rearrangement of {|dj|}.

Note that if reduced words w = xai ybl ■ ■ ■ xarybr and v = xc ydl ■ ■ ■ xcr' ydr', written in form (21), have the same trace polynomial, then w and v are trace-similar [56].

The following propositions are valid (see [14] for the proofs).

Proposition 6.23.

Let w(x,y) = xai ybl ■■■ xarybr, A = ^ a, B = ^ b. Assume that either A = 0 or B = 0. Assume that the trace polynomial fw(s,u,t) is C-composite, fw(s,u,t) = h(q(s,u,t)), where q G C[s, u, t] and h G C[z], deg h > 2. Then h = Dd (z) for some d > 2.

Proposition 6.24.

Let w be a reduced word of complexity r written in form (21). If its trace polynomial fw is C-composite, fw(s, u, t) = h(q(s, u, t)) where q G C[s, u, t] and h(x) = ¡jxn + ■ ■ ■ is a polynomial in one variable of degree n, then r = nm and w is trace-similar to v(x, y)n where v is a word of complexity m.

Proposition 6.25.

Let w (x,y) = xayb ■■■ be a reduced word of complexity n such that fw (s,u,t) = Dn(q(s, u, t)) for some q. Then w (x,y) = (xayb)n.

Remark 6.26.

The statements of Propositions 6.23, 6.24 and 6.25 remain valid if we replace C by (the algebraic closure of) a sufficiently big prime field Fp, and "composite" by "p-composite" (p > p0 depending on w).

Here are some concrete cases where one can get more conclusive results.

Corollary 6.27.

Let w (x, y) = xa yb ■ ■ ■ be a reduced word of prime complexity r. If p > r and w is not p-equidistributed, then w = vr (x, y).

Corollary 6.28.

The word w(x, y) = xaybxcyd is either equidistributed or equal to (xayb)2.

All facts mentioned above allow one to describe a class of words within which a "generic" word induces the map which is almost equidistributed. More precisely, we have the following proposition.

Proposition 6.29.

Let R be the set of words w of prime complexity. Then the set S of words w G R, such that the corresponding word morphism Pw: SL2,Z x SL2,Z — SL2,Z is p-equidistributed for all but finitely many primes p, is exponentially generic in R.

(According to the terminology of [65], this means that the proportion of words from S among all words from R of fixed length tends to 1 exponentially fast as the length tends to infinity.) This is proved by combining the results quoted above with the well-known fact (see, e.g., [5]) stating that the class of words which are proper powers of other words is exponentially negligible.

7. Concluding remarks and open problems

We conclude with a brief discussion of various ramifications, analogues and generalizations of results presented in this survey, focusing on open problems, the list of which does not pretend to be comprehensive and reflects the authors' taste. Most of them are borrowed from [7, 9, 10, 14, 52].

Engel-like words and solvability properties

In light of Theorems 3.4 and 3.6 and subsequent discussions in subsection 4.1.4, it is natural to ask whether or not the property of a given sequence to characterize finite solvable groups is generic. A possible way to express it is the following conjecture.

Conjecture 7.1.

Let R0 (resp. R) denote the class of words f(x, y,z) G F3 satisfying the following condition: there exists w0(x, y) G F2 (resp. w) such that the sequence -¡Vn0'(x, y)} (resp. {vn(x, y)}) generated by the first word w0 (resp. w) and law f

(i) does not contain the identity word,

(ii) is Engel-like,

(iii) descends along the derived series

(resp. conditions (i)-(iii) and the additional one:

(iv) for any finite group G the following holds: G is solvable if and only if there is n such that vn(x, y) = 1 in G). Then the class R is generic within R0 (in the sense of [65], as in subsection 6.2).

We hope that algebraic-geometric approaches developed in subsection 4.1.4 could be useful in establishing this conjecture. One can also try the following counter-part as a testing ground: in Conjecture 7.1, replace throughout "solvable" with nilpotent and "derived series" with "lower central series".

It would be interesting to investigate further, in the spirit of [7, 51], the relationship between finite groups and finite-dimensional Lie algebras from the point of view of solvability properties. Namely, one can put forward the following (maybe over-optimistic) conjecture.

Conjecture 7.2.

Let {vn(x, y)} be an Engel-like sequence of words in the free Lie algebra W2 which characterizes finite-dimensional solvable Lie algebras defined over fields of arbitrary characteristics (i.e., a finite-dimensional Lie algebra g defined over an arbitrary field K is solvable if and only if there is n such that vn(x, y) = 0 in g). Then the same sequence, regarded as a sequence of words in the free group F2 (i.e., viewing Lie bracket as commutator), characterizes finite solvable groups.

Note that the assumption on arbitrary characteristics is essential: in [7] there have been exhibited sequences characterizing finite-dimensional Lie algebras defined over fields of characteristic zero but not over fields of prime characteristic (and not characterizing finite solvable groups). The first intriguing example is the sequence defined by

V1(x,y) = [x,yl Vn+1(V,y) = [[Vn(x,y),xl [Vn(x,y),y]].

This example resists the algebraic-geometric approach described in Section 3 for purely computational reasons: the arising equations lead to varieties which are out of range of SINGULAR and MAGMA. Perhaps experts in computer algebra, who are able to apply more sophisticated methods, will be more lucky.

It is desirable to use an Engel-like sequence for characterizing the solvable radical of a finite group in the same way as the original Engel sequence is used to characterize the nilpotent radical. A theorem of Baer [6] says that the nilpotent radical of a finite group G coincides with the set of elements y e G with the following property: for every x e G there is n = n(x, y) such that en(x, y) = 1.

Problem 7.3.

Exhibit an Engel-like sequence {vn(x, y)} such that the solvable radical of any finite group G coincides with the collection of y e G with the property: for every x e G there is n = n(x, y) such that vn(x, y) = 1.

A recent result of Wilson [126], stating the existence of a two-variable countable set of words with the required property, gives a strong evidence that such an Engel-like sequence should exist but does not provide any candidate. Note that even the toy problem of characterizing the solvable radical of a finite-dimensional Lie algebra with the help of Engel-like sequences remains open in the case where the ground field is of positive characteristic;see [7, 52]. We hope that algebraic-geometric machinery in the spirit of Section 3 may turn out to be useful for achieving this goal.

Borel's theorem and around

Let G be a connected semisimple algebraic group defined over an infinite field k.

Problem 7.4.

What can be said about the "fine structure" of w(G)? In particular, describe w such that w(G) contains the set of all semisimple elements of G.

A similar problem was discussed in [62] for associative noncommutative polynomials on associative matrix algebras, in the spirit of Kaplansky's problem. In such a setting, the resulting maps may well be non-dominant, and not only for obvious reasons mentioned in the introduction. In the paper cited above there have been described certain classes of polynomials which are not central and whose image contains elements with nonzero trace but the induced map is not dominant on 2 x 2 matrices. Here are some general questions remaining open.

Question 7.5 ([62]).

Let P be an associative, noncommutative, noncentral, multilinear polynomial in d variables whose image contains matrices with nonzero trace. Does P induce a dominant map M(n, K)d —> M(n, K)? Does there exist P such that this map is not surjective?

Similar problems were discussed in [10] for Lie polynomials P on Chevalley Lie algebras g. It was shown that the induced map is dominant provided P is not identically zero on sl(2). We do not know whether or not the latter assumption can be dropped.

Question 7.6.

Does there exist a Lie polynomial in d variables, not identically zero on a Chevalley Lie algebra g, such that the induced map gd — g is not dominant?

It would be interesting to understand the situation with infinite-dimensional simple Lie algebras (as well as with finite-dimensional algebras of Cartan type over fields of positive characteristic). The first question, which does not seem too complicated, is the following one.

Question 7.7.

For which Lie algebras g of Cartan type the map g x g — g, (x, y) — [x, y], is surjective?

If for some g this question is answered in the affirmative, one can continue by looking at the Engel maps, as in [10].

Remark 7.8.

It would be interesting to consider a more general set-up when we have a polynomial map P: Ld — Ls. (This means that we consider systems of equations rather than single equations.) In [48] some dominance results were obtained for the multiple commutator map P: L x Ld — Ld given by the formula P(X,X1.....Xd) = ([X,X1].....[X,Xd]).

Remark 7.9.

In a similar spirit, one can consider generalized word maps w: Gd — Gs on simple groups. Apart from [48], see also a discussion of a particular case w = (w1, w2): G2 — G2 in [21, Problem 1].

Remark 7.10.

It would be interesting to find more classes of infinite simple (or close to simple) groups admitting some analogue of Borel's statement in the sense that the image of the word map is "large", at least for a generic word. Some such classes were discussed in the literature: infinite symmetric groups [86] (surjectivity of the commutator word was established by Ore [102]), groups of automorphisms of trees [92] and of random graphs [29].

We would suggest looking at Cremona groups, which share many common properties with linear algebraic groups and in which such notions as Zariski topology and dominance can be defined (see, e.g., [112]).

Problem 7.11.

Let w G Fd be a nontrivial word. Is the corresponding word map w: (Cr(n,k))d — Cr(n,k) dominant?

Of course, the first case to be considered is n =2. Recall two recent spectacular results on Cr(2, k) answering longstanding questions on its simplicity. It turns out that this group is at the same time simple and non-simple: it does not contain nontrivial normal subgroups closed in the topology mentioned above [15] but contains lots of abstract normal subgroups [22].

Remark 7.12.

In light of the previous remark, one can note that analogues of Borel's theorem may look differently at the first glance when one considers other classes of groups. Say, a certain analogue of Borel's theorem for profinite groups is provided by the following deep theorem by Nikolov and Segal [100, 101]: let G be a finitely generated profinite group, let w be a non-commutator word, and let {w(G)) denote the corresponding verbal subgroup (i.e., the subgroup generated by the set {w(g)±1}, g G G);then {w(G)) is open in G. As pointed out to us by the first referee, this is a stronger version of the positive solution to the restricted Burnside Problem [127] (and of course the proof depends on the latter).

Remark 7.13.

To prevent the reader from an overoptimistic view on Borel's theorem, one has to note that there may be a significant gap between dominance and surjectivity. Moreover, the image of the word map may be very large in the Zariski topology (according to Borel's theorem) but very small in some natural topology. See [118] where such word maps are constructed for real compact Lie groups. Note that the behaviour of Engel word maps on such groups is much better: they are all surjective [30].

Word maps on finite simple groups

Regarding the image of the word map, we can recall here Thompson's and Shalev's Conjectures 5.14, 5.5 and 5.18 discussed in subsection 5.2.2. In particular, one can ask whether the following variant of Shalev's Conjecture 5.5, for the family of groups PSL(2, q), holds.

Conjecture 7.14 (Shalev1).

Assume that w = w(x, y) e F2 is not of the form vm(x, y) for some v = v(x, y) e F2 and m > 1. Then there exists a constant q0(w) such that if q > q0(w) then w(G) = G for G = PSL(2, q).

In particular, this conjecture holds for Engel words and for words of the form xayb (see subsection 6.1 and [8, 9]). One can therefore attempt to use the trace map method described in subsection 2.3 to find more words satisfying the above conjecture. In particular, the following questions are raised (see subsection 4.1.4 and [9, Section 8]).

Question 7.15 ([9]).

What are the words w = w(x, y) e F2 for which the corresponding trace map p(s, u,t) = (f1 (s, u, t), f2(s, u, t), t) has one of the following properties:

(*) the set {f1(s, u, t) = a} is absolutely irreducible for almost all q and for every a e Fq?

(**) there exists a ^/-invariant plane A and the curves {p\A = a} are absolutely irreducible for for almost all q and for a general a e Fq?

It is tempting to generalize the results on the image of some word maps of [8, 9], presented in Section 6.1, as well as the criteria for almost equidistribution of [14], presented in Section 6.2, in the following directions:

(i) extend them from words in two letters to words in d letters, d > 2;

(ii) keep d = 2 but consider arbitrary finite Chevalley groups;

(iii) combine (i) and (ii).

Whereas in case (i) one can still hope to use trace polynomials, which exist for any d, to produce criteria for almost equidistribution, cases (ii) and (iii) require some new terms for formulating such criteria and new tools for proving them.

Regardless of getting such criteria, it would be interesting to compare, in the general case, the properties of having large image and being equidistributed, in the spirit of Corollary 6.19. We dare to formulate the following conjecture.

Conjecture 7.16.

For a fixed p, let Gq be a family of Chevalley groups of fixed Lie type over Fq (q = pn varies). For a fixed word w e Fd, d > 2, let Pq = Pw q : (Gq)d — Gq be the corresponding map. Suppose that

(*) for all n big enough the image of Pq contains all regular semisimple elements of Gq. Then the family {Pq} is almost p-equidistributed.

It is a challenging task to describe the words w satisfying condition (*) in Conjecture 7.16 (cf. the discussion in [73] after Theorem 5.3.2). Certainly, words of the form w = vk, k > 2, do not satisfy this condition. We do not know any non-power word for which (*) does not hold.

1 After the first version of the present survey had been submitted, a counter-example to this conjecture was constructed in [60].

Other interesting problems, arising in the context of measure preservation and primitivity, were raised by Puder and Parzanchevski [105]. Namely, in [105] it was shown that the property of a word w to be measure-preserving within the class of all finite groups can be detected on the family of all symmetric groups Sn. Are there other natural families (say, PGL(n,q)) that can be used as such detectors? They also ask whether their results on measure preservation can be extended to the class of compact Lie groups (with respect to the Haar measure).

One can try yet another direction: consider equidistribution problems for matrix algebras and for polynomials more general than word polynomials (see the introduction). Even the case of 2 x 2 matrices is completely open.

Miscellaneous remarks and problems Remark 7.17.

Recently, Shalev with his collaborators extended many results of Waring type from finite simple groups to some simple algebraic groups over p-adic integers. The case of simple Chevalley groups over rings of integers remains widely open, see [115] for some relevant questions and conjectures.

Remark 7.18.

A new dynamical viewpoint on the image of the word map was developed by Schul and Shalev [110]. They showed that the random walk on any finite simple group G, with respect to this image as a generating set, has mixing time 2.

Remark 7.19.

One could try to extend some of the results of this survey to the case of matrix groups or algebras over some sufficiently good ring. One has to be careful: say, in [108] there are examples of rings R such that not every element of si(n,R) is a commutator.

Remark 7.20.

One can ask questions of Borel's type for other classes of algebras (beyond groups, Lie algebras and associative algebras). The interested reader may refer to [49] for the case of values of commutators and associators on alternative and Jordan algebras.

Remark 7.21.

It would be interesting to extend (at least part of) the methods and results described in the present survey to matrix equations with constant matrix coefficients, as discussed in the introduction.

Acknowledgements

Bandman and Kunyavskii were supported in part by the Minerva Foundation through the Emmy Noether Research Institute for Mathematics. Kunyavskii was supported in part by the Israel Science Foundation, grant 1207/12;a part of this work was done when he participated in the trimester program Arithmetic and Geometry in the Hausdorff Research Institute for Mathematics (Bonn). Garion was supported by the SFB 878 Groups, Geometry and Actions. Support of these institutions is gratefully appreciated.

We thank the referees for useful remarks.

References

[1] Abert M., On the probability of satisfying a word in a group, J. Group Theory, 2006, 9(5), 685-694

[2] Adolphson A., Sperber S., On the degree of the L-functions associated with an exponential sum, Compositio Math., 1988, 68(2), 125-159

[3] Amitsur S.A., The T-ideals of the free ring, J. London Math. Soc., 1955, 30(4), 470-475

[4] Arzhantsev I.V., Petravchuk A.P., Closed polynomials and saturated subalgebras of polynomial algebras, Ukrainian Math. J., 2007, 59(12), 1783-1790

[5] Arzhantseva G.N., Ol'shanskii A.Yu., Generality of the class of groups in which subgroups with a lesser number of generators are free, Math. Notes, 1996, 59(3-4), 350-355

[6] Baer R., Engelsche Elemente Noetherscher Gruppen, Math. Ann., 1957, 133(3), 256-270

[7] Bandman T., Borovoi M., Grunewald F., Kunyavskiï B., Plotkin E., Engel-like characterization of radicals in finite dimensional Lie algebras and finite groups, Manuscripta Math., 2006, 119(4), 465-481

[8] Bandman T., Garion S., Surjectivity and equidistribution of the word xayb on PSL(2, q) and SL(2,q), Internat. J. Algebra Comput., 2012, 22(2), #1250017

[9] Bandman T., Garion S., Grunewald F., On the surjectivity of Engel words on PSL(2, q), Groups Geom. Dyn., 2012, 6(3), 409-439

[10] Bandman T., Gordeev N., Kunyavskiï B., Plotkin E., Equations in simple Lie algebras, J. Algebra, 2012, 355, 67-79

[11] Bandman T., Greuel G.-M., Grunewald F., Kunyavskiï B., Pfister G., Plotkin E., Two-variable identities for finite solvable groups, C. R. Acad. Sci. Paris, 2003, 337(9), 581-586

[12] Bandman T., Greuel G.-M., Grunewald F., Kunyavskiï B., Pfister G., Plotkin E., Identities for finite solvable groups and equations in finite simple groups, Compos. Math., 2006, 142(3), 734-764

[13] Bandman T., Grunewald F., Kunyavskiï B., Geometry and arithmetic of verbal dynamical systems on simple groups, Groups Geom. Dyn., 2010, 4(4), 607-655

[14] Bandman T., Kunyavskiï B., Criteria for equidistribution of solutions of word equations in SL(2), J. Algebra, 2013, 382, 282-302

[15] Blanc J., Groupes de Cremona, connexité et simplicité, Ann. Sci. Éc. Norm. Supér., 2010, 43(2), 357-364

[16] Bodin A., Dèbes P., Najib S., Indecomposable polynomials and their spectrum, Acta Arith., 2009, 139(1), 79-100

[17] Borel A., On free subgroups of semisimple groups, Enseign. Math., 1983, 29(1-2), 151-164

[18] Borisov A., Sapir M., Polynomial maps over finite fields and residual finiteness of mapping tori of group endomor-phisms, Invent. Math., 2005, 160(2), 341-356

[19] Borisov A., Sapir M., Polynomial maps over p-adics and redisual properties of mapping tori of group endomorphisms, Int. Math. Res. Not. IMRN, 2009, 16, 3002-3015

[20] Bray J.N., Wilson J.S., Wilson R.A., A characterization of finite soluble groups by laws in two variables, Bull. London Math. Soc., 2005, 37(2), 179-186

[21] Breuillard E., Green B., Guralnick R., Tao T., Strongly dense free subgroups of semisimple algebraic groups, Israel J. Math., 2012, 192(1), 347-379

[22] Cantat S., Lamy S., Normal subgroups in the Cremona group, Acta Math., 2013, 210(1), 31-94

[23] Cargo D.P., de Launey W., Liebeck M.W., Stafford R.M., Short two-variable identities for finite groups, J. Group Theory, 2008, 11(5), 675-690

[24] Casals-Ruiz M., Kazachkov I., On Systems of Equations over Free Partially Commutative Groups, Mem. Amer. Math. Soc., 212(999), American Mathematical Society, Providence, 2011

[25] Connes A., Schwarz A., Matrix Vieta theorem revisited, Lett. Math. Phys., 1997, 39(4), 349-353

[26] Deligne P., Sullivan D., Division algebras and the Hausdorff-Banach-Tarski paradox, Enseign. Math., 1983, 29(1-2), 145-150

[27] Digne F., Michel J., Representations of Finite Groups of Lie Type, London Math. Soc. Stud. Texts, 21, Cambridge University Press, Cambridge, 1991

[28] Dixon J.D., The probability of generating the symmetric group, Math. Z., 1969, 110(3), 199-205

[29] Droste M., Truss J.K., On representing words in the automorphism group of the random graph, J. Group Theory, 2006, 9(6), 815-836

[30] Elkasapy A., Thom A., About Goto's method showing surjectivity of word maps, preprint available at http://arxiv.org/abs/1207.5596

[31] Ellers E.W., Gordeev N., Gauss decomposition with prescribed semisimple part in classical Chevalley groups, Comm. Algebra, 1994, 22(14), 5935-5950

[32] Ellers E.W., Gordeev N., Gauss decomposition with prescribed semisimple part in Chevalley groups II, Exceptional cases, Comm. Algebra, 1995, 23(8), 3085-3098

[33] Ellers E.W., Gordeev N., Gauss decomposition with prescribed semisimple part in Chevalley groups III, Finite

twisted groups, Comm. Algebra, 1996, 24(14), 4447-4475

[34] Ellers E.W., Gordeev N., On the conjectures of J.Thompson and O.Ore, Trans. Amer. Math. Soc., 1998, 350(9), 3657-3671

[35] Etingof P., Gelfand I., Retakh V., Factorization of differential operators, quasideterminants, and nonabelian Toda field equations, Math. Res. Lett., 1997, 4(2-3), 413-425

[36] Formanek E., Central polynomials for matrix rings, J. Algebra, 1972, 23(1), 129-132

[37] Fricke R., Über die Theorie der automorphen Modulgruppen, Nachr. Akad. Wiss. Göttingen, 1896, 91-101

[38] Fricke R., Klein F., Vorlesungen über die Theorie der Automorphen Funktionen, 1 and 2, Teubner, Leipzig, 1897 and 1912

[39] Fuchs D., Schwarz A., Matrix Vieta theorem, In: Lie groups and Lie algebras: E.B. Dynkin's Seminar, Amer. Math. Soc. Transl. Ser. 2, 169, American Mathematical Society, Providence, 1995, 15-22

[40] Fujiwara K., Rigid geometry, Lefschetz-Verdier trace formula and Deligne's conjecture, Invent. Math., 1997, 127(3), 489-533

[41] Garion S., Shalev A., Commutator maps, measure preservation, and 7-systems, Trans. Amer. Math. Soc., 2009, 361(9), 4631-4651

[42] Gelfand I., Retakh V., Noncommutative Vieta theorem and symmetric functions, In: The Gelfand Mathematical Seminars, 1993-1995, Gelfand Math. Sem., Birkhäuser, Boston, 1996, 93-100

[43] Gelfand I., Retakh V., Quasideterminants I, Selecta Math. (N.S.), 1997, 3(4), 517-546

[44] Gelfand S., On the number of solutions of a quadratic equation, In: Globus: General Mathematical Seminar, 1, Independent University of Moscow, Moscow, 2004, 124-133 (in Russian)

[45] Ghorpade S.R., Lachaud G., Number of solutions of equations over finite fields and a conjecture of Lang and Weil, In: Number Theory and Discrete Mathematics, Chandigarh, October 2-6, 2000, Trends Math., Birkhäuser, Basel, 2002, 269-291

[46] Ghorpade S.R., Lachaud G., Etale cohomology, Lefschetz theorems and number of points of singular varieties over finite fields, Mosc. Math. J., 2002, 2(3), 589-631;2009, 9(2), 431-438

[47] Goldman W.M., An exposition of results of Fricke and Vogt, preprint available at http://arxiv.org/abs/math/0402103

[48] Gordeev N., Rehmann U., On multicommutators for simple algebraic groups, J. Algebra, 2001, 245(1), 275-296

[49] Gordon S.R., Associators in simple algebras, Pacific J. Math., 1974, 51(1), 131-141

[50] Gowers W.T., Quasirandom groups, Combin. Probab. Comput., 2008, 17(3), 363-387

[51] Grunewald F., Kunyavskii B., Nikolova D., Plotkin E., Two-variable identities in groups and Lie algebras, J. Math. Sci. (N.Y.), 2003, 116(1), 2972-2981

[52] Grunewald F., Kunyavskii B., Plotkin E., Characterization of solvable groups and solvable radical, Internat. J. Algebra Comput., 2013, 23(5), 1011-1062

[53] Guralnick R., Malle G., Products of conjugacy classes and fixed point spaces, J. Amer. Math. Soc., 2012, 25(1), 77-121

[54] Guralnick R.M., Tiep P.H., Cross characteristic representations of even characteristic symplectic groups, Trans. Amer. Math. Soc., 2004, 356(12), 4969-5023

[55] Guralnick R.M., Tiep P.H., The Waring problem for finite quasisimple groups. II, preprint available at http://arxiv.org/abs/1302.0333

[56] Horowitz R.D., Characters of free groups represented in the two-dimensional special linear group, Comm. Pure Appl. Math., 1972, 25(6), 635-649

[57] Hrushovski E., The elementary theory of the Frobenius automorphisms, preprint available at http://arxiv.org/ abs/math.LO/0406514/

[58] Humphreys J.E., Modular Representations of Finite Groups of Lie Type, London Math. Soc. Lecture Note Ser., 326, Cambridge University Press, Cambridge, 2006

[59] Huppert B., Blackburn N., Finite Groups, III, Grundlehren Math. Wiss., 243, Springer, Berlin-Heidelberg-New York, 1982

[60] Jambor S., Liebeck M.W., O'Brien E.A., Some word maps that are non-surjective on infinitely many finite simple groups, Bull. Lond. Math. Soc., 2013, 45(5), 907-910

[61] Kanel-Belov A., Kunyavskii B., Plotkin E., Word equations in simple groups and polynomial equations in simple algebras, Vestnik St. Petersburg Univ. Math., 2013, 46(1), 3-13

[62] Kanel-Belov A., Malev S., Rowen L., The images of non-commutative polynomials evaluated on 2 x 2 matrices,

Proc. Amer. Math. Soc., 2012, 140(2), 465-478

[63] Kantor W.M., Lubotzky A., The probability of generating a finite classical group, Geom. Dedlcata, 1990, 36(1), 67-87

[64] Kapovich I., Mapping tori of endomorphisms of free groups, Comm. Algebra, 2000, 28(6), 2895-2917

[65] Kapovich I., Schupp P.E., Random quotients of the modular group are rigid and essentially incompressible, J. Reine Angew. Math., 2009, 628, 91-119

[66] Kassabov M., Nikolov N., Words with few values in finite simple groups, Quart. J. Math. (in press), DOI: 10.1093/qmath/has018

[67] Lang S., Weil A., Number of points of varieties in finite fields, Amer. J. Math., 1954, 76(4), 819-827

[68] Larsen M., Word maps have large image, Israel J. Math., 2004, 139, 149-156

[69] Larsen M.J., Pink R., Finite subgroups of algebraic groups, J. Amer. Math. Soc., 2011, 24(4), 1105-1158

[70] Larsen M., Shalev A., Characters of symmetric groups: sharp bounds and applications, Invent. Math., 2008, 174(3), 645-687

[71] Larsen M., Shalev A., Word maps and Waring type problems, J. Amer. Math. Soc., 2009, 22(2), 437-466

[72] Larsen M., Shalev A., Fibers of word maps and some applications, J. Algebra, 2012, 354, 36-48

[73] Larsen M., Shalev A., Tiep P.H., The Waring problem for finite simple groups, Ann. of Math., 2011, 174(3), 1885-1950

[74] Larsen M., Shalev A., Tiep P.H., Waring problem for finite quasisimple groups, Int. Math. Res. Not. (IMRN), 2013, 10, 2323-2348

[75] Levy M., Word maps with small image in simple groups, preprint available at http://arxiv.org/abs/1206.1206

[76] Levy M., Word maps with small image in almost simple groups and quasisimple groups, preprint available at http://arxiv.org/abs/1301.7188

[77] Lidl R., Mullen G.L., Turnwald G., Dickson Polynomials, Pitman Monogr. Surveys Pure Appl. Math., 65, Longman Scientific & Technical, Harlow, 1993

[78] Lidl R., Niederreiter H., Finite Fields, Encyclopedia Math. Appl., 20, Addison-Wesley, Reading, 1983

[79] Liebeck M.W., O'Brien E.A., Shalev A., Tiep P.H., The Ore conjecture, J. Eur. Math. Soc. (JEMS), 2010, 12(4), 939-1008

[80] Liebeck M.W., O'Brien E.A., Shalev A., Tiep P.H., Commutators in finite quasisimple groups, Bull. Lond. Math. Soc., 2011, 43(6), 1079-1092

[81] Liebeck M.W., O'Brien E.A., Shalev A., Tiep P.H., Products of squares in finite simple groups, Proc. Amer. Math. Soc., 2012, 140(1), 21-33

[82] Liebeck M.W., Shalev A., The probability of generating a finite simple group, Geom. Dedicata, 1995, 56(1), 103-113

[83] Liebeck M.W., Shalev A., Diameters of finite simple groups: sharp bounds and applications, Ann. of Math., 2001, 154(2), 383-406

[84] Liebeck M.W., Shalev A., Fuchsian groups, finite simple groups, and representation varieties, Invent. Math., 2005, 159(2), 317-367

[85] Lubotzky A., Images of word maps in finite simple groups, Glasg. Math. J. (in press), DOI: 10.1017/ S0017089513000396

[86] Lyndon R.C., Words and infinite permutations, In: Mots, Lang. Raison. Calc., Hermès, Paris, 1990, 143-152

[87] Macbeath A.M., Generators of the linear fractional groups, In: Number Theory, Houston, 1967, American Mathematical Society, Providence, 1969, 14-32

[88] Macpherson D., Tent K., Pseudofinite groups with NIP theory and definability in finite simple groups, In: Groups and Model Theory, Mülheim an der Ruhr, May 30-June 3, 2011, Contemp. Math., 576, American Mathematical Society, Providence, 2012, 255-267

[89] Magnus W., Rings of Fricke characters and automorphisms groups of free groups, Math. Z., 1980, 170(1), 91-102

[90] Magnus W., The uses of 2 by 2 matrices in combinatorial group theory. A survey, Resultate Math., 1981, 4(2), 171-192

[91] Manin Yu.I., Cubic Forms, North-Holland Math. Library, 4, North-Holland, Amsterdam, 1986

[92] Maroli J.A., Representation of tree permutations by words, Proc. Amer. Math. Soc., 1990, 110(4), 859-869

[93] Martinez C., Zelmanov E., Products of powers in finite simple groups, Israel J. Math., 1996, 96(2), 469-479

[94] Myasnikov A., Nikolaev A., Verbal subgroups of hyperbolic groups have infinite width, preprint available at http://arxiv.org/abs/1107.3719

[95] Myasnikov A.G., Shpilrain V., Automorphic orbits in free groups, J. Algebra, 2003, 269(1), 18-27

[96] Najib S., Une généralisation de l'inégalité de Stein-Lorenzini, J. Algebra, 2005, 292(2), 566-573

[97] Nikolov N., Algebraic properties of profinite groups, preprint available at http://arxiv.org/abs/1108.5130

[98] Nikolov N., Pyber L., Product decompositions of quasirandom groups and a Jordan type theorem, J. Eur. Math. Soc. (JEMS), 2011, 13(4), 1063-1077

[99] Nikolov N., Segal D., A characterization of finite soluble groups, Bull. Lond. Math. Soc., 2007, 39(2), 209-213

[100] Nikolov N., Segal D., Powers in finite groups, Groups Geom. Dyn., 2011, 5(2), 501-507

[101] Nikolov N., Segal D., Generators and commutators in finite groups;abstract quotients of compact groups, Invent. Math., 2012, 190(3), 513-602

[102] Ore O., Some remarks on commutators, Proc. Amer. Math. Soc., 1951, 2(2), 307-314

[103] Platonov V.P., Linear groups with identical relations, Dokl. Akad. Nauk BSSR, 1967, 11, 581-582 (in Russian)

[104] Puder D., Primitive words, free factors and measure preservation, Israel J. Math., 2014 (in press), DOI: 10.1007/s11856-013-0055-2

[105] Puder D., Parzanchevski O., Measure preserving words are primitive, preprint available at http://arxiv.org/ abs/1202.3269

[106] Razmyslov Yu.P., A certain problem of Kaplansky, Math. USSR Izv., 1973, 7(3), 479-496

[107] Ribnere E., Sequences of words characterizing finite solvable groups, Monatsh. Math., 2009, 157(4), 387-401

[108] Rosset M., Rosset S., Elements of trace zero that are not commutators, Comm. Algebra, 2000, 28(6), 3059-3072

[109] Saxl J., Wilson J.S., A note on powers in simple groups, Math. Proc. Cambridge Philos. Soc., 1997, 122(1), 91-94

[110] Schul G., Shalev A., Words and mixing times in finite simple groups, Groups Geom. Dyn., 2011, 5(2), 509-527

[111] Segal D., Words: Notes on Verbal Width in Groups, London Math. Soc. Lecture Note Ser., 361, Cambridge University Press, Cambridge, 2009

[112] Serre J.-P., Le groupe de Cremona et ses sous-groupes finis, In: Séminaire Bourbaki, 2008/2009 (997-1011), Astérisque, 2010, 332(1000), 75-100

[113] Shalev A., Commutators, words, conjugacy classes and character methods, Turkish J. Math., 2007, 31(Suppl.), 131-148

[114] Shalev A., Word maps, conjugacy classes, and a noncommutative Waring-type theorem, Ann. of Math., 2009, 170(3), 1383-1416

[115] Shalev A., Applications of some zeta functions in group theory, In: Zeta Functions in Algebra and Geometry, Palma de Mallorca, May 3-7, 2010, Contemp. Math., 566, American Mathematical Society, Providence, 2012, 331-344

[116] Slusky M., Zeros of 2 x 2 matrix polynomials, Comm. Algebra, 2010, 38(11), 4212-4223

[117] Suzuki M., On a class of doubly transitive groups, Ann. of Math., 1962, 75(1), 105-145

[118] Thom A., Convergent sequences in discrete groups, Canad. Math. Bull., 2013, 56(2), 424-433

[119] Thompson J.G., Nonsolvable finite groups all of whose local subgroups are solvable, Bull. Amer. Math. Soc., 1968, 74(3), 383-437

[120] Thompson R.C., Commutators in the special and general linear groups, Trans. Amer. Math. Soc., 1961, 101(1), 16-33

[121] Tiep P.H., Zalesskii A.E., Some characterizations of the Weil representations of the symplectic and unitary groups, J. Algebra, 1997, 192(1), 130-165

[122] Tits J., Free subgroups in linear groups, J. Algebra, 1972, 20(2), 250-270

[123] Varshavsky Ya., Lefschetz-Verdier trace formula and a generalization of a theorem of Fujiwara, Geom. Funct. Anal., 2007, 17(1), 271-319

[124] Vogt H., Sur les invariants fundamentaux des equations différentielles linéaires du second ordre, Ann. Sci. École Norm. Supér., 1889, 6(Suppl.), 3-70

[125] Wan D., A p-adic lifting and its application to permutation polynomials, In: Finite Fields, Coding Theory, and Advances in Communications and Computing, Las Vegas, August 7-10, 1991, Lecture Notes in Pure and Appl. Math., 141, Marcel Dekker, New York, 1993, 209-216

[126] Wilson J.S., Characterization of the soluble radical by a sequence of words, J. Algebra, 2011, 326, 286-289

[127] Zelmanov E.I., On the restricted Burnside problem, In: Proceedings of the International Congress of Mathematicians, Kyoto, August 21-29, 1990, Mathematical Society of Japan, Tokyo, 1991, 395-402

[128] Zorn M., Nilpotency of finite groups, Bull. Amer. Math. Soc., 1936, 42(7), 485-486