PROCEEDINGS

THE ROYAL A SOCIETY

MATHEMATICAL, PHYSICAL & ENGINEERING SCIENCES

Matrix Theory of Correlations in a Lattice. Part I

R. Eisenschitz

Proc. R. Soc. Lond. A 1944 182, 244-259 doi: 10.1098/rspa.1944.0002

Email alerting service

Receive free email alerts when new articles cite this article - sign up in the box at the top right-hand corner of the article or click here

To subscribe to Proc. R. Soc. Lond. A go to: http://rspa.royalsocietypublishing.org/subscriptions

This journal is © 1944 The Royal Society

Matrix theory of correlations in a lattice. Part I

By R. Eisenschitz

(Communicated by Sir Robert Robertson, F.R.8 .-—Received 5 March 1942—

Revised 25 May 1943)

The statistical mechanics of some crystalline systems may be reduced to statistical correlations between objects which .are the unit cells of a fictitious lattice. The correlations are deduced from postulates according to which some configurations of the cells are incompatible with some configurations of the neighbouring cells; if, on the other hand, configurations of neighbours are compatible with each other, their probabilities are to combine by multiplication. By these postulates matrices are implicitly defined such that the probability distribution for a chain of cells is found by forming the powers of a matrix. A similar approach to the statistics of a lattice involves infinite matrices. It does not seem practicable to give explicit expressions for these matrices. If appropriate conditions are complied with, the correlations in a chain are accounted for by adjusting the mean probability coefficients of the cells and for the rest regarding the cells as statistically independent. In this case the infinite matrices may be replaced by the outer power of finite matrices. As result an equation is given by means of which the thermodynamical energy may be calculated as function of temperature.

General introduction

The statistical mechanics of systems involving correlations becomes increasingly important in the theory of solids and liquids. The rigorous'treatment of the theory presents considerable difficulties. It seems advisable to treat the matter in such a way as to separate the mathematical formalism from an application of the theory to a concrete physical system. This is the course adopted here.

Part I of the paper deals with the mathematical aspect and in Part II the theory is given for a physical system.

1. Introduction to Part I

The present paper deals with the statistical mechanics of crystals. The forces in the crystal are assumed to be of such a kind that one atom interacts with only a finite number of other atoms in the lattice. A further assumption is made on the number of configurations of the crystal; while this number is infinite in an infinite ^ crystal, it is assumed that its ratio to the number of lattice points is finite. ^h These assumptions are in fact never strictly realized; they are, however, frequently

W used as approximations. In mixed crystals, for instance, the partition function may adequately be separated into a vibrational and configurational factor, and the energy of configuration may be assumed as equal to the sum of bond energies (these are not necessarily restricted to bonds of pairs or interaction of nearest neighbours). Some fair approximation may be expected in crystals with rotating molecules by substituting a finite number for the infinity of molecular orientations. In ferromagnetism a rough approximation may be obtained by considering the spin interaction of

[ 244 ]

The Royal Society is collaborating with JSTOR to digitize, preserve, and extend access to Proceedings of the Royal Society of London. Series A, Mathematical and Physical Sciences.

www.jstor.org

Matrix theory of correlations in a lattice 245

nearest neighbours and neglecting the resonance of states with equal magnetic moment.

The theory of the above systems involves statistical correlations between different atoms in the lattice which may extend over long distances.

In this paper a theory is given which applies to a large section of these correlation problems. It is presented without reference to any special physical system. Lassettre & Hove (1941) and Montroll (1941) have made important contributions to the theory. The main result of this paper is to show how the statistics of the crystal can be reduced to a statistics of independent systems. This has apparently not yet been anticipated generally but has been recognized by the author (1941) in a special case.

2. Mathematical groundwork

A few results of the theory of probability required in this investigation are obtained separately in this section of the paper.

Consider now objects which may occupy configurations c1...cx...cA with the (not necessarily normalized) probability coefficients px -"Pa- I*1 a collection of these objects configurations are specified by the configurations cA of the particular objects. In a configuration of the collection the 'populations of the configurations cA' are defined as the number of objects which occupy the configurations cA. The objects are said to be statistically independent if they occupy their configurations with a probability that is independent of the configurations which the other objects of the collection may occupy. Otherwise there are 4statistical correlations' in the collection. A collection of statistically independent objects will be called an 4assembly'; a collection of statistically dependent objects is to be spoken of as an 'aggregate'.

There are well-known theorems on probability which hold for any assembly but not generally for aggregates. In an assembly of independent aggregates relations may be expected to hold which are in some way independent of the special kind of correlation within the aggregates. Some results of this kind will be obtained in the following.

(a) First consider the statistics of independent objects. In an assembly of n

objects let the populations of the configurations be denoted by nr1... nrx... nrA ; rx are

positive fractions subject to the condition S rA = 1. All configurations of the assembly

which are not distinct by their populations nrx have equal probability. The total probability of the configurations with assigned populations is equal to a term in the expansion of

F(n,p) = (p1+p2+...+pA)n. (la)

Regarding F(n, p) as a function of pl9 p2, ..., pA, it will be spoken of as 'function of distribution'. Denoting the terms in the expansion by

w(nrl9nr2,...,nrA;pvp2, = o)(nrl9nr29 ...,nrA)p^p^...pfA,

246 R. Eisenschitz

briefly by w(nr, p) = (¿)(nr)nxpYx>

it is seen that 0)(nr) = nl/IIx(nrx\),

In the limit n~>go the polynomial (la) has a greatest term

n\nxpfxjnx(nfx\),

where rx = pjTi Px, in comparison with which the remaining terms may be neglected. A

In this limit

(rx - rA)2/rA j.

(6) In an aggregate of k objects the function of distribution does no more comply with equation (la) but is to be written

i'(i,p) = SS...SS <0t№I, — **A) PiSlP*2 • • • Pa% (1 b)

Si sz sx r

where ksx is the population of the configuration cA, the suffix r refers to all possible permutations of objects occupying different configurations, and coT(ks* • * > ^d) are arbitrary non-negative numbers. Of course S«sA = 1. The average populations ksx are A

ksx=pxdlogF(k,p)ldpx. (2)

The mean probability with which one object in the aggregate occupies the configuration cA is different from px and equal to p'x and

h = p'x ftp A- (3)

Now consider an assembly of m independent aggregates containing mk = n objects. The populations in the assembly are denoted by nrx; the coefficients of total probability for the configurations with assigned populations are denoted by

w(nr1...nrA;p1...pA).

In the limit m->oo these probabilities are given by the same expression as apply to an assembly of independent objects, namely,

where obviously rx = sx. (5)

In order to prove this, the function of distribution for the aggregate is written F(k, p) = k = 1... K, where the configurations of the aggregate are enumer-

ated by the suffix /c. The populations are denoted by ksKX.^sKA = 1.

The function of distribution of the assembly is

F(n, p) = [F(k, p)]m (6)

F(n, p) = S • • • S m ! nK Pf«/nK (ml J), (6a)

Matrix theory of correlations in a lattice 247

where 2 K = I*1 the limit

F(n, p) = S ... S exp ( - 5 S (K ~ WK) ■ (66)

h lK \ Z K J

The numbers rx and lK are connected by the equations

= a- (?)

The sums in equation (66) may be divided into partial sums in each of which all terms have equal populations nrv nr2, nrA\ every partial sum is the function of distribution for the configurations having these populations. This function of distribution may be alternatively expressed by means of A parameters fi2,..., :

2 s ... s exp ( -1S hlK - ïK)*/lK] + 2k 2 lK8KXfa\).

h h lk \ * k l a )}

Rearranging this expression into

2 S - S exp ( - 5 S k -1(1 - k S 5,a/?a)J2/1 + ».¿(S - Z>(£ ^a/?a)2)) , Zx h ÎR \ L K \ A A A J7

it is seen to approach asymptotically its greatest term in which lK = lK and where

It is accordingly allowed to substitute on the right of (66) the greatest term for each partial sum so that

CZ ft)w N S ... 2 exP ( - ~ 2 (L - ï)2lh) .

* h Ik \ 1 K , '

In the latter sum every term is equal to the total probability of the configurations of the assembly with assigned values of the numbers lK. By putting lK = lK in equations (7) and noting that

—which equation is derived from (8)—A independent linear equations are provided for calculating the parameters /?A. It follows that these parameters and accordingly the numbers lK are linear functions of the numbers rx and

( S PK)m■=. S S •.. S exp ( - jQ(rx)\,

K n r2 TA \ L J

where Q(rx) is a polynomial of second degree in the variables rx. The summation is carried out over those systems of values of rl9 r2, ..., rA to which a non-negligible probability corresponds.

Any function of distribution of this type contains no other constants than those which are functions of the moments of the first and second order, namely, fx, r\,

248 R. Eisenschitz

(rArA/). The linear moments are proportional to the mean probability coefficients of one object as given in equations (5) and (3). The moments of the second order depend similarly upon the mean probability coefficients of pairs of objects. Denoting by pxx the mean probability coefficient for two objects to occupy simultaneously the configuration cx, then _

Denoting by the mean probability coefficient for one object to occupy the configuration cA and another object to occupy simultaneously the configuration cA/, then

/ÀÀ'=. 2w(rArA/)/(n-l).

The probabilities and ^>AA' are averages over all pairs of objects. There are n(n— l)/2 pairs. The number of those pairs the objects of which are contained in different aggregates is m(m~ l)&2/2 which is asymptotically equal to the former number. The overwhelming majority are accordingly pairs of statistically independent objects. The moments of the second order and the function Q(rx) are therefore equal to the corresponding quantities in an assembly of independent objects which occupy the configurations cA with the probability^; Q(rx) = 2 (rA — fx)2/fx such as asserted in equation (4). A

(c) The configurations cA may be divided into classes to be denoted by G^; populations of the classes G^ may be defined by the sum of populations of all those configurations cx which belong to the class The populations of the classes G^ are denoted by kS ; S' is the sum of some sx and thus

where the suffix in brackets indicates the class G^ to which the corresponding configuration cA belongs to.

The total probability for the configurations of the aggregate with assigned values of the S and the function of distribution for these configurations is denoted by W{kSp p) and

W(kSpp) = s s ... -->ksA/l)).

sl{/i)s2(/i) sA{[i)

The sum is to be taken over all those values of s^, ..., which satisfy equation (9) for the assigned values of the S^.

In an assembly of m independent aggregates the populations of the classes G^ are denoted by nR^ n = mk. The function of distribution for the configurations with

assigned populations of the classes G^ is denoted by W(nR , p). 0bviously =

and F(n, p) = S W(nR p),

Matrix theory of correlations in a lattice 249

and according to equation (6)

2 W(nRp p) = |~2 p)T U<>)

Hp L^a J

One term on the left of (10) is equal to a partial sum of terms on the right. By substituting for the partial sum its greatest term, W{nRp p) is obtained as a product of powers of functions W(kSpp) multiplied by a factor which is independent of Pv P& •••> Pa but may depend upon Rv R2, ..., R , .... If there is in W(kSp p) one term in comparison with which the remaining terms may be neglected, the product of powers is reduced to [W(kSp p)]m with S^ = R/M.

(d) By assigning to every object in an aggregate a class G^ such that the object may occupy no other configuration than those belonging to Gp a set of configurations of the aggregate is selected. In this set all configurations have the same populations of the classes Gv 02, ..., Gp ... and therefore are specified by the same values of the numbers Sv S2, ..., Sp .... By permuting the objects to which different classes G^ are assigned, another set of configurations is obtained in which the numbers necessarily have the same values as in the first set. There are k'\jn^(kS^.) different sets in which the numbers Sv S2, Sp ... have the same values.

The function of distribution of a set depends necessarily upon the numbers Sv $2, ..., Sp... and possibly varies with the permutations. It is accordingly denoted by v^kSp, p), the suffix cr ranging from 1 to k!/77^{kS^!). There may be sets for which this function is equal to 0. Obviously

Zv(,(lcSll,p) = W(kS/0p). (11a)

In an assembly of independent aggregates similar sets of configurations are selected by assigning a class G^ to every object. Their function of distribution is denoted by vp(nRp p), the suffix p ranging from 1 to nl/lJ^(nR^.). Obviously

2 vp(nR/l,p)=W(nR^p). (116)

Given the values of Rv R2,..., Rp ..., then those corresponding functions vp(nRpp), which are different from 0 and not negligibly small, approach asymptotically (limm->oo) equality among themselves and with their average which is proportional to W(nRp p).

In proving this I assume for sake of argument that given the values of Sl9 S2, ..., Sp ..., the functions v^kS^ p) are distinct from each other; otherwise the assertion would be trivial.

It follows from equations (10), (11a), (116) that every function vp(nRpp) is a sum of products of powers of functions vjjc8p p). In order to prove the assertion it is sufficient to show that the sum of power products is asymptotically reduced to one term. Assuming at first that

W(nR/l,p)^y[W(kS/l,p)r, W

Vol. 182. A

250 R. Eisenschitz

where y does not depend upon the px but possibly upon the Rfl and where Rfl = S^ then W(nRli,p) = 7[Zv<r(kS.,p)r. (13)

Every term in the expansion of the right is equal to a sum of equal functions vp(nRpp)) and on the other hand a product of high powers of functions va(kSp p). The expansion is asymptotically equal to one of its terms. A similar argument applies if the right of equation (12) is a product of powers of the W(kSp p).

The factor y depends upon the number of functions which deviate appreciably from the avefage. y is to be regarded as an unknown continuous function of the numbers R„.

With regard to those functions vp(nRp p) which are not negligibly small, it is

found finally that , „ , Tir/ '

* vp{nRpp) = ccf .W{nRfPp), (14)

where cxf is an unknown continuous function of the R/r Equations (4) and (14) are part of the necessary foundations of the subsequent calculations.

3. Formulation of the statistical problem

Since every atom interacts with only a finite number of other atoms, it is possible to divide the lattice into equal domains of such a kind that every atom interacts only with those atoms which are contained in the same domain. These domains overlap, so that a lattice point is generally contained in more than one domain. From a geometrical point of view the lattice may be said to be generated by a group of three displacements of the unit cells or of the domains.

In a lattice in which there is interaction between nearest neighbours only, the number of atoms divided by the number of domains in the crystal is equal to the number of atoms in the unit cell; the atoms which are sited on the boundary of the unit cells belong to at least two domains between which they are shared. If atomic interaction extends further, the above ratio is larger than the number of atoms in the unit cell and the domains overlap to greater depth.

The configurations of the domain are specified by the configurations of the lattice points. The latter are specified by the kind of atom which occupies the lattice point, the internal state of the atom, its displacement from equilibrium position, etc. Since the configuration of every lattice point enters into the specification of the configuration of the domain, I shall speak of the 4 contribution' of the lattice point to the configuration of the domain, although no quantity expressible in numbers is contributed. The energy of the domain is, on the other hand, equal to the sum of contributions of energy made by atoms and bonds between atoms. In defining the energy of the domain each of these contributions is divided by the number of domains between which the atoms or bonds, respectively, are shared. The energy of the lattice is consequently obtained as the sum of energies of domains.

Mutually overlapping domains may not occupy their configurations independently of each other, for each lattice point which is shared by the domains makes the same contribution whether it is regarded to belong to one domain or the other.

Matrix theory of correlations in a lattice 251

In any pair of domains there are lattice points corresponding to each other, such that the set of distance vectors from one point in one domain to the remaining points of the same domain is equal to the set of distance vectors from the corresponding point in the other domain to the remaining points of the other domain. The distance between corresponding points in different domains defines the distance between the domains. It is possible to select from those domains, with which one particular domain overlaps, six distinct domains in three non-coplanar directions with the three smallest distances. These six domains are said to be 'neighbours' of the original domain. A frame of rectilinear co-ordinates x, y, z is defined such that the axes are parallel to the three vectors joining neighbours. To every domain three integer co-ordinates are assigned in relation to these axes.

In an isolated domain a probability coefficient is assigned to every configuration. In neighbouring domains some configurations of one neighbour are incompatible with some configurations of the other. If two configurations combine, the probability coefficient of the combination is equal to the product of the two probability coefficients corresponding to either domain; this follows from the additivity of domain energy.

It is possible to link any pair of two domains in the crystal by a series of neighbouring domains. There are accordingly statistical correlations between all domains of the lattice.

The probability coefficients of the configurations of the lattice may accordingly be derived from the probability coefficients of the domains provided the correlations between neighbours are correctly taken into account. In the terminology of § 2 the lattice may be regarded as an aggregate of domains, the statistics of which is determined by correlations between neighbouring domains. The objects of this aggregate are specified by enumerating their configurations individually (i.e. the configurations of the domains) and by establishing which of the objects are neighbours. The correlations between neighbours have to be formulated as ' rules of composition '.

The configurations of the domains may obviously be distinguished from each other and their number is finite; therefore a symbol may be assigned to each of them, namely, cv c2, ..., cA, ..., cA. Let Ex be the energy of the configuration cA; The corresponding probability coefficient is px = exp(-EJkT) = tEa.

Excluding absolute zero of temperature it is assumed that 0 < t < 1.

Every object of which the aggregate is composed represents one particular domain. The object is accordingly labelled by means of the co-ordinates of this domain. The objects are abstractions to which no position in space is assigned. It is nevertheless possible and convenient to speak of 'neighbouring' objects, which represent neighbouring domains, and to consider accordingly the totality of objects as forming a fictitious lattice. This expression is used in the following. The objects of the aggregate are considered to be the unit cells of the fictitious lattice. In order to distinguish them from the unit cells of the crystal lattice, these unit cells are spoken of as 'cells'. In relation to the co-ordinates of the domains we shall use expressions such as '^-neighbours', ' ¿-chains', 'a^-layers\ etc.

252 R. Eisenschitz

In order to formulate the rules of composition we consider first à pair of neighbouring domains which are represented by cells with the co-ordinates x, y, z and x+1, y, z. Whether or not configurations of these cells may combine depends upon the contribution of those lattice points which are shared by the domains. According to the contribution of these lattice points, the configurations are partitioned into classes such that either all configurations or no configurations of the class may combine with any arbitrary configuration of the neighbouring domain. All those configurations of the first domain to which the above lattice points make the same contribution are assembled in a class to be denoted by B(x)h; h = 1All those configurations of the second domain to which the above lattice points make the same contribution are assembled in a class A(x)g; g — 1,J1. The configurations of the first domain which are contained in one particular class combine with those configurations of the second domain contained in one particular class. The suffices g and h are chosen such that their values are equal for the classes of combining configurations.

Similarly, classes of combining configurations are defined for ¿/-neighbours and 2-neighbours.

The configurations cA are accordingly partitioned in six different ways into classes which are denoted by A(x)g, B(x)h> A(y)i} B(y)pA{z)k, B(z)l. g,h = 1,...= 1, Jc,l = 1, ...,©. Every cA is contained in one and only one of the classes A(x)g and B(x) h and A(y)i9- etc. Considering a cell with the co-ordinates x, y, z, those configurations which are contained in the class B(x)g{B(y)i; B(z)k} combine with all those and no other configurations of the cell at x -b 1, y, z{x,y+l,z;x,y,z+ 1} which are contained in the class A(x)g{A(y)i; A(z)k}.

If cA and cA/ are combining configurations of a pair of neighbouring cells, the corresponding configuration of the pair (denoted by cAcA,)has the relative probability£>Af>A>.

These are the premisses from which the probability coefficients of the aggregate of domains are to be deduced.

In dealing with this problem I employ 'functions of distribution ' which are sums of relative probabilities taken over all configurations of a cell, chain, layer or the entire fictitious lattice. These functions are distinct from the corresponding partition functions for two reasons: (a) They are functions of the variables px whereas the partition function is the function of one variable, namely, the temperature. Similarly, as the partition function provides the mean energy, these functions provide a more detailed information, namely, the mean probabilities with which the domains within a chain, etc., occupy their different configurations. Information of this kind is necessary for calculating the correlations bètween different chains and layers. The thermodynamic energy is obtained as average energy of the configurations of the domains, (b) The numerical values of the functions of distribution are generally distinct from the numerical values of the partition functions and their ratio may depend upon temperature. The thermodynamic energy cannot be obtained from the function of distribution by applying the usual formula by which that energy is related to the partition function.

Matrix theory of correlations in a lattice 253

4. Application of matrices

The above rule of composition for ^-neighbours is readily expressed in terms of matrices. A matrix D(x) of Prows and columns can be defined such that the matrix element D(x)gh contains the (symbolic) sum of all those cA which are simultaneously contained in the classes A (x)g and B(x)h; if there is no cA which corresponds to a pair of suffices g and h, the matrix element D(x)gh is 0. According to the matrix law of multiplication, the matrix elements of D(x)2 are sums of products cAcA>; all products of combining configurations and no other products appear in these matrix elements. By substituting px for cA in D(x), a matrix G(x) is obtained, the elements of which are sums of probability coefficients. The sum of the matrix elements of G(x)2 is equal to the sum of probability coefficients taken over all configurations of a pair of x-neighbours.

Similarly, matrices D(y), G(y) and D(z), G(z) can be defined, in terms of which the

rules of composition for «/-neighbours and ¿-neighbours are formulated.

The configurations of an #-chain and their probability coefficients are obviously

found by calculating the nth, power of the matrices D(x) and G(x) respectively.

2 G(x)ngk is equal to the partition function (function of distribution respectively) of a ,h

an, #-chain.

An xy-lsbjev is composed of neighbouring ^-chains the cells of which combine according to the rule for ¿/-neighbours. The layer may be regarded as a chain the cells of which are ^-chains. The rule of composition for «/-neighbours implies a rule of composition for neighbouring ^-chains by which a matrix L is implicitly defined. As the number of configurations increases with the length of an #-chain, it may be impracticable to enumerate the individual configurations and to find expressions for the matrix elements of L. Actually it is not these matrix elements but the sum

2 Lmii which is required. The evaluation of the sum—which is the function of dis-

tribution of the xy-lsbjei—is the central problem of this paper and will be given in §§5 and 6. The lattice may similarly be regarded as composed of neighbouring ^«/-layers; a matrix L+ is implicitly defined by the rule of composition for two neighbouring ^«/-layers.

In order to employ the powerful methods of matrix algebra it will be assumed that the matrices comply with certain conditions. This involves a restriction of the scope of the theory.

At first it is assumed that the matrices G(x), G{y), G{z) are symmetric matrices. If this condition holds, it may be shown that the matrices L and L+ are also symmetric. There exists accordingly a real orthogonal transformation by which these matrices are transformed to diagonal form.

The significance of this symmetry is conveniently demonstrated by considering those lattice points of a domain which are shared by the right neighbour and those which are shared by the left neighbour (for instance, those represented by the two ^-neighbours of a cell). Every lattice point of the first kind is geometrically equi-

254 R. Eisenschitz

valent to one lattice point of the second kind. By permuting the contributions of geometrically equivalent lattice points any configuration of the domain is transformed into another of its configurations. These two configurations have equal probability coefficients if and only if the corresponding matrix—in this instance the matrix 0(x)—is symmetric. The symmetry of these matrices is accordingly not derived from crystal symmetry.

It will be assumed that some matrices have the following properties: Their highest proper value and the highest proper value of their matrix square are single and the latter is the square of the former. Matrices of this kind will be said to be ' ©-matrices' and the proper vector corresponding to the highest proper value will be spoken of as 4 first proper vector'.

Every real symmetric matrix M may be represented in terms of its proper values*

= (15)

where Jf J are the proper values and the components of the proper vectors.

If Jf is a high power of a ©-matrix, the sum in equation (15) is reduced to one term. In this case M is equal to a number (i.e. the power of the highest proper value) multiplied by an idempotent matrix.

A matrix in which the sum of matrix elements is equal for all rows is a ©-matrix and the components of the first proper vector are equal to each other. (The proof is simple and may be omitted.) If it is assumed that the number of configurations cA is equal in every row of D(x), it follows that at high temperatures (t = 1) the matrix G(x) is a ©-matrix and ^ the components of the first proper vector are equal to each other. There is accordingly a range t < 1 where G(x) and the matrix G(x)ghE>gE>h are ©-matrices.

The assumption that there are equal numbers of cA in every row of D(x) is not necessary for the validity of the following calculations. The assumption may hold for special systems of physical importance. It will, however, be assumed that the matrices G(x), G(y), G(z), and some other matrices to be defined in the course of calculations, are ©-matrices. By this condition the scope of the theory is once more restricted. In applying the theory to any special system it is necessary to prove that the above conditions are complied with.

5. The statistics of a chain

As far as the energy is concerned, the probability distribution of an ¿c-chain is determined by its partition function*)* which is equal to 2 G(x)ng^. The configurations

of the x-chain have, however, to be specified not only by their energy but also with regard to those configurations of a neighbouring #-chain with which they combine.

* Cf. Temple (1934, p. 15).

f In the case of a chain a function of distribution will be defined which is numerically equal to the partition function at all temperatures.

Matrix theory of correlations in a lattice 255

It is, on the other hand, impracticable to enumerate the individual configurations of an #-chain.

If the #-chain was replaced by a collection of independent cells, the problem would be considerably simplified. It would be sufficient to specify every configuration of the collection by the appropriate populations of cell configurations irrespective of the position of the cells. In a collection of independent cells no distinction is made between two different configurations in which nrv nr2,..., nrA cells occupy the configurations cvc2, ...,cA respectively. The two configurations combine with two different sets of configurations of their «/-neighbours, but these sets have equal functions of distribution.

It will be seen that there is a close similarity between the statistics of an #-chain and of a collection of statistically independent cells so that it is sufficient to specify the configurations of an #-chain by their populations of cell configurations.

In the terminology of § 2, an ^-chain is an aggregate of cells. In applying equation (16) it is noticed that all configurations with the same set of populations cA have the same probability. The function of distribution may accordingly be written without the suffix r

Si Sa SA

= 2 2 • • • 2 fej, Jcs2, ...,ksA; pl9p2,...,pA),

sx s2 sA

where the probability coefficients o) are unknown functions of the populations.

The function of distribution is, on the other hand, equal to 2 G(x)k9* provided

that the matrix elements are regarded as functions of pv p2,...,pA. Since G(x) is supposed to be a G-matrix

where G' is the highest proper value of G(x) and £,h are the components of the first proper vector. In this function of distribution it is admissible to regard j 2 ig in I as

virtually constant.

The function of distribution for an assembly of m independent chains is

\g,h ]

and for a chain of km cells (Cr')fem 2 igih•

The difference between these functions divided by their average approaches 0 if k is large enough. For the purpose of calculating probabilities we may therefore put

lim ZGkm&h = lim [ZG(x)kghyn. (17)

Jc-> oo A;—>00

256 R. Eisenschitz

Comparing (17) with (6) the #-chain is seen to have the statistics of an assembly of independent aggregates.*

According to equation (4) the probability coefficient for the configurations with assigned populations is equal to

w(torl9nr2, ...,nrA;pl9p29...9pA) = exp^-|s(rA-rA)2/rA|,

where n is the number of cells and nrx are the populations of cell configurations. The values of rx could be found according to equation (2) by differentiation, but it is more convenient to consider a chain with a £central' cell to which n and n' cells are joined to the left and right respectively. The function of distribution

S [G(x)n G{x) G(x)n']fi = £ S Zf{GYZo G{x)ghgh{G'r L

f,i f,i g, h

Lf,i J g,h

is expanded in terms of the matrix elements of G(x)gh representing the central cell. The mean probability of the configuration cA is accordingly

Px=Px£0£h> (18)

where the suffices g9h correspond to the matrix element D(x)gh in which cA is contained. According to (3)

Consider next the functions of distribution of certain classes or sets of configurations which have a similar significance for the combination^ of chains as the classes A(x)i9etc. have for the combination of cells.

In a chain of domains a set of configurations is selected when all lattice points vary their configurations with exception of those lattice points which are shared by a neighbouring chain. Any arbitrary configuration of the latter chain combines either with all configurations or with no configuration of the set.

Let the chain be represented by ^-chains in the fictitious lattice which have the cb-ordinates y and y+ 1 respectively, and cA be a configuration of the cell in the first

* It might be objected that G' is an irrational function of the px and accordingly (G')70™ cannot be the distribution function of an assembly of aggregates which necessarily is a high power of a polynomial. (G')km approaches actually a high power of a polynomial. Proof: The diagonal sum of a matrix is equal to the sum of its proper values so that

where | 8t \ lfk< 1 and | kS1 [ is smaller than any given positive number if k is large enough. For any integer m, 0 < m < k,

(G')mk = [S G(x)k^(I +

where 1 + \ 82 \ < (1 + | ^ | )m and accordingly | | < 2JcS± may be held below any given positive number and m beyond any other given positive number if k is large enough.

Matrix theory of correlations in a lattice 257

chain. The contributions of those lattice points which are shared by the chains may be specified by the particular class B(y)j where the cA belongs to. The representative of the above set of configurations is accordingly selected by assigning to every cell of the first chain one of the classes B(y)j such that the cell may occupy no other cA than those belonging to B(y)j. Let nR] be the number of cells to which the class B(y)j is assigned; E -Rj = 1- If the numbers R[... R'r,are given, there are many

possible positions for the cells to which a certain class B(y)j is assigned so that there are nl/II^nR^.) different sets with assigned numbers Bj.

The function of distribution of each set, a function of px ...pA, may depend on the numbers R[ ... R'r and also on the positions of the cells with assigned classes B(y)P These functions are denoted by vp(nB'p p) where the suffix p accounts for the position of the cells and ranges from 1 to nl/II^nBy.) and where

W(nR'„p) = Xf>P(nB'j9p) p

is the function of distribution of the configurations with assigned numbers Bp

Since the sets of configurations are specified by assigning to every cell a class of configurations, and since the #-chain may be regarded as an assembly of independent aggregates, equation (14) applies to the present problem. The functions vp(nBp p) approach accordingly independence of the suffix and proportionality to W(nR'p p). The factor of proportionality remains undetermined and may depend on the numbers Bp Those functions vp(nBp p) which deviate appreciably from the average are negligibly small or identically 0.

If alternatively the neighbouring ^-chains have the co-ordinates y and y— 1, similar sets of configurations of the first chain are defined by assigning to every cell one particular class A(y)t. The function of distribution of the set approaches asymptotically proportionality to the function W(nBi} p), where nRi is the number of cells to which the class Aty^ is assigned.

Finally, consider those sets of configurations of a chain in which a class A(y)i and a class B(y)p in other words a matrix element D(y)ip are assigned to every cell. In sets of this kind either all configurations or no configuration combine with any arbitrary pair of configurations of the two neighbouring ^-chains. The corresponding function of distribution depends upon the numbers of cells nB+j to which simultaneously the classes A(y)i and B(y)j are assigned and is—similarly as the functions W(nRp p) and W(nBp)—independent of the position of the cells with assigned classes. It is denoted by U(nR$9 p).

6. The statistics of a layer and the lattice

In § 4 it is shown that the function of distribution of an xy-layer is equal to S Lnv,

where L is a symmetric matrix the elements of which contain all configurations of the x-chain.

258 R. Eisenschitz

The ^-chains which are to be joined may be tentatively replaced by assemblies of

independent cells which occupy the configurations cA with the probabilities p'A. For

this problem the matrix H(y)ln] is appropriate where H(y) is a matrix obtained from

0(y) by substituting p'x for px and H(y)[n] is the nth. outer power of H(y). An xy-lsbjer

consisting of m ^-chains should according to this tentative approach have the

function of distribution 2 i(H(y)[n])m]ij- Since outer multiplication commutes with

matrix multiplication the latter expression is equal to which is

obviously the function of distribution of an assembly of n independent ¿/-chains.

The matrix H(yf7^ may be partitioned into (square or rectangular) submatrices in each of which all configurations with assigned numbers R£j are contained. All matrix elements within a submatrix are equal to each other and proportional to the function of distribution for the configurations of this submatrix which is obviously equal to TJ(nRp).

The matrix L is defined in the same space as the matrix H(yf^ and its structure is similar. In each submatrix the function of distribution is equal to U(nR^, p); the matrix elements are no longer equal; the sums of subrows and subcolumns, however, are either equal to each other or negligibly small, possibly equal to 0.

The sums of those subrows and subcolumns which give an appreciable contribution to the function of distribution are proportional to U(nRfpp). The factor of proportionality differs for the various submatrices, i.e. depends upon the numbers Rfj. For a given temperature it is sufficient to consider only a small range of these numbers where the factor of proportionality may be regarded as constant or rather as an unknown function of temperature.

In calculating the powers of L two results of matrix algebra are required. They are given without proof since they may readily be verified by means of the matrix law of multiplication.

(а) If a square matrix is partitioned into submatrices such that the partitioning lines are symmetric about the leading diagonal, the powers of this matrix may be calculated by regarding the submatrices as matrix elements. The product of two submatrices is defined as matrix product in the usual way.

(б) Two square or rectangular matrices for which multiplication is defined and in each of which the sum of elements is equal for each row and column have a product in which the sum of elements is equal for each row and column. If in either factor some rows and columns contain only elements equal to 0, while for the remaining rows and columns the sums of elements are equal, the product contains some rows and columns with all elements equal to 0 while for the remaining rows and columns the sums of elements are equal.

In applying these theorems to the powers of L it is seen that instead of multiplying a submatrix of Lm with a submatrix of L, the corresponding submatrices of (H(y)[n])m and H(y)[n] may be multiplied. The resulting functions of distributions differ only by a factor which may depend upon t but not upon the pA.

Matrix theory of correlations in a lattice 259

The above tentative approach to the function of distribution of the #«/-layer is therefore justified. The function of distribution of the xy-layer is equal to

2IM = a(t). \^H{y)miiT, (19)

i, j L i,j J

where a(t) is an unknown function of temperature.

Provided that H(y) is a ff-matrix the statistics of an xy-layer may be shown to be the statistics of an assembly of independent aggregates, similarly as it is shown for an #-chain. The arguments which lead from the function of distribution of the #-chain to the function of distribution of the xy-layer may be applied to the latter and lead to the function of distribution of the lattice. It is sufficient to give the result.

The configurations cA are supposed to be contained in the matrix elements D(x)gh, D(y)ip D(z)kl. The mean probability coefficient of the configuration cA is denoted by px, pi, p"{ when referring to a cell which is sited within a chain, a layer or the lattice respectively. K(z) is a matrix obtained by substituting p"x for px in the matrix G(z).

A range of temperature is considered in which G(x), H(y) and K(z) are ^-matrices. The components of their first proper vectors are denoted by rji9 respectively. It has been shown that / ^ c c nc\

and it may similarly be proved that

pl = PxZg£hViVj> Pi = P\£o£hViVj£k& (20)

The function of distribution for a cell which is sited within the lattice is therefore

equal to P = S ^^„LViViZi^i- (2i)

P is not equal to the partition function since an unknown function of temperature enters into equation (19) and a similar function enters into the function of distribution of the lattice. In § 3 Ex is defined as the energy of the configuration cA and tBa = exj)[ — EJkT]. In evaluating equation (21), we consider the energies given as multiples of an arbitrary unit of energy which is denoted by e. Then t = exp [ — ejJcT]. The thermodynamic energy per cell and also per domain of the crystal is obtained by taking an average of the cell energy

E ^etdlogP/dt. (22)

In carrying out the differentiation iginViVj^k^i regarded as independent of t.

By means of equations (21) and (22) the thermodynamic energy can be calculated as function of temperature. If the theory is applied to a special system such as specified in the introduction, this relation between temperature and energy is capable of experimental verification.

References

Eisenschitz, R. 1941 Nature, Lond., 147, 778.

Lassettre, E. N. & Hove, J. P. 1941 J. Chem. Phys. 9, 747, 808.

Montroll, E. W. 1941 J. Chem. Phys. 9, 706.

Temple, G. 1934 The general principles of quantum theory. London.