Research Article Open Access

Francisco Balibrea*

On the origin and development of some notions of entropy

DOI 10.1515/taa-2015-0006

Received June 2, 2015; accepted August 13, 2015

Abstract: Discrete dynamical systems are given by the pair (X, f) where X is a compact metric space and f : X ^ X a continuous maps. During years, a long list of results have appeared to precise and understand what is the complexity of the systems. Among them, one of the most popular is that of topological entropy. In modern applications other conditions on X and f have been considered. For example X can be non-compact or f can be discontinuous (only in a finite number of points and with bounded jumps on the values of f or even non-bounded jumps). Such systems are interesting from theoretical point of view in Topological Dynamics and appear frequently in applied sciences such as Electronics and Control Theory.

In this paper we are dealing mainly with the original ideas of entropy in Thermodinamics and their evolution until the appearing in the twenty century of the notions of Shannon and Kolmogorov-Sinai entropies and the subsequent topological entropy. In turn such notions have to evolve to other recent situations where it is necessary to give some extended versions of them adapted to the new problems.

Keywords: Clausius; Boltzmann; Gibbs; Shannon; Kolmogorov-Sinai entropies; topological entropy; Tsallis entropy

1 Introduction

Introduced by Rudolf Clausius, the notion of entropy appeared for the first time in the setting of Physics but has been adopted in other fields with different meanings. Here we present some of them.

The first idea to be considered is that given a physical system, the energy contained in it is comparable with the water contained in lakes, rivers or the sea. In such physical systems only surface waters are the unique which can be used to be transformed into work mechanical work or simply work (for example causing the rotation of a turbine).

In Classical Physics, entropy is seen as a magnitude which in every time is proportional to the quantity of energy that at such time can not be transformed into mechanical work.

Using the above interpretation, entropy plays a central rolle in the formulation of the Second Law of Thermodynamics which states that in an isolated physical system, any transformation of it leads to an increase of its entropy.

In Probability Theory, the entropy of a random variable measures the uncertainty over the values which can be reached by the variable.

In Information Theory, the entropy associate to the compression of a message (for example, of a file from a computer), quantifies the contains of information of the message to have the minimum lost of information in the compression process previous to its transmission.

In Dynamical Systems, the entropy measures the exponential complexity of the system or the average flow of information per unit of time.

*Corresponding Author: Francisco Balibrea: Departamento de Matemáticas, Universidad de Murcia, 30100-Murcia, Spain, E-mail: balibrea@um.es

[MBSSH© 2015 Francisco Balibrea, published by De Gruyter Open.

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License.

In real life, the word entropy is understood as disorder or chaos.

Concerning the evolution of the notion of entropy, it is necessary to mention the oustanding paper [26] concentrated in the ergodic point of view and Kolmogorov's new ideas on dynamical systems and the book [17] on comparison between different types of entropy and many topics associate to the topological entropy.

2 Prehistory of the entropy

The charge of putting the last stone in the thermodynamics science or simply Thermodynamics, was Rudolph Clausius in 1850 in the paper [15], where he coined the term entropy taking the Greek word evrpnia (trope) which means tranformation and trying to imitate the sound of the word energy. For him the meaning of entropy was the form of energy that eventually and inevitably turns into a useless heat.

Such meaning was inspired by an earlier formulation made by the French physician and mathematician Sasi Carnot (see [14]) of what is known his formulation of the Second Law of Thermodynamics: entropy represents the energy no longer capable to perform work. In isolated systems it can only grow. Sadi Carnot stated that in an ideal engine (which does not interchange heat with its outside), the entropy would be constant.

In Clausius time, were known previous experiments made by Joule in the sense of proving without doubt that mechanical work could always be transformed into heat. The reasoning to reach it was as follows. Carnot assumed that in his engine could not have losses of heat, according with the opinions and tendency of that time. Nevertheless, the experiments made by Joule indicated that heat could be created and in fact, Joule gave a precise equivalence between mechanical work and heat. If heat could be created, then could be also destroyed; therefore the claim of Carnot was false. As a consequence, Clausius wondered what was the origin of the energy necessary for a machine of Carnot to create mechanical work. For him the answer was clear, one part of the exchange of heat between the two sources composing the Carnot machine is the mechanism by which it is ceated mechanical work. Therefore heat could be created and distroyed from nothing but always transformed into the equivalent quantity of mechanical work. In this way the total energy had to maintain constant. This reasoning leads to the First Law of Thermodynamics: the absorbed heat made by a physical system is equivalent to the mechanical work made by it, or increases the internal energy or is a combination of both processes. In 1856, Clausius refined it by a more or less heuristic formulation, using differential calculus, earning precision but loosing intuition. Besides, around 1862 he assumed the atomic hypothesis of the materia and introduced the idea of how molecules separate. Finally in 1865 he introduced his notion of entropy and formulated the Second Law of Thermodynamics.

Clausius introduced the formula

where AS denotes the increment of entropy of a system and AQ denotes the increment in heat of it at a temperatute T. That is, the increasing of entropy is proportional to the increasing in heat and inverse to the temperature. He proved, for example, that the sum of all increments of entropy during a complete Carnot cycle is zero, which means that the system increases the entropy when a systems receives heat in the same quantity that it losses in the part of the cycle when the systems cools. Since Carnot engine was ideal then it was a mechanism of a maximum efficiency. Nevertheless a real engine has unavoidable losses. Finally bouth things mean that

AS > 0

which is the Second Law of Themodynamics. That is, when a system is transformed, then the entropy increases.

The ideas of Clausius were refined by Ludwig Boltzmann. In first place Boltzmann was a follower of the cinetic theory of gases introduced by Daniel Bernouilli who considered that all fluids are agregates of molecules moving continuously. Temperature could be interpreted as a measure of the energy of particules. The termal energy of a gas is identified with the cinetic energy of the individual molecules which explains that

heat and mechanical work were different transmission forms of energy. More precisely, Boltzmann identified the temperature of a gas with the mean cinetic energy of its molecules. In an equilibrium state (there is no transmission between two substances of heat since both have the same temperature), there is no transmission of cinetic energy. Otherwise when two substances are not in equilibrium, the cinetic energy goes from the hotter to the colder. This means that average of cinetic energy behaves similarly than temperature and therefore both can be identified.

Boltzmann used a curious hypothesis. He assumed that the motion of molecules was periodic. That is, given sufficient time, a molecule is changing changed of level of energy until coming back to the initial level of it.

With the explanation of temperature in mechanics terms, the First Law of Thermodynamics was clarified. In fact heat and mechanical work were interchangeble because they are two forms of energy. But it remained to explain the Second Law which was difficult because the notion of entropy was not quite clear. Boltzmann tried to do the job using a mathematical formulation, very far from physics interpretations. He proved that heat (understood as a supplied energy) divided by tempearture gave a quantity whose behavior was exactly that of entropy. Finally he gave thermodimamics macroscopic reasons without considering the molecular behavior, to prove the Second Law.

2.1 The ideas of Maxwell

James Maxwell was an impulsor of the cinetic theory of gases during the XIX century. His main idea was to describe the behavior of molecules in a gas by mean of a distribution function. He considered a great quantity of molecules of the gas and take knowledge of their velocities which is much better than considering individual molecules which is an intractable way from mathematical point of view. Such function had to indicate how were distributed the velocities in molecules. Even with such approach was possible to compute most of relevant properties of gases.

To get an adequate mechanical description of a fluid, Maxwell had to overcome two difficulties. By one way to find a distribution function adequate to every temperature and prove that such a function was unique. For the first problem, he claimed that the gaussian function represented adequately the distribution of velocities of molecules. The second problem was open until its solution by Boltmann. In 1868, Boltzmann gave a justification of using the gaussian function to describe a gas in most of particular cases and even he extended what was made by Maxwell in the sense of including gases submitted to a big range of forces.

2.2 Boltzmann's contributions

In [9], Boltmann succeded in proving Second Law of Thermodynamics using principles of Mechanics. In fact, such paper was the starting point of the Statistical Physics. It contained two important innovations; by one hand, the introduction of the currently known as the Boltzmann equation which models the behavior of a gas in different situations; on other hand, his first proof of Second Law is a consequence of the atomic theory and of the probability theory, stating what it is known as H theorem. In his formulation he used two hypothesis of simplification. The first is the gas has a uniform space distribution. The second velocities in each direction are equiprobables.

Boltzmann equation is a description of the evolution of the probability distribution of a dilute gas depending of several factors. In its simplest form is:

dF(t, x, v) _ / dF(t, x, v \ + / dF(t, x, v \ + / dF(t, x, v \

V J force V J difussion V J colisions

In this case, F(t, x, v) denotes and describes the density function of particles in the phase space. F can be called more accurately the empirical measure. According with [21] and [22], modernly the equation can be written as

dF + V ■ VxF = Q(F, F)

where Q(F, F) is the Boltzmann collition operator acting only on the velocity variable v and is local in (t, x) as

V - V* |), o ) [F*F - F*F] )

Q(F, F)(v) = ¡J dv* ¡J doB(l v - v* |),,

where we are using the shortland F = F(v),F* = F(v*),F' = F(v), F* = F(v*). In this formula, v ,v* and v,v*are the velocities of a pair of particles before and after collision that result from parametrizing over the sphere S2 the physical law of elastic collisions:

V + V* = V + V* | V |2 + | V* |2 = | V |2 + | V* |2

A deep prediction of Boltzmann's equation is given by the cited H-Theorem which says that the solutions of the equation satisfy

dnH =- it /dx / dvFl°gF a 0

This H-functional is increasing with respect to time and all of it maximizers are exactly the maxwellian states

H(v) = (2ri)' 2 e-|v|2/2

This Boltmann's H-theorem proves the Second Law of Thermodynamics, meaning that physical entropy of an isolated system does not decrease with time. In the setting of Statistical Physics, it has been considered the most important Boltmann's contribution.

For the former Boltmann's equation, recently has been proved that it has classical solutions holding some relevant additional conditions (see [21] and [22]) which has been an open and difficult problem in Mathematics for 140 years.

In the deduction of his equation, he used what is known in the literature as Stosszahlansatz or molecular chaos. According to it, the molecules or atoms (particules) of a gas move colliding among them. Boltmann assumed that before a collision, velocities of particules are not interrelated, that is, they were moving completely at random. This does not happen after the collision, since the direction of moving of a particule depends on where were the collided particule. This assumption provoques a temporal asymmetry in the mathematical analysis, since it is necessary to distinguish between the past (there is no correlation) and the future. In fact this hides the notion of thermodynamics irreversality. Boltzmann's equation proves that the change in the function F is only due to external forces, collisions among particules and diffussion phenomenon which is the statistical tendency of particules located in a region to expand trying to deal with all the allowed space.

Boltzmann proved also that the Maxwell distribution given before is a solution of the equation. After he proved that if a gas reaches such a distribution, then it will not change, that is the internal collisions among molecules do not change its state. Using his equation, he proved also that any gas in any state will tend to reach the Maxwell distribution. But the most relevant result was that the proof of the Second Law can be obtained from mechanics principles which means to do a mechanics interpretation of such Law. This result is again the Boltzmann H-theorem. For it, he considered instead of F the average of it, in fact its logarithm. It is the value H (originally denoted by E by Boltzmann in his papers). H had to be constant or decrease during any physical proccess. Changing the sign of H, Boltzmann found a mechanics equivalent of the entropy (it remains constant or increases in any physical proccess). This fact represents a microscopic interpretation of the Second Law. Additionally, the Clausius's entropy is valid only for systems in equilibrium, but the Boltz-mann approach does not depend on it, and it is valid in any situation. Today the physics community disposes of entropy definitions valid for quantic systems and relativists due to the versality of Boltzmann formulation. But [9] contained an important additional result. Previously to it, Boltzmann assumed the ergodic hypothesis, that is, the fact that any particule has to reach all level of energy before reaching the initial one. But this point

of view was very complicate to considered in applications. He changed it and considered that had to be only a finite number of energy levels, multiples of one number. Then to proof his theorem, he used energy instead of velocities and discretized it. As a consequence, calculus became easier and when energies are transformed it is clear that given sufficient time, all particules will reach all levels of such discretization.

Boltmann distribution applied to the black body problem, reproduces exactly the Plank results. Some time after, Einstein explained the photoelectric effect, that is, the creation of a electric current from the incidence of light on a metal using a similar hypothesis. He assumed that light was composed of particles whose energy could not reach any value since was discretized. Such point of view was very important for the developing of Quantic Mechanics.

In order to overcome some criticisms to the previous ideas, in [11] Boltzmann did not consider a velocity distribution of a gas, but he thought on the probability that the gas would reach a state if it is known all the possible states. This meant to do an inventory of all configurations of the gas and then compute their probabilities. The state of maximal probability would correspond with that observed at a macroscopic scale. It was made in the following way contained in [11]. Boltzmann did not consider the distribution of velocities of a gas, but the probability that it would be in a state chosen among all possible states. But this means to have an inventory of all possible configurations of the system and its number to obtain the probabilities and the state of maximal probability. First he obtained the number of molecules having each discrete level of energy for a total fixed energy. The state of the system is given from macroscopic point of view by such number independently of what individual molecules has such level of energy.

Boltzmann named each individual state as complexion or in modern terminology microstate since such state is not observable. Distributions of energy where the only necessary is to know the number of molecules in each level of energy are known as macrostates since they are macroscopically observable. Then he introduced the number B that allows to obtain a new expression for entropy. B is the number of microstates supplying the same distribution of energy. Now calculate B for all distributions and compare them. The proportion between B and the total number of microstates is the probability that the system has in a state described by B. After obtaining all possible distributions, the following step is to count how many microstates there is in each possible state. This was named by him as permutability derived from permutation and represented by B. After some computations it can be observed that permutability was considerable greater in intermediate distributions, that is, in those in which energy is distributed more or less homogeneously, even similarly to Boltzmann distribution. After doing the above computations he obtained a general expression for permutability of a distribution, first assuming that numer of molecules were very big and then that energy reached continuous values. Then he named degree of permutability to the logarithm of permutability.

Finally he observed that the expression for the degree of permutability was equal to the value H obtained in [9] with inverted sign. It is relevant since H was equal to entropy with negative sign. The conclusion is that logB could be used as a measure of the entropy of the system. Additionally, this method can be used not only for gases but also for any other substance, monoatomic o poliatomic, liquid or solid. In [11], the entropy was finally introduced as 2/3 of the measure of permutability. In the modern nomenclature, such factor is incorporated to what is known was as Boltzmann's constant although he never used it. Since permutability and number of microscopic states compatible with a distribution are proportionals, nowadays it is used the last number instead of permutability. Finally, entropy is proportional to the logarith of the microscopic states compatible with the observed macroscopic state (such as temperature, pressure, volume, etc).

S _ KlogW (1)

the logarithm is used because simplifies the computations and reproduces the property of additivity of entropy, in the sense that the entropy of two systems sums instead of multiplies as does the permutability. All this modern terminology and formulas were introduced by J.Gibbs as part of his formalization of what is known as Statistical Physics. Former formula is graved in Boltzmann's tomb. Einstein named this formula as the Boltzmann's principle. In (1), the entropy S increases when W does. The more microstates, the more disorder and more entropy. Besides for only one possible microstate, entropy is zero.

The notion of disorder is an intuitive notion depending of the system to be considered. In the case of a gas, it is considered it as an ordered state if its molecules have a distribution of energies or positions very different of those random which means the Boltzmann distribution. The most disordered states are the most probable and as consequence have the most entropy. Another consequence is that in the universe the disorder tends to increase.

3 Shannon's entropy

The definition of Boltzmann's entropy is quite a lot general and its idea has been used to introduce others notions of entropy in Mathematics, Computation Sciences and other fields. One of them was proposed by C.Shannon (see [36]) to evaluate the quantity of information contained in a message. A message is a sequence of symbols containing some type of information and which have to be transmitted. Let us suppose we have a sequence composed only of two symbols {0,1}. If the frequency of zeros and ones is not random and there are some tendencies towards more zeros or ones, an observer receiving and reading a message of zeros and ones could predict the following symbol after having received a finite number of them. The sequence 1, 1,1,... is further predictable and after receiving one hundred of ones, it is likely the following symbol would be another 1. In this situation the information given by the message is very poor since one knows what needs before receiving the message. In such case the Shannon's entropy would be minimum. Oppositly, the Shannon's entropy has to be maximal when the sequence of symbols is a random sequence of zeros and ones. In this case after hundred of transmitted symbols, the only way to know the new one is just receiving it. In practice, most of messages using alphabets of languages, for example English and Spanish, has entropy relatively low, due to statistical preponderance of some letters in such languages. This made to have a low information in such messages and it facilitates the compression of them. Shanon's entroy was inspired on Boltzmann's and has been during years an interesting and important problem in Mathematics and Physics to understand their relations.

Inspired also in Boltzmann's entropy, in Linux system (the operational system of free code producing an Android) the term entropy is used to specify the random data taken by the system from movements of mouse and keyboard produced when executing special instructions.

In probability theory, a probability vector p is a sequence of finitely many non-negative numbers {p1, P2,..., Pn} holding Y^"=1 Pi = 1. We define the Shannon's entropy of p as:

H(P) = Pilog2 Pi i=1

where we take 0log20 = 0. This definition is associate to the situation in which we have a finite partition of a probability space. If we consider an abstract space A equipped with a probability measure i and p is a finite partition of A, that is, the family of subsets {A1, A2,..., An} where they are pairwide disjoint and immeasurable and their union is A. Then the probabilities pi = i(Ai) form a probability vector P(p). Then we define

Hi (p) = H(P(p))

this example can be related with the notion of information. In former setting, given a measurable set A, we define the information associate with it, as

I(A) = -log2(i(A))

and the information function Ip associate with a partition {A1, A2,..., An} of the space O as a function of w e O given by

Ip(w) = Y^ -log2(uA)XAi (w) i=1

where it is assumed that as a function of w, u(Ai)(w) is constant in Ai andxAi is the characteristic function of Ai .It is immediate that the expected value of the information function with respect to u has to be just Hu (P).

In [17] it is given two interpretation of Shannon entropy and in terms of information function and uncertainty. In the first case, the interpretation is based in the fact that given w e O and the above partition of O, the information gives an answer to the question: in what Ai we are?. Such a question is not binary, but can be replaced by a finite ammount of binary questions necessary to determine where is w. Depending on the arrrangement of the sets Ai, we obtain different results to locate the belonging of w. If we denote by N(w)) the smaller number to do such proccess we have (see [17]): Ip(w) < N(w) < Ip(w) + 1 for almost every w. The real number Ip(w) can be interpreted as a precise value and then entropy can be interpreted as the expected ammount of information needed to locate a point in the given partition of O.

Let X be a random variable in the probability space O reaching values in a finite set {x1, x2,..., Xn}. Associate to X we have the partition of O into the sets Ai = {w e O : X(w) = xi}). The probabilities pi = U(Ai = Prob{X = xi} form a probability vector (distribution of X). Suppose now that an experimenter knows the distribution of X before performing a experiment, that is, before choosing an w e O and obtaining the value X(w). His/her uncertainty on the outcome is the expected value of the information he/her is missing to be certain. According to the above, such a value is the value of the entropy Hu (P).

3.1 Comparation between Boltzmann and Shannon entropy (see [17])

Connections between Boltzmann and Shannon entropy has been a matter of controversy during years. At first view both notions are connected by their definitions, both refer to probability, but their analogy is far of beeing obvious.

The interpretation of analogy and differences rely on the diferences between macroscopic states associate to Thermodynamics and microscopic states to Statistical Mechanics. That is, a macroscopic state is associate to a thermodynamical one described by probability distributions of physical magnitudes such as pressures, temperatures, volumes, etc, which can be explained in several ways, while in microscopic states one distinguishes all individuals particles taking into account their positions or velocities. Given a thermodynamical state A we have mentioned that the difference S(A) - Smax (where Smax denotes the maximum value of the entropy of all states) is proportional to log2(Prob(A)), that is, the logarithm of the probability of the macroscopic state A in the space O of all microscopic states w. Finally we find the following formula

Smax - S(A)_ kI(A) (2)

where I(A) denotes the probabilistic information associate to A c O. The conclusion suggested by last formula is that Boltmann's entropy is strongly connected to Shannon' information rather than with Shannon's entropy. The above formula has also a controversial interpretation, since beeing S(A) negative it is reversed the sense of monotonicity. The more information associated to A the smaller is its Boltmann's entropy we have. Such fact is explained at light of interpreating the meaning of information associate to a state.

Information on a state in a system is the information received by an outside observer. This means that it is reasonable to assume that such information escapes from the system and as consequence it will have a negative sign. It is the knowledge on the system by an observer what gives the degree of usefulness of the energy contained in the system to produce physical work, that it, decreasing the entropy of the system.

With the former ideas it is possible to do a complete interpretation of the above notions. The key idea is that every microstate in a system appreciated by an observer belongs to a macrostate A, hides the information on its identity. Le us denote by Ih (A) the joint information hiding in the system in the state identified by the observer as A. This entropy is maximal at the maximal state and then it is calculated by ^f^. In the state A, it is diminished by I(A) the information already taken by the observer. Then we have

Ih(A) = Smax - I(A)

and together with (2) we obtain

S(A) = kIh (A)

which gives us a new interpretation of Boltzmann's entropy. In fact such entropy is proportional to the information still hiding in the system if the macroscope state A has been detected. The Boltzmann's entropy is determined up to an additive constant and then we can compute the change of entropy from one state to another. It is really hard to obtain the adequate constant, since the maximal state depends on the level of precision of the microestates composing a macrostate. If the space O is infinite, then also it is infinite the maximal entropy. According to what was commented in previous sections, it is necessary to do a quantuum approach, that is considered that O is composed only of a finite number of points. The advantage is that in such case, Boltmann's entropy has a new interpretation in direct terms of Shannon's entropy (not depending of information function). The highest possible Shannon's entropy Hu(P) is achieved when p = Q which denotes the partition of O into singles states w and u is the uniform measure on O, that is, in the case that each state has probability 1/(|O|), where |O| denotes the cardinality of O. In such case we have

Smaximal = kHu(Q) = klog2|O|

We detect that the system is in state A if the information is I(A) = -log2(u(A)) = -log2(yOy. Using (2), we obtain

S(A) = k(-log2 O + log2( O) = klog2 A

which is k times the Shannon's entropy of uA, the normalized uniform measure restricted to A. Both entropies are compared by mean of the above formula and one obtains the unknown additive constant.

4 On Tsallis entropy

It is well known that Statistical Mechanics is essentially constructed with two ingredients, Mechanics (including electromagnetis forces) and Probability Theory. From it we may construct notions such as energy and entropy and obtain for them mathematical expressions. In what concerns entropy, guided by Thermodynamics we obtain that entropies from Clausius and Boltzmann are extensive, which means that the entropy S(N) of a macroscopic system of N elements asymptotically behaves as N,which is denoted by

S(N) a N

when N ^ <x One expects to keep such property for both short (for example for collitions between paticules) and long-range interactions. For short-range it is the case using strictily thermodynamic tools. But for longrange the former approach is not evident. In such cases, total energy U of the system behaves non-extensively (more precisely, U(N) increases faster than N for large N. Nevertheless, it can be seen that Clausius and Boltzmann entropies remain extensive (see [39]).

The next relevant question is to be able to introduce a mathematical expression in probability terms of a entropy for long-range interations, taking into account the admissible microscopic configurations given by correlations between the N elements of the system, to obtain that S(N) is extensive.

Assume we have a system whose total number T(N) of admissible configurations are equally probable and satifies

T(N) a uN, (N ^ ro, u > 1) (3)

Such assumption corresponds to probabilistic independence of the N elements of the system and ^ is a factor of proporcionality, that is, it is assumed that T(N +1) ~ ^ T(N). Using now Shannon's entropy of a probabilistic

vector

Ssh = -kYl Pilog2Pi i=1

where 52JSP Pi =1. For equal probabilities, we have

Ssh = klog2 T(N)

using (3) we have

Ssh (N) = klog2 T(N) a N

and the Shannon entropy results extensive.

Let us assume a system whose elements are strongly correlated in such a way that the number of equally probable admissible configurations (configurations whose probability are non-zero) holds

T(N) a Np (4)

when N ^ to and p > 0. In this case is not adequate using Shannon's entropy since SSh(N) a log2N and it violates Thermodynamics. Instead of we use the following generalized expression (see [39])

1 _ vT nq T T 1

Sq = k ¿-I Pi = _k E PqlogqPi = k E PilogqP i=1 i=1 i

with J=1 pi = 1. where we have used that logqz = zl1"_q1 and taken log1 z = log z. It is evident that S1 = SSh (shannon entropy) and that for equal probabilities is

Sq = klogq T

Using (4) we obtain

Sq(N) = k[T(N1]1~q _ 1 a NPd_q) 1 q

Therefore, choosing q = 1 _1 we obtain that S1_ t(N) a N and this is in accordance with Thermodynamics. As

a consequence, Shannon's entropy can be replaced by Tsallis notion having extensivity, changing adequately the index parameter, when there exists strong correlations in the system.

At this point it is necessary to distinguish between the propertis of non-extensivity and non-additivity. A notion of entropy S is additive with respect to a variable, if for two probabilistic independent system A and B it is held S(A u B) = S(A + B) = S(A) + S(B). Using the expression of Tsallis entropy, we have

Sq (A U B) Sq (A + B) Sq (A) + Sq (B) (1 q) Sq(A) Sq (B)

—k— = —k— = — + — = (1_ --IT

which means that SSh is additive but Sq is non-additive for q = 1.

In the case of having a continuous random variable X with density of probability distribution function p(X) in a domain D c R3, it is possible to obtain expressions for the Tsallis's entropy, (see [39], [40]).

Sq = k q[P1X)]qdx = _k J [p(x)]q logqP(x)dx = kj [p(x)]q logqpx) dx

where it is held JDp(x)dx = 1.

Phenomena such that the distribution of motion of cold athoms in dissipative optical lattices, fluctuations of the magnetic field in the solar wind, etc (see [39], [40]) are examples where there is internal strong

correlations. When p(xi) = 0 for some i, then the corresponding summand 0log0 is taken to be 0 consistently with

limp^0+P log(p) = 0

5 Kolmogorov-Sinai entropy

One of the key notions in Ergodic Theory of Dynamical Systems was introduced and developed by Andrei Nico-laevich Kolmogorov in 1953 [30] and complete and improved by students of him, by Sinai, Rokhlin, Pinsker and Abramov years after during the period from 1958 till 1962. It is currently known as Kolmogorov-Sinai entropy (KS-entropy or simply KS). It refers to Shannon entropy but differs greatly because it is introduced in a different setting, that of measure-preserving transformations on Dynamical Systems.

It is the entropy of a transformacion f which preserves the measure in a probabilistic space given by the triad (X, B, m) (m(f-1(B)) = m(B), B e B). It is introduced in three steps

1. Entropy of a finite sub-a-algebra of B

2. Entropy of a transformation f in relation to a finite sub-a-algebra of B

3. Finally, we can obtain the entropy of f

Definition 5.1. Let A be a finite sub-algebra of B and E(A) = {A1,..., An}. The entropy of A is the number

H(A) = - ^ m(A, )logm(Aj) i=1

H(A) is a measure of the avoid uncertainty (or the gained information) when it is delivered an experiment whose results resultados are {A1,..., An}

Definition 5.2. Let f : X ^ X be a transformation that preserves the measure in a probability space (X, B, m). If A is a finite sub-a-algebra of B, then

1 n-1 h(f, A) = lim-H(U f-'A)

is called the entropy of f related to A

Looking for a meaning, we can think on f as the change of one day to the next. Then [Jn=01 f-' A represents the combined relationship of an experiment, which is represented by A, during n consecutive days.

Let f : X ^ X be a preserving measure transformation on the measure of a probabilistic space (X, B, m), then

h(f) = suph(f, A)

where the supremum is taken on all sub-algebras A of B is called the entropy off

If we think of f as the time of transition of a one day to the next, then h(f) is the maximum average per día obtained doing dayly the same experiment.

5.1 On topological entropy

Let (X, f) be a topological dynamical system, that is, let X be a non-empty Hausdorff space and f : X ^ X a continuous map. In this seting the topological entropy is a number that measures the complexity of the system,

that is how complicate is the behavior of all orbits of all points in the space. Approximately it measures the rate of exponential increasing of the number of orbits that can be distinguible when time goes on.

The original definition of topological entropy was introduced by Adler, Konheim and McAndrew (see [1]). Their idea was to assign a number to every cover of a compact space measuring its size and was clearly inspired in a Kolmogorov-Thomirov's paper of 1961. Once clarified such initial notion, the rest is an imitation of the Kolmpgorov-Sinai's definition.

If the space X is metric not necessarily compact, a new definition was introduced by Bowen in 1971 and independenly by Dinaburg in 1970. Such definition uses the notion of e-separate points. When the space is compact, Bowen's definition coincides with Adler, Konheim and McAndrew one. Here we do not go through formulas since they have been introduced in several papers and books, for example [37], [27] and [17].

What is really interesting and relevant is the relationship between topological entropy and Kolmogorov-Sinai entropy. It is called a Variational Principle that was formulated around 1970 by Dinaburg, Goodman and Goodwin.

Let f : X ^ Xbe a continuous map of a compact metric space to itself. Then

h(f) = suph(f) : v e M(X,f)}

where with M(X) we denote the set of all Borel measures on X and M(X, f) the subset of M(X) composed of all measures on X for whic f preserves the measure.

The following properties supplies a method to choose elements of M(X, F).

Let X be a compact metric space, then it is said of v e M(X, f) that it is a measure of maximal entropy of

(f) = h(f)

If now we denote by Mmax(X, f) the set of all measures of maximal entropy of f, then we have

1. If h(f) < <x, then Mmax(X, f) is composed exactly of all ergodic measures.

2. If h(f) < and Mmax(X, f) = 0, then Mmax(X, f) contains some ergodic measure.

3. If h(f) = then Mmax(X, f) = 0

4. If the function of entropy, h(f) is upper semicontinuous, then Mmax (X, f) is non-empty and compact.

5. When (X, a) is a subshift (a symbolic systems), that is, a is the shift aplication of a closed and invariant shift by the aplication composed by symbols of a finite alphabet, then

h(a) = lim_ ^^) n

where Bn denotes the family of all words of lenght n of X

6. In subshifts of finite type, the topological entropy coincides with the logarithm of spectral radius of the transition matrix la matriz.

7. For piecewise real monotone maps, f is

h(f ) = limn^ ^ n

where Cn denotes the number of pieces of monotonocity of fn.

8. In most cases introduced by Bowen, the topological entropy can be computed counting the of distinct periodic orbits of period n and then taking the upper limit after taking logarithms and divide by n.

Let F : X ^ X a continuos transformation of a compact metric space (X, d). Let C(X, R) denote the Banach algebra of real-valued continuous functions with the supremum norm. The topological pressure of F is a map

P(F,.): C(X, R) ^ R

having good properties relative to the structures of C(X, R). It contains topological entropy in the sense that P(F, 0) = h(F) where 0 denotes the zero function of C(X, R).

Here, a generalization of the Variational Principal can be proved and sometimes it gives a natural way of choosing important members of M(X, f). In this theory ideas from mathematical statistical mechanics are used and the theory has another applications in differentiable dynamical systems. For g e C(X, R), n > 1, and e > 0 put

Pn(F,g, e) = sup{£ eSng(x : Eis a (n, e)separated subset ofX}

(Sn (g))(x) = £ g(Fix) i=0

Finally

P(F, g, e) = lim supn^logPn (F, g, e)

In the case of symbolic system, the entropy can be interpreted in terms of information content (see [? ])

Let (X, a) be a subshift of finitely many symbols with entropy, h(a). Let M be the smallest integer strictly larger than 2h(a\ Then (X, a) is conjugate (via a sliding block code) to a subshift (Y, a) on M symbols and M is the smallest such number (with the exception when (X, a) is conjugate to the full shift on M symbols, in which case is M = 2h(a

In other words, h(a) informs on the minimal number of symbols sufficient to encode the system in real time (that is, without rescaling the time). If the original subshift uses some Q > M symbols, it can be compressed (by reducing the alphabet to M symbols) without loss of information. Interpretations of topological entropy in other settings are

1. Topological entropy for flows

Given a flow: $ : R x X ^ X, topological entropy is defined as the entropy of the time-one map

h($) = h($1)

since in flows is used better the notion of equivalence than conjugacy, which admits rescaling of time, topological entropy essentially distinguishes among flows with zero, positive finite and infinite entropy.

2. Toplogical entropy can be also defined for actions of more general groups, like Zd, Rd and more generally amenable groups or even amenable semigroups. Variational Principle holds in those cases.

5.2 Entropy for non-autonomous systems

Such systems are described by the pair (X, f™) where fTO = (fn^o and fn e C(X, X) for all n and X is a metric space. With this formulation, they were introduced in the nineties by Kolyada and Snoha in [29] and currently it is an active topic of research.

It is convenient for next definitions to introduce the notation

fP = fi+(n _ 1) ° fi+(n _ 2) °----fi+2 ° fi+1 ° fi

with i > 0 , n > 0 and f0 = Identity on X and Trf-(x0) = f (x0))„=0 = (xn)„=0 For any pair x, y e X and n > 0 we define

pn(x, y) = max!=0,...,n_ 1 d(f0(x), f0(y)) A set E c X is (n, e, f^)-separate if pn (x, y) > e for every pair of distinct points x, y e E.

The topological entropy of a non-autonomous system, h(f^) is given by

lime^0(lim sup,,^-log(s„(f^, e)))

where Sn(f™, e) denotes the maximal cardinality of (n, e,/™)-separate sets.

When fn = f for all n the former formula is the Bowen's expression for it. The following result was proved by Kolyada-Snoha in their broader paper [29]

Let X be a compact metric space and (X, f™) verifies that fi converges uniformly to a map f : X ^ X, then is h(f™) < h(f).

The next result shows that uniform convergence is an essential assumption in the former result. For every f e C(I) there exists a non-autonomous system (I, f™) such that f converges point-wise to f and h(f™) = ™ (see [4]).

C.Kawan in [28], has introduced the notion of metric entropy for such systems, generalizing the classic metric entropy of Kolmogorov-Sinai. It is related via a variational inequality to the topological entropy introduced before. Moreover, such metric entropy, shares several properties with that of Kolmogorov-Sinai. In particular it is invariant with respect to appropriately defined isomorphisms, a power rule and a Rokhlin-type inequality.

1. Sequence topological entropy

2. Topological entropy in the setting of fuzzy spaces

3. Algebraic topology

6 Topological entropy for discontinuous maps

The topological entropy is difficult to compute if we use the former definition or other equivalent, for example, that defined by (n, e) - spanning sets (see [12]). Independently of the definition, in all cases to compute strictly h(f) we must calculate a limit.

When we deal with discontinuous maps, the former approaches are not possible and it is necessary to introduce new notions of entropy in such a way that when applied to the continuous case, coincide with the former notions. In the cases we are considering in this paper, we always suppose that it will be possible to construct a symbolic dynamical system associate to the original system with discontinuities, where the influence of such discontinuities has to be taken into consideration. This can be seen by the existence of the forbidden set of the system, that is, the set of preimages of all points of discontinuity. The orbits of such points are not defined and as a consequence must be excluded of the symbolic consideration. Then the corresponding shift space is a closed subset of a full shift of a finite of symbols [31] which will be the usual situation we deal.

Now we are explaining such process. Let A be a finite set of symbols and AZ = X be the full-shift on it (the set of all bi-sequences constructed with the symbols of A). Consider now the shift given by (X, aX) and the corresponding shift system (X%, a|X^) where c X is closed and the bi-sequences belonging to it does not contains the blocks (words) associate to the forbidden set of the system. In most cases, it is easier to describe a shift space by specifying which finite blocks of symbols (n-blocks) are allowed, rather than which are forbidden

If x e AZ and w is a block over A, then w occurs in x if there are indices i < j so that w = xixi+1... .xj. Let us denote by Bn (X) the set of all n-blocks that occur in points of X. |©n | means the number of n-blocks which appear in points of X and gives us an idea of the dynamical complexity of X. Instead of using the numbers |Bn| for n e N it is better to consider their growth rate. In [31] it is proved that |©n| behaves approximately like 2kn beeing the constant k such growth rate. It is also proved that k can be computed approximately by (1/n)log2 |Bn | when n is large (from now the log2 will be denote simply by log). In the folowing we introduce a definition of entropy very useful in the setting of shift spaces.

Let X be a shift space. The entropy of X is

h(X) = lim

When X is a shift space with shift map aX, then the topological entropy h(aX) equals h(X). To see it, consider the shift space X with the metric p(x, y) = ^neZ 2n q"-for x, y e X .If we apply the formula of topological entropy given before to aX, it is easy to see that rn(aX, 2-k) = IBn+2k(X)|. From this and such formula we obtain that h(aX) = h(X).

Further we are considering some different cases of real discontinuous maps and recall different techniques used to give sense to the computation of the entropy

6.1 On Lorenz maps

In what follows, we will denote the unit real interval [0,1] by I. Lorenz maps can arise as when we consider return maps to a cross section of semiflows on two dimensional branched manifolds. The point c appears when we conclude that there are points from which flow lines never returns to I (see the description given in [19],[20]or [13]).

Definition 6.1. A map f : I ^ I is a Lorenz map if there is a point c e (0,1) such that

1. f is continuous and strictly increasing in [0, c) and (c, 1]

2. limx^c - = 1 and limx^c+ = 0

If the set of preimages of c is dense in I, then under such a topological expanding condition, the map is called expansive Lorenz map. When a Lorenz map is not expanding, then it must have homtervals (H is a homterval iffn|H is a homeomorphism for every n e N or equivalently, the kneading sequence is constant (see [33]). It is evident that a Lorenz map is discontinuous in c.

To calculate the entropy, we use a cutting invariant idea (see for example [19]) approach. It is evident that the set composed of c and all preimages belong to the forbidden set (F(f)) of these map, e.g., y e F(f) if Orbf (y) is not defined. This happens because the discontinuity of the map. For each n e N we put

An = {x e I: fn(x) = c}

given any finite union of closed intervals, Z c I we introduce the nth-cutting number of f on Z (denoted by 7i )by

card{An n Int(Z)}

such formula measure the number of times that Z is cut up by the discontinuity produced in the iterates of f. If besides f (Z) c Z then we introduce an entropy for this case by

h(f |Z) = logsZ = log lim supn^(7Z)1/n)

where sZ is a grow rate of presence of numbers of the forbidden set of f in X.

This growth rate is the reciprocal of the radius of convergence of the formal power series given by

Z V^ Xn

1 = Int

which it is called the cutting invariant of f on Z. By definition of Lorenz maps, it must be sZ < 2 for any closed set Z, since each point has in I at most two preimages.

If in the definition of a Lorenz map, we add f (c) = b with 0 < b < 1, then the forbidden set in empty since the images of any x e I exist. Nevertheless the computation of the entropy is the same than in the previous case, since the shift space to be considered it is the same when taken into account the discontinuity.

In the setting of Lorenz maps, more things have been obtained. For continuous unimodal maps, the topological entropy is related to the smallest positive zero s of the kneading invariant of f by the formula

h(f) = log(1/s)

(see for example [31]) and for each positive entropy basic set Qi in the renormalization decomposition of the non-wandering set of f, there is a real zero si of the kneading invariant such that h(f |Пi) = log(1/S;-). in the setting of Lorenz maps we obtain a similar result, except in the fact that it is possible to obtain two basis set in the normalization decomposition having the same entropy. In this case it is proved ([19]) that it appears a double zero in the kneading invariant.

As an easy example let us define the map

is a Lorenz map but it is not expansive since the preimages of | is not dense in I, in other words, F(f) is composed of two convergent sequences of points convergent to 0 and 2. It is easy to see that in such case, using the cutting method, h(f) = log2. Using the symbolic approach, the shift space of this problem is the full shift associate to three symbols {L, C, R}

In [13] has been introduced a class of Lorenz maps called contractive Lorenz maps whose dynamics is different to the expansive case and more complicate. A graph of one of such maps can be seen in Table 1. The above paper is a deep analysis of the dynamics of such maps including the way in which the nonwandering set associate to them decomposes.

It is a pending task to compute the entropy of contractive Lorenz maps in the line of the Kneading Theory, completing the results of [19].

6.2 Other discontinuous maps

Now we introduce some families of non-Lorenz maps, even with two points of discontinuity

6.2.1 Piecewise parabolic and piecewise linear real maps

Let f : I ^ I be the following family of discontinuous piecewise parabolic maps

Since the members of such family are not Lorenz maps the dynamics behavior are different of them and the same for the method to calculate the topological entropy.

Using the kneading theory of [33] several algorithms have appeared to calculate the topological entropy for continuous piecewise monotone maps on the interval(see [16], [6], among others). In the case of piecewise linear Markov maps, can be seen ([5]) that the topological entropy is the maximal eigenvalue of a induced matrix composed only of zeros and one's related to the intersection or not of the images of all subintervals of the Markov partition. Such idea allows us to compute by approximation by linear Markov maps, the topological entropy of general piecewise monotone maps. In fact if f is piecewise monotone and (gn)™=1 a sequence of piecewise linear Markov maps, each with the same pieces of monotonocity than f and approaching it in the C0-topology, then in [23] is proved that

When f : I ^ I is discontinuos with a finite number of discontinuity points, then the definition of entropy for such map the following, introduced in [23]. f has topological entropy h(f) if there is a covergent sequence of piecewise linear Markov maps (gn )n=1, all with the same number of pieces of continuity than f and converging to f in the C0-topology if(logAn )n=1 converges to h(f), where every An is the maximal eigenvalue of the induced matrices associated to gn.

At this point, it is necessary to prove that the above notion is well defined and that under the dynamical point of view it is invariant by topological conjugations. Both things are made in [23]. In the case of piecewise

limn^<» h(gn)

parabolic maps, the piecewise Markov maps used as approximations must be also discontinuous. One possibility is the construction of the maps shown in center of. In each step, the topological entropy is given by the exponential rate of increase with n of the monotone upper bounds of any of these quantities:

1. the number of turning points of fn

2. the lenght of the graph of fn and

3. the number of periodic points of fn

(see [32]).

A paradigmatic family is that of tent maps, given by:

f(x)=l r(x - 1), if 0 < x < ^; f(x) \ r(1 - x), if i-! < x < 1

applying the length of the graph of fn the topological entropy is ±logr.

In [8] it is developed an algorithm of fast convergence and provides upper and lower bounds of the topological entropy. This method can be used in the covergence by Markov maps of general piecewise monotone maps including the discontinuous cases. The method is based in the constructin of topological transfer matrices Mf associate to the map f.

Let (x0 = 0 < x1 < ... < xp+1 = 1) a partition of the unit interval. The interval [x0, xp+1] is divided into subintervals Ii = [x,, xi+1] with i = 0,1,..., p. While the subintervals Ii are not necessarily laps, some of them can contain more than one lap. Then we add the turning point to the former partition and the notation will

be Ij for the lth lap contained in Ii. In [8] it is introduced the term mij as

mi, = E 0)

l 1 j 1

where I I denotes the length of the interval Ii. mij means the proportion in which the image of Ii covers Ij. In the case of the family sawtooth maps with slopes +(-)s it is immediate that the matrix M = (mij has as eigenvalue s which is the antilogarithm of the topological entropy corresponding to the partition of I given by the three intervals I1, I2. This conclusion comes from the application of Perron-Frobenius theorem that such eigenvalue is maximal (see [18]).

In most cases, the matrix M can be complicate due to multiple covering and getting the maximal eigenvalue could be hard. One easier case is when the partition is Markov or when we have a Markov piecewise monotone map and a Markov partition can be chosen composed of the turning points and their orbits. In this case the terms mij of M are only zeros or one's and looking for the maximal eigenvalue is less difficult. In this case it is known (see for example [25]), the In cases of having points of finite discontinuities, such points have to be elements of the partition.

When we are dealing with non-markovian maps, in (3) we can have elements mij different of zero and one and most of them having such values. The effect of the former computation is that the max0, log, E where E denotes the largest eigenvalue of M will be an approximation of the topological entropy having an error. Such errror decreases with the number of subintervals of the partition. But what it is more interesting is that we can introduce some bounds of the entropy, choosing adequately the partition.

To construct it, we take the subcolumn (mijmi+1j.... mi+k,j )t of length k, change it by a k-simplex, choosing a set of points (P0,..., Pk) e Rk+1 holding Pn > 0 and £ nn:k0 ||P n|| = 1. The end points of the simplex are the points P0 = (1,0,..., 0)t, P1 = (0,1,..., 0),...,Pk = (0, 0,..., 1)t. For k = 0 we choose as P0 = (0,0,..., 0) andP1 = (1,1,..., 1).

Applying this procedure, we replace each subcolumn of M following the above rules choosing adequately the points Pn. Then compute the largest eigenvalue of all matrices so obtained. We get a set of numbers whose extrema give us lower and upper bounds of the largest eigenvalue of M given by the application of the Perron-Frobenius theorem. As a consequence of the Milnor-Thurston kneading theory ([33]), such extrema will be in turn, upper and lower bounds of the entropy. Such bounds can be improved using sufficiently long orbits of the turning points of the partitions we have considered, including in the partition the discontinuities points of the map (in case of having them). In all cases, we must justify the adequate procedures to be consistent.

In references [3], [6], [8], [16], [23], [25], [31] [34] it is possible to see computations in different particular examples, using the former methods.

6.2.2 Discontinuous maps with more than one point of discontinuity

In the more general setting we can have maps with a finite number of discontinuity points of finite jump and a finite number of turning points. The procedures of the former paragraph can be applied but the computation can be very difficult. Therefore some simplifications are valuable, for example on the line of what is made in [3] for the case of continuous maps with a finite number of turning points. If symbolic methods are used, always the discontinuity points must be points the correspondind partitions. For a treatment of maps with odd discontinuous points can be seen [34] and for the graph of an example of symmetric discontinuous for poynomialwise maps see in Figure 1. It is a pending to introduce an adequate notion of entropy for the case of maps with discontinuities of infinite with infinite jumps.

6.2.3 Transitive discontinuous real maps

It is well known that if a continuous interval map f : I ^ I is transitive (recall that it is the case when for every pair of open sets U, V, there exists n e N such that fn (U) n V = 0), then h(f) > log2 (see [7]). The proof of such result is based in the fact that when X is a Baire separable metric space and f continuous, transitivity implies the existence of a point x e X whose orbit is dense in X. When f is discontinuous we can not assure the validity of the equivalence. Nevertheless, we have the following result, whose proof can be seen in [35].

Let X be a Baire separable metric space and f : X ^ X be a transitive map with only a unique point of discontinuity. Then there is x e X such that Orbf (x) is dense in X.

Even in [35] it is proved that it is possible to construct counterexamples with transitive maps having two points of discontinuity but for which the above property of having a point of dense orbit is not fulfilled.

7 Conclusions

In this paper we have rewieved the physics origins of the notion of entropy and its different meanings mainly throughout the work made by Clausius and Boltzmann. Keeping such ideas in mind was introduced the Shannon and Kolmogorov-Sinai entropies in the mathematical setting and then the important notion of topological entropy and its variants. In last paragraphs we have stated the necesity of extending such notions to non-autonomous systems and discrete dynamical systems described by means of discontinuous maps.

Acknowledgement: The research has been supported by the Proyecto MTM2014-51891-P from Spanish MINECO and from Proyecto Séneca de la Comunidad Autónoma de la Región de Murcia, 19294/PI/14.

References

[1] Adler R.L., Konheim A.G. and McAndrew M.H., Topological entropy, Trans.Am.Math.Soc. 114-309 (1965).

[2] Alsedá L. Llibre J. and Misiurewicz M.Combinatorial dynamics and entropy in dimension one, Advances Series in Nonlinear Dynamics, World Scientific, Singapore (1993).

[3] Amigó J.M. and Giménez A. A simplified algorithm for the topological entropy of multimodal maps, Entropy, 16(2), (2014), 627-644.

[4] Balibrea F. and Oprocha P., Weak mixing and chaos in nonautonomous discrete systems, Applied Mathematical Letters, 25 (2012), 1135-1141.

[5] Block L., Guckenheimer J., Misiurewicz M. and Young L.S. Periodic orbits and topological entropy of one-dimensional maps Global Theory of Dynamical Systems, Lecture Notes in Math.,vol 819, Springer-Verlag, New York, (1980), 18-34.

[6] Block L., Keesling J., Li S. and Peterson K. An Improved Algorithm for Computing Topological Entropy, Journal of Statistical Physics, 55(5/6), (1989), 929-939.

[7] Blokh A., On sensitive mappings of the interval, Russian Math.Surveys, 37:2 (1982), 189-190.

[8] Balmforth N.J. and Spiegel E.A. Topological entropy ofene-dimensional maps: approximations and bounds, Physical Review Letters. vol 72, number 1, (1994), 80-83.

[9] Boltzmann L. Weitere Studien über das Wärmegleichgewicht unter Gasmolekülen, Wiener Berichte 66, (1872), 275-370

[10] Boltzmann L. Bermerkungen über einige Probleme der mechanische Wärmetheorie, Wiener Berichte, 75: 62-100; in WA II, paper 39 (1877).

[11] Boltzmann L. Über die beziehung dem zweiten Haubtsatze der mechanischen Wärmetheorie und der Wahrscheinlichkeitsrechnung respektive den Sätzen über das Wärmegleichgewicht, Wiener Berichte, 76: 373-435; in WA II, paper 42(1877).

[12] Bowen R., Entropy for group ofendomorphisms and homogeneous spaces, Trans.Am.Math.Soc. 153, 401-414 (1971); Errata: 181(1973), 509-510.

[13] Brandaö P. On the structure of Lorenz maps, arXiv: 1402.2862vi[math.DS], 12/02/2014.

[14] Carnot S., Reflections on the Motive Power of Fire and on Machines Fitted to develop that power, Paris: Bachelier. French title: Réflections sur la puissance motrice du feu et sur les machines propres á développer cette puissance.

[15] Clausius R., Über die bewegende Kraft der Wärme, Parti, Part II, Annalen der Physik, 79, (1850) 368-397, 500-524. English translation: On the moving Force of Heat and the Laws regarding the Nature of Heat itself which are deducible therefrom, Phil.Mag. (1851),2,1-21,102-119.

[16] Collet P., Crutchfield P. and Eckmann J.P.Computing the topological entropy of maps, Communications in Mathematical Physics, 88(2), (1983), 257-262.

[17] Downarowicz T. Entropy in Dynamical Systems, New Mathematical Monographs, number 18,Cambridge University Press (2011)

[18] Gantmacher W.F. Theory of Matrices, Voll, Chelsea Publishing Co. (1959).

[19] Glendining P. and Hall T., Zeros of the kneading invariant and topological entropy for Lorenz maps, Nonlinearity 9 (1996), 999-1014.

[20] Gomez P., Franco N, or Silva L., Syllabe Permutations and Hyperbolic Lorenz Knots, to appear in Applied Mathematics and Information Sciences.

[21] Gressman P. and Strain M., Global existence of classical solutions and rapid time decay Proccedings of the National Academy of Sciences of the U.S.A., Vol 107, no. 13, (2010), 5744-5749.

[22] Gressman P. and Strain M., Global classical solutions of the Boltzmann equation with angular cut-off, J.Amer.Math.Soc. 24 (2011), 771-847.

[23] Góra P. and Boyarsky A., Computing the topological entropy of general one-dimensional maps, Trans.Am.Math.Soc., 323,1, (1991), 39-49.

[24] Hasselblatt B. and Katok A., Principal structures in Handbook of Dynamical Systems, North-Holland, Amsterdam IA, (2002), 1-208.

[25] Hsu C.S. and Kim M.C., On topological entropy, Pys. Rev. A, 31, (1985), 3253-3260.

[26] Katok A., Fifty years on entropy in dynamics: 1958-2007, Journal of Modern Dynamics, Volume 1, No. 4, (2007), 545-596.

[27] Katok A. and Hasselblatt B., Introduction to the modern theory of dynamical systems, Cambridge University Press, Cambridge, (1995).

[28] Kawan C. Metric entropy of non-autonomous dynamical systems (arXiv:1304.5682v2 [math.DS] (2013).

[29] Kolyada S. and Snoha L., Topological entropy of non-autonomous dynamical systems, Random and Computational Dynamics, 4, (1996), 205-233.

[30] Kolmogorov A.N., On dynamical systems with an integral invariant on the torus, Doklady Akademii Nauk. SSSR (N.S.), 93 (1953), 763-766.

[31] Lind D. and Marcus B. An Introduction to Symbolic Dynamics and Coding, Cambridge University Press, reprinted in (1999).

[32] Misiurewicz M. and Szlenk W. Entropy of piecewise monotone mappings, Studia Math. 67 (1980), 45-63.

[33] Milnor J. and Thurston W. Dynamical Systems, Lecture Notes in Mathematics 1342, Edited by A.Dold and B.Eckmann, Springer-Verlag (1988).

[34] Oliveira H., Symbolic dynamics of odd discontinuous bimodal maps, to appear in Applied Mathematics and Information Sciences.

[35] Peris. A. Transitivity, dense orbits and discontinuous functions, Bull. Belg. Math. Soc. 6(1999), 391-394.

[36] Shannon C. A Mathematical theory of communication, Bell System Tech., (1948), 379-423, 623-656.

[37] Walters P. An Introduction to Ergodic Theory, Springer Graduate Texts in Math., 79, New-York, (1982).

[38] Smorodinsky M. Information, entropy and Bernouillisystems, Development of mathematics 1950-2000, Birkhäuser, Basel (2000).

[39] Tsallis C., Introduction to Non-extensive Statistical Mechanics- Approaching a Complex World, Springer-Verlag, New-York (2009).

[40] Tsallis C., The Nonadditivite Entropy Sq and Its Applications in Physics and Elsewhere: Some Remarks, Entropy, Vol 13, (2011), 1765-1804.