Scholarly article on topic 'Theoretical studies of HIV-1 reverse transcriptase inhibition'

Theoretical studies of HIV-1 reverse transcriptase inhibition Academic research paper on "Chemical sciences"

Share paper
Academic journal
Phys. Chem. Chem. Phys.
OECD Field of science

Academic research paper on topic "Theoretical studies of HIV-1 reverse transcriptase inhibition"

View Online / Journal Homepage / Table of Contents for this issue

PCCP Dynamic Article Links Q

Cite this: Phys. Chem. Chem. Phys., 2012,14,12614-12624 PAPER

Theoretical studies of HIV-1 reverse transcriptase inhibition

Katarzyna Swiderek,a Sergio Marti6 and Vicent Moliner*6

Received 25th March 2012, Accepted 22nd June 2012 DOI: 10.1039/c2cp40953d

£ Computational methods for accurately calculating the binding affinity of a ligand for a protein

g play a pivotal role in rational drug design. We herein present a theoretical study of the binding of

r^ five different ligands to one of the proteins responsible for the human immunodeficiency virus

cS type 1 (HIV-1) cycle replication; the HIV-1 reverse transcriptase (RT). Two types of approaches

2 are used based on molecular dynamics (MD) simulations within hybrid QM/MM potentials: the

£ alchemical free energy perturbation method, FEP, and the pathway method, in which the ligand

^ is physically pulled away from the binding site, thus rendering a potential of mean force (PMF)

¡3 for the binding process. Our comparative analysis stresses their advantages and disadvantages

and, although the results are not in quantitative agreement, both methods are capable of distinguishing the most and the less potent inhibitors of HIV-1 RT activity on an RNase H site. The methods can then be used to select the proper scaffold to design new drugs. A deeper j; analysis of these inhibitors through molecular electrostatic potential (MEP) and calculation of the

8 binding contribution of the individual residues shows that, in a rational design, apart from the

g strong interactions established with the two magnesium cations present in the RNase H site, it is

g important to take into account interactions with His539 and with those residues that are

>3 anchoring the metals; Asp443, Glu478, Asp498 and Asp549. The MEPs of the active site of

^ the protein and the different ligands show a better complementarity in those inhibitors that

•S present higher binding energies, but there are still possibilities of improving the favourable

la interactions and decreasing those that are repulsive in order to design compounds with

•§ higher inhibitory activity.


Drugs used in medicinal chemistry are small molecules that bind to a protein thus disrupting the protein-protein interactions or inhibiting the reaction carried out by a protein if bonding takes place in the active site. These potential medicines would bind tightly to active sites, with a large negative standard 'free energy of binding', AG°, and exert their biological effect at low doses. A good inhibitor would also present selectivity for the target enzyme, to prevent undesirable side effects that might arise if other enzymes are inhibited.1 As a consequence, predicting reliable absolute binding free energies should provide a guide for rational drug design with an enormous save of time and costs and, in a more general context, can be used to understand the correlation between the structure and function of proteins.2 Computational techniques could be used to pose or dock small molecules into a receptor site of a protein structure with the most stable relative orientation and to predict a binding free energy.3

a Departamento de Química Física, Universität de Valencia,

46100 Burjasot, Valencia, Spain b Departement de Química Física i Analítica, Universitat Jaume I, 12071 Castellón, Spain. E-mail:; Fax: +34 964728066

The computationally discovered compound could be then experimentally tested to confirm whether it shows some activity in an assay measuring biological response and, subsequently, optimized to obtain greater potency and pharmacologically acceptable properties.4

Protein-ligand ''docking'' and ''scoring'' has been an active field of research for the last decade.3'5 Nevertheless, as pointed out by Leach et al.,6 accurate prediction of binding affinities turns out to be genuinely difficult. Even highly accurate computed binding free energies have errors that represent a large percentage of the target free energies of binding.3 In fact, Mulholland and co-workers recently stressed that despite decades of intensive research, no method exists currently that allows, from first principles, the binding free energy of a ligand to a protein to be predicted reliably, accurately, and within a reasonable timescale.7 Current computational chemistry methods devoted to the development of new drugs range from quantitative structure-activity relationship (QSAR) methods8 to those based on Molecular Dynamics (MD) simulations that allow getting a deeper insight into the foundations of the full binding process. In recent years, a number of studies have reported computations of binding free energies of small molecules to proteins using MD simulations with explicit solvent molecules.9 In order to get

reliable binding free energies with MD simulations, two types of approaches are basically used: the alchemical free energy perturbation methods, and the pathway methods. The former, termed in such a way by Gao et al. because the kind of simulations cannot be performed in the laboratory,10 are also called the double annihilation method11 and are based on thermodynamic cycles in which the interaction of the ligand with its surroundings is progressively switched off in both the active site and the bulk solution. This is done through a series of nonphysical (alchemical) over-lapping states using thermodynamic integration (TI) or free energy perturbation (FEP) procedures. FEP/MD applications to biological processes, and in particular to studies of proteins, were first introduced by Warshel and co-workers,12 who obtained absolute binding free energies by means of thermodynamic cycles.13

In the pathway methods the ligand is physically pulled away from the binding site and then the binding free energy can be computed as a potential of mean force (PMF). Van Gunsteren and co-workers reviewed and compared up to twelve different methods to compute a PMF, analyzing all combinations of the type of sampling (unbiased, umbrella-biased or constraint-biased), the way of computing free energies (from density of states or force averaging) and the type of coordinate system (internal or Cartesian) used for the PMF degree of freedom.14 The first problem that arises when computing the binding PMF is the need of transforming the configurational partition function from Cartesian to internal coordinates, which is the case if we are interested in applying restrains to the relative distance between the protein and the ligand to get the PMF associated with the binding process. The correction can be related to Jacobian factors and then the final one dimension PMF can be corrected for the missing Jacobian contribution to get the true PMF energy profile.14-16 Intrinsically, the problem present when using a PMF to determine the standard free energy of binding is that configurational areas at each point along a single reaction coordinate are not equal in size but increasing as 4pr2. A corrected PMF along one-dimensional radial reaction coordinate, r, can be obtained by adding extra terms accounting for the 4pr2 increase in area as the ligand unbinds. Other approaches have emerged to correct the limitations of one dimension PMFs such as the one proposed by Woo and Roux, in which multiple restrains to the ligand are applied.17 Recently, Henchman and co-workers have applied a similar approach to the one of Woo and Roux but using a z-component reaction coordinate to get the PMF and applying orthogonal restrains to this reaction coordinate, thus limiting the sampling required in each window,18 and Colizzi et al. described a new method for predicting protein-ligand binding affinities that works by modelling the force that is required to pull inhibitors out of their complexes with proteins.19 Mulholland and co-workers7 presented a method involving the use of a new reaction coordinate, similar to the method proposed by Helms and Wade20 that swaps a ligand bound to a protein with an equivalent volume of bulk water by using replica exchange thermodynamic integration methods.21

In practice, a critical issue with FEP or PMF/MD simulations is to achieve a sufficient sampling, which can be attacked by means of different approaches such as the Replica Exchange Molecular Dynamics (REMD) based strategies22 which has been shown to improve the statistical convergence in calculations of

absolute binding free energy of ligands to proteins.23 Parrinello and co-workers developed the Metadynamics method,24 and successfully applied in docking ligands on flexible receptors in water solution. Its added value is that it reconstructs the complete free energy surface, including all the relevant minima and the barriers between them.25 More recently, Warshel and co-workers have shown an effective model for high level calculations for configurational sampling, paradynamics,26 that is based on the use of an empirical valence bond (EVB) reference potential.

An alternative to these rigorous but expensive methods is those based on the microscopic linear response approximation (LRA) and related approaches.13'27'28 These include the linear interaction energy (LIE) method,29 where the nonelectrostatic term is estimated by scaling the average van der Waals interactions, or the less accurate semi-macroscopic methods such as the protein dipoles Langevin dipoles (PDLD/S) in its LRA form (PDLD/ S-LRA),13,28 the molecular mechanics/Poisson-Boltzmann surface area (MM/PBSA) or the molecular mechanics/generalized born surface area (MM/GBSA).30 In a recent paper of Singh and Warshel, where a comparative analysis of these virtual screening scoring methods is carried out, it is concluded that although some of these approaches such as the PDLP/S-LRA appear to offer an appealing option, the accurate screening of ligands with similar binding affinities is hard to approach.31 Genheden and Ulf have also performed an interesting analysis of the efficiency of the LIE and MM/GBSA methods to calculate ligand-binding energies concluding that although reasonable binding affinities can be obtained with LIE methods, MM/GBSA gives a significantly better predictive index and correlation to the experimental affinities.32 All in all, it appears that, as mentioned before, further investigations to predict accurate ligand-proteins binding energies are required in order to improve the correlation between theoretical predictions simulations and observed inhibitory activities.

One of the most devastating diseases of the early 21st century is still the acquired immunodeficiency syndrome (AIDS), caused by the human immunodeficiency virus type 1 (HIV-1). Reverse transcriptase (RT), protease (PR) and integrase (IN) are the three key enzymes of HIV-1 as drug targets due to their essential role in HIV replication. Despite the shortcomings of chemical drugs such as toxicity, lack of curative and multiple effects, the search for more and better anti-HIV agents remains a challenge. RT possesses two enzymatic activities, DNA polymerase and ribonuclease H (RNase H), located in two distinct domains. Inhibitors of RT which act at the polymerase active site have received much attention in the drugs development,33 but more recently, there has been considerable interest in developing a specific inhibitor of the RNase H activity of RT by combinations of nucleoside (NRTIs) and non-nucleoside inhibitors (NNRTIs) of HIV-1 RT.34 In fact, the two type of RT inhibitors can block the RNase H or polymerase activity and thus, as suggested by Arnold and co-workers, a better understanding of the structure and function(s) of RT, conjointly with the mechanism(s) of inhibition, can be used to generate better drugs.35

Binding energies of different NNRTIs in the binding pocket of HIV-1 RT have been the subject of previous theoretical studies where interaction energies to the different residues of the allosteric binding pocket have been identified by MM-PBSA combined with molecular docking,36 by quantum methods37,38 and by structure energy optimizations with ONIOM methods.39

Scheme 1 Structures of five nucleoside inhibitors of HIV-1 RT: (1)

2.7-dihydroxy-4-1(methylethyl)-2,4,6-cycloheptatrien-1-one or b-thujapli-cinol, (2) 3-cyclopentyl-1,4-dihydroxy-1,8-naphthyridin-2(1H)-one, (3) ethyl 1,4-dihydroxy-2-oxo-1,2-dihydro-1,8-naphthyridine-3-carboxylate, (4) 3-[4-(diethylamino)phenoxy]-6-(ethoxycarbonyl)-5,8-dihydroxy-7-oxo-

7.8-dihydro-1,8-naphthyridin-1-ium, and (5) 2-dihydroxy-4-1(methylethyl)-2,4,6-cycloheptatrien-1-one or b-thujaplicin.

More recently, Warshel and co-workers have evaluated the performance of a PDLD/S-LRA/b method by computing the absolute binding free energies of a set of NNRT inhibitors,31 and Jorgensen, Anderson and co-workers have discovered potent NNRTIs with guidance from molecular modeling including FEP calculations for protein-inhibitor binding affinities.40

In this paper we are using two types of approaches based on MD simulations within hybrid Quantum Mechanic/Molecular Mechanics (QM/MM) potentials, the alchemical FEP and the pathway method based on calculations of the PMF from umbrella-biased simulations. Both methods will be applied to study the binding energy of five different NRTI inhibitors of the HIV-1 RT (see Scheme 1). These inhibitors are believed to bind the RNase H active site of the enzyme, which contains two magnesium ions that seem to be involved in the catalytic mechanism of RNA phosphodiester bond hydrolysis.41 Nevertheless, as mentioned by Budihas et al.,42 since HIV-1 RT is a bifunctional enzyme, inhibition of its activity might reflect direct binding to the RNase H domain but, alternatively, an allosteric effect through binding to the nonnucleoside binding pocket close to a DNA polymerase domain, which has been shown previously to modulate RNaseH function. The effect of the different parameters selected to run the simulations will be systematically studied. The results will stress that MD based methods, together with analysis of electrostatic potentials and ligand-protein specific interactions, reveal to be a promising computational protocol to predict the structures and features of new putative drugs.

Computational methods

Setup of system

The first step of our protocol was setting up the system for performing the MD simulations. In order to carry out this kind of study for such large molecular systems, hybrid QM/MM potentials can be used, where a small part of the system, the ligand, is described by quantum mechanics while the protein

and solvent environment was represented by classical force field.43 This hybrid methodology, by comparison with methods based on just MM, avoids most of the work needed to obtain new force field parameters for each new species. Treating the ligand quantum mechanically and the protein molecular mechanically has the additional advantage of the inclusion of quantum effects such as ligand polarization upon binding.44 Moreover, as the largest part of the system is described classically, sampling can be obtained at reasonable computational cost although, obviously, it is much more expensive than if based on just classical force fields.

The initial coordinates for the QM/MM MD calculations were taken from the 3.0 A resolution structure of wild-type HIV-1 RT in a complex with an RNA:DNA oligonucleotide (PDB code 1HYS).45 After removing the RNA:DNA helix, the missing atoms of side chain residues of the RNase H active site were added using the SCit program (http://bioserv.rpbs.jus

In order to get the optimal pose of the five ligands and the two Mg2 + cations in the active site of the protein, their coordinates were obtained by overlapping the initial structure (1HYS) with the corresponding X-ray structures of the protein complexed with the corresponding nucleoside inhibitor: (1)

2.7-dihydroxy-4-(propan-2-yl)cyclohepta-2,4,6-trien-1-one or b-thujaplicinol (PDB code 3IG1), (2) 3-cyclopentyl-1,4-dihydroxy-1,8-naphthyridin-2(1H)-one (PDB code 3LP1), (3) ethyl 1,4-dihydroxy-2-oxo-1,2-dihydro-1,8-naphthyridine-3-carboxylate (PDB code 3LP0), (4) 3-[4-(diethylamino)-phenoxy]-6-(ethoxycarbonyl)-5,8-dihydroxy-7-oxo-7,8-dihydro-

1.8-naphthyridin-1-ium (PDB code 3LP3) and (5) 2-dihydroxy-4-1(methylethyl)-2,4,6-cycloheptatrien-1-one or b-thujaplicin (PDB code 3IG1). The DaliLite46 program, available online (http://www., was used to overlap the structures within the maximum similarity score S, defined as:

¿2 core je core

D(dA, dj )o(4, dj))

where core is the set of structurally equivalent residue pairs (iA, iB) between A and B, D is the deviation of intramolecular Ca-Ca distances between (iA, jA) and (iB, jB), relative to their arithmetic mean (d), and 6 is the threshold of similarity, set empirically to 0.2. The envelope function o = exp(—cF/r2'), where r = 20 A, downweights contributions from distant pairs.

Since the standard pKa values of ionizable groups can be shifted by local protein environments,47 an accurate assignment of the protonation states of all these residues at pH = 7 was carried out. Recalculation of the pKa values of the titratable amino acids has been done using the empirical PropKa program of Jensen et al.48 According to these results, most residues were found at their standard protonation state, except Asp-110, Glu-514 residues in chain A and Asp-218, Glu-28 residues in chain B that should be protonated. Additionally in chain B, His-221 and His-235 were protonated at the 8-position and His-208 at the e-position, and His-539 from chain A was protonated at both 8- and e-position. Then, a total of 32 counter ions (CP) were placed into optimal electrostatic positions around the protein, in order to obtain electro neutrality. Afterwards, a series of optimization algorithms (steepest descent conjugated gradient and L-BFGS-B49) were applied. To avoid a denaturation of the

protein structure during this step, all the heavy atoms of the protein and the inhibitor were restrained by means of a Cartesian harmonic umbrella with a force constant of 1000 kJ mor1 A-2. Afterward, the system was fully relaxed, but the peptidic backbone was restrained with a lower constant of 100 kJ mol-1 A-2.

The optimized protein was placed in a box of pre-equilibrated waters (140 x 80 x 80 A3), using the principal axis of the protein-inhibitor complex as the geometrical center. Any water with an oxygen atom lying in a radius of 2.8 A from a heavy atom of the protein was deleted. The remaining water molecules were then relaxed using optimization algorithms. Finally, 100 ps of hybrid QM/MM Langevin-Verlet MD (NVT) at 300 K were used to equilibrate the solvent.

During the MD simulations, the atoms of the ligands were selected to be treated by QM, using a semiempirical AM1 hamiltonian.50 The rest of the system (protein plus water molecules) were described using the OPLS-AA51 and TIP3P52 force fields, respectively, as implemented in the fDYNAMO


Due to the large amount of degrees of freedom, in order to reduce the cost of the calculations, any water molecule 25 A apart from any of the atoms of the Mg2 + ions located in the active site was deleted and protein atoms and all water molecules 20 A apart from any of the atoms of the initial inhibitor were kept frozen in the remaining calculations. Cut-offs for the non-bonding interactions were applied using a switching scheme, within a range radius from 14.5 to 16 A. Finally, the system was equilibrated by means of 1 ns of QM/MM MD at temperature of 309 K.

The potential energy

The potential energy of our scheme is derived from the standard QM/MM formulation:

Eqm/mm =jo|c)

+ fe(c

4- E + E

re,MM f EMM


+ EQlec

,vdW QM/MM

where EMM is the energy of the MM subsystem, Eqm/mm the van der Waals interaction energy between the QM and MM subsystems, Evac is the gas phase energy of the polarized QM subsystem and EQMt/MM includes both the coulombic interaction of the QM nuclei (Zqm) and the electrostatic interaction of the polarized electronic wave-function (C) with the charges of the protein (qMM).

The interaction energy between the inhibitor and the environment, computed by residue, can be evaluated as the difference between the QM/MM energy and the energies of the separated, non-interacting, QM and MM subsystems with the same geometry. Considering that the MM part is described using a non-polarizable potential the interaction

energy contribution of each residue (i) of the protein is given by the following expression:



— E<




+ y^ y 4ëqm,mm

QM rQM,MM _ sQM,MM\"~ / sQM,MM



Alchemical free energy perturbation (FEP) methods

In order to evaluate the ligand-protein interaction free energy, a series of QM/MM MD simulations have been carried out in the active site of the protein and in a box of solvent water molecules, introducing two parameters; l and g parameters in the electrostatic and van der Waals QM/MM interaction terms, respectively:

Eqm/mm(1) = <c|Ho|c)



rQM , MM

Then l smoothly changes between values of 1, corresponding to a full MM charge-wave function interaction, and 0 whereas no electrostatic interaction with the force field is introduced. Once the electrostatic charges of the ligand (QM region) are annihilated, g smoothly changes between values of 1, corresponding to a full QM-MM van der Waals interaction, and 0 whereas no van der Waals interaction with the force field is introduced. The calculation of the free energy difference of two consecutive windows of the two stages, annihilation of charges and van der Waals parameters, is performed by means of FEP methods, and then the total free energy change is evaluated as the sum of all the windows covering the full transformation from the initial to the final state. In practice, this is done in two steps (eqn (6) and (7)), once the charges are annihilated then the van der Waals:



b X h,<e"

b [X mie"



We have used a total of 50 windows to evaluate the electrostatic interaction term, from l = 0 (no electrostatic interaction) to l = 1 (full interaction), which turns into a 51 of 0.02, and the same amount of windows have been used to calculate the van der Waals interaction term, from g = 0 (no electrostatic interaction) to g = 1 (full interaction), which turns into a 5g of 0.02. The procedure can be applied both forwards (i — i + 1) and backwards (i +1 — i) as long as the comparison provides information about the convergence of the process. In each window a total of 5 ps of relaxation followed by 10, 50

Scheme 2 Thermodynamic cycle to compute enzyme-ligands binding free energies from alchemy free energy perturbation methods. E is the enzyme with ligand (L) in its binding site, E' is the apo form of enzyme, (L)0 is the ligand in gas phase.

or 100 ps of production QM/MM MD, test dependent, have been performed using the NVT ensemble at the reference temperature of 309 K.

According to Scheme 2, it is possible to compute the binding free energy of a particular ligand, which could be directly compared with experimental data such as the inhibition constants, KI. Thus, the predicted value would be expressed as:


where the two terms on the right side of eqn (8) are computed by applying eqn (6) and (7), in water and in the enzyme, respectively. It is important to point out that, in computing AGW, water molecules can occupy the cavity left by the ligand when all interactions between ligand and water environment are switched off. In such a case, if the volume of full solvent water molecules was not big enough, a decrease in the density of the box could prevent closing of the thermodynamic cycle in Scheme 2. A similar argument could be applied to the calculation of AGE but, in this case, although the cavity created when removing the ligand from the active site can be occupied by water molecules of the solvent (keeping in mind that an RNase H active site is quite flat and open to solvent), the cavity can be also occupied by the protein, as long as flexibility to the protein is allowed, and then the impact of this effect is minimized.

Free energy of binding from potential of mean force (PMF)

The PMFs traced in order to obtain the free energy54,55 of the process of pulling the ligand from the enzyme binding site to solution has been calculated using the weighted histogram analysis method (WHAM) combined with the umbrella sampling approach55,56 as implemented in fDYNAMO. The procedure for the PMF calculation is straightforward and requires a series of molecular dynamics simulations in which the distinguished reaction coordinate variable, X, is constrained around particular values.55 In our case, the reaction coordinate X was chosen as the distance between the center of mass of the ligand and the position of a fixed water molecule 25 A away from the active site in a perpendicular direction. The values of the variable sampled during the simulations (which covers a distance of 12 A) are then pieced together to construct a distribution function from which the PMF is obtained as a function of the distinguished reaction coordinate (W(X)). The PMF is related to the normalized probability of finding the system at a particular value of the chosen coordinate by eqn (9):

The difference of free energy can be then expressed as:57

AG(X) = W(Xw) - [W(Xbs) + GX(Xbs)] (10)

where the superscripts indicate the value of the reaction coordinate at the initial state (in the binding site, bs) and final state (in water environment, w) and Gx(Xbs) is the free energy associated with setting the reaction coordinate to a specific value at the initial state; in the active site. Normally this last term makes a small contribution58 and the change in free energy is directly estimated from the PMF change between the initial and final states of the profile:

AG(X) = W(Xw) - W(Xbs) = A W(X) (11)

Results and discussion

Alchemical free energy perturbation methods

First step of our study was to evaluate the electrostatic and van der Waals contributions to the free energy of binding of the five ligands in the active site of HIV-1 RT and in aqueous solution, as a function of the length of the MD sampling. The results are reported in Table 1.

As can be observed in Table 1a, the values of the electrostatic free energy terms obtained in the enzyme active site strongly depend on the time of simulation, being clearly overestimated if short MD, 10 ps, is used. Changes from 10 ps to 50 ps range from 4.8 kcal mol-1 (in Lig2) to 11.5 kcal mol-1 (in Lig3), while changes from 50 ps to 100 ps are dramatically reduced from 0.4 kcal mol-1 (in Lig2, thus virtually invariant) to 4.2 kcal mol-1 (Lig4). Accordingly, changes below 2 kcal mol-1 for all ligands when enlarging the sampling up to 200 ps would be expected. These results would not be qualitatively different from the ones obtained with 100 ps sampling. Thus, after evaluating the variations between 10, 50 and 100 ps, we can consider that 100 ps MD is a good compromise between computer resources and quality of results, keeping in mind that we are interested in a comparative analysis. As expected, the calculation of the electrostatic term of the free energy of binding in water does

Table 1 Electrostatic, AGe (a) and van der Waals, AGvdW (b) terms of the protein-ligand and water-ligand interaction free energy as a function of the MD simulations. All values are reported in kcal mol-1



W(X) = C - kTln/p(rN)S(X(rN) - X)drN

10 ps 50 ps 100 ps 10 ps 50 ps

Lig-1 -38.2 -30.1 -27.6 -5.1 -5.0

Lig-2 -29.7 -24.9 -24.5 -11.6 -12.0

Lig-3 -49.2 -37.7 -35.0 -8.5 -8.2

Lig-4 -38.3 -29.8 -25.6 -14.0 -13.4

Lig-5 -27.7 -20.8 -17.1 -5.4 -4.8

Protein-ligand Water-ligand

10 ps 50 ps 100 ps 10 ps 50 ps 100 ps

Lig-1 -23.5 -19.6 -19.6 -17.3 -16.8 -16.4

Lig-2 -27.9 -28.6 -26.7 -23.7 -23.7 -23.4

Lig-3 -25.6 -23.8 -26.5 -21.7 -23.1 -23.0

Lig-4 -41.9 -39.2 -36.9 -33.9 -36.5 -36.8

Lig-5 -15.8 -15.6 -13.2 -15.4 -16.6 -16.5

not show such strong dependency and then 50 ps MD will be the length of our sampling. Regarding the van der Waals interaction term, the dependence on the length of the simulations is not so dramatic (see Table 1b). The equilibration of the water molecules or the protein during the annihilation of the ligands can be considered as fully converged, in both media, after 100 ps of MD and these are the values used for the FEP calculations. Thus, according to Scheme 2, the binding free energy of the five ligands is computed by means of eqn (6)-(8), and presented in Table 2. The energy decomposition of the two terms, van der Waals and electrostatics is also reported in Table 2. The first conclusion that can be obtained is that the contribution of the van der Waals interaction to the binding free energy is much smaller than the contribution of the electrostatic term. Obviously this is due to the fact that very close values are obtained in both environments for all ligands, which is a reasonable result taking into account that van der Waals interactions depend mostly on the molecular surface of the ligand. As observed in Table 2, the largest value corresponds to the biggest molecule, L4, while the smallest value is obtained for the smallest molecules, L1 and L5, in both aqueous solution and in the enzyme active site. Thus, the total free energy of binding will depend mostly on the difference between electrostatic interactions in solution and in the enzyme, as previously suggested by Warshel and co-workers.59 By comparing the different ligands, L5 shows the lowest electrostatic free energy of interaction in the RNase H active site, while L3 is the one presenting the largest value. The same trend is obtained when computing the final binding free energies.

The relative significance of the electrostatic (AGe) and van der Waals (AGvdW) terms to the binding energy can be also shown by the more computing demanding calculation of the electrostatic and van der Waals free energy interaction terms along the binding pathway of the ligand from the water to the active site. The result obtained for L1 is plotted in Fig. 1 and confirms the previous predictions: the change of the Lennard-Jones interactions from the aqueous environment to the active site of the enzyme is significantly smaller than the variation of the electrostatic interactions that, therefore, is the term that will determine the total binding free energy of the process. Nonetheless, the evolution of both terms is favourable for the binding process. Another important conclusion that can be derived from the plot is that, according to the distance evolution of the electrostatic interaction term presented in Fig. 1a the ligand can be considered as completely solvated when located at ca. 8 A from the active site. According to this criterion, the variation of the electrostatic and van der Waals interactions, from the enzyme active site to the bulk, would be ca. 33 kcal mol-1 and

Table 2 Electrostatic (AGe) and van der Waals (AGvdW) terms of the water-ligand and protein-ligand interaction free energy, and differences in interaction between water and the binding site of the protein (AGelec, AGvdW and AGbind). All values are reported in kcal mol-1


AGe-w AGvdW-w AGe-e AGvdW-e AGelec AGvdW AGbind

Lig-1 -5.0 —16.4 -27.6 -19.6 -22.6 -3.2 -25.8

Lig-2 -12.0 -23.4 -24.5 -26.7 -12.5 -3.3 -15.8

Lig-3 -8.2 -23.0 -35.0 -26.5 -26.8 -3.5 -30.3

Lig-4 -13.4 -36.8 -25.6 -36.9 -12.2 -0.1 -12.3

Lig-5 -4.8 -16.5 -17.1 -13.2 -12.3 3.3 -9.0

• • « •

• Water

-33 kcal/mol

• .........

" Active site

0 2 4 6 8 10 12 14 Distance (ligand-active site) (A)

0 2 4 6 8 10 12 14 Distance (ligand-active site) (A)

Fig. 1 Electrostatic (A) and van der Waals (B) ligand-environment interaction energy terms along the binding path from aqueous to the enzyme active site (distance in A). Results obtained for ligand 1.

ca. 8 kcal mol-1, respectively. This result, although cannot be considered at a quantitative level as long as longer dynamics is carried out, is in good agreement with AGelec and AGvdW obtained from eqn (6) and (7) and reported in Table 2, which gives some credit to our results.

Pathway methods

In order to study the ligand-protein binding free energies by means of the umbrella biased PMF along a coordinate defining the distance from the ligand to the active site, it is necessary to define some variables. These are, the force constant of the umbrella potential energy term applied to properly overlap the consecutive windows during the sampling, the window width or step size where the restraining or umbrella potential energy term is centred along the path, and the time of the MD simulations carried out in each window. Taking the L1 as a benchmark, the effect of the force constant and the step size on the PMF profile for a defined time of the MD is presented in Fig. 2.

As can be observed in Fig. 2, the free energy of binding, as well as the shape of the profile, presents a strong dependency on the value of the constant, as previously shown by Henchman and co-workers18 and by Fabrittis and co-workers:60 the larger the force constant is, the larger the free energy difference between the ligand in the active site and completely solvated in the bulk. This effect is reduced if the size of the step is reduced from 0.2 A to 0.1 A, as observed in Fig. 2a. Obviously, the force constant dependency is due to an overlapping conformational sampling deficiency, which is partially minimized when reducing the step size of the umbrella function. Fig. 2a shows that, albeit reducing the step size of the scan along the coordinate minimizes the problem of sampling, it is not completely cancelled out (no convergence is obtained with the force constant). Henchman and co-workers already detected this problem when computing binding PMFs without orthogonal restraints to the single reaction coordinate.18

Fig. 2 (a) PMF dependency on the force constant k (in kJ mol-1 A-2) and step size (0.1 and 0.2 A). (b) Free energy profile dependency on the umbrella sampling force constant k with a step size equal to 0.2 A.

Consequently, and considering that this is due to an unbiased sampling of the conformational space along the path from the water to the active site, an analysis of the effect of the force constant and the length of the MDs has been carried out. The results of increasing the time of the MDs performed in each of the windows along the binding path, from 10 ps to 1 ns, is depicted in Fig. 3. As can be observed, while the values obtained with short MD are clearly dependent on the selected umbrella force constant, when long dynamics are performed, the effect is clearly minimized. Fig. 3 suggests that long MDs are required to get reliable values of binding free energies with reduced dependency on the selected force constant. Fabrittis and co-workers have also demonstrated that long sampling MD simulations are required for robust binding predictions.58 Thus, in order to study the binding of the five nucleoside inhibitors of the HIV-1 RT by means of the pathway method, we will apply the umbrella sampling with a step-size of 0.1 A to cover the range from inside the active site to the bulk solvent, with a force constant of 100 kJ mol-1 A-2 and MD simulations of 1 ns. We must keep in mind that while previous binding studies based on PMF methods were carried out by means of classical force fields describing both ligand, protein and solvent, our MD simulations are based on hybrid potentials and thus, the calculations are much more CPU time demanding.

The final values of the binding free energies obtained by means of the PMFs for the five ligands within a cartesian force constant equal to 100 kJ mol-1 A-2, a step size of 0.1 A and 1 ns MD simulations for each window are listed in Table 3. The values obtained by means of the alchemical FEPs methods are also listed in the table for comparative purposes.

As observed in Table 3, both methods would present L3 as the best candidate to inhibit the HIV-1 RT while L5 gives the lowest negative binding energy. This one is, in fact, the one that has experimentally shown the lowest activity, while the other four ligands present similar activities.42,61

Fig. 3 Binding free energy (PMF) dependency on the time of the MD simulations performed in the umbrella sampling, and on the harmonic umbrella force constant. Results obtained for ligand 1.

Nevertheless, as mentioned above, a direct comparison between our binding free energies of the ligands to the RNase H site and experimental data, IC50 (value which expresses the concentration of an inhibitor required to produce 50 per cent inhibition of an enzymic reaction at a specific substrate concentration), is problematic since the experimental values can reflect other kind of inhibition processes. Also, according to Cheng and Prusoff,62 the inhibitory binding constant of a competitive inhibitor to the enzyme, or the reciprocal of the dissociation constant of the enzyme-inhibitor complex, is the value that could be straightforward compared with binding free energy values. IC50 depends on the concentration of the substrate and its dissociation constant. If concentration of the substrate is the same in all experiments and in any case much larger than the corresponding substrate Michaelis constant, the experimentally measured IC50 of all five residues would be related with free energies of binding in the range from -5.5 (L5) to -9.5 kcal mol-1 (L2). Then, the agreement between our theoretical predictions and the experimental data has to be considered at a qualitative level, since the results appear to be clearly overestimated.

There are, nevertheless, significant differences between the absolute values, and trends, obtained by PMF and FEP based methods for the rest of the inhibitors, especially for ligands L2 and L4. The origin of this discrepancies could be a deficiency sampling during the simulations. Keeping in mind that these two ligands are large and present more functional groups (hydroxyl, carbonyl and ester groups) than the rest of ligands, more specific interactions with the pocket can be established and probably the effect of lack of sampling is more dramatic in these two ligands. This effect is enhanced in the PMF calculations along the path. As observed in Fig. 3, the longer the sampling the smaller the binding free energy and the dependency on the umbrella force constant. Correction of this possible error would imply longer MD or the use of, for instance, perpendicular constrains as the ones used by Henchman and co-workers.18 Another possible source of error can be the underestimation of the cavity created in the solvent environment after ''deleting'' the ligands during the calculation of the AGvdW term with the

Table 3 Ligand-protein binding free energies obtained by means of the FEP and PMF methods (in kcal mol-1)

Lig-1 Lig-2 Lig-3 Lig-4 Lig-5

DGb from PMF -30.4 -34.6 - 57.2 -50.7 -16.4

DGb from FEP -25.8 -15.8 -30.3 -12.3 -9.0

FEP method. As mentioned, this effect would not be so dramatic in the protein environment since part of the cavity can be occupied by the protein provided that it is flexible and allowed to move during the simulations. We have computed the electrostatic binding energy term of filling the cavity by the required number of water molecules, in each case. The resulting energies are —16.5, —18.2, —22.5, —53.2 and —22.2 kcal mol—1 for ligands L1, L2, L3, L4 and L5, respectively. Obviously these numbers are overestimating the real effect since during the original calculation of the DGvdW term, waters were allowed to move inside the cavity and the change in the density was negligible. Anyway, these values present the expected trend, which is a dependency on the size of the ligand that would partially close the gap between the PMF and FEP results of binding free energies.

Molecular Electrostatic Potential (MEP)

The Molecular Electrostatic Potential (MEP) is defined as the interaction energy between the charge distribution of a molecule and a unit positive charge. The MEP is highly informative of the nuclear and electronic charge distribution of a given molecule. The MEP has been applied to a range of fields, such as the study of biological interactions, topographical analysis of the electronic structure of molecules and definition of molecular reactivity patterns.63 The potential applications of the MEP as a tool for interpretations and prediction of chemical reactivity, as well as to probe the similarity of transition states and inhibitors, has long been recognized.64 Three-dimensional MEP surfaces for the five ligands, the active site of the enzyme and its natural substrate are depicted in Fig. 4. The MEP surfaces were derived from B3LYP/6-31+G (d,p) calculations, computed under the effect of the protein environment, using Gaussian 0965 and visualised in Gaussview 3.0.66 These surfaces correspond to an isodensity value of 0.002 a.u. In the figure, the most nucleophilic regions (negative electronic potential) are in red, while the most electrophilic regions (positive electrostatic potential) appear in blue. The active site displays large positive (blue) regions matching the positions of the magnesium cations, and nitrogen backbone atoms of Asp443, Glu478, Asp549 and His539, while it display large negative (red) regions matching the positions of oxygen atoms of backbone belonging to His539, Asp443, Glu478, Asp498, and Asp549. The negative electrostatic potential for the fragment of DNA-chain can be found around the oxygen atoms belonging to the phosphate groups. First conclusion that can be derived from the figure is that there is a reasonable complementarity between the active site of the enzyme and its natural substrate. According to Pauling postulate,67 even a better match could be expected for the TS structure of the DNA polymerase HIV-1 RT catalyzed reaction. The analysis of the five ligands reveals negative red regions around the hydroxyl and carbonyl oxygen atoms while positive regions are located mainly in hydroxyl hydrogen atoms. Nevertheless, the positive charge on these proton atoms seems to be delocalized when taking part of hydrogen bonds established with the contiguous carbonyl group. This effect is dramatic in the case of L6 that, in fact, does not present any blue region on the MEP.

This analysis can be supported by means of evaluation of the interaction energy between the inhibitor and the environment,

Fig. 4 Molecular electrostatic potential (MEP) surfaces derived from B3LYP/6-31G+(d,p) calculations for the active site of the HIV-1 RT, together with the natural substrate (RNA) and the five tested ligands polarized by the charges of the protein environment. The increase in negative charges goes from blue (positive) to red (negative).

computed by eqn (4). This interaction energy can be decomposed in a sum over residues provided that the polarized wave function (C) is employed to evaluate this energy contribution. The global polarization effect can be obtained from the gas phase energy difference between the polarized, C, and unpolarized, C0, wave functions. Averaged protein-substrate interaction energies by residue are displayed in Fig. 5.

In Fig. 5, where positive values correspond to unfavourable interactions and negative values mean that the interaction is stabilizing the binding ligand, only the main contributions, which correspond to residues of the active site, have been presented.

As can be observed in Fig. 5, the strongest interaction is established between the ligands and the Mg2+ cations. These favourable interactions are, nevertheless, not completely the determinant factor of the final ligand-protein binding energies, since significant non-favourable interactions can be established with negatively charged residues of the active site such as Asp443, Glu478, Asp498 and Asp549. These residues are required to anchor the two Mg2+ cations in the active site and appear to destabilize the binding of the ligands (see Fig. 6). His539, a residue on the second shell of the active site that interacts with Asp549, is the only aminoacid that slightly stabilizes the interaction with the ligand, particularly in the case of L2. This can be explained by Fig. 4, as there is a positive charge in the ethoxy group of this ligand that could interact with the backbone oxygen atom of this residue. Thus, while L1 shows the largest negative values with Mg2+ cations, it also presents high positive values of interaction energies with aspartate residues of an active site that seem to cancel out the initially favourable effect. A similar picture is obtained for ligand 5.

Fig. 5 Contributions of individual amino acids of an active site to substrate-protein interaction energy (in kcal mol—') computed for the five ligands.

Fig. 6 Schematic representation of interactions established between ligand 2 and the active site of HIV-1 RT according to structure after 1 ns MD relaxation.

In contrast, while interactions of metals with L2, L3 and L4 are much weaker they do not show important destabilizing interactions with the aminoacids of the protein, especially with aspartate residues Asp443 and Asp498. Consequently, the total interaction energies can be more favourable even presenting lower interactions with the metals.

According to structure-activity relationship studies by Grice and co-workers,59'60 three oxygen ligands of RNase H inhibitors are critical. Their recent crystallographic analysis indicates that manicol makes multiple contacts with the highly conserved His539 residue,68 thus suggesting a bidentate inhibitor as a putative improved drug. Modelling studies of Chung et al.69 indicating that since the vinylogous urea binding site is also in the vicinity of His539, these kind of compounds would comprise allosteric and RNase H inhibitory properties that would provide improved selectivity.68 Our results show how the best inhibitors we have tested (see Fig. 6) are presenting a favourable interaction with divalent magnesium ions and with His539 (see Fig. 5), in agreement with experimental studies, and small destabilizing interactions with the rest of the aminoacids of the active site.


Two computational approaches, based on MD simulations within hybrid QM/MM potentials, have been used to compute the binding free energy of five known inhibitors of HIV-1 RT: the alchemical free energy perturbation methods, FEP, and a pathway method, in which the ligand is physically pulled away from the binding site, that renders a PMF of the process. Our comparative analysis suggests that just a qualitative agreement between both strategies can be obtained although the former is much less computing demanding. Both methods can be used to distinguish ligands presenting large inhibitory activity differences and to select the most promising scaffold for further developments. Nevertheless, as a general conclusion, long sampling MD simulations should be required to get reliable results. This is especially important for the pathway method when computing binding free energies of large compounds with a big amount of functional groups, capable of interacting with the protein pocket. Nevertheless, it is important to take into account that our simulations are based on hybrid QM/MM potentials in an attempt to get more reliable interactions since treating the ligand quantum mechanically has the additional advantage of

the inclusion of quantum effects such as ligand polarization upon binding.

Despite differences between the HIV NRTIs tested previously by different authors (tropolone derivatives identified by Budihas et al.,42 pyrimidal carboxylic acid scaffolds described by Kirschberg et al.,70 the betathujaplicinol described by Himmel et al.,71 and the naphthyridinones described by Su et al.61), all of them suggest the importance of the coordination with the two metal ions. Our analysis of the binding contribution of the individual residues shows that the design of an inhibitor based just on favourable interactions with the two magnesium cations present in the active site of HIV-1 RT can fail. Negatively charged residues, such as Asp443, Asp549, Asp498 or Glu478, that play a critic role in anchoring the metal atoms into the protein, can destabilize the ligand binding. Also, large inhibitorlike molecules could present the advantage of interacting with residues of the second shell of the active site. This conclusion would explain why some ligands present anti-HIV activity while others, like b-thujaplicin (ligand 5), are almost inactive. In general, the MEPs of the active site of the protein and the different ligands show a better complementarity in those cases for which higher binding energies have been computed. This result is harmonizing with the analysis of the interaction energies by residues that support the equilibrium of the absolute values of favourable and non-favorable interactions that must present an inhibitor to have significant activity. We are herein showing how the best inhibitors we have tested are presenting a favourable interaction with divalent magnesium ions and with His539, in agreement with experimental studies.

Our results can be used to guide the addition of substituents to an initial inhibitor scaffold to decrease non-favorable interactions with negatively charged residues that anchor the metals. According to our obtained binding energies and the MEP and electrostatic interactions analysis, this could be done from L2 or L3. This result would support the use of naphthyridinone-based ligands, instead of drugs based on a cycloheptatriene ring such as tropolone and its derivatives, as well as the importance of properly located hydroxyl groups. It is important to stress that an over estimation of the importance of two hydroxyl groups for metal chelation could render non-favourable interactions with other residues of the active site. Keeping in mind that an RNase H active site is open to solvent, as pointed out by Grice and co-workers,68 the possibility of a variety of substituents could be also proposed to reduce cellular toxicity by decreasing binding affinity for other enzymes. Extensive binding and mechanistic studies, using both experimental and computational tools, are needed to finally design an optimal drug that efficiently inhibits all possible mutated forms of the rapidly emerging HIV.


We thank the Spanish Ministry Ministerio de Ciencia e Inovacion for project CTQ2009-14541-C1 and C2, Universitat Jaume I-BANCAIXA Foundation for projects P1 1B2011-23, Generalitat Valenciana for ACOMP2011/038, ACOMP/ 2012/119 and Prometeo/2009/053 projects. The authors also acknowledge the Servei d'Informatica, Universitat Jaume I for generous allotment of computer time.

Notes and references

1 W. L. Jorgensen, Nature, 2010, 466, 42-43.

2 N. Singh and A. Warshel, Proteins: Struct., Funct., Genet., 2010, 78, 1705-1723.

3 K. M. Merz Jr., J. Chem. Theory Comput, 2010, 6, 1769-1776.

4 W. L. Jorgensen, Acc. Chem. Res., 2009, 42, 724-733.

5 (a) N. Brooijmans and I. D. Kuntz, Annu. Rev. Biophys. Biomol. Struct., 2003, 32, 335-373; (b) A. R. Leach, B. K. Shoichet and

C. E. Peishoff, J. Med. Chem., 2006, 49, 5851-5855; (c) G. Morra, A. Genoni, M. A. C. Neves, K. M. Merz and G. Colombo, Curr. Med. Chem, 2010, 17, 25-41; (d) P. Kolbo and J. J. Irwin, Curr. Top. Med. Chem., 2009, 9, 755-770; (e) O. Guvench and

A. D. MacKerell, Curr. Opin. Struct. Biol, 2009, 19, 56-61.

6 A. R. Leach, B. K. Shoichet and C. E. Peishoff, J. Med. Chem., 2006, 49, 5851-5855.

7 C. J. Woods, M. Malaisree, S. Hannongbua and A. J. Mulholland, J. Chem. Phys, 2011, 134, 054114.

8 S. F. Sousa, P. A. Fernandes and M. Ramos, Proteins: Struct., Funct., Bioinf., 2006, 65, 15-26.

9 (a) Y. Deng and B. Roux, J. Phys. Chem. B, 2009, 113, 2234-2246; (b) E. Stjernschantz and C. Oostenbrink, Biophys. J., 2010, 98, 2682-2691; (c) A. de Ruiter and C. Oostenbrink, Curr. Opin. Chem. Biol, 2011, 15, 547-552; (d) S. E. Boyce, D. L. Mobley, G. J. Rocklin, A. P. Graves, K. A. Dill and B. K. Shoichet, J. Mol. Biol, 2009, 394, 747-763; (e) W. Jiang, M. Hodoscek and B. Roux, J. Chem. Theory Comput, 2009, 5, 2583-2588.

10 J. Gao, K. Kuczera, B. Tidor and M. Karplus, Science, 1989, 244, 1069-1072.

11 (a) J. Hermans and S. Shankar, Isr. J. Chem., 1986, 27, 225-227; (b) W. L. Jorgensen, J. K. Buckner, S. Boudon and J. Tirado-Rives, J. Chem. Phys., 1988, 89, 3742-3746; (c) M. K. Gilson, J. A. Given,

B. L. Bush and J. A. McCammon, Biophys. J, 1997, 72, 1047-1069.

12 A. Warshel, F. Sussman and G. King, Biochemistry, 1986, 25, 8368-8372.

13 F. S. Lee, Z. T. Chu, M. B. Bolger and A. Warshel, Protein Eng., 1992, 5, 215-228.

14 D. Trzesniak, A. Pitschna, E. Kunz and W. F. van Gunsteren, ChemPhysChem, 2007, 8, 162-169.

15 I. V. Khavrutskii, J. Dzubiella and J. A. McCammon, J. Chem. Phys., 2008, 128, 044106.

16 (a) D. R. Herschbach, H. S. Johnston and D. Rapp, J. Chem. Phys., 1959, 31, 1652-1661; (b) S. Boresch and M. Karplus, J. Chem. Phys, 1996,105, 5145-5154; (c) M. Sprik and G. Ciccotti, J. Chem. Phys, 1998,109, 7737-7744; (d) D. A. Zichi, G. Ciccotti, J. T. Hynes and M. Ferrario, J. Phys. Chem, 1989, 93, 6261-6265.

17 H. J. Woo and B. Roux, Proc. Natl. Acad. Sci. U. S. A., 2005,102, 6825-6830.

18 S. Doudou, N. A. Burton and R. H. Henchman, J. Chem. Theory Comput., 2009, 5, 909-918.

19 F. Colizzi, R. Perozzo, L. Scapozza, M. Recanatini and A. Cavalli, J. Am. Chem. Soc., 2010, 132, 7361-7371.

20 V. Helms and R. C. Wade, J. Am. Chem. Soc., 1998, 120, 2710-2713.

21 (a) C. J. Woods, M. A. King and J. W. Essex, J. Phys. Chem. B, 2003, 107, 13703-13710; (b) C. J. Woods, M. A. King and J. W. Essex, J. Phys. Chem. B, 2003, 107, 13711-13718.

22 R. H. Swendsen and J. Wang, Phys. Rev. Lett, 1986, 57, 2607-2609.

23 W. Jiang, M. Hodoscek and B. Roux, J. Chem. Theory Comput., 2009, 5, 2583-2588.

24 A. Laio and M. Parrinello, Proc. Natl. Acad. Sci. U. S. A., 2002, 99, 12562-12566.

25 F. L. Gervasio, A. Laio and M. Parrinello, J. Am. Chem. Soc., 2005, 127, 2600-2607.

26 N. V. Plotnikov, S. C. L. Kamerlin and A. Warshel, J. Phys. Chem. B, 2011, 115, 7950-7962.

27 F. S. Lee, Z. T. Chu and A. Warshel, J. Comput. Chem., 1993, 14, 161-185.

28 Y. Y. Sham, Z. T. Chu, H. Tao and A. Warshel, Proteins: Struct., Funct., Genet., 2000, 39, 393-407.

29 (a) T. Hansson, J. Marelius and J. Aqvist, J. Comput.-Aided Mol. Des, 1998, 12, 27-35; (b) J. Aqvist and T. Hansson, Protein Eng., 1995, 8, 1137-1144.

30 (a) J. Srinivasan, T. E. Cheatham, P. Cieplak, P. A. Kollman and

D. A. Case, J. Am. Chem. Soc, 1998, 120, 9401-9409;

(b) P. A. Kollman, I. Massova, C. Reyes, B. Kuhn, S. Huo, L. Chong, M. Lee, T. Lee, Y. Duan, W. Wang, O. Donini, P. Cieplak, J. Srinivasan and D. A. Case Cheatham III., Acc. Chem. Res., 2000, 33, 889-897.

31 N. Singh and A. Warshel, Proteins: Struct., Funct., Genet., 2010, 78, 1705-1723.

32 S. Genheden and R. Ulf, J. Chem. Theory Comput, 2011, 7, 3768-3778.

33 J. M. Molina, Drugs, 2008, 68, 567-578.

34 K. Das, S. E. Martinez, J. D. Bauman and E. Arnold, Nat. Struct. Mol. Biol., 2012, 19, 253-259.

35 S. G. Sarafianos, B. Marchand, K. Das, D. M. Himmel, M. A. Parniak, S. H. Hughes and E. Arnold, J. Mol. Biol., 2009, 385, 693-713.

36 J. Wang, P. Morin, W. Wang and P. A. Kollman, J. Am. Chem. Soc., 2001, 123, 5221-5230.

37 X. He, Y. Mei, Y. Xiang, Da W. Zhang and J. Z. H. Zhang, Proteins: Struct., Funct, Bioinf, 2005, 61, 423-432.

38 J. Lameira, C. N. Alves, V. Moliner and E. Silla, Eur. J. Med. Chem, 2006, 41, 616-623.

39 (a) S. Saen-oon, M. Kuno and S. Hannongbua, Proteins: Struct., Funct, Bioinf, 2005, 61, 859-869; (b) P. Nunrium, M. Kuno, S. Saen-oon and S. Hannongbua, Chem. Phys. Lett., 2005, 405, 198-202.

40 W. L. Jorgensen, M. Bollini, V. V. Thakur, R. A. Domaoal, K. A. Spasov and K. S. Anderson, J. Am. Chem. Soc., 2011, 133, 15686-15696.

41 M. Nowotny and W. Yang, EMBO J., 2006, 25, 1924-1933.

42 S. R. Budihas, I. Gorshkova1, S. Gaidamakov, A. Wamiru, M. K. Bona, M. A. Parniak, R. J. Crouch, J. B. McMahon, J. A. Beutler and S.F.J. Le Grice, Nucleic Acids Res., 2005, 33, 1249-1256.

43 A. Warshel and M. Levitt, J. Mol. Biol., 1976, 103, 227-249.

44 C. N. Alves, S. Marti, R. Castillo, J. Andres, V. Moliner, I. Tuüon and E. Silla, Chem.-Eur. J, 2007, 13, 7715-7724.

45 S. G. Sarafianos, K. Das, C. Tantillo, A. D. Clark Jr., J. Ding, J. M. Whitcomb, P. L. Boyer, S. H. Hughes and E. Arnold, EMBO J, 2001, 20, 1449-1461.

46 L. Holm and J. Park, Bioinformatics, 2000, 16, 566-567.

47 J. Antosiewicz, J. A. McCammon and M. K. Gilson, J. Mol. Biol., 1994, 238, 415-436.

48 L. Hui, A. D. Robertson and J. H. Jensen, Proteins, 2005, 61, 704-721.

49 R. H. Byrd, P. Lu, J. Nocedal and C. Zhu, J. Sci. Comput., 1995, 16, 1190-1208.

50 M. J. S. Dewar, E. G. Zoebisch and E. F. Healy, J. Am. Chem. Soc, 1985, 107, 3902-3909.

51 W. L. Jorgensen, D. S. Maxwell and J. Tirado-Rives, J. Am. Chem. Soc, 1996, 118, 11225-11236.

52 W. L. Jorgensen, J. Chandrasekhar, J. D. Madura, R. W. Impey and M. L. Klein, J. Chem. Phys, 1983, 79, 926-935.

53 (a) M. J. Field, A Practical Introduction to the Simulation of Molecular Systems, Cambridge University Press, Cambridge, UK, 1999; (b) M. J. Field, M. Albe, C. Bret, F. Proust-de Martin and A. Thomas, J. Comput. Chem., 2000, 21, 1088-1100.

54 (a) J. G. Kirkwood, J. Chem. Phys, 1935, 3, 300-313; (b) D. A. McQuarrie, Statistical Mechanics, Harper & Row, New York, 1976, p. 266.

55 B. Roux, Comput. Phys. Commun., 1995, 91, 275-282.

56 G. M. Torrie and J. P. Valleau, J. Comput. Phys., 1977, 23, 187-199.

57 G. K. Schenter, B. C. Garrett and D. G. Truhlar, J. Chem. Phys., 2003, 119, 5828-5833.

58 T. D. Heightman and A. T. Vasella, Angew. Chem., Int. Ed., 1999, 38, 750-770.

59 (a) J. D. Madura, Y. Nakajima, R. M. Hamilton, A. Wierzbicki and A. Warshel, Struct. Chem., 1996, 7, 131-138; (b) J. Wang, P. Morin, W. Wang and P. A. Kollman, J. Am. Chem. Soc., 2001, 123, 5221-5230; (c) B. O. Brandsdal, F. Osterberg, M. Almlof, I. Feierberg, V. B. Luzhkov and J. Aqvist, Adv. Protein Chem., 2003, 66, 123-158.

60 I. Buch, S. K. Sadiq and G. De Fabritiis, J. Chem. Theory Comput., 2011, 7, 1765-1772.

61 H.-P. Su, Y. Yan, G. S. Prasad, R. F. Smith, C. L. Daniels, P. D. Abeywickrema, J. C. Reid, H. M. Loughran, M. Kornienko,

S. Sharma, J. A. Grobler, B. Xu, V. Sardana, T. J. Allison, P. D. Williams, P. L. Darke, D. J. Hazuda and S. Munshi, J. Virol, 2010, 84, 7625-7633.

62 Y. Cheng and W. H. Prusoff, Biochem. Pharmacol., 1973, 22, 3099-3108.

63 (a) A. Gavezotti, J. Am. Chem. Soc., 1983, 105, 5220-5224; (b) J. S. Murray, T. Brinck, P. Lane, K. Paulsen and P. Politzer, Theochem, 1994, 307, 55-64; (c) G. D. van Duyne, R. F. Standaert, P. A. Karplus, S. L. Schreiber and J. Clardy, Science, 1991, 252, 839-842; (d) E. Cubero, F. J. Luque and M. Orozco, Proc. Natl. Acad. Sci. U. S. A., 1998, 95, 5976-5980; (e) J. Lameira, C. N. Alves, I. Tuüon, S. Marti and V. Moliner, J. Phys. Chem. B, 2011, 115, 6764-6775.

64 C. K. Bagdassarian, V. L. Schramm and S. D. Schwartz, J. Am. Chem. Soc, 1996, 118, 8825-8836.

65 M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuseria, Q M. A. Robb, J. R. Cheeseman, G. Scalmani, V. Barone, S3 B. Mennucci, G. A. Petersson, H. Nakatsuji, M. Caricato, X. Li, o H. P. Hratchian, A. F. Izmaylov, J. Bloino, G. Zheng, ^ J. L. Sonnenberg, M. Hada, M. Ehara, K. Toyota, R. Fukuda,

J. Hasegawa, M. Ishida, T. Nakajima, Y. Honda, O. Kitao, Js H. Nakai, T. Vreven, J. A. Montgomery Jr., J. E. Peralta,

o F. Ogliaro, M. Bearpark, J. J. Heyd, E. Brothers, K. N. Kudin,

o V. N. Staroverov, R. Kobayashi, J. Normand, K. Raghavachari,

A. Rendell, J. C. Burant, S. S. Iyengar, J. Tomasi, M. Cossi, •§ N. Rega, J. M. Millam, M. Klene, J. E. Knox, J. B. Cross,

>-h S_

0 ^ •a c u ^

1 * I s a g

V. Bakken, C. Adamo, J. Jaramillo, R. Gomperts, R. E. Stratmann, O. Yazyev, A. J. Austin, R. Cammi, C. Pomelli, J. W. Ochterski, R. L. Martin, K. Morokuma, V. G. Zakrzewski, G. A. Voth, P. Salvador, J. J. Dannenberg, S. Dapprich, A. D. Daniels, O. Farkas, J. B. Foresman, J. V. Ortiz, J. Cioslowski and D. J. Fox, Gaussian 09, Revision A.1, Gaussian, Inc., Wallingford CT, 2009.

66 R. Dennington, T. Keith, J. Millam, K. Eppinnett, W. L. Hovell and R. Gilliland, GaussView, V3.0, Semichem Inc., Shawnee Mission, KS, 2003.

67 L. Pauling, Chem. Eng. News, 1946, 24, 1375-1377.

68 S. Chung, D. M. Himmel, J. Jiang, K. Wojtak, J. D. Bauman, J. W. Rausch, J. A. Wilson, J. A. Beutler, C. J. Thomas, E. Arnold and S. F. J. Le Grice, J. Med. Chem., 2011, 54, 4462-4473.

69 S. Chung, M. Wendeler, J. W. Rausch, G. Beilhartz, M. Gotte, B. R. O'Keefe, A. Bermingham, J. A. Beutler, S. X. Liu, X. W. Zhuang and S. F. Le Grice, Antimicrob. Agents Chemother., 2010, 54, 3913-3921.

70 T. A. Kirschberg, M. Balakrishnan, N. H. Squires, T. Barnes, K. M. Brendza, X. Chen, E. J. Eisenberg, W. Jin, N. Kutty, S. Leavitt, A. Liclican, Q. Liu, X. Liu, J. Mak, J. K. Perry, M. Wang, W. J. Watkins and E. B. Lansdon, J. Med. Chem, 2009, 52, 5781-5784.

71 D. M. Himmel, K. A. Maegley, T. A. Pauly, J. D. Bauman, K. Das, C. Dharia, A. D. Clark, K. Ryan Jr., M. J. Hickey, R. A. Love, S. H. Hughes, S. Bergqvist and E. Arnold, Structure, 2009, 17, 1625-1635.