1. Introduction
In a recent article, published in Studies in History and Philosophy of Modern Physics, Jeffrey Ramsey has begun the difficult task of trying to distinguish between the virtues of ab initio and parametrized (semi-empirical) calculations in the physical sciences (Ramsey Reference Ramsey2000). While I agree with many of the ideas that Ramsey expresses on this topic I also want to express some friendly disagreement with some aspects of his analysis.
After doing so I will consider some work in recent chemical calculations which I believe give a new twist to the question of ab initio and parametrized theories or, to use Ramsey's apt phrase, “principles and parameters.”
2. Ramsey's Analysis
The traditional view of calculations in the physical sciences is that work carried out in a purely theoretical fashion, from the beginning, or ab initio, is to be preferred to calculations that explicitly incorporate some experimental parameters. The latter method is called semi-empirical, because it consists, broadly speaking, of half theory and half experimental data which has been imported into the calculation.
As Ramsey mentions there is a sense in which no calculation can ever be said to be fully ab initio. Even the most fundamental theories of physics, for example, do not generally compute the masses of elementary particles from first principles, as things stand at present. The mass of the electron, or that of the proton, for example, are fixed according to experimental data. But these features do not prevent physicists from regarding such calculations as being of an essentially ab initio nature. As Ramsey also mentions, it is more a matter of the degree to which a particular method is regarded as ab initio, rather than there being a clear-cut distinction between ab initio and parametrized methods.Footnote 1
But the remainder of Ramsey's article appears to be an attempt to question the widely held view that ab initio calculations are somehow better regarded that semi-empirical ones. Ramsey writes,
In short, the debate between the ‘principle’ and ‘parameter’ approaches has been cast as a ‘right’ vs a ‘wrong’ way to produce theory. (Reference Ramsey2000, 549)
And when there are many different projects of explanation in a science, multiple styles of theory production and articulation are needed. All this implies that the debate in chemistry should not be presented as a debate between the ‘right’ vs the ‘wrong’ approaches to theory production. (550–551)
The author then appeals to two case studies, one from physics, and one from chemistry. Of course I am sure that Ramsey appreciates the dangers of falling prey to the naturalistic fallacy in this context and yet there are sections in his article in which he appears to blur this question. For example, although the nature of science may well require a plurality of approaches this need not necessarily bear on the question of whether ab initio calculations are preferable to semi-empirical work, if all things are equal. Ramsey first turns to physics and examines the development of some theories of turbulence in the 1940s and 50s.
Many twentieth century theoretical physicists have been strongly attracted to a style of theory production that emphasizes principles rather than parameters. Thus, one might expect to hear physicists condemning semi-empirical approaches as strongly as the chemists do. However, the history of turbulence theory shows that physicists have been quite accepting of alternative styles when the situation has demanded it. (Reference Ramsey2000, 551)
His conclusion from this case is that although the early theory of turbulence, due to Heisenberg and Von Weizsaecker, was overtly semi-empirical this feature did not provoke too much criticism from fellow physicists. Ramsey argues, from this single and rather selective example, that physicists have a high tolerance for semi-empirical work.
In summary physicists have aimed for ‘principle’ theories, but they have accepted other forms of theories when conditions demand them. (Reference Ramsey2000, 554)
He then turns to some semi-empirical theories of chemical kinetics. By appeal to statements from various researchers of the period, Ramsey claims that chemists do not share the same degree of tolerance towards semi-empirical work. Ramsey reports that some chemists go as far as to “pour scorn” over semi-empirical approaches. Although I do not propose to go through Ramsey's examples in detail I would just like to consider the suggestion that chemists, unlike physicists, “pour scorn” upon parametric theories.
If one looks through the quotations that Ramsey has gathered to support this claim, there is little evidence for it. For example, Ramsey cites some critics of the semi-empirical approach in chemistry as commenting in the following way,
To establish the validity of the assumption on theoretical grounds would require a detailed consideration of the effectiveness of collisions in producing energetic molecules. Up to the present such a calculation has proved intractable […]. Calculations which ignore this phenomenon [of the production of activated species] seem to us too crude to be useful, and none have as yet been made which take adequate account of it (Guggenheim and Weiss, quoted in Ramsey Reference Ramsey2000, 558).
This can hardly be said to be a case of pouring scorn. The strongest objection by purist chemists that Ramsey brings to bear is the following,
If the fraction of the coulombic binding energy ‘is to be adjusted for each reaction so as to give the right answer, the method is useless’. (Guggenheim quoted in Ramsey Reference Ramsey2000, 558)
These are admittedly stronger reservations from chemists but more work would be needed to determine whether the authors of the semi-empirical calculations were indeed guilty of this charge. In any case I think that a single quotation to the effect that a particular calculation may be “useless” does not provide sufficient evidence for saying that there exists a general trend whereby chemists pour scorn on semi-empirical calculations, in contrast to physicists whom Ramsey alleges do not.
On the other hand, Ramsey also acknowledges that many historians and philosophers recognize that chemistry is less dominated by theories than physics, and that,
The majority of theoretical chemists have noted the need to construct theories that are appropriate to chemical problems, and chemists have been more willing than physicists to live with multiple representations. (Reference Ramsey2000, 554–555)
He then expresses his puzzlement and poses a rather telling question,
Given this, one would expect theoretical chemists to employ constructive theories more freely and thus for the semi-empirical techniques to be received rather more warmly than they were. (555)
The correct response to Ramsey's puzzlement is surely not too difficult to see. I claim that it is not generally the case that physicists are more forgiving of semi-empirical methods as Ramsey's one example suggests to him. In general the opposite seems to be true. Quite in keeping with what historians and philosophers have pointed out, chemistry does indeed employ semi-empirical theories more freely, and chemists are in fact more permissive than physicists in their acceptance of this tendency. How can Ramsey seriously suggest otherwise, using just one example from physics and one from chemistry, to challenge this well-established fact? I think that the case studies he has selected may be misleading, as may be the general conclusions that he attempts to draw from them.
I do not believe that the chemical purists do indeed “pour scorn” over their more parametrically inclined colleagues. What they do instead is to express a stylistic preference. Some theorists like to deal in analytical solutions, or approximations, which assume no experimental evidence. Others are interested in obtaining results or correlations at any cost and do not object to “tainting” the calculations by introducing experimental parameters. I use the word tainting deliberately since I do think that parametrized calculations are indeed the poor relation of ab initio work. The reason is quite simple. In the case of ab initio calculations there can be little, or no, suspicion that the theorist has fudged things to make his or her calculation appear better than it really is. The same cannot unfortunately be said for semi-empirical calculations. If one incorporates empirical data it becomes easier to produce a calculation that matches with the experimental data being calculated. Not surprisingly, such an achievement is held in lower esteem than a piece of research that is produced in a purely theoretical manner.
3. Calculations of Equilibrium Structures in Quantum Chemistry via ab Initio and Semi-Empirical Approaches
In the second part of my paper I too will undertake a brief case study of another area of science in which ab initio calculations coexist alongside semi-empirical calculations. In fact, as I hope to show, the situation here is not merely one of coexistence. Rather than one side showing scorn for the other, it is more in the nature of a symbiotic relationship in which elements of the two approaches have become almost inextricably linked. This feature, as I will argue, further complicates any attempts to demarcate between ab initio and parametric approaches by considering scientific practice, although I still maintain that practitioners regard ab initio as preferable in chemistry. The area I will examine involves contemporary calculations of equilibrium structures rather than reaction dynamics that Ramsey has focused on. Calculations on equilibrium properties seek to determine such things as the geometry of a molecule, including bond angles and bond lengths. In addition such calculations may seek to predict dipole moments and polarizabilities, properties that are essential in many areas of chemistry, biology, and physics. Footnote 2
In carrying out this case study I also take the opportunity to discuss density functional theory (DFT) alongside the more established ab initio chemical calculations based on wavefunctions. Although a number of papers dealing with ab initio calculations in quantum chemistry have now appeared, in the philosophy of science literature, I don't think anyone has yet discussed density functional approaches. This is a rather serious omission given that as much ab initio work, if not more, is now conducted using DFT.
4. Two Kinds of ab Initio Calculations in Chemistry
Ab initio calculations in chemistry have traditionally been carried out via the wavefunction of the system in question. The development of this approach can be traced all the way back to Schrödinger who first introduced the wavefunction into quantum mechanics, then through Heitler and London who used a wavefunction approach to calculate the energy in the hydrogen molecule. Another early contribution was Douglas Hartree's self-consistent method for calculating the energy of an atom. First a wavefunction is chosen and expressed as a product of atomic orbital functions. A variation of these orbitals is then performed, one at a time, with the aim of minimizing the energy of the system. The method involves solving the Schrödinger equation for the system,
where the wavefunction Ψ itself is being generated and optimized in the course of the calculation of the allowed energy values. Another approached that has gained prominence more recently, but that also has a long history is the density functional method. This method does not use orbitals and wavefunctions, at least not in principle.
5. Density Functional Approach
In 1927 the physicist Llewellyn Thomas proposed treating the electrons in an atom by analogy to a statistical gas of particles (Thomas Reference Thomas1927). Electron-shells and orbitals play no role in this model. This approach was independently rediscovered by Italian physicist Enrico Fermi, and is now called the Thomas-Fermi method (Fermi Reference Fermi1927). For many years it was regarded as a mathematical curiosity without much hope of application, as the results obtained were inferior to those obtained by the method based on wavefunctions and electron orbitals. But on the positive side, the Thomas-Fermi method treats the electrons around the nucleus as a perfectly homogeneous electron gas and the mathematical solution for the Thomas-Fermi model is ‘universal’, which means that it can be solved once and for all. This offers a potential advantage over traditional, or wavefunction based, ab initio methods within which a separate Schrödinger equation must be solved for each atom in the periodic table for example. The traditional approach is thus akin to “empirical mathematics,” a feature that seems to deprive it of a true ab initio status. The Thomas-Fermi method, by contrast, is able to obtain a general solution for the energy of all the atoms in the periodic table at one stroke, at least in principle.
Gradually the Thomas-Fermi method and density functional theories, as its modern descendants are known, have become as powerful as methods based on orbitals and wavefunctions and in many cases can outstrip the wavefunction approaches in terms of computational accuracy and computational efficiency.
6. Density Functional Theory in Practice
What has been described so far concerning density functional theory applies in theory rather than in practice. The fact that the Thomas-Fermi method is capable of yielding a universal solution for all atoms in the periodic table is a potentially attractive feature but is generally not realized in practice. Because of various technical difficulties, the attempts to implement the ideas originally due to Thomas and Fermi have not quite materialized.Footnote 3 This has meant a return to wavefunctions and orbitals even within DFT calculations (Kohn and Sham Reference Kohn and Sham1965). We thus return to the need to solve a number of equations separately for each individual atom as one does in the Hartree-Fock method and other ab initio methods using atomic orbitals. In addition, the more tractable approaches in density functional theory also involve a return to the use of atomic orbitals in carrying out quantum mechanical calculations since there is no known means of directly obtaining the functional that captures the exact electron density.
A further and much touted, promise from DFT theorists was that whereas a wavefunction approach involves a great deal of redundant information since it operates in 3N dimensional space, electron density studies would be more economical since they operate in real 3D space. Needless to say this promise too has not materialized in view of the partial return to using wavefunctions.
To make matters worse, the use of a uniform gas model for electron density does not enable one to carry out accurate calculations. Instead, a density gradient must be introduced into the uniform electron gas distribution. The way in which this has been implemented has typically been in a semi-empirical manner by working backwards from the known results on a particular atom, usually the helium atom (Gill Reference Gill and von Rague Schleyer1998).
It has thus become possible to obtain sets of functions which often give successful approximate calculations in many other atoms and molecules than just helium atoms. But there is yet no known way of calculating, in an ab initio manner, the required density gradient that must be introduced into the calculations. By carrying out this combination of semi-empirical procedures and retreating from the pure Thomas-Fermi notion of a uniform electron gas it has actually been possible to obtain computationally better results in many cases of interest than with conventional ab initio methods.
And yet in spite of this kind of parametrization, DFT is frequently presented as being a “first principles” approach at the same level of purity as ab initio calculations based on wavefunctions.Footnote 4
As time has progressed the best of both methods (DFT and ab initio orbital methods) have been blended together (see Table 1), with the result that many computations are now performed by a careful mixture of wavefunction and density approaches even within the same specific problem in question (Hehre et al. Reference Hehre, Radom, Schleyer and Pople1986). This feature brings with it advantages as well as disadvantages. There is really no such thing as a pure density functional method for performing calculations and the philosophical appeal of a universal solution for all the atoms has not yet borne fruit.Footnote 5
Exact value | −.392 |
MP2 (wavefunction method) | −.152 |
Pairs (wavefunction method) | −.210 |
W38 (DFT) | −.360 |
VWN (DFT) | −.746 |
LYP (DFT) | −.383 |
GG91 (DFT) | −.389 |
Note.— All values in Hartrees. Data based on (Gill Reference Gill and von Rague Schleyer1998).
7. A Digression on Models of the Solar System and the Atom
A statement often made in quantum mechanics is that contrary to earlier beliefs, the atom does not behave as a miniature solar system since electrons do not move in orbits around the nucleus. It is stressed that whereas planets have definite trajectories or deterministic orbits, things are altogether different in the case of the atom where electrons lack trajectories and have an indeterministic motion.
This may all be true in conceptual terms but I think that this dis-analogy may be somewhat overdone. In purely mathematical terms the analogy remains extremely valuable. Although it may be true that in conceptual terms the electrons do not orbit the nucleus in the same manner that planets orbit the sun, the mathematical equations for the two systems and the manner in which they are solved show striking similarities.
For example, consider a calculation of the energy of the solar system. The expression is given below:
First there is the contribution from kinetic energy of the planets with the characteristic
terms. The second term represents the potential energy due to the interaction of each of the planets with the sun. These are of course gravitational interactions, given by the product of the masses divided by the distance between each planet and the sun. The final term represents the interaction between the planets themselves and is the source of serious difficulties since it involves the coupling together of all the planets.
There are various approaches one may take. The crudest approximation is to omit the third term altogether. This has the effect of de-coupling the planets and allows one to treat the problem as a set of independent problems, one for each of the planets interacting with the sun. Alternatively, and more accurately, the third term can be replaced by one that is similar to the second. We can reasonably suggest that each planet moves in a field due to the sun and also an average field due to all the other planets combined together. This can be represented by the next equation shown below.
The troublesome coupling terms explicitly involving distances between planets have now disappeared and have been replaced by terms involving the average field ν. Once again the problem has been reduced to one involving independent equations, one for each of the planets. An interesting aspect of this problem is that the average field must depend on the orbits of the planets and vice versa, the orbits must depend on the average field. This requirement leads to a self-consistent field method.
Turning to the analogy with the atomic and molecular case, the energy of the system is obtained by solving the Schrödinger equation
As already mentioned earlier and in which the Hamiltonian is given by
Just like the case of the solar system, the first term represents the kinetic energy of the electrons moving around the nucleus. The second term is due to the nucleus-electron potential energy from each of the electrons in the system. The third term represents the mutual coupling of electrons. Of course unlike the astronomical case, the interaction in this and all the previously mentioned terms are Coulombic rather than gravitational. As in the case of the solar system we could make a drastic approximation and simply ignore the third term but then the energy calculated would be rather inaccurate. On the other hand we can do what we did in the astronomical case and substitute the third term for one that allows us to calculate the movement of each electron in an average field due to the nucleus and all the other electrons combined together. The equation now takes on the following form,
Again the bonus is that we can now treat the problem as one of n independent or pseudo-independent equations, one for each of the electrons. This too is a self-consistent field problem in which the field determines the electronic motion while the electronic motion in turn determines the field.
The first attack on calculating an approximate self-consistent average field in the atomic case was carried out by Douglas Hartree in 1928 using wavefunctions and orbitals and this formed the basis of most ab initio calculations in chemistry until recently. The alternative DFT approach involving electron density, rather than electron orbitals, was initiated by Thomas and Fermi as already mentioned. Then in 1964 Hohenberg and Kohn, working within the DFT approach, proved a very powerful theorem whereby there exists a unique potential, which if used self-consistently actually generates the exact Schrödinger energy of the system (Hohenberg and Kohn Reference Hohenberg and Kohn1964). The bad news is that this is an existence proof but not a constructive proof. It does not tell us how to actually construct this wonderful potential. It appears that all self-consistent field methods have been attempts to arrive at this potential. Some like Hartree's method involve wavefunctions and orbitals while others like the Thomas-Fermi method, that later evolved into DFT methods, do not.
Regardless of which approach is used, wavefunctions or density, the total electronic energy in the case of the atom can be expressed in the following useful form,
The first three terms correspond to the energies yielded by the first three terms in the Hamiltonian of equation (5). The fourth and fifth terms in (6) are needed to correct for the deficiencies of the energy calculated using the third term in the Hamiltonian. This is because the use of the self-consistent method used to calculate the average electron-electron repulsion suffers from the following drawbacks. The way in which the averaging of electron-electron repulsions is carried out in Hartree's method does not take account of the fact that electrons are completely indistinguishable. In the more general Hartree-Fock method an exchange of all electrons is carried out. In physical terms this implies that electrons with the same spin are made to keep away from each other in the manner required by the Pauli Principle.
Finally, the fifth term calculates the small but all-important correlation energy of the atom. This represents the correlation or, more physically speaking, the mutual avoidance of electrons with different spins. This effect that is not addressed by the Pauli Exclusion Principle and hence not addressed by the fourth, so-called exchange term. Some idea of the relative magnitudes and signs of the energies arising from these five contributions can be seen by considering a typical atomic case such as neon.
E T | E V | E J | E X | E C | |
+129 | −312 | +66 | −12 | −0.4 | atomic units |
Although the correlation energy term may appear very small in comparison with the other four contributions it is essential that accurate calculations faithfully reproduce as much as possible of this term if they are to be of any chemical value for most applications. Indeed, most of the effort in modern computational chemistry is directed at obtaining more accurate calculations of the correlation energy term. Interestingly, the second and third terms are calculated in the same way in any quantum mechanical procedures. This leaves the first, fourth and fifth terms. The particular manner in which they are calculated determines the type of self-consistent field approach that is used, or combination of such approaches as is more often the case.
If one considers the three crucial terms in my equation (6), namely E T, E X and E C, it emerges that different combinations of wavefunction and density approaches are used to evaluate some of these (see Table 2). Indeed the entanglement between current wavefunction and density approaches is so pronounced that it becomes increasingly difficult to tell the difference between them. This is especially so for many practicing chemists and physicists for whom these methods have become just computational black boxes.
Method of Calculating E T | Method of Calculating E X and E C |
---|---|
Orbital functional | Orbital functional* |
Orbital functional | Orbital & density functionals |
Orbital functional | Density functional |
Density functional | Density functional+ |
* This approach corresponds to traditional ab initio methods using only wavefunctions and orbitals.
+ This approach (pure density functional method) is difficult to implement because E T is not generally amenable to density functionals.
8. Conclusions
What bearing does this state of affairs I have described have on the initial issue of ab initio versus semi-empirical approaches? All I can say is that if one considers the actual situation in modern computational chemistry there appear to be aspects of ab initio and semi-empirical work interwoven at every level. I think it becomes almost impossible to judge the virtues of the two approaches and one must fall back again on normative considerations. There seems little doubt about the fact that density functional theory has been forced to renege on its initial promises over various issues and this is regarded by some experts as a failure. For example, as I have mentioned, the idea of a uniform electron gas does not quite work. It needs to be supplemented with a gradient function which loosely speaking introduces a “lumpiness” in certain parts of the electron gas and this is carried out semi-empirically.
For example, Peter Gill, has written a witty article in which he laments the ‘passing away’ of DFT which he describes as an initially promising computational approach (Gill Reference Gill2001). Of course the method has not really passed away. What Gill is getting at is that the purity or ab initio nature of the method has been seriously compromised by recent developments.
But computational scientists make demanding masters and, before long, she [DFT] was being driven to change. “You would be much more attractive,” they whispered, “if you would submit to a little parameterization. It won't hurt very much.” Not content with her elegant simplicity, they insisted that she provide the same results as her sophisticated and expensive brother [ab initio wavefunction method], but at a fraction of his price. Motivated by an insatiable hunger for perfection at no cost, they cared not a whit for her welfare.
So the re-invention began. Layers of parameters—the rouge of computational science—were plastered onto her frail frame until, as the final decade of the century dawned, she could barely recognize herself. “The panacea of all of quantum chemistry's ills!”, some declared. “Mutton dressed as lamb,” others muttered and, sickened, she silently agreed.
Finally, after suffering from a succession of excruciating fits, she turned to an eminent Canadian surgeon [A. D. Becke]. He examined her, drew a deep breath and sighed. “There is little that I can do for you,” he admitted. “You have advanced Hyperparametric Disorder and there is no known cure. You should resign yourself to a future of infinite regression.”
“Oh no!”, she cried. “The only thing worse than rigor mortis is the death of rigour!” (Gill Reference Gill2001, 662).
My conclusion is that chemists and physicists do whatever needs to be done in order to obtain greater accuracy in their calculations. But this will not stop the more reflective among them, like Gill, from lamenting the loss which results when one has to incorporate experimental data in order to make calculations work better. Yes a semi-empirical calculation is often closer to nature but only because we have built in a chunk of nature. The ultimate theoretical goal is to try to model nature as closely as possible without getting one's theoretical hands too dirty. In this respect ab initio work will always be better than semi-empirical approaches.