Saturday, July 8, 2017

How Do Magnets Work? Spin, Paramegnetism, Ferromagnetism and the Ising Model


Photo Credit: Newton Henry Black, Harvey N. Davis (1913) Practical Physics, The MacMillan Co., USA, p. 242, fig. 200.

Magnets are everywhere - they play a crucial role in correcting the trajectories of charged particles in particle accelerators, they act as very great emergency breaking systems, they encode your credit card information whenever you swipe to make a purchase, and you may even have a few sitting on your refrigerator. Unfortunately, however, even many undergraduate physics majors never get to fully appreciate the origin of this strange phenomena that we call "permanent magnetism". Indeed, not every metal can become a permanent magnet. As it turns out, magnetic materials tend to require that the individual atoms that make up the metal be sufficiently close to each other to experience an "interaction potential" whose origin is fundamentally quantum mechanical. I will attempt to make this discussion tailored toward the intermediate undergraduate physics major or the even the advanced high school student. Why so? Well, the full beauty of permanent magnetism is best described in its dirty quantum mechanical nature. We could talk all day about the classical origin of magnetism, treating every atom as a miniature dipole (which they are), but this watered-down description does not give the reader an understanding of where permanent magnetism really comes from; rather, it gives the reader a classical approximation to what permanent magnetism is. We will begin with a short warm-up with spin and build our way (in surprisingly short time) to the classical, macroscopic nature of magnetism through the mean field Ising model. Let's do it!

I. A QUICK REFRESHER ON SPIN


So what is spin? Angular momentum! But it is a special kind of angular momentum that, for particles with a spin greater than zero, allows us to write down the full rotation operator (a group element of SU(2)) as it acts upon spinor states (representations of SU(2)) via treating it as the generator of finite rotations: $$T(\theta)=\exp\bigg(\frac{-i \mathbf{\theta} \cdot (\mathbf{\hat{L}}+\mathbf{\hat{S}})}{\hbar}\bigg)$$ Okay, so what about the stationary states of $\mathbf{\hat{S}}$? Well, it is angular momentum, so the eigenvalues for the stationary states for the total angular momentum operator should at least be: $$\hat{S}^2| \psi \rangle= \hbar^2 s(s+1)|\psi \rangle$$ Moreover, the eigenvalue of the z-projection operator should then also be: $$\hat{S}_z|\psi \rangle = \hbar m|\psi\rangle$$ Now, this is where we start jumping out of the realm of normal orbital angular momentum, $\mathbf{\hat{L}}$, and into the strange nature of spin. It is this very nature that causes spin to have no classical analogue - it is purely a quantum mechanical object with macroscopic classical effects. For fermions, the values of s are only half-integral, while for bosons, the values of s are only integral. Moreover, the stationary states of the spin operator are NOT functions - they are spinors (two-dimensional representations of the SU(2) group). For electrons, the value of s is $$s=\frac{1}{2}$$ so the z-projection is either $$\lambda_z=\frac{\hbar}{2}$$ which is called 'spin-up' or $$ \lambda_z=-\frac{\hbar}{2}$$ which is called 'spin-down'. Therefore, the stationary states for an electron are 2-dimensional spinors, which either represent spin-up, spin-down, or a linear combination of the two (a superposition of spin-up and spin-down). It turns out that much of the information presented here (including our quick blurb about the rotation operator) is just about all you need to have as an elementary basis in starting your journey to understanding quantum computing! As a final side note on our digression on the fundamentals of spin, recall that we can write the spin operators of an electron (or any spin-1/2 fermion) in terms of Pauli matrices, $\sigma_i$, where i=1,2,3, which correspond to the x, y and z coordinates, respectively. That is: $$\hat{S}_x=\frac{\hbar}{2}\sigma_1=\frac{\hbar}{2}\begin{bmatrix}0 & 1\\1 & 0\end{bmatrix}$$ $$\hat{S}_y=\frac{\hbar}{2}\sigma_2=\frac{\hbar}{2}\begin{bmatrix}0 & -i\\i & 0\end{bmatrix}$$ $$\hat{S}_z=\frac{\hbar}{2}\sigma_3=\frac{\hbar}{2}\begin{bmatrix}1 & 0\\0 & -1\end{bmatrix}$$ The eigenvalues of the Pauli spin matrices are $+1$ for spin-up and $-1$ for spin-down. Let us then ask: what happens when a spin-1/2 particle that is subject to an external magnetic field? We will answer this question in a semi-classical analysis of the situation (i.e. we are not quantizing the magnetic field) - the results are a hallmark of 20th century physics.

The Microscopic Effects of a Uniform Magnetic Field

Let us think of the most simple situation possible: an electron in a uniform magnetic field. What happens? If we think of spin as, perhaps, some sort of current (it's not) that creates a dipole moment that depends on the angular momentum of what is spinning, then the dipole moment is: $$\mathbf{\mu}=C\mathbf{L}$$ where C is some constant that keeps the dimensions consistent and L is the angular momentum of the spinning current. This model is not without its merit for the case of a single electron, as the intrinsic spin does actually generate a magnetic moment even though no current is actually orbiting the electron. We can write down the dipole moment produced by the spin of the electron as: $$\mathbf{\mu}=\frac{g e}{2mc}\mathbf{\hat{S}}$$ where e is the elementary charge of the electron, m is the mass, and c is the speed of light. If you have studied very much classical electrodynamics, you may be wondering where the constant 'g' came in. This constant is referred to as the 'Landé g-factor' or, simply, the 'g-factor' and was first discovered experimentally, then derived from the framework of relativistic quantum mechanics (for more information, see [1]). If a stationary dipole is subject to a uniform magnetic field, its total (non-relativistic) energy is: $$E=-\mathbf{\mu} \cdot \mathbf{B}$$ In an elementary E&M course, one may hear the heuristic argument that dipoles prefer to align with the magnetic field because it minimizes the energy. We will see why this is true within the framework of statistical mechanics in the next section. Via direct analogy, we can write the Hamiltonian of an electron subject to a uniform magnetic field as: $$\hat{H}=-\frac{g e}{2mc} \mathbf{B} \cdot \mathbf{\hat{S}}$$ Suppose that the magnetic field is pointed up in the z-direction. The Hamiltonian is then: $$\hat{H}=-\frac{g e}{2mc}B\hat{S}_z$$ Because we are only dealing with the spin operator, the eigenvalues of this Hamiltonian are: $$E=-\frac{g \hbar e}{4mc} B \text{ or } E=\frac{g \hbar e}{4mc}B$$ corresponding to spin-up or spin-down, respectively. The eigenvectors therefore diagonalize $\hat{S}_z$. Okay, that's some great information, but what is actually happening? Let us evolve the state via the time-evolution operator: $$\hat{U}(t-t_0)\lvert\psi_0\rangle=a\exp\bigg(\frac{ige}{4mc} B(t-t_0)\bigg)\lvert+\rangle+b\exp\bigg(-\frac{ige}{4mc} B(t-t_0)\bigg)\lvert-\rangle$$ where $\lvert+\rangle$ denotes a spin-up state, $\lvert-\rangle$ denotes a spin-down state, and a and b are normalization constants. If a=0 (the electron is initially in a spin-up state), then: $$\hat{U}(t-t_0)\lvert\psi_0\rangle=a\exp\bigg(\frac{ige}{4mc} B(t-t_0)\bigg)\lvert+\rangle$$ Let us think of the action of the time evolution operator in a different fashion. By identification with the rotation operator, we can see that the effect of the time evolution operator on the spin-up state is to rotate it in Hilbert space by an angle of: $$\theta = \frac{ge\hbar}{4 mc} B(t-t_0)$$ Therefore, the state rotates at a constant rate of: $$\omega=\frac{ge\hbar}{4 mc} B$$ This is called the 'Larmor' frequency. We can see that the effect on a spin-down state is to rotate in the opposite direction. A linear combination of spin-up and spin-down then yields a superposition of rotations in both directions at the Larmor frequency, which means that which state we observe to electron to be in is time-dependent in the case of being a linear combination of the two, for the combined state is no longer stationary. Hence, if the initial state is indeed a linear combination of spin-up and spin-down, we will observe a time-dependent probability of observing it in either a spin-up state or spin-down state, but we will NEVER observe it in some other state. Let us now move on to shortly analyzing the effect of spin in physical space, not just Hilbert space. Suppose the original orientation of the spin vector is in some intermediate direction between spin-up and spin-down, then the expectation value of the components are[2]: $$\langle \hat{S}_x \rangle=\frac{\hbar}{2}\sin(\beta)\cos\bigg(\frac{ge\hbar}{4 mc} Bt\bigg)$$ $$\langle \hat{S}_y \rangle=-\frac{\hbar}{2}\sin(\beta)\sin\bigg(\frac{ge\hbar}{4mc} Bt\bigg)$$ $$\langle \hat{S}_z \rangle=\frac{\hbar}{2}cos(\beta)$$ where $\beta$ is the angle of the average spin vector from the z-axis. Therefore, if the spin vector is in some intermediate value between spin-up and spin-down and the uniform magnetic field is in the z-direction, the expectation value of the spin vector will rotate about the z-axis at the Larmor frequency. What is the gist of this? Electrons in uniform magnetic fields precess about the magnetic field lines at a particular frequency called the Larmor frequency! The situation gets much more interesting when we introduce a non-uniform magnetic field, for the electron will not only rotate at the Larmor frequency, but it will also gain momentum in either the up or down direction depending upon the original orientation of the electron as it passes through the magnetic field. We will not go into detail about this, but I do highly encourage the reader to read through page 399 of [1] or 182 of [2] for more information about the particular behavior of silver atoms (which have a total spin of 1/2) as they pass through a non-uniform magnetic fields in a famous classical experiment called the Stern-Gerlach experiment.

The Microscopic Effects of Other Electrons

We will now discuss a different type of potential that is only non-negligible when there exists a high degree of overlap between the wave functions of two electrons. This potential is called the 'exchange interaction' (it will play a crucial role in our analysis of ferromagnetism). We can write the Hamiltonian down for this interaction as: $$\hat{H}=-k_{1,2}\mathbf{\hat{S}}_1\cdot\mathbf{\hat{S}}_2$$ where $k_{1,2}$ is some arbitrary constant that depends upon the degree of overlap between the wave functions (we shall elaborate in section III). The stationary state solutions to this Hamiltonian are direct product states of the two electrons, such that the three following states are degenerate for the energy eigenvalue corresponding to a total spin of $s=1$ ($E=\frac{k\hbar^2}{4}$), called 'triplet states': $$|\psi_1\rangle=|+\rangle \otimes |+\rangle$$ $$|\psi_2\rangle=|+\rangle\otimes|-\rangle+|-\rangle\otimes|+\rangle$$ $$|\psi_3\rangle=|-\rangle\otimes|-\rangle$$ while the energy eigenvector for the energy eigenvalue ($E=\frac{-3k\hbar^2}{4}$) corresponding s=0 is just the 'singlet state': $$|\psi_3\rangle=|-\rangle\otimes|+\rangle-|+\rangle\otimes|-\rangle$$ By the Pauli exclusion principle, coupled electrons are not allowed to be in the same state (i.e. cannot be simultaneously in the same spin state and spatial wave function), for they occupy the anti-symmetric singlet states. However, in details that we shall explain in due time, combined systems of spin-1/2 can occupy triplet states depending upon signage of $k_{1,2}$, which will play a crucial role in our analysis of ferromagnetism. Now that we have discussed the two important potentials at play, let us get into the nitty-gritty details of the Ising model by discussing first a few simple situations that will allow us to appreciate the full scope of ferromagnetism.

II. THE EFFECT OF AN INCIDENT MAGNETIC FIELD (PARAMAGNETISM)


Consider a system composed of weakly interacting spin-1/2 atoms (i.e. the exchange interactions of each atom can be neglected) at some temperature T. Then we can consider the probability of a single atom occupying a particular energy state to be described by the Boltzmann distribution: $$p(E_r,T)=\frac{\exp(-\beta E_r)}{\sum_r \exp(-\beta E_r)}$$ which we can obtain from a Taylor expansion of the natural log of the number of accessible states of the entire system of atoms while using the fundamental relations $\frac{\partial S}{\partial E}=\frac{1}{kT}=\beta$, where k is the Boltzmann constant, T is the temperature, and S is the entropy. $E_r$ denotes the energy of the atom in a state $r$, which is either spin-up or spin-down. We could also just get it from considering the partition function (which is the denominator of this expression). The Hamiltonian of this system is: $$H=-\mu B$$ Generally, this is an inner product, but we are assuming the $\mathbf{B}$ is uniform in magnitude a direction, so $\mu$ points along the same direction as $\mathbf{B}$. Since the atom either occupies a spin-up state or spin-down state, the two possible energies are either positive or negative, corresponding to the magnetic moment being anti-aligned with the external field or aligned with the external field, respectively. Hence, the probability of being spin-up is: $$p_{+}(E_r,T)=\frac{\exp(\beta \mu B)}{\exp(-\mu B)+\exp(\mu B)}$$ while spin-down is: $$p_{-}(E_r,T)=\frac{\exp(-\beta \mu B)}{\exp(-\mu B)+\exp(\mu B)}$$ In the regime of relatively low temperatures: $$\bigg(\frac{\exp(\beta \mu B)}{\exp(-\mu B)+\exp(\mu B)}\bigg )>\bigg( \frac{\exp(-\beta \mu B)}{\exp(-\mu B)+\exp(\mu B)}\bigg)$$ and thus: $$p_{+} > p_{-}$$ so the probability of the magnetic moment being aligned with the magnetic field is greater than being anti-aligned. However, if the temperature is high, then $\beta \implies 0$, so $$p_{+} \approx p_{-}$$ This behavior of the probabilities in the high temperature regime can be attributed to the random thermal motion of the atoms, for the higher the temperature, the higher the average kinetic energy of each atom and hence the higher the resistance of the atoms to being aligned in a particular direction of spin as dictated the direction of the incident uniform magnetic field. Therefore, the preference of the magnetic moment to align with the magnetic field is fundamentally an effect of statistical mechanics! We can, however, do even better than simply calculating the probabilities of a particular atom occupying some particular state - we can go as far as calculating the magnetization of the material from the mean magnetic moment. From the definition of the mean: $$\bar{\mu} = \sum_r p_r \mu_r=\mu \frac{\exp(\mu B)-\exp(-\mu B)}{\exp(\mu B)+\exp(-\mu B)}=\mu \tanh\bigg(\frac{\mu B}{kT}\bigg)$$ from the definition of the magnetization (dipole per unit volume), we can let N define the number of atoms per volume. Hence: $$M=N\mu \tanh\bigg(\frac{\mu B}{kT}\bigg)$$ So in the regime of high temperature ($\tanh$ approaches zero): $$M\approx \frac{N\mu^2}{kT}B=\chi B$$ where $\chi$ is the magnetic susceptibility. However, by the shape of hyperbolic tangent, in the regime of low temperatures and/or strong magnetic field, $M$ approaches $M=N \mu$. Very beautiful!


III. THE EFFECT OF EXCHANGE INTERACTIONS


Now that we have played around with the case of only an incident magnetic field with no coupling, let us now consider a 1-D line of spin-1/2 atoms with no external magnetic field, but with a non-negligible spin-spin coupling. Let the this line not fold into itself (i.e. let us not impose periodic boundary conditions on the atoms). The discussion we gave in section I on the effect of the exchange interaction was quite elementary. We should now go into a bit more detail on exchange interactions to give the reader a good idea of what they are. As mentioned earlier, the constant, k, in $$\hat{H}=-k_{1,2}\hat{\mathbf{S}}\cdot\hat{\mathbf{S}}$$ determines whether the material favors triplet states or singlet states. We can write k in terms of a new constant, J, such that $k=J$ and J is[3]: $$J=2\frac{J_{ex}-CB^2}{1-B^4}$$ where $$J_{ex}=\int \psi_a^*(\mathbf{r_1}) \psi_b^*(\mathbf{r_2}) \bigg(\frac{1}{R_{ab}}+\frac{1}{r_{12}}-\frac{1}{r_{a1}}-\frac{1}{r_{a2}}\bigg)\psi_a(\mathbf{r_1})\psi_b(\mathbf{r_2})d^3\mathbf{r_1}d^3\mathbf{r_2}$$ which is called the 'exchange integral'. $\psi_a$ is the atomic state of atom a and $\psi_b$ is the atomic state of atom b. The quantity $R_{ab}$ is the distance between the protons a and b of each atom, $r_{12}$ is the distance between electrons 1 and 2 of each atom, $r_{a1}$ is the distance between electron 1 and proton a, and $r_{a2}$ is the distance between electron 2 and proton b. The quantity, C, which is called the 'Coulomb integral', is: $$C=\int [\psi_a(\mathbf{r_1})]^2\bigg(\frac{1}{R_{ab}}+\frac{1}{r_{12}}-\frac{1}{r_{a1}}-\frac{1}{r_{a2}}\bigg)[\psi_b(\mathbf{r_2})]^2d^3\mathbf{r_1}d^3\mathbf{r_2}$$ and finally, the quantity B is called the 'overlap integral', which measures the degree of overlap between the two atomic wave functions. $$B=\int \psi_a(\mathbf{r_1})\psi_b(\mathbf{r_2})d^3\mathbf{r_2}$$ For much more complicated atoms, calculating these integrals is a daunting task, as one must resort to numerical methods for calculation of these integrals, given the vast majority of wave functions (recall that the only atomic wave function that we have an analytical solution for, thanks to the Dirac equation or the Shrodinger equation with perturbative corrections, is that of the hydrogen atom) have no straight-forward (as in, non-approximate) solution. However, we can, and quite extensively have, obtain the value of J experimentally. Materials that are anti-ferromagnetic have a value of $J>0$, thus favoring anti-alignment of their spins (i.e., they favor singlet states of the two electron system) because it is the lowest energy state. Materials that are ferromagnetic have a value of $J>0$ and hence the lowest energy states are that of the degenerate triplet states of the two electron. Within the Ising model, we make assumption that, instead of the pure form that we explored for the two electron system, we can write the approximate Hamiltonian down for this system as: $$H=-\sum_{i=1}^{n-1} J_{i}\sigma_i\sigma_{i+1}=\sum_{i=1}^{n-1}\langle \mathbf{\hat{S}}_i\cdot\mathbf{\hat{S}}_{i+1}\rangle$$ where $\sigma_i$ and $\sigma_{i+1}$ are the eigenvalues for the Pauli spin matrices explored in section I (+1 for spin-up and -1 for spin-down). As a simplification, we let the value of J be $J=J_i$ for all $i\in\{1,2,...,n-1\}$. What does this mean? Simply that the interaction potential cares only about the nearest neighbors (i.e. that each atom does not 'sense' the presence of electrons located more than one atom away) and that each atom is equally spaced (i.e. that the integrals that J depends upon are the same for every i). This is a feasible approximation, for the value of J drops off extremely fast. The value of $i$ ends at $n-1$ because at the end of the line of atoms, there is no $i+1$ atom. Let us, just for convenience, let $\sigma_i\sigma_{i+1}=\mu_i$ since the product is either +1 or -1. If we just focus our attention upon the spin of just a single atom and consider the rest of the system as a heat bath, we can calculate all of the necessary thermodynamic quantities straight from the partition function: $$Z=2\sum_{\mu_i=+1,\mu_i=-1}\exp\bigg(\beta J\sum_{i=1}^{n-1}\mu_i \bigg)$$ where, again, $\beta=(kT)^{-1}$, k is the Boltzmann constant, and T is the temperature. The sum goes over each value of the product of $\sigma_i$ and $\sigma_{i+1}$. The factor of 2 is introduced because, even though there are two possible values of the product of $\sigma_i\sigma_{i+1}$, there are four possible combinations of spin orientations (++,+-,-+,--) and the partition function must go over every possible state (and hence every possible combination). Moreover, because a sum in the argument of the exponential simply multiplies each exponential over the sum: $$2\sum_{\mu_i=+1,\mu_i=-1}\exp\bigg(\beta J\sum_{i=0}^{n-1}\mu_i \bigg)=2\prod_{i=1}^{n-1}\sum_{\mu_i=+1,\mu_i=-1}\exp(\beta J \mu_i)$$ $$=2\bigg(\sum_{\mu_1=+1,\mu_1=-1}\exp(\beta J \mu_1)\sum_{\mu_2=+1,\mu_2=-1}\exp(\beta J \mu_2)\text{ }\dotsi\sum_{\mu_{n-1}=+1,\mu_{n-1}=-1}\exp(\beta J \mu_{n-1})\bigg)$$ $$=2\bigg([\exp(\beta J)+\exp(-\beta J)]_1\text{ }\dotsi[\exp(\beta J)+\exp(-\beta J)]_{n-1}\bigg)$$ $$=2\bigg(\exp(\beta J)+\exp(-\beta J)\bigg)^{n-1}$$ Finally, we can simply use the identity $\exp(\theta)+\exp(-\theta)=2\cosh(\theta)$ to obtain a full expression for the partition function: $$Z=2\bigg(2\cosh(\beta J)\bigg)^{n-1}$$ How nice! Don't get too comfortable with the beautiful simplicity, however, as simply increasing the dimensionality of the problem to $D\gt 2$ steps out of the bounds of an analytical solution for the partition function. Moreover, adding an incident magnetic field changes the dynamics of the problem in a manner that we shall explore later. Let us now, just as we did in the last section, calculate the particular spin orientation that this Hamiltonian prefers. This amounts to calculating the average value of $\mu_i$, as the average will tell us whether the spins prefer to align or anti-align. The average can be expressed in terms of the partition function in the following manner: $$\langle \mu \rangle = 2Z^{-1}\sum_{\mu_i=+1,\mu_i=-1}\exp\bigg(\beta J\sum_{i=0}^{n-1}\mu_i \bigg)\mu_i$$ Which, following along the same lines as we had in the last section, is evaluated as: $$ 2Z^{-1}\sum_{\mu_i=+1,\mu_i=-1}\exp\bigg(\beta J\sum_{i=0}^{n-1}\mu_i \bigg)\mu_i$$ $$=\bigg(\bigg(2\cosh(\beta J)\bigg)^{n-1}\bigg)^{-1}\sum_{\mu_i=+1,\mu_i=-1}\exp\bigg(\beta J\sum_{i=0}^{n-1}\mu_i \bigg)\mu_i$$ $$=\frac{\bigg(\mu\exp(\beta J)-\mu\exp(-\beta J)\bigg)^{n-1}}{\bigg(2\cosh(\beta J)\bigg)^{n-1}}=\frac{\bigg(2\mu\sinh(\beta J)\bigg)^{n-1}}{\bigg(2\cosh(\beta J)\bigg)^{n-1}}=\bigg(\mu\tanh(\beta J)\bigg)^{n-1}$$ where we have used the identity $\exp(\theta)-\exp(-\theta)=2\sinh(\theta)$. The steps leading to the evaluation of the numerator are identical to those in which we used to obtain the partition function; however, the sum over the exponentials is now weighted by $\mu$, producing the $\sinh$ function. Hence: $$\langle \mu \rangle=\bigg(\tanh(\beta J)\bigg)^{n-1}$$ What does this tell us? Well, $\tanh$ is either always negative or always positive depending upon the parity of its argument, but the only parameter that is allowed to vary in the argument of the $\tanh$ function is T, which can never be negative. We therefore surmise that the preferred orientation of the individual spins in this line of atoms is in the same as the neighboring atoms. The lower the temperature, the easier it becomes to align all of the spins into a single direction (given that $\beta = (kT)^{-1}$ and thus lower temperatures increase the magnitude of $\tanh$'s argument). We are now ready to develop a theory for the existence of ferromagnetism, where exchange interactions play a crucial role permanent magnetization of a material, while an incident magnetic field plays a crucial role in generating magnetization in the first place.


IV. FERROMAGNETISM (The Ising Model)


Let us reiterate what we learned in the last two sections before we really jump into the theory of ferromagnetism. I would like the reader to be capable of appreciating the feet that the advent of quantum mechanics and its use in modern statistical mechanics is in explaining a phenomena that, in the scope of modern history, has only been well-understood in very recent times (there is actually still a lot of research done in developing a better understanding of ferromagnetism, but the essentials of ferromagnetism, which are what we are exploring, are well-understood). Perhaps the particular case of ferromagnetism that you are most familiar with (and of which are not actually ferrous at all) is that of the convenience store neodymium magnets. If one was capable of probing a ferromagnetic material (we are) and observing the orientation of all of the spins of the individual atoms that make up the material, they would likely see something of the following:



The arrows indicate the direction of the spins. In each of the domains, called ferromagnetic domains, most of the spins are aligned in the same direction. The existence of these domains perplexed scientists for quite some time, as no then-current theory of electromagnetism was capable of explaining why these domains existed in some materials and not others. We now understand that the existence of these domains is a byproduct of what we just explored in the last section, such that each of the spin-1/2 atoms tends to prefer to align its spin vector in the same direction. We could bias all of these domains to align their spins in the same direction if we utilize the existence of an external field to align them along the same direction, as we explored in the second section of this article. The particular curve that the magnetization follows for different magnitudes of magnetic fields was first discovered experimentally and is called a 'hysteresis curve', as given by the image below.


The magnitude of the magnetization depends upon the history of the already existing magnetic field, such that, if one starts at any point on this curve and adds a particular amount of magnetic field, H, the path that the magnetization will follow depends upon that particular starting point. One of the easiest ways one could carry out such a task is to simply wrap a ferromagnetic metal bar with copper wire, then add a constant current, which creates an approximately constant magnetic field that depends upon the direction of the current down the axis of the metal bar. This magnetic field will bias the spins to align along a particular direction and produce a net magnetization and hence a net magnetic field (for a net magnetization produces a magnetic field). Because the material is ferromagnetic, the preferred orientation of any individual spin depends upon the spin of its neighbors, so the material will remain magnetized. What one has done is create a permanent magnet out of simply wrapping wire around a rod and running current through it! Permanent magnetism is hence a result of quantum mechanics - to think that it was hiding under our noses the whole time! Wonderful, now let us get into the Ising model in its full detail by considering a lattice of ferromagnetic atoms and adding the two interactions that we discussed in the last two sections together for each atom in the following manner. $$H=-\sum_{\langle ij \rangle}J_{ij}\sigma_i\sigma_j-\mu B\sum_i\sigma_i$$ In this Hamiltonian, $i$ denotes each atom in the lattice, $j$ denotes the set of atoms nearest to atom $i$. Hence the notation in first sum, $\langle ij \rangle$ means to sum over i first, then each of the nearest atoms. The potential on the right is the same as the one we considered in section II, but now we are summing over each of the atoms in the lattice. Let us now make our first approximation before we get into the big approximation (the mean field approximation) - let's assume that every atom is surrounded by six nearest neighbors, two on each Cartesian axis. Moreover, assume that each of these atoms are equally spaced from atom $i$ and that $J>0$ for each atom. This looks like the following picture.





 Now, unfortunately, this 3-dimensional model of ferromagnetism does not have an analytical solution for the partition function. That is, unless we make one more approximation. This 'mean field' approximation, however, still allows us to probe the existence of second-order phase transitions in the model and many other properties that are important to our understanding of ferromagnetic materials. We have actually already made one key assumption and that is to assume that each of the atoms are equally spaced with a constant J value that is greater than zero. The second key assumption is that the average spin of any particular atom is equal to some constant order parameter, m and that the deviation from this average is some small quantity, $\delta \sigma_{i \text{ or } j}$. We can then do a cute little trick by adding zero to each value of spin: $$\sigma_i=m+(\sigma_i-m)=m+\delta \sigma_i$$ We can then write the Hamiltonian as: $$H=-\sum_{\langle ij \rangle}J_{ij}(\sigma_i+\delta \sigma_i)(\sigma_j+\delta \sigma_j)-\mu B\sum_i\sigma_i$$ $$=-\sum_{\langle ij \rangle}J_{ij}(m^2+m(\delta\sigma_i+\delta\sigma_j+\delta\sigma_i\delta\sigma_j))-\mu B\sum_i\sigma_i$$ Dropping terms of order $O(2)$ $$H=-\sum_{\langle ij \rangle}J_{ij}(m^2+m(\delta\sigma_i+\delta\sigma_j))-\mu B\sum_i\sigma_i$$ $$=-\sum_{\langle ij \rangle}J_{ij}(-m^2+m(\sigma_i+\sigma_j))-\mu B\sum_i\sigma_i$$ where in the last part, we re-inserted the expression for $\sigma_i$ in terms of m and itself to re-obtain the sigmas. For the next part, a lot of resources tend to leave out the details, seemingly writing the next step as some trivial operation on the Hamiltonian. It is true that it is trivial, but without some illumination, it can be somewhat frustrating to figure out what is going on. Consider the first sum: $$-\sum_{\langle ij \rangle}J_{ij}(-m^2)$$ We already assumed the necessary assumptions to treat J as the same quantity for every neighboring atom to lattice site, i. Let us approximate this sum, then, to simply be a multiple of N atoms by z neighbors (this approximation will only be valid for large numbers of atoms). We must, however, introduce a factor of 1/2, for each lattice site shares each of its six atoms with the neighboring lattice sites and hence the 1/2 prevents us from double counting the j's. $$-\sum_{\langle ij \rangle}J_{ij}(-m^2)=\frac{1}{2}NzJm^2$$ We can approximate the second sum by arguing that, since the sum contains exactly one $\sigma_i$ for each of the bonds, we can let $$-\frac{1}{2}\sum_{\langle ij \rangle}m(\sigma_i+\sigma_j)=\sum_i \sigma_i$$ because there are z bonds for each i. Wonderful, so now our Hamiltonian is: $$H=\frac{1}{2}JzNm^2-(JzNm+\mu B)\sum_i \sigma_i$$ In calculating the partition function from this, what we have done may not seem like much of an improvement, but we have actually already calculated the partition function for this in section III, as: $$Z=\sum_{\sigma_i=1,\sigma_i=-1}\exp\bigg(-\frac{1}{2}\beta JzNm^2+\beta(Jzm+\mu B)\sum_i \sigma_i\bigg)$$ $$=\exp(-\frac{1}{2}\beta JzNm^2)\sum_{\sigma_i=1,\sigma_i=-1}\exp\bigg(\beta(Jzm+\mu B)\sum_i \sigma_i\bigg)$$ which we have already calculated, but with multiples of sigma's instead of single sigma's alone. Therefore: $$Z=\exp\bigg(-\frac{1}{2}\beta JzNm^2\bigg)\bigg[2\cosh\bigg(\beta(Jzm+\mu B)\bigg)\bigg]^N$$ Wonderful! Now comes the important part of our derivation. We have made a lot of assumptions. Let's see if we can obtain a self-consistency equation for m that will justify all of the assumptions that we have made in getting this far. We do this by minimizing the Helmholtz free energy, which we can calculate using the partition function. $$F=-\beta^{-1}\ln(Z)=-\beta^{-1}\ln\bigg(\exp\bigg(-\frac{1}{2}\beta JzNm^2\bigg)\bigg[2\cosh\bigg(\beta(Jzm+\mu B)\bigg)\bigg]^N\bigg)$$ Hence: $$F=\frac{1}{2}JzNm^2-\beta^{-1}N\ln\bigg(\bigg[2\cosh\bigg(\beta(Jzm+\mu B)\bigg)\bigg]\bigg)$$ Via elementary calculus, we can minimize this free energy to obtain the following self-consistency equation for the value of m (the average value of $\sigma$): $$\frac{\partial}{\partial m}F=0 \implies m=\tanh\bigg(\beta(Jzm+\mu B)\bigg)$$ There is a lot of detail in this equation, as it codes much of the information that we 'expect' about how permanent magnets behave. First of all, phase transitions. Are there any phase transitions evident in this? This may seem like somewhat of a cute analysis, but it will answer the question in a simple way. Let us define a new variable: $$x=\beta(Jzm+\mu B)$$ Hence, our equation becomes: $$\frac{kT}{Jzm+\mu B}x=\tanh(x)$$ If we look at the simple form of $\tanh$ using an online graphing calculator, we note the following form:

desmos.com
The solutions to our consistency equation should lie at the intersection of a line with a slope of $\frac{kT}{Jzm+\mu B}$ and the $\tanh$ function. Hence, for slopes that are large (that is, for high temperatures or small contributions from the already established value of m [hysteresis hinting itself in the denominator] and B), there will exist one solution at $\tanh(x)=0$, i.e. $m=0$. Hence, the average spin is zero. This is precisely what we should expect, as the random thermal motions caused by large temperatures should randomly orient the spins. Only with a stronger magnetic field can we begin to re-order the spins. Our phase transition occurs when we decrease the slope. For a matter of fact, it occurs right at: $$\frac{kT}{Jzm+\mu B}=1$$ This is because the slope of $\tanh$ approaches 1 as its argument approaches zero. What do we expect then? Three solutions!

desmos.com

There hence exists one solution at m=a, m=-a, and m=0, where a is an arbitrary number set by the slope; however, the m=0 solution corresponds to an unstable solution, so the order parameter will tend to stick around either m=a or m=-a, i.e. either highly biased toward being spin-up or highly biased toward being spin-down. We can see the beauty of these results in the following 2-D Monte Carlo mean field Ising model simulations performed by the University of Texas. First, we have simply a panel of atoms at a temperature of $T=40T_c$, where $T_c$ is the critical temperature in the zero magnetic field regime (even though we assumed that there existed a magnetic field in our entire derivation, we can set B=0 either at the beginning or end of the derivation and retrieve the same result).

http://farside.ph.utexas.edu/teaching/329/lectures/node110.html

The black denotes spin-up and the white denotes spin-down. Notice that the spins are fairly well distributed at this higher temperature and hence we expect that the order parameter be inching $m=0$; however, if we take a look at the same simulation at $T=3.6T_c$

http://farside.ph.utexas.edu/teaching/329/lectures/node110.html

The spins become heavily biased toward spin-up! Very beautiful! We expect the same behavior if we were to bias the simulation by increasing the magnetic field while keeping the temperature constant, for our self-consistency equation tells a story of a fine balance between temperature and the introduction of an incident magnetic field.

Wonderful! We have now thoroughly scratched the surface of ferromagnetic materials using the theoretical tools of statistical mechanics! I cannot lie to my reader, however, and assert that we have probed the full beauty of ferromagnetism. For a matter of fact, we have only considered a semi-classical approximation to the properties of ferromagnets. A much more rigorous treatment of ferromagnetism must come from the operator formalism of quantum mechanics and the broader subject of quantum field theory. This is, however, still subject to much research and hence I leave the reader with a suggestion of going further and exploring more complicated models of ferromagnetism now that you have a subtle basis of what is one of the less complicated models out there (I must note, however, that the Ising model is still subject to a high degree of research, as, though it is simple in the mean field theory, it still does not have any non-approximate analytical solutions in higher dimensions).

 REFERENCES
[1] Shankar, R. (2014). Principles of Quantum Mechanics. New York, NY: Springer.
[2] Griffiths, D. J. (2017). Introduction to Quantum Mechanics. Cambridge: Cambridge University Press.
[3] Van Vleck, J. H.: Electric and Magnetic Susceptibilities, Oxford, Clarendon Press, p. 318 (1932).
[4] Kopietz, P., Bartosch, L., & Schütz, F. (2013). Introduction to the Functional Renormalization Group. Berlin: Springer Berlin.
[5] Kittel, C., & Kroemer, H. (1998). Thermal physics. New York: W.H. Freeman and Company.
[6] Griffiths, D. J. (1999). Introduction to Electrodynamics. Upper Saddle River: Prentice Hall.
[7] Schwartz, M. (2005). Principles of Electrodynamics. New York: Dover Publications.
[8] Reif, Frederick. Fundamentals of Statistical and Thermal Physics. Calcuta: Levant, 2010. Print.

Sunday, June 11, 2017

An Explanation of Paramagnetism from the Framework of Statistical Mechanics

Here, we discuss a beautifully straight-forward way of calculating the effects of an incident magnetic field on a substance composed of atoms with a net spin of $s=\frac{1}{2}$ from the framework of statistical mechanics. As a side note on notation, most derivations will use the notation for the magnetic field being as being $\mathbf{H}$; however, I am not fond of this misuse of notation, for the "true" magnetic field is $\mathbf{B}$, $\mathbf{H}$ being merely a measure of the magnetic field caused by free currents, called the auxiliary field (which does not have the same units of $\mathbf{B}$). Now, consider a system composed of a of weakly interacting atoms spin-1/2 atoms (i.e. the spin-spin interactions of each atom can be neglected) at some temperature T. Then we can consider the probability of a single atom occupying a particular energy state to be described by the Boltzmann distribution: $$p(E_r,T)=\frac{\exp(-\beta E_r)}{\sum_r \exp(-\beta E_r)}$$ which we can obtain from a Taylor expansion of the natural log of the number of accessible states of the entire system of atoms while using the fundamental relations $\frac{\partial S}{\partial E}=\frac{1}{kT}=\beta$, where k is the Boltzmann constant, T is the temperature, and S is the entropy. $E_r$ denotes the energy of the atom in a state $r$, which is either spin-up or spin-down. Recall from classical electrodynamics that the energy (or Hamiltonian in our notation) of a dipole interacting with an external field is: $$H=-\mu B$$ where $\mu$ is the magnetic moment. Generally, this is an inner product, but we are assuming the $\mathbf{B}$ is uniform in magnitude a direction, so $\mu$ points along the same direction as $\mathbf{B}$. Since the atom either occupies a spin-up state or spin-down state, the two possible energies are either positive or negative, corresponding to the magnetic moment being anti-aligned with the external field or aligned with the external field, respectively. Hence, the probability of being spin-up is: $$p_{+}(E_r,T)=\frac{\exp(\beta \mu B)}{\exp(-\mu B)+\exp(\mu B)}$$ while spin-down is: $$p_{-}(E_r,T)=\frac{\exp(-\beta \mu B)}{\exp(-\mu B)+\exp(\mu B)}$$ In the regime of relatively low temperatures: $$\bigg(\frac{\exp(\beta \mu B)}{\exp(-\mu B)+\exp(\mu B)}\bigg )>\bigg( \frac{\exp(-\beta \mu B)}{\exp(-\mu B)+\exp(\mu B)}\bigg)$$ and thus: $$p_{+} > p_{-}$$ so the probability of the magnetic moment being aligned with the magnetic field is greater than being anti-aligned. However, if the temperature is high, then $\beta \implies 0$, so $$p_{+} \approx p_{-}$$ This behavior of the probabilities in the high temperature regime can be attributed to the random thermal motion of the atoms, for the higher the temperature, the higher the average kinetic energy of each atom and hence the higher the resistance of the atoms to being aligned in a particular direction of spin as dictated the direction of the incident uniform magnetic field. Therefore, the preference of the magnetic moment to align with the magnetic field is fundamentally an effect of statistical mechanics! We can, however, do even better than simply calculating the probabilities of a particular atom occupying some particular state - we can go as far as calculating the magnetization of the material from the mean magnetic moment. From the definition of the mean: $$\bar{\mu} = \sum_r p_r \mu_r=\mu \frac{\exp(\mu B)-\exp(-\mu B)}{\exp(\mu B)+\exp(-\mu B)}=\mu \tanh\bigg(\frac{\mu B}{kT}\bigg)$$ from the definition of the magnetization (dipole per unit volume), we can let N define the number of atoms per volume. Hence: $$M=N\mu \tanh\bigg(\frac{\mu B}{kT}\bigg)$$ So in the regime of high temperature ($\tanh$ approaches zero): $$M\approx \frac{N\mu^2}{kT}B=\chi B$$ where $\chi$ is the magnetic susceptibility. However, by the shape of hyperbolic tangent, in the regime of low temperatures and/or strong magnetic field, $M$ approaches $M=N \mu$. Very beautiful!


References:
Reif, Frederick. Fundamentals of Statistical and Thermal Physics. Calcuta: Levant, 2010. Print.

Friday, April 28, 2017

Proof: The Set of all Positive Integers Is the Same Size as the Set of all Integers

In the following discussion, I will provide a quick, painless, and cute proof that the set of all positive integers has the same number of elements, which is called cardinality, as the set of all integers. My goal is to make this as readable as possible for a more general audience. Before we delve into our discussion, we must build up some preliminary knowledge. First, let us define a function:

Function: A relation between any two sets such that for every element in the domain, there exists a unique element in the codomain.

The domain is the 'set of all inputs', whereas the codomain is the 'set of all outputs' (also called the range). What our definition means, colloquially, is that a function takes any element in a set, say the set of all real numbers, and maps it onto only one element in another set, say the set of all real numbers, donkeys, neoconservatives, whatever we define our function to be a mapping between. We then ask ourselves: how can we compare the sizes of any two sets? As it turns out, any two sets have the same size if there exists a function which is one-to-one (injective) and onto (surjective) - this function is called a bijection - that maps elements between the two sets. Let us then define one-to-one and onto:

One-to-one: For any two elements in the domain, if a function maps these elements to the same element in the codomain, then the two elements in the domain are equal to each other.

Onto: For every element in the codomain, b, there exists an element, a, in the domain such that f(a)=b.

The reason why we seek a bijective function is that bijective functions have inverses which are also functions, i.e. every element in the codomain of some bijective function maps onto some element in the domain by the inverse of the bijective function. This is only possible if the codomain is the same size as the domain and vice-versa. Therefore, if we want to prove that two sets have the same size, we must find a bijective function that maps elements from one set to the other. If a function has an inverse that is also a function, we are guaranteed that the function is bijective. We can use this to prove all sorts of seemingly ridiculous statements, such as our original goal - so let's do it:

Theorem: The set of all positive integers has the same cardinality as the set of all integers.

Proof. Let us define a function whose domain is the set of all positive integers, such that $$F(n)=n/2$$ for all positive even integers and $$F(n)=\frac{1-n}{2}$$ for all positive odd integers. If we plug positive even integers into our first function, we generate the set of all positive integers and if we plug in positive odd integers into the second function, we generate the set of all integers less than or equal to zero. The inverse of this function for the set of all positive even integers is simply $F^{-1}(N)=2N$, whereas the inverse of this function for the set of all positive odd integers is $F^{-1}(N)=1-2N$. Since we can define an inverse, this function is bijective. Since there exists a bijective function that maps elements between the set of all positive integers and the set of all integers, these two sets have the same cardinality.

Tuesday, March 7, 2017

The Bending of Electric Field Lines about Unlike Ponderous Media

Perhaps one of the coolest properties of electric fields, and amongst the most important, is that they can bend when moving between media with different dielectric constants. In the following discussion, we are going to use a rather simple method (that is, the method of images) to show that the electric fields lines of a point charge in a dielectric of permittivity, $\epsilon_1$, will bend according to the relative magnitude of the permittivity of another medium, $\epsilon_2$, that we will allow them to pass through. The scenario that we are dealing with is depicted in the figure below.
There are four conditions that we need to satisfy to completely describe the potential and electric field of our setup in the figure above. First, we are dealing with an electrostatic situation, so $$\nabla \times \mathbf{E} = 0 $$ Furthermore, in region I we have a charge, so we should expect that we have a point charge distribution existing within the dielectric material $$\nabla \cdot \mathbf{E} = \frac{1}{\epsilon_1} \rho$$ while in the region II, we should observe no charge, i.e. $$\nabla \cdot \mathbf{E} = 0$$ Our final conditions are what sets this solution apart from a normal old method of images problem, as instead of imposing that the potential be zero at the plane between the dielectric media, we impose a continuity condition between the displacement fields in the z-direction and the electric fields in the $\rho$-direction. $$\mathbf{D}^{\text{perp.}}_{\epsilon_1} |_{z=0} = \mathbf{D}^{\text{perp.}}_{\epsilon_2} |_{z=0}$$ $$E^{\text{para.}}_{\epsilon_1}\hat{\rho} |_{z=0} = E^{\text{para.}}_{\epsilon_2} \hat{\rho} |_{z=0}$$ The first of these last two conditions is simply a statement asserting that the bound charge is zero, so there is not a discontinuity in the displacement field at the boundary between the two dielectric media. The very last condition is a statement asserting that, equivalently, there is no discontinuity in the electric field about the parallel points along the plane splitting the two regions. These are both standard boundary conditions typical to many electrostatic problems. Starting off with the normal old method of images trick, we suppose that there exists a dummy charge, $q_2$, on the other side of the plane that makes the potential in region I $$\phi_{\epsilon_1}=\frac{1}{4\pi\epsilon_1}\bigg( \frac{q}{r_1} + \frac{q_2}{r_2} \bigg)$$ where $r_1$ and $r_2$ are separation vectors, not to be confused with an attempt to center either charge at the origin and use spherical coordinates. For a matter of fact, they are $$r_1=\sqrt{\rho^2+(z-a)^2}$$ $$r_2=\sqrt{\rho^2+(z+a)^2}$$ And that is all we really need to do for region I. Now we would like to obtain a solution for the potential in region II. Our use of the method of images will already satisfy our third condition, as we can simply suppose that there exists an image charge in region I, such that $$\phi_{\epsilon_2}=\frac{1}{4\pi \epsilon_2}\frac{q_3}{r_1}$$ i.e., the image charge is located in the same exact region as our real charge; however, it occupies a charge value of $q_3$. What is left to us now is to show that our choice of potentials are not arbitrary and that we can, indeed, satisfy the boundary conditions using what we have here to exploit the uniqueness theorem for Laplace's equation by finding the one and only solution to the setup depicted in the figure at the beginning of our derivation. By our fourth condition, we require that $$\epsilon_1 \mathbf{E}^{\text{perp.}}_{\epsilon_1} |_{z=0} = \epsilon_2 \mathbf{E}^{\text{perp.}}_{\epsilon_2} |_{z=0}$$ meaning that our gradient in the z-direction of our potentials must be continuous: $$\epsilon_1 \frac{\partial}{\partial z}\phi_{\epsilon_1} |_{z=0} = \epsilon_2 \frac{\partial}{\partial z} \phi_{\epsilon_2} |_{z=0}$$ Furthermore, utilizing our fifth condition $$ \frac{\partial}{\partial \rho}\phi_{\epsilon_1} |_{z=0} = \frac{\partial}{\partial \rho} \phi_{\epsilon_2} |_{z=0}$$ Putting these two together yields the following relationship between our charges $$\frac{1}{\epsilon_1}(q+q_2)=\frac{q_3}{\epsilon_2}$$ $$q-q_2=q_3$$ Rearranging our two relations between the charges, we obtain a relationship for our two image charges in terms of our original charge $$q_2=\frac{\epsilon_2-\epsilon_1}{\epsilon_2+\epsilon_1}q$$ $$q_3=\frac{2\epsilon_2}{\epsilon_2+\epsilon_1}q$$ so our potentials are then $$\phi_{\epsilon_1}=\frac{1}{4 \pi \epsilon_1}\bigg(\frac{q}{r_1}+\frac{(\epsilon_2-\epsilon_1)q}{(\epsilon_2+\epsilon_1)r_2}\bigg)$$ for region I, and $$\phi=\frac{1}{4\pi \epsilon_2}\bigg( \frac{(2\epsilon_2)q}{(\epsilon_2+\epsilon_1)r_1}\bigg)$$ for region II. Notice something very very fascinating that you may have already anticipated earlier - the first potential (that is, for region I) looks a hell of a lot like a dipole. Depending on the relative magnitude of $\epsilon_1$ and $\epsilon_2$, it does indeed act like a dipole, such that, for $\epsilon_2>\epsilon_1$, we obtain a solution that looks like two unlike charges in sign and magnitude that are separated by a finite difference (a physical dipole) and, for $\epsilon_2<\epsilon_1$, our solution looks like the potential for two charges that have the same sign for their charge, but different values of charge. This is reflected in the following drawings of the electric fields produced by our image charge solution, such that the first drawing depicts $\epsilon_2>\epsilon_1$ and the second drawing depicts $\epsilon_2<\epsilon_1$:
Now, the whole goal of our discussion was to show that the electric field lines for a point charge bend and this is precisely what we have been capable of displaying in our solution. Because the potential for region II acts like a point charge, whereas the potential in region I acts like two charges of equal or opposite parity, our continuity conditions guarantee that, even though the fields look drastically different, we will obtain electric fields that bleed into the second boundary and bend about that boundary while remaining continuous.


References:
[1] Jackson, John David. Classical electrodynamics. N.p.: John Wiley & Sons, Ltd., 1962. Print.
[2] Griffiths, David J. Introduction to electrodynamics. Noida, India: Pearson India Education Services, 2015. Print.