Origin of Mott insulating behavior and superconductivity in twisted bilayer graphene
Abstract
A remarkable recent experiment has observed Mott insulator and proximate superconductor phases in twisted bilayer graphene when electrons partly fill a nearly flat miniband that arises at a ‘magic’ twist angle. However, the nature of the Mott insulator, origin of superconductivity and an effective low energy model remain to be determined. We propose a Mott insulator with intervalley coherence that spontaneously breaks U(1) valley symmetry, and describe a mechanism that selects this order over the competing magnetically ordered states favored by the Hunds coupling. We also identify symmetry related features of the nearly flat band that are key to understanding the strong correlation physics and constrain any tight binding description. First, although the charge density is concentrated on the triangular lattice sites of the Moiré pattern, the Wannier states of the tightbinding model must be centered on different sites which form a honeycomb lattice. Next, spatially localizing electrons derived from the nearly flat band necessarily breaks valley and other symmetries within any meanfield treatment, which is suggestive of a valleyordered Mott state, and also dictates that additional symmetry breaking is present to remove symmetryenforced band contacts. Tight binding models describing the nearly flat miniband are derived, which highlight the importance of further neighbor hopping and interactions. We discuss consequences of this picture for superconducting states obtained on doping the valley ordered Mott insulator. We show how important features of the experimental phenomenology may be explained and suggest a number of further experiments for the future. We also describe a model for correlated states in trilayer graphene heterostructures and contrast it with the bilayer case.
Contents
 I Introduction
 II Electronic Structure of Twisted Bilayer Graphene: General Considerations
 III Low Energy Theory: Two Band Projection
 IV Intervalley coherent order: phenomenological motivation
 V Simple theory of the IVC ordered state
 VI Intervalley coherent Mott insulators: generalities and a concrete example
 VII Other possible Mott insulating states
 VIII Tight Binding Model
 IX Model for correlated states in trilayer graphene heterostructure
 X Proposed future experiments
 XI Conclusion
 XII Acknowledgements
 A Lattice and symmetries
 B Valley Symmetry and Wannier Obstruction
 C Wannier functions
 D Tightbinding model
 E HartreeFock theory for selection of IVC ordering
 F Spinorbital model for Mott insulators in twisted trilayer graphene
I Introduction
Superconductivity occurs proximate to a Mott insulator in a few materials. The most famous are the cuprate high materials Lee et al. (2006); others include layered organic materials Powell and McKenzie (2011), certain fullerene superconductors Iwasa and Takenobu (2003), and some iron based superconductors Song et al. (2016). In these systems, there is a complex and often poorly understood relationship between the Mott insulator and the superconductor which has spurred tremendous research activity in condensed matter physics in the last 30 years. Very recently, in some remarkable developments, both Mott insulating behavior and proximate superconductivity have been observed in a very different platform: two layers of graphene that are rotated by a small angle relative to each other Cao et al. (2018a, b).
Twisted bilayer graphene (TBG) structures have been studied intensely in the last few years Lopes dos Santos et al. (2007); Trambly de Laissardière et al. (2010); Bistritzer and MacDonald (2011); Wong et al. (2015); Kim et al. (2017); Huang et al. (2018); Rickhaus et al. (2018). The charge density is concentrated on a Moiré pattern which forms (at least approximately) a triangular lattice Trambly de Laissardière et al. (2010); Wong et al. (2015); Kim et al. (2017). The electronic states near each valley of each graphene monolayer hybridize with corresponding states from the other monolayer. When the twisting angle is close to certain discrete values known as the magic angles, theoretical calculations show that there are two nearly flat bands (per valley) that form in the middle of the full spectrum that are separated from other bands.Bistritzer and MacDonald (2011) When the carrier density is such that the chemical potential lies in the middle of these nearly flat bands interaction effects are expected to be enhanced. At a filling of or (denoted and respectively with full band filling denoted ) of these nearly flat bands, Ref. Cao et al., 2018a reported insulating behavior at very low temperatures. At such fillings band insulation is forbidden leading naturally to the expectation that these are correlationdriven (Mott) insulators. Doping the Mott insulator at band filling  with either electrons or holes  reveals superconductivity at low Cao et al. (2018b).
A number of other striking observations have been made about both the Mott insulator and the superconductor from transport studies in a magnetic field. The Mott insulation is suppressed through the Zeeman coupling of the magnetic field at a low scale  roughly the same scale as the activation gap inferred from zero field resistivity. Quantum oscillations are seen in the hole doped state with a frequency set (in the hole doped side) by the density deviation from the Mott insulator. The degeneracy of the corresponding Landau levels is half of what might be expected from the spin and valley degrees of freedom that characterize electrons in graphene. The superconductivity occurs at temperatures that are high given the low density of charge carriers. Just like in other doped Mott insulators there is a dome of superconductivity with reaching an “optimal” value at a finite doping. The superconductivity is readily suppressed in accessible magnetic fields  both perpendicular and parallel to the plane.
The observation of these classic phenomena in graphene gives new hope for theoretical progress in addressing old questions on Mott physics and its relationship to superconductivity. They also raise a number of questions. What is the nature of the insulators seen at these fractional fillings? How are they related to the observed superconductivity? On the theoretical side what is an appropriate model that captures the essential physics of this system?
In this paper we make a start on addressing these questions. The two nearly flat bands for each valley found in the band structure have Dirac crossings. We argue that these Dirac crossings are protected by symmetries of the TBG system. We show that this precludes finding a real space representation of the nearly flat bands in terms of Wannier orbitals localized at the triangular Moiré sites, in contrast to natural expectations. Thus a suitable real space lattice model is necessarily different from a correlated triangular lattice model with two orbitals (corresponding to the two valleys) per site. We show that a representation that is faithful to the Dirac crossings is possible instead on a honeycomb lattice with two orbitals per site but that even this has some subtleties. First the symmetry associated with separate conservation of electrons of each valley (which we dub ) is not implemented locally in such a honeycomb description. Second, since the charge density is concentrated at the Moiré triangular sites (which appear as the centers of the honeycomb plaquettes), the dominant interaction is not an onsite Coulomb repulsion on the honeycomb sites. Rather it is a ‘cluster charging energy’ that favors having a fixed number of electrons in each honeycomb plaquette. This makes this model potentially rather different from more standard Hubbard models with onsite interactions.
Armed with this understanding of the microscopics we can begin to address the experimental phenomenology. We propose that this system spontaneously breaks the valley symmetry  we call the resulting order as “InterValley Coherent” (IVC). We discuss microscopic mechanisms that stabilize IVC symmetry breaking. We point out that even when the IVC is fully polarized it cannot, by itself, lead to a fully insulating state but rather leads to a Dirac semimetal. The development of a true insulator needs a further symmetry breaking (or some more exotic mechanism) to gap out the Dirac points. We show that once the valley symmetry is spontaneously broken, the physics at lower energy scales can be straightforwardly formulated in terms of a real space honeycomb lattice tightbinding model with a dominant cluster charging interaction, and other weaker interactions. We outline a number of different possible routes in which a true insulator can obtain in such a IVC ordered system. A concrete example is a state that further breaks rotational symmetry. We show how doping this specific IVC insulator can explain the phenomenology of the experiments. We present a possible pairing mechanism due to an attractive interaction mediated by Goldstone fluctuations of the IVC phase. We describe and contrast features of other distinct routes by which the IVC state can become a true insulator at . We propose a number of future experiments that can distinguish between the different routes through which a IVC can become a true insulator.
Very recently a heterostructure of ABC trilayer graphene and boron nitride (TLG/hBN) which also forms a triangular Moiré superlattice even at zero twist angle was studied Chen et al. (2018). This system also features nearly flat bands that are separated from the rest of the spectrum. Correlated Mott insulating states were seen at fractional fillings of the nearly flat band. Unlike the TBG, here the nearly flat band has no Dirac crossing. This makes the details of the two systems potentially rather different. In particular the nearly flat band of the TLG/hBN can be modeled in real space as a triangular lattice model with two orbitals per site, supplemented with interactions. However the hopping matrix elements are, in general, complex but with some symmetry restrictions. We describe some properties of this model. We suggest that this system offers a good possibility to realize novel kinds of quantum spinorbital liquid states.
Ii Electronic Structure of Twisted Bilayer Graphene: General Considerations
ii.1 Setup
First, to establish notation, let us consider a graphene monolayer, with lattice vectors and (see Appendix A for details). The honeycomb lattice sites are located at
Now consider the Moiré pattern generated in the twisted bilayer problem. For concreteness, imagine we begin with a pair of perfectly aligned graphene sheets, and consider twisting the top layer by an angle relative to the bottom one. Now we have two pairs of reciprocal lattice vectors, the original ones and . Like references Lopes dos Santos et al. (2007); Bistritzer and MacDonald (2011) we approximate the Moiré superlattice by the relative wavevectors, leading to a periodic structure with reciprocal lattice vectors . For small theta we can approximate this by: , thus the Moiré pattern also has triangular lattice symmetry and is rotated ninety degrees and much larger than the atomic lattice. Note, in this dominant harmonic approximation, questions of commensuration/ incommensuration are avoided since no comparison is made between the other sets of harmonics of the Moiré superlattice that are not symmetryrelated to the dominant one.
Let us now briefly review the low energy electronic structure of monolayer graphene to set the notation. Parameterizing for , the Bloch Hamiltonian for the nearest neighbor hopping model is:
(1) 
Note that, as a general property of our present choice of Fourier transform, the Bloch Hamiltonian is not manifestly periodic in the Brillouin zone. Rather, for any reciprocal lattice vector , we have , where . One can now pass to a continuum limit near each Dirac points and . We then have the linearized Hamiltonian
(2) 
where in our simple nearestneighbor model. Since is not periodic in the BZ, expanding about the other equivalent Dirac points will lead to a slightly modified form of the Hamiltonian (due to conjugation by some ). In second quantized notation we can write the continuum Hamiltonian:
(3) 
where the momentum integration is understood to be implemented near the Dirac point momentum by introducing a cutoff and
Next, we couple the degrees of freedom in the two layers of graphene and arrive at a continuum theory for the twisted bilayer graphene system Lopes dos Santos et al. (2007); Bistritzer and MacDonald (2011). First, we note that the rotated Bloch Hamiltonian of a monolayer can then be identified as . Linearizing about the rotated K point , obtains , with defined by replacing in , where
(4) 
Focusing on a single valley, say , The continuum theory Lopes dos Santos et al. (2007); Bistritzer and MacDonald (2011) of the twisted bilayer graphene system is described by the Hamiltonian , where
(5) 
where and respectively denote the top and bottom layers, and we set and . Here, we have introduced , which characterizes the momentum transfer between the electronic degrees of freedom of the two layers Bistritzer and MacDonald (2011). The symmetries of the system, which we will discuss in the following subsection, constrain Mele (2011) to take the form , where are real parameters ^{1}^{1}1Note that, due to a different choice of basis, our parameterization is slightly different from that of Refs. Bistritzer and MacDonald, 2011; Mele, 2011. We have therefore defined with an additional minus sign to account for the difference in basis choice.. Similarly, one can generate the omitted symmetryrelated terms by applying symmetries on .
ii.2 Symmetries of the Continuum Theory
Let us discuss how the symmetries of the graphene monolayer are modified in the twisted bilayer problem within the dominant harmonic approximation. We will see that in addition to the Moiré translation symmetry, we have C rotation, time reversal and a mirror symmetry. Furthermore, a U(1) valley symmetry that allows us to assign valley charge to the electrons will be an emergent symmetry in the low energy limit. The generator of C rotation and time reversal will flip the valley charge, while reflection leaves it invariant.
Microscopically, the stacking pattern of the two layers can be specified as follows Bistritzer and MacDonald (2011); Mele (2011); Fang and Kaxiras (2016): first, we align the two layers perfectly in a siteonsite manner, corresponding to the “AA stacking” pattern, and then we shift the top layer by a vector parallel to the plane; second, we rotate the top and bottom layers respectively by angles and clockwise. For generic values of and one expects that almost all of the spatial symmetries are broken.
However, within the dominant harmonic approximation it was found that, on top of possessing Moiré lattice translation symmetries, the effective theory is also insensitive to Bistritzer and MacDonald (2011). This implies that, given , the effective theory will at least possess all the exact symmetries for any choice of . A particularly convenient choice is when we take , and that we rotate the two layers about the center of a hexagon. In this case, we can infer all the pointgroup symmetries of the system by focusing on the center of the hexagons (Fig. 1a). Aside from the rotational symmetries generated by the sixfold rotation , we see that there is an additional mirror plane , which, in fact, combines a mirror perpendicular to the 2D plane together with an inplane mirror which flips the top and bottom layers. Strictly speaking, this leads to a twofold rotation in 3D space, but when restricting our attention to a 2D system it acts as a mirror.
To summarize, the effective theory will at least have the following spatial symmetries: lattice translations, a sixfold rotation and a mirror. This allows one to uniquely identify its wallpaper group (i.e., 2D space group) as (numbered 17 in Ref. Hahn, 2006). Having identified the symmetries of the system, one can derive the model following a phenomenological approach by systematically incorporating all symmetryallowed terms with some cutoff Mele (2011). We have tabulated the explicit symmetry transformation of the electron operators in Appendix A.
In the effective theory the degrees of freedom arising form the microscopic K and K’ points are also essentially decoupled Lopes dos Santos et al. (2007); Bistritzer and MacDonald (2011). This is because, for a small twist angle satisfying , we have , and therefore the coupling between the K and K’ points is a very highorder process. Hence, on top of the usual electron charge conservation, the effective theory has an additional conservation corresponding to the independent conservation of charge in the two valleys K and K’. Henceforth, we will refer to this as “valley conservation.” The valley charge operator is given by:
(6) 
Note that, as timereversal interchanges the K and K’, it is not a symmetry of a single valley. Similarly, also interchanges the two valleys, thus:
(7) 
We then see that their combined symmetry, , is a symmetry in the singlevalley problem. In fact, one can check that the symmetries of singlevalley problem is described by the magnetic space group 183.188 (BNS notation; Ref. Litvin, 2013). We have tabulated the generating symmetries in Table 1.
Symmetry  Remarks  

Broken by valley polarization  
–  
Broken by perpendicular electric field  
Pins Dirac points to and  
Protects the local stability of the Dirac points  
– 
Iii Low Energy Theory: Two Band Projection
Formally, the continuum effective theory Lopes dos Santos et al. (2007); Bistritzer and MacDonald (2011) we described corresponds to an infiniteband problem for each valley. However, near charge neutrality it was found that, for some range of angles, the Moiré potential can induce additional band gaps at certain commensurate filling of the Moiré unit cell. The “nearly flat bands” identified near the magic angle correspond to two bands per valley, separated from all other bands by band gaps, that form Dirac points at the and points in the Moiré BZ. These bands correspond to the relevant degrees of freedom for the correlated states observed in Refs. Cao et al., 2018a, b, and in the following we will focus our attention to the properties of these bands. In this section, we will always focus on a single valley, say that corresponding to the K point in the microscopic description.
iii.1 Symmetryenforced Band Contacts
A salient feature of the effective theory is the presence of Dirac points at charge neutrality, whose velocity is strongly renormalized and approaches zero near the magic angle Lopes dos Santos et al. (2007); Bistritzer and MacDonald (2011). The stability of the Dirac points can be understood from symmetries: for a single valley, is symmetric under the (magnetic) pointgroup generated by . In particular, , and therefore we can label each band at by its eigenvalue, which takes value in . In particular, a band with eigenvalue is necessarily degenerate with another with eigenvalue , as on these bands and enforces a Kramerslike degeneracy. The observed Dirac points at charge neutrality correspond precisely to this twodimensional representation (Fig. 1b).
While we have alluded to the presence of symmetry in explaining the stability of the Dirac points, these band contacts are actually locally stable so long as the symmetry is kept. This can be reasoned by noting that quantizes the Berry phase along any closed loop to , and a Dirac point corresponds precisely to the case of a nontrivial Berry phase Kim et al. (2015).
Let us now consider the effect of breaking the various symmetries (spontaneously or explicitly) in the system. First, as is crucial in protecting the local stability of the Dirac points, once it is broken the Dirac points can be immediately gapped out (Fig. 1c). However, as long as symmetry is preserved, a small breaking of any other spatial symmetries will not lead to a gapped band structure at charge neutrality. For instance, the mirror maps K to K’, and its presence only ensures that the two inequivalent Dirac points are at the same energy. Therefore, even when a perpendicular electric field is externally applied such that is broken, as in the set up of Refs. Cao et al., 2018a, b, it can only induce an energy difference between the two Dirac points Lopes dos Santos et al. (2007) (Fig. 1d). This should be contrasted with the case of Bernalstacked bilayer graphene, whose quadratic band touching at charge neutrality can be gapped by an external electric field ^{2}^{2}2This can be understood by noting that, in the case of Bernal stacking, the system has an effective symmetry which interchanges the top and bottom layer. A perpendicular electric field will therefore break this symmetry, and hence the band degeneracy at charge neutrality is lifted.. Alternatively, if symmetry is broken the Dirac points are unpinned from and (Fig. 1e). As such, for a sufficiently strong breaking, the Dirac points could meet their oppositely charged partners and annihilate Goerbig and Montambaux (2017).
Now consider the case when valley conservation is spontaneously broken by an IVC, i.e., the valley charge is no longer conserved. In this case, we should first consider the full fourband problem consisting of both valleys. At, say, K, the combined symmetry of ensures that the Dirac points from the two valleys are degenerate. While such degeneracy is lifted in the presence of an IVC, as long as the remaining symmetries are all intact we can only split the degeneracy according to (Fig. 1f). This remaining twofold degeneracy rules out an interpretation of the experimentally observed Mott insulator as a Slater insulator with a spatialsymmetryrespecting (ferro) IVC incorporated at the HartreeFock level. Instead, one must either introduce additional symmetry breaking, say that of or lattice translations, or consider an IVC which also breaks some additional spatial symmetries. We will elaborate on these points in Sec. VI. We also note that, an essentially identical argument holds for the case of spontaneously ferromagnetic order leading to fully spinpolarized bands of ordering. In this way it connects to the half filled Mott insulator we will be interested in.
iii.2 Triangular versus Honeycomb Lattice
A conventional route for understanding the correlated states observed in Refs. Cao et al., 2018a, b is to first build a realspace tightbinding model for the relevant bands, and then incorporate shortrange interactions to arrive at, say, a FermiHubbard model. Typically, the orbital degrees of freedom involved in the tightbinding model can be identified from either chemistry insight, or more systematically by studying the projected density of states for the relevant bands, both of which are inapplicable to the current Moiré potential problem. Further understanding on the structure of the wavefunctions is required. Indeed, as is noted in Refs. Trambly de Laissardière et al., 2010; Cao et al., 2016; Fang and Kaxiras, 2016, the local density states for the flat bands are welllocalized to the AA regions of the Moiré pattern, which form a triangular lattice. This theoretical prediction has also been confirmed experimentally Luican et al. (2011); Li et al. (2009); Wong et al. (2015). Based on this observation, it is natural to consider a realspace model starting from effective orbitals centered at the AA sites, which corresponds to a tightbinding model defined on the triangular lattice Cao et al. (2018a, b). In addition, by treating the two valleys separately, one envisions a model with two orbitals localized to each of the triangular sites (i.e., AA regions of the Moiré pattern).
From symmetry representations, however, we can immediately rule out such a model. This can be readily inferred from the computed band structure Lopes dos Santos et al. (2007); Bistritzer and MacDonald (2011); Mele (2011); Fang and Kaxiras (2016) (Fig. 4j): While the two bands are nondegenerate at , as we have explained they form symmetryprotected Dirac points at K and K’. Using such pattern of degeneracies one can infer the possible symmetry representations at these highsymmetry points, and form a realspace analysis Po et al. (2017); Bradlyn et al. (2017); Watanabe et al. (2017) one finds that a triangularlattice model will always lead to the same symmetry representation at all three of the highsymmetry points, i.e., they are either all nondegenerate, or are all Dirac points. This is inconsistent with the observed pattern of degeneracies, which rule out all triangularlattice models.
In fact, the degeneracy pattern described is familiar—it corresponds exactly to the monolayer graphene problem. Symmetrywise, this implies that any tightbinding model must correspond to orbitals forming a honeycomb lattice. One can further check that this is the only possible solution Watanabe et al. (2017). To reconcile with the predicted and observed local density of states Trambly de Laissardière et al. (2010); Fang and Kaxiras (2016); Luican et al. (2011); Li et al. (2009); Wong et al. (2015), however, these orbitals much have nontrivial shapes: although each orbital is centered at a honeycomb site, which corresponds to the AB/BA region of the Moiré pattern, the weight of the orbitals are mainly localized to the AA sites. Therefore, we expect the shape of the orbitals to resemble a (threelobed) fidget spinner (Figs. 3(a,b)).
iii.3 Obstructions to Symmetric Wannier States
Our symmetry analysis suggests that one should model the system by orbitals centered at the AB/BA regions of the Moiré potential, which forms a honeycomb lattice. A minimal tightbinding model of a single valley would then be
(9) 
where is an electron creation operator at centered at a honeycomb site (for a single valley), and connects two th nearest neighbor sites, and given that this describes a single valley which breaks time reversal symmetry, the hoppings are in general complex unless constrained by a spacegroup symmetry.
A pedestrian approach would involve optimizing the parameters to reproduce the energy eigenvalues obtained from the continuum description. Would this be a good starting point for building of a realspace effective model upon which we can incorporate interaction terms? Contrary to usual expectations, such an approach has a serious flaw in capturing certain essential properties. Specifically, while the energy eigenvalues may be well approximated, the topology of the resulting Bloch wavefunctions will necessarily be incorrect. This has important dynamical consequences, relating to the stability of band contacts under different symmetry assumptions, which in turn dictate whether an insulator will result at particular fillings. In particular, we found two symmetry obstructions to deriving a singlevalley tightbinding model. First, we found that the two bands have opposite eigenvalues of , whereas, from a realspace analysis Po et al. (2017); Bradlyn et al. (2017); Watanabe et al. (2017), one can show that the two bands in a tightbinding model must have the same eigenvalue.
There is a second, more serious, obstruction: aside from a quantized Berry phase of for any closed loop encircling a single Dirac point, one can further define a valued winding number Goerbig and Montambaux (2017). In contrast to the conventional case of graphene, the two inequivalent Dirac points in the singlevalley model are known to have the same winding number de Gail et al. (2011); Goerbig and Montambaux (2017); Cao et al. (2018a). As the net winding number of the Dirac points arising in any twoband tightbinding model would necessarily be zero, we can then conclude that there is an obstruction for a realspace description, i.e., there is an obstruction for constructing localized Wannier functions that reproduces just the two bands of interest and preserves valley quantum numbers. A more detailed description of this obstruction, by relating it to the anomalous surface state of a three dimensional topological phase, is contained in the Appendix B. Essentially, this argument invokes three key ingredients: (i) a two band model and (ii) symmetry and (iii) net winding of the Dirac points in the Brillouin zone.
We will return to the question of tight binding models in Section VIII, but for the discussion below, we will work directly in momentum space in the manifold of states spanned by the nearly flat bands.
Iv Intervalley coherent order: phenomenological motivation
We first describe some important clues from experimentsCao et al. (2018a, b) on the nature of both the Mott state and the superconductor. We begin with the observation that  at optimal doping  an inplane magnetic field suppresses the superconductivity when the Zeeman energy scale is of order the zero field . This shows that the superconductor has spinsinglet pairing. Upon hole doping the insulator, quantum oscillations are seen with a frequency set by the density of doped holes in perpendicular fields exceeding . This tells us that the “normal” metallic state and the superconductor that emerges from it should be regarded as doped Mott insulators: the charge carriers that are available to form the normal state Fermi surface or the superconducting condensate are the doped holes. Thus the holedoped superconductor retains information about the Mott insulator. In contrast electron doping this Mott insulator leads very quickly to quantum oscillations with a high frequency that is set by the deviation of charge density from the charge neutrality point (). This may indicate a first order transition between a metal and Mott insulator on the electron doped side. It will be important to search for signs of hysteresis in transport experiments as the gate voltage is tuned. As the superconductor is better developed and characterized on the hole doped side we will restrict attention to hole doping from now on.
A further important clue from the quantum oscillation data is that the degeneracy of the Landau levels (per flux quantum) is 2 instead of 4 as might have been expected from the spin and valley degeneracy. The doped holes have thus lost either their spin or valley (or some combination) quantum numbers. Losing spin makes it hard to reconcile with spin singlet pairing that can be suppressed with a Zeeman field. Thus we propose instead that the valley quantum number is lost. The simplest option^{3}^{3}3A more exotic option to explain the reduced Landau level degeneracy should also be kept in mind. Instead of losing the valley quantum number by symmetry breaking we lose it through fractionalization. For instance the electron could split into a fermion that carries its charge and spin but not the valley quantum number and a charge, spinless boson that carries its valley quantum number. If the boson is gapped while the fermion forms a fermi surface in the doped state we will get the reduced Landau level degeneracy. Of course such fractionalization will come hand in hand with an emergent gauge field. then is that the valley quantum number is frozen due to symmetry breaking, i.e . Here, we may define using the electron operators for the nearly flat band states:
(10) 
where correspond to the valley index, is the spin index, labels the two bands for each valley, and denotes the standard Pauli matrices.
Now a nonzero expectation value for breaks time reversal symmetry. This will lead to a sharp finite temperature phase transition in , and would likely have been detected in the experiments. Given the absence of any evidence of a sharp finite temperature transition we propose that the ordering is in the pseudospin plane. These phenomenological considerations therefore lead us to a IVC ordered state.
We note that for IVC ordering to be useful to explain the quantum oscillations it has to occur at a scale that is large compared to the scales set by the magnetic field. Specifically the band splitting due to IVC ordering must be bigger than the Landau level spacing at the biggest fields used (of order ). This means that the IVC order is much more robust than the superconductivity and occurs at a higher temperature scale. We further need the IVC order to be present already in the Mott insulator so that upon doping it can impact the quantum oscillations.
Thus our view is that the first thing that happens as the sample is cooled from high temperature is IVC ordering. This order then sets the stage for other phenomena to occur at lower temperature (the Mott insulation, or the superconductivity).
V Simple theory of the IVC ordered state
We now describe a mechanism that stabilizes IVC ordering, and describe the properties of the resultant state. Interestingly to treat this stage of the problem it is sufficient to work within a momentum space formulation. This enables us to sidestep the difficulties elaborated in Sec. III.3 with a real space tightbinding formulation.
Consider the nearly flat bands in the limit of strong Coulomb repulsion. Note that the dominant part of the interaction in fully invariant. We expect that the Coulomb interaction prefers an ferromagnetic state  similar to the ferromagnetism favored by Coulomb interaction in the zeroth Landau level in monolayer grapheneNomura and MacDonald (2006); Alicea and Fisher (2006); Young et al. (2012), or in the extensive literature on flat band ferromagnetismTasaki (1998). Indeed the difficulties with Wannier localization of the nearly flat bands also suggest that when Coulomb interactions dominate an ferromagnetic ground state will be favored. The band dispersion however is not symmetric and hence there will be a selection of a particular direction of polarization in the space. To address this we consider the energies of different orientations of the ferromagnet within a simple HartreeFock theory. Specifically we compare a spin polarized state, a pseudospin polarized state, and the IVC state with polarization. Specifically assume a Hamiltonian
(11) 
with
(12) 
Similarly to before, is the valley index, is the spin index, and labels the two bands for each valley. The dispersion is independent of the spin, and due to time reversal . We assume a simple form of interaction:
(13) 
where is the electron density with flavor and spin . Repeated indices are summed over here. This interaction actually has an symmetry but this is strongly broken down to by the difference in dispersion between the two bands, and eventually down to by the antisymmetry of the dispersion under . Each factor corresponds simply to independent charge and spin conservation symmetries of the two valleys. Details of the HartreeFock calculation are in Appendix E. We find that the IVC state has lower energy than both spin and polarized states. The physical reason is that for both the for both the spin and polarized states the order parameter is conserved and hence there is a linear shift of the band when the order parameter is nonzero. In contrast due to the dispersion anisotropy, the IVC order parameter does not commute with the Hamiltonian. IVC order thus does not simply shift the band but modifies it more significantly. Near full polarization in the resulting HartreeFock Hamiltonian this leads to an extra energy gain at second order in the IVC state compared to the spin polarized or polarized states.
Note that in the presence of symmetry the spin singlet IVC state is degenerate with states that have spin triplet IVC ordering with an order parameter . The selection between the singlet and triplet IVC order has to occur due to other terms in the Hamiltonian that have been ignored so far. We will not attempt to pin down the details of this selection in this paper and will simply assume, as suggested by the phenomenology, that the spin singlet IVC is preferred, and discuss its consequences.
Next we turn to a description of the properties of the IVC state. We assume that the order parameter is large and first study its effects on the band structure. In the absence of valley ordering at the two Dirac points there is a fourfold band degeneracy. As explained in Section III.1, The valley ordering splits this fourfold degeneracy into two sets of twofold degenerate Dirac points. When the order parameter is large the four nearly flat bands split into two sets of two bands (Fig. 1f). At quarter filling we fill the bottom most band. This however results not a Mott insulator but in a Dirac semimetal. Thus the IVC state by itself does not lead to a Mott insulator and a further mechanism is needed. We discuss this in the next section. We note that the semimetals obtained from planar valley order versus order are rather different, the latter being similar to spin ordered states. Furthermore, while additionally breaking symmetry alone can eventually gap the Dirac points of the IVC semimetal, the same is not true of spin or ordered semimetals, which need further symmetry breaking due to their Dirac points carrying the same chirality.
Going beyond the mean field, the universal properties of the IVC ordered state are determined by its symmetry breaking. It will have a Goldstone mode with linear dispersion at the longest scales. Further it will have a finite temperature BKT transition which will have weak signatures in standard experimental probes^{4}^{4}4In principle as discussed earlier the symmetry is only approximate and will be weakly broken by 6body interaction terms. This will induce a crossover very close to the transition. .
Vi Intervalley coherent Mott insulators: generalities and a concrete example
We saw that IVC ordering by itself only gives us a Dirac semimetal and not a Mott insulator. We now consider the physics below the IVC ordering scale. First we note that that once symmetry is broken, there is no difficulty with writing down a real space tightbinding model for the two lowest bands. This model lives on the honeycomb lattice and must be supplemented with interactions. The dominant interaction will be the cluster charging energy which penalizes charge fluctuations on each honeycomb plaquette. Thus a suitable model Hamiltonian at scales much smaller than the intervalley coherence scale takes the form
(14)  
(15)  
(16) 
where are sites of the honeycomb lattice, is the center of the honeycomb plaquettes (i.e the triangular Moiré sites) and we have introduced the cluster charge defined through
(17) 
We have also specialized to when his honeycomb lattice is halffilled. For the usual Hubbard model with a strong onsite repulsion the Mott insulating state has the usual 2sublattice Neel order. However when the the cluster charging energy is dominant this is not obviously the case. We will therefore allow ourselves to consider a few different possibilities for the Mott insulator. Naturally, in all these options the charge gap of the insulator will be much lower than the scale of IVC ordering. In the experiments, the charge gap is estimated to be about . The IVC ordering should then occur at a much higher scale, consistent with what we already concluded based on the phenomenology. In this section, to be concrete, we focus on a particular Mott insulator where the rotation symmetry is spontaneously broken while preserving other symmetries (Fig. 2).
vi.1 broken insulator
Breaking the symmetry allows gapping out the Dirac points and leads to an insulator. As the breaking order parameter increases the two Dirac points will move towards each other (Fig. 1e) and eventually annihilate to produce a fully gapped insulator. This annihilation (and correspondingly the gap minimum just into the insulator) will occur either at the or point depending on details of the dispersion. Note that within this picture the breaking also occurs at a scale bigger than the charge gap of the Mott insulator. Clearly the excitations above the charge gap are ordinary electrons, and their gap can be readily closed by a Zeeman field.
Upon doping this insulator charge will enter as ordinary holes and form a small Fermi pocket. This pocket will be centered at either or depending on the location of the minimum insulating gap. In wither choice, due to the absence of symmetry, there will just be a single such Fermi pocket which will accommodate the full density of doped holes. Due to the intervalley ordering these holes will be valley polarized in the direction. Naturally this explains the quantum oscillation experiments  the frequency will be set by the density of doped holes and the Landau level degeneracy (per flux quantum) will only be twofold (from the spin).
A natural pairing mechanism emerges from the coupling of the holes to Goldstone fluctuations of the intervalley order, as we now elaborate. In the presence of an intervalley condensate an appropriate effective action will be
(18)  
(19)  
(20) 
Here is a continuum electron field that represents the electrons in the low energy nearly flat bands, is the phase of the intervalley condensate and is its amplitude. Note that is a matrix for each point^{5}^{5}5Strictly speaking we should allow for all 4 bands per spin and work with a four coponent and a corresponding Hamiltonian. However for the present discussion we will eventually only be interested in the modes in the vicinity of the Fermi surface after the flavor ordering. It is this sufficient to focus on the two lower bands that are split off by the flavor ordering. We therefore focus on just these two right from the start.. We will allow for slow Goldstone fluctuations of the phase and obtain a convenient form of the electronelectron interaction induced by these fluctuations. To that end we first define new fermion variables through
(21) 
This then removes the dependence from but now takes the form
(22)  
(23) 
Here is the contribution to the current from the fermions. It is conveniently written down in momentum space as
(24) 
Now we assume that is near maximum polarization and diagonalize the Hamiltonian obtained from . As discussed earlier there are two sets of bands per spin (corresponding to ) that are well separated from each other. The low energy electrons are those that have valley polarization . We wish to obtain the coupling of these electrons to the fluctuations. For the bands with we write
(25) 
It follows that and similarly . The only nonvanishing coupling therefore is to the contribution from . We get
(26) 
Now we assume we have integrated out the fermions everywhere except in the close vicinity of the Fermi surface. This gives a long wavelength, low frequency effective action for the fluctuations of the form
(27) 
Here is the phase stiffness of the field, and is the velocity of the linear dispersing fluctuations. We now integrate out to get an effective interaction between the electrons:
(28) 
This is an attractive interaction. Anticipating that the important regime for pairing is for an approximate treatment we set in the prefactor to get a simplified effective interaction
(29) 
We emphasize again that  within our approximate treatment  the only contribution to comes from the antisymmetric part of the “normal” state dispersion. This attractive interaction can now be treated within BCS mean field, and will lead to a superconducting state.
Note that in real space since the large repulsion will be on the hexagon center and not on the honeycomb site there is no particular reason to disfavor onsite swave pairing. Though we will not give a detailed description of the pairing symmetry the route to superconductivity sketched above naturally leads to a spin singlet superconductor. . Further, it forms out of a ‘normal’ metal of ordinary holes through a BCSlike pairing mechanism. We expect then that Zeeman fields of order will efficiently suppress the superconductivity except possibly at very low doping (where eventually phase fluctuations will kill ). At low doping, and when one is near a high symmetry point of the Brillouin zone (which is appropriate given there is no additional degeneracy seen in quantum oscillations), the antisymmetric part of the dispersion is expected to be small from symmetry, for example vanishing as the cube of the crystal momentum near the point. This would lead to a small valley current and hence a weakening of the coupling to valley Goldstone modes, as the doping is reduced. However, if C rotation symmetry is broken, a term linear in momentum can arise in the antisymmetric dispersion, leading to an anisotropic and non vanishing valley current at small doping.
Vii Other possible Mott insulating states
The broken insulator is a concrete example of how an intervalley condensate of the twisted bilayer system can eventually become a Mott insulator. However, given the current experimental information, it is not clear that this is uniquely dictated. Therefore we sketch a few different Mott insulating states and present some of their phenomenological consequences.

Translation broken insulator: Broken Moiré translations  for instance Kekule ordering on the effective honeycomb lattice  can also gap out the Dirac points. The properties of this state and its evolution into the doped superconductor will be similar to the broken insulator discussed above.

Antiferromagnetic insulator: This is the familiar Mott insulator of the usual honeycomb Hubbard model. Upon doping it is expected to evolve into a spin singlet superconductor as seen in numerical studies of the honeycomb modelGu et al. (2013). The pairing symmetry appears to be . It will be interesting to look for signatures of broken time reversal symmetry if this scenario is realized. Further this state is known to have quantized spin and thermal Hall effects, and associated gapless edge statesSenthil et al. (1999); Volovik (1997).

Featureless Mott insulators: Given that the honeycomb lattice features two sites in the unit cell, it evades the LiebShultzMattis theorem and allows for a featureless ground state (i.e. a gapped insulator with neither topological order nor symmetry breaking) at half filling Kimchi et al. (2013); Kim et al. (2016); Ware et al. (2015); Metlitski and Thorngren (2017); Jian et al. (2018); Lee et al. (2018). Pictorially, this is viewed as a spin singlet Cooper pair of electrons being localized on orbitals composed of equal weight superpositions of the hexagons of the honeycomb lattice. While model wave functions of this phase have been constructed, the interactions that can drive a system into this phase remain to be understood.

Quantum spin liquids: The simplest possibility is a fully gapped quantum spin liquid. In this case there are neutral spin excitations (spinons) in the insulator. Upon doping a natural possibility is that the charge goes in as bosonic holons (spinless charge quasiparticles) whose condensation leads to superconductivity. This is the classic RVB mechanism Lee et al. (2006) for superconductivity in a doped Mott insulator. However in this scenario, at low doping the superconducting will not have anything to do with the spin gap (measured by the Zeeman scale needed to suppress pairing).
We do not attempt to decide between these different options in this paper. However, we will outline experiments that can distinguish between them.
Viii Tight Binding Model
viii.1 For twisted BG
As we have argued, there is an obstruction for writing down any tightbinding model for the singlevalley problem. A natural way out, therefore, is to instead consider the fourband problem consisting of both valleys. Our earlier argument requiring the orbitals to be centered on the sites of the honeycomb lattice still applies, but now with two orbitals associated with every site. In addition, to resolve the mirroreigenvalue obstruction we described, these two orbitals should transform oppositely under the mirror symmetry. These orbitals, however, cannot have definite valley charge, for otherwise the problem is reduced back to the earlier case with each valley considered separately. Instead, it is natural to demand each of the two orbitals to be a timereversal singlet, which would lead to a standard representation for the symmetry group of together with timereversal. The representation of valley is necessarily involved, due to the aforementioned anomaly, and we will address that later.
Forgetting about valley conservation for the timebeing, the construction of Wannier functions becomes a rather standard problem and wellestablished protocols apply. In particular, we construct welllocalized Wannier functions using the projection method Marzari et al. (2012), starting from a set of welllocalized trial wavefunctions as the “seed” of the Wannier functions (Appendix C). The success of this method relies crucially on having a nonsingular projection everywhere in the BZ, which can be monitored by ensuring that the overlap between the seed and the actual Bloch wavefunctions neither vanish nor diverge anywhere in the BZ Marzari et al. (2012). Using this approach, we construct Wannier functions for a particular choice of parameters, detailed in Appendix C, for the four nearly flat bands near charge neutrality (spin ignored). Our trial wavefunctions attain a minimum and maximum overlap of and respectively, indicating a satisfactory construction. Indeed, the Wannier functions we obtained are quite welllocalized (Figs. 3(a,b)), with approximately of their weight contained within one lattice constant from the Wannier center. In addition, the Wannier functions we constructed are exponentially localized (Figs. 3(c,d)), as anticipated from the nonsingular trial wavefunction projection.
Having constructed the Wannier functions, one can readily extract an effective tightbinding model by first projecting the full Bloch Hamiltonian into the Wannier basis, and then perform Fourier transform. Due to the exponential tail, however, the resulting tightbinding model would have infiniterange hopping despite with an exponentially suppressed amplitude. To capture the salient behavior of the model, it is typically sufficient to only keep the bonds with strength larger than some cutoff . In other words, serves as a control parameter, and one recovers the exact band structure in the limit of , albeit at the cost of admitting infiniterange hoppings.
The obtained band structures for different value of is plotted in Figs. 4(a–d). We found that a fairly longrange model (see Appendix D for parameters), keeping terms that connect sites up to lattice constants apart, is needed to capture all the salient features of the energetics. It should be noted that the range of the approximate models generally depends on the localization of the Wannier function, and in this work we have not optimized the Wannier functions. It is therefore possible that, by further optimization, one may capture the energetics more faithfully using only shorterrange terms.
Although spatial and timereversal symmetries are respected in the tightbinding model, valley conservation is explicitly broken. This is because our Wannier functions cannot be chosen to represent valley conservation naturally, similar to the case of topological insulators Soluyanov and Vanderbilt (2011), and therefore any truncation of the transformed Hamiltonian will generically introduce explicit valleyconservation symmetry breaking. Furthermore, one can ask how the operator is represented. In particular, we would want to construct the projection operators for the single particle problem which project into the valleys. However, we found that there are obstructions, which mirror exactly the obstructions we faced when attempting to construct Wannier functions for the singlevalley twoband model of the nearly flat bands, i.e., a mismatch in the mirror eigenvalues, as well as a nonzero net charge of the Dirac points. Such inheritance of the obstructions is presumably a manifestation of the underlying anomaly of the singlevalley description.
Towards recovering valley conservation: It is desirable to restore valley conservation even approximately, for our truncated model, and we will describe a method below.Recall we denote the Hermitian valley charge operator in the continuum effective theory by (Eq. (6)). Projecting into the four band subspace of the nearly flat bands, and rotating into the Wannier function basis, we arrive at another Hermitian operator defined on the honeycomb sites, which we can simply interpret as yet another Hamiltonianlike object in our problem. Similar to the earlier discussion for the Hamiltonian, the effective operator, , will have infiniterange hopping with exponentially suppressed amplitude, and it is natural to approximate it by truncation, keeping again only terms with a strength larger than some . Such truncation, however, introduces deviation in the eigenvalues of from the physical values of (Fig. 4e). To fix this, therefore, we can further perform spectral flattening of the corresponding bands to in momentum space. This procedure is well defined so long as a band gap is sustained between the second and third bands of , which is generically true as long as is chosen to be reasonably small. We will denote this flattened version by . Physically, this corresponds to an approximation of the actual representation of the valley charge operator in our tightbinding model, and again our approximation can be made exact in the limit of .
We can now restore valley conservation in our effective Hamiltonian. To this end, for define the projection operator
(30) 
which projects into the sector with eigenvalue (in the manybody Hilbert space) and satisfies . We can now define
(31) 
for which valley conservation is restored. In essence, through this procedure we have introduced a pair of Hermitian operators , corresponding respectively to the Hamiltonian and the valley charge, that converge to the exact operators in the limit .
The valley projection procedure we described applies equally well to an interacting Hamiltonian obtained by projecting the microscopic interaction terms to our Wannier basis, which again would not be automatically valleyconserving due to the truncation errors. For the freepart of the Hamiltonian, however, the projection procedure can be greatly simplified. This is because the Bloch states of , which equal to those of by definition, are known, and using which we can decompose into the valleyconserving and valleybreaking parts. The projection then proceeds simply by retaining only the valleyconserving part. More concretely, write the Bloch “Hamiltonian” of the valley charge operator as , where are matrices. Note that the columns of are simply the valleycharge eigenstates. We can then perform the projection by
(32) 
giving an easy way to perform the described projection.
As is shown in Figs. 4(f–i), the projected effective tightbinding model reexhibits all the symmetry features of the bands from the continuum theory (Fig. 4j), for any choice of truncation parameter . In particular, Figs. 4(a,f) represent the simplest model which demonstrates the utility of our approach, with the valley projection alone converting an otherwise hoppingfree Hamiltonian into one exhibiting the chargeneutrality Dirac points. We further remark that, although generically does not respect , the effective Hamiltonian corresponding to Fig. 4b comes very close to being symmetric in terms of its the energetics along the highsymmetry line. We provide an explicit tabulation of the bonds in this and that of in Appendix D.
viii.2 Nearly flat bands in trilayer grapheneBoron Nitride Moiré superlattices
Recently, Mott insulating phases (but not superconductivity, at the time of writing) were observed Chen et al. (2018) in a heterostructure of ABC trilayer graphene encapsulated in boron nitride (TLG/hBN), where a Moiré superlattice is present even at zero twisting between the graphene layers. Four mini bands are observed close to neutrality, whose bandwidth and separation can be tuned by a vertical electric field. Halffilling one of the nearly flat bands results in a Mott insulator.
We remark that the symmetry setup for this trilayer heterostructure bears more resemblance to the Bernalstacked bilayer graphene than the TBG system we discussed above. In particular, the absence of Dirac points among the mini bands suggest that no symmetry is present, and the system is potentially described by a wallpaper group for which the two sublattices of the honeycomb lattice are nolonger symmetry related (say the wallpaper group , No.14 Hahn (2006)). If that is indeed the case, one expects the valleyresolved band structure to admit a tightbinding model defined on the triangular lattice, although it remains to be checked whether or not the charge density profile exhibits any nontrivial features, akin to that found for the TBG system (Fig. 3c). It will be of great interest to derive a concrete realspace effective model for the TLG/hBN, but we will leave this as a future work.
Ix Model for correlated states in trilayer graphene heterostructure
In this section we briefly consider the case of triangular Moiré superlattices in trilayer graphene heterostructure. Correlated insulating states were observed very recently in this system Chen et al. (2018). Just like in the twisted bilayer here too there are nearly flat bands that are separated from the rest of the spectrum. However unlike the TBG, here there are no Dirac crossings in this nearly flat band, and the low energy degrees of freedom are in the trivial representation of . It is thus reasonable that the nearly flat band can be modeled in real space by a triangular lattice model with two orbitals (corresponding to the two valleys) per site supplemented with interactions. However some care is still necessary. Time reversal and both act by flipping the valley index. Thus the band dispersion within a single valley is not symmetric under :
(33) 
though
(34) 
is satisfied. A real space tightbinding description on the triangular lattice therefore takes the form
(35) 
with real and positive. The phases are in general nonzero. The phase even on nearest neighbor bonds cannot be removed as in general the symmetries permit a nonzero flux for any single valley through an elementary triangle (and the opposite flux for the other valley).
When the Coulomb interaction dominates the ferromagnet with a further selection of IVC order is once again a possibility. From a real space point of view the projection of the Coulomb interaction to the Wannier basis used to formulate the tightbinding model will lead to an appropriate interaction Hamiltonian. If the Wannier orbitals are not tightly confined to each triangular site then there will be significant intersite ferromagnetic Hund’s exchange which will promote ferromagnetism with a further selection of IVC ordering.
It is interesting to consider the limit where the Wannier functions are sufficiently tightly localized that such a ferromagnetic intersite exchange is weak and can be ignored. In that limit to obtain a minimal model for this system we restrict the hopping to just be nearest neighbor and include an onsite repulsion. Further we assume the presence of a good rotation symmetry. Given the microscopic realization the will exchange the two valleys. Then the flux for a single valley will alternate in sign between two neighboring triangles related by a rotation. The minimal model then takes the form
(36) 
with with sign for upfacing triangles and sign for downfacing ones. Here the sites are assumed to be arranged counterclockwise on each triangle. As in previous sections this Hamiltonian has a symmetry corresponding to independent rotations of each valley in addition to the discrete symmetries described above. The model thus needs to be supplemented with further weaker interactions that break the continuous symmetry down to though we will not specify them here. Note that if the flux then the model actually has an even higher symmetry.
The minimal model above allows for discussion of the Mott insulator in the strongly correlated regime of large at integer . In the experiments Mott insulators at fillings have been reported. In the large limit, the effective model takes the form of a “spinorbital” Hamiltonian on a triangular lattice that has 4 states per site: 2 spins and 2 valleys. A systematic expansion is readily performed to yield this spinorbital Hamiltonian. At the “superexchange” is not sensitive to the flux , and we end up with an quantum antiferromagnet on the triangular lattice. For the spins are in the fundamental representation while for they are in the dimensional representation.
Antiferromagnetic models of spins have been studied on a variety of lattices with different motivations (for some representative recent papers see Refs. Corboz et al., 2012; Kaul, 2015; Yamada et al., 2017). It seems likely that they go into “paramagnetic” states that preserve symmetry. However a new feature in the present problem is the presence of the flux in the underlying Hubbard model which breaks to . This modifies the spinorbital model at . In the experiments the ratio of Coulomb interactions to the bandwidth of the nearly flat bands may be controlled by a perpendicular electric field, and it may be thus be possible to tune the strength of these third order terms relative to the second order ones. In Appendix F we derive the spinorbital Hamiltonian to third order showing how the flux leads to new terms not present in an invariant model. We however leave for the future a detailed study of these interesting spinorbital models.
At any rate we emphasize that this trilayer system is thus qualitatively different from the twisted bilayer graphene where we argued that a real space triangular lattice description is not possible due to Dirac crossings within the nearly flat bands.
X Proposed future experiments
As discussed in previous sections, the ideas presented in this paper suggest a number of experiments which will be extremely useful in revealing the physics. Here we reiterate and elaborate on some of these suggestions.
A crucial clue from the existing experiments is that an inplane field suppresses the superconductivity  at optimal doping  when the Zeeman energy is of order the zero field . This indicates spin singlet pairing and that at optimal doping is associated with the loss of pairing. It will be extremely useful to study this systematically as a function of doping. For the doped broken insulator, the superconductivity may be driven by pairing of a small Fermi surface of electrons. Then (except perhaps at very small doping) and the critical Zeeman scale will continue to track each other as the doping is decreased. In contrast, if the pairing (in the form of singlet valence bond formation) already happens in the Mott insulator  as in the usual RVB theory, or with the featureless Mott insulator, then with decreasing doping and the critical Zeeman field should part ways significantly.
A second crucial clue from the experiments is the degeneracy pattern of the Landau fan emanating from the Mott insulator. We proposed that this was due to the freezing of the valley degree of freedom. This can be distinguished from the alternate possibility that there is spin freezing by studying the quantum oscillations in a tilted field. Zeeman splitting, if it exists, should show up in a characteristic way as a function of tilt angle.
Our proposal is the intervalley phase coherence at a scale higher than both the superconducting and the Mott insulating scale . The valley symmetry is as usual related to translational symmetry of the microscopic graphene lattice. In the twisted bilayer there is an approximate translation symmetry that holds at some short scale associated with translation by one unit cel of the microscopic graphene lattices. Under this approximate translation operation electrons at the valleys get different phases. This is a rotation. Therefore intervalley ordering will strongly break this approximate short translation symmetry. Within each Moiré site the density of states will be uniform at the lattice scale when there is no intervalley ordering but will oscillate once this order sets in. This may be detectable through Scanning Tunneling Microscopy (though if the bilayer graphene is fully encapsulated by Boron Nitride it may be challenging to see the graphene layer).
Assuming there is intervalley ordering, if the undoped Mott insulator develops antiferromagnetic order, it appears likely that the doped superconductor will be a spin singlet superconductor. This spontaneously breaks time reversal symmetry. In contrast for a doped broken state, either wave or spin singlet superconductivity seem possible. It will also be useful to directly search for broken or Moiré translational symmetry in the experiments. Finally the very different behavior in quantum oscillations between electron and hole doping away from the Mott insulator suggests that there may be a first order transition into the Mott state as it is approached from the charge neutrality point. This will lead to hysteretic response as the gate voltage is tuned towards charge neutrality from the Mott insulator.
Xi Conclusion
In this paper we addressed some of the theoretical challenges posed by the remarkable observations of Mott insulating states and proximate superconductivity in twisted bilayer graphene.
We proposed that both the Mott insulator and the superconductor develop out of a state with spontaneous intervalley coherence that breaks independent conservation of electrons at the two valleys. We described a mechanism for the selection of this order over other spin/valley polarized states owing to the peculiarities of the symmetry realization in the band structure. We showed that intervalley ordering by itself does not lead to a Mott insulator, and described possible routes through which a Mott insulator can develop at low temperature. A specific concrete example is a broken insulator. We showed how doping such an insulator leads to an understanding of the quantum oscillation data, and presented a possible pairing mechanism for the development of superconductivity. We described potentially useful experiments to distinguish the various possible routes to a Mott insulator from an intervalley coherent state.
Our work was rooted in a microscopic understanding of the twisted graphene bilayer. We showed that the momentum space structure of the nearly flat bands places strong constraints on real space descriptions. In particular, contrary to natural expectations, we showed that a real space lattice model is necessarily different from a correlated triangular lattice model with two orbitals (corresponding to the two valleys) per site. This is due to a symmetry enforced obstruction to constructing Wannier functions centered at the triangular sites that can capture the Dirac crossings of the nearly flat bands. We showed that a honeycomb lattice representation may be possible but requires a nonlocal implementation of valley symmetry. In our description of the intervalley ordered state and its subsequent low temperature evolution into the Mott/superconducting states, we sidestepped these difficulties by first treating the problem directly in momentum space and defining a real space model only at scales below the intervalley ordering (when the obstruction to a honeycomb representation is gone). We also contrasted the bilayer system with trilayer graphene where Mott insulators have recently been observed. In the trilayer system, it is reasonable to construct a real space triangular lattice twoorbital model but the symmetries allow for complex hopping (with some restrictions).
After completion of this work Ref. Xu and Balents, 2018 appeared, with significant differences from the present paper.
Xii Acknowledgements
We thank Yuan Cao, Valla Fatemi, and Pablo JarilloHerrero for extensive discussions of their data, and sharing their insights. We also thank Shiang Fang, Gene Mele, Feng Wang and Patrick Lee for interesting discussions or correspondence. TS is supported by a US Department of Energy grant DESC0008739, and in part by a Simons Investigator award from the Simons Foundation. AV was supported by a Simons Investigator award and by NSFDMR 1411343.
References
 Lee et al. (2006) P. A. Lee, N. Nagaosa, and X.G. Wen, Rev. Mod. Phys. 78, 17 (2006).
 Powell and McKenzie (2011) B. J. Powell and R. H. McKenzie, Reports on Progress in Physics 74, 056501 (2011).
 Iwasa and Takenobu (2003) Y. Iwasa and T. Takenobu, Journal of Physics: Condensed Matter 15, R495 (2003).
 Song et al. (2016) Y. Song, Z. Yamani, C. Cao, Y. Li, C. Zhang, J. S. Chen, Q. Huang, H. Wu, J. Tao, Y. Zhu, W. Tian, S. Chi, H. Cao, Y.B. Huang, M. Dantz, T. Schmitt, R. Yu, A. H. Nevidomskyy, E. Morosan, Q. Si, and P. Dai, Nature Communications 7, 13879 EP (2016).
 Cao et al. (2018a) Y. Cao, V. Fatemi, A. Demir, S. Fang, S. L. Tomarken, J. Y. Luo, J. D. SanchezYamagishi, K. Watanabe, T. Taniguchi, E. Kaxiras, R. C. Ashoori, and P. JarilloHerrero, Nature , EP (2018a).
 Cao et al. (2018b) Y. Cao, V. Fatemi, S. Fang, K. Watanabe, T. Taniguchi, E. Kaxiras, and P. JarilloHerrero, Nature , EP (2018b).
 Lopes dos Santos et al. (2007) J. M. B. Lopes dos Santos, N. M. R. Peres, and A. H. Castro Neto, Phys. Rev. Lett. 99, 256802 (2007).
 Trambly de Laissardière et al. (2010) G. Trambly de Laissardière, D. Mayou, and L. Magaud, Nano Letters, Nano Letters 10, 804 (2010).
 Bistritzer and MacDonald (2011) R. Bistritzer and A. H. MacDonald, Proceedings of the National Academy of Sciences 108, 12233 (2011).
 Wong et al. (2015) D. Wong, Y. Wang, J. Jung, S. Pezzini, A. M. DaSilva, H.Z. Tsai, H. S. Jung, R. Khajeh, Y. Kim, J. Lee, S. Kahn, S. Tollabimazraehno, H. Rasool, K. Watanabe, T. Taniguchi, A. Zettl, S. Adam, A. H. MacDonald, and M. F. Crommie, Phys. Rev. B 92, 155409 (2015).
 Kim et al. (2017) K. Kim, A. DaSilva, S. Huang, B. Fallahazad, S. Larentis, T. Taniguchi, K. Watanabe, B. J. LeRoy, A. H. MacDonald, and E. Tutuc, Proceedings of the National Academy of Sciences 114, 3364 (2017).
 Huang et al. (2018) S. Huang, K. Kim, D. K. Efimkin, T. Lovorn, T. Taniguchi, K. Watanabe, A. H. MacDonald, E. Tutuc, and B. J. LeRoy, ArXiv eprints (2018), arXiv:1802.02999 .
 Rickhaus et al. (2018) P. Rickhaus, J. Wallbank, S. Slizovskiy, R. Pisoni, H. Overweg, Y. Lee, M. Eich, M.H. Liu, K. Watanabe, T. Taniguchi, V. Fal’ko, T. Ihn, and K. Ensslin, ArXiv eprints (2018), arXiv:1802.07317 .
 Chen et al. (2018) G. Chen, L. Jiang, S. Wu, B. Lv, H. Li, K. Watanabe, T. Taniguchi, Z. Shi, Y. Zhang, and F. Wang, ArXiv eprints (2018), arXiv:1803.01985 .
 Mele (2011) E. J. Mele, Phys. Rev. B 84, 235439 (2011).
 Fang and Kaxiras (2016) S. Fang and E. Kaxiras, Phys. Rev. B 93, 235153 (2016).
 Hahn (2006) T. Hahn, ed., International Tables for Crystallography, 5th ed., Vol. A: Spacegroup symmetry (Springer, 2006).
 Litvin (2013) D. B. Litvin, Magnetic Group Tables: 1, 2 and 3Dimensional Magnetic Subperiodic Groups and Magnetic Space Groups (International Union of Crystallography, 2013).
 Kim et al. (2015) Y. Kim, B. J. Wieder, C. L. Kane, and A. M. Rappe, Phys. Rev. Lett. 115, 036806 (2015).
 Goerbig and Montambaux (2017) M. Goerbig and G. Montambaux, “Dirac fermions in condensed matter and beyond,” in Dirac Matter, edited by B. Duplantier, V. Rivasseau, and J.N. Fuchs (Springer International Publishing, Cham, 2017) pp. 25–53.
 Cao et al. (2016) Y. Cao, J. Y. Luo, V. Fatemi, S. Fang, J. D. SanchezYamagishi, K. Watanabe, T. Taniguchi, E. Kaxiras, and P. JarilloHerrero, Phys. Rev. Lett. 117, 116804 (2016).
 Luican et al. (2011) A. Luican, G. Li, A. Reina, J. Kong, R. R. Nair, K. S. Novoselov, A. K. Geim, and E. Y. Andrei, Phys. Rev. Lett. 106, 126802 (2011).
 Li et al. (2009) G. Li, A. Luican, J. M. B. Lopes dos Santos, A. H. Castro Neto, A. Reina, J. Kong, and E. Y. Andrei, Nature Physics 6, 109 EP (2009).
 Po et al. (2017) H. C. Po, A. Vishwanath, and H. Watanabe, Nature Communications 8, 50 (2017).
 Bradlyn et al. (2017) B. Bradlyn, L. Elcoro, J. Cano, M. G. Vergniory, Z. Wang, C. Felser, M. I. Aroyo, and B. A. Bernevig, Nature 547, 298 EP (2017), article.
 Watanabe et al. (2017) H. Watanabe, H. C. Po, and A. Vishwanath, ArXiv eprints (2017), arXiv:1707.01903 .
 de Gail et al. (2011) R. de Gail, M. O. Goerbig, F. Guinea, G. Montambaux, and A. H. Castro Neto, Phys. Rev. B 84, 045436 (2011).
 Nomura and MacDonald (2006) K. Nomura and A. H. MacDonald, Phys. Rev. Lett. 96, 256602 (2006).
 Alicea and Fisher (2006) J. Alicea and M. P. A. Fisher, Phys. Rev. B 74, 075422 (2006).
 Young et al. (2012) A. F. Young, C. R. Dean, L. Wang, H. Ren, P. CaddenZimansky, K. Watanabe, T. Taniguchi, J. Hone, K. L. Shepard, and P. Kim, Nature Physics 8, 550 EP (2012).
 Tasaki (1998) H. Tasaki, Progress of Theoretical Physics 99, 489 (1998).
 Gu et al. (2013) Z.C. Gu, H.C. Jiang, D. N. Sheng, H. Yao, L. Balents, and X.G. Wen, Phys. Rev. B 88, 155112 (2013).
 Senthil et al. (1999) T. Senthil, J. B. Marston, and M. P. A. Fisher, Phys. Rev. B 60, 4245 (1999).
 Volovik (1997) G. E. Volovik, Soviet Journal of Experimental and Theoretical Physics Letters 66, 522 (1997), condmat/9709084 .
 Kimchi et al. (2013) I. Kimchi, S. A. Parameswaran, A. M. Turner, F. Wang, and A. Vishwanath, Proc. Natl. Acad. Sci. 110, 16378 (2013).
 Kim et al. (2016) P. Kim, H. Lee, S. Jiang, B. Ware, C.M. Jian, M. Zaletel, J. H. Han, and Y. Ran, Phys. Rev. B 94, 064432 (2016).
 Ware et al. (2015) B. Ware, I. Kimchi, S. A. Parameswaran, and B. Bauer, Phys. Rev. B 92, 195105 (2015).
 Metlitski and Thorngren (2017) M. A. Metlitski and R. Thorngren, (2017), arXiv:1707.07686 .
 Jian et al. (2018) C.M. Jian, Z. Bi, and C. Xu, Phys. Rev. B 97, 054412 (2018).
 Lee et al. (2018) J. Y. Lee, A. Turner, and A. Vishwanath, ArXiv eprints (2018), arXiv:1802.02155 .
 Marzari et al. (2012) N. Marzari, A. A. Mostofi, J. R. Yates, I. Souza, and D. Vanderbilt, Rev. Mod. Phys. 84, 1419 (2012).
 Soluyanov and Vanderbilt (2011) A. A. Soluyanov and D. Vanderbilt, Phys. Rev. B 83, 035108 (2011).
 Corboz et al. (2012) P. Corboz, M. Lajkó, A. M. Läuchli, K. Penc, and F. Mila, Phys. Rev. X 2, 041013 (2012).
 Kaul (2015) R. K. Kaul, Phys. Rev. Lett. 115, 157202 (2015).
 Yamada et al. (2017) M. G. Yamada, M. Oshikawa, and G. Jackeli, ArXiv eprints (2017), arXiv:1709.05252 [condmat.strel] .
 Xu and Balents (2018) C. Xu and L. Balents, ArXiv eprints (2018), arXiv:1803.08057 .
Appendix A Lattice and symmetries
In this appendix, we document some details on the conventions and the symmetry transformations.
Consider a monolayer of graphene. We let the primitive lattice vectors and reciprocal lattice vectors be
(37) 
where is the lattice constant (some authors use to denote the CC bond length, which is a factor of smaller than the lattice constant we are using here). In this choice, we can choose the basis of the honeycomb lattice sites to be
(38) 
In momentum space, the K, K’ points are given by , or, for the equivalent ones lying on the axis, . Note that , as is well known. Furthermore, we take the Dirac speed to be ms. Besides, we choose the Moiré lattice vectors to be
(39) 
In the main text, we have listed all the symmetries of the continuum theory (Table 1). Here, we tabulate explicitly the symmetry transformations of the electron operators, which follow from that of the Dirac points in the monolayer problem.
(40) 
where . Note that is the only symmetry which flips the two layers, i.e., and vice versa.
The symmetries listed in Eq. (40) generate all the spatial symmetries of the continuum theory of the TBG Lopes dos Santos et al. (2007); Bistritzer and MacDonald (2011); Mele (2011) (in wallpaper group 17). In particular, we see that and preserves the valley index (K vs. K) whereas and do not. However, their (pairwise) nontrivial products will leave valley invariant, and it is helpful to also document their symmetry action explicitly (which are fixed by the above):
(41) 
Here, we make two remarks regarding the subtleties in the symmetry representation documented here: first, the momentum appearing above are defined as the deviation from the original Dirac points in the monolayer problem. Generally, they correspond to different momenta in the Moiré BZ. For instance, the Dirac point labeled by , i.e., that of the point in the top layer, is mapped to , whereas is mapped to . Similarly, and are respectively mapped to and . If desired, one can also rewrite Eqs. (40) and (41) using a common set of momentum coordinates defined with respect to the origin of the Moiré BZ.
Second, the representation of the translation symmetry, , has a subtlety in its definition. This is because the microscopic translation effectively becomes an internal symmetry for the slowly varying fields appearing in the continuum theory. As such, for a single layer one can only deduce its representation up to an undetermined phase, and hence the appearance of in Eq. (40). However, the relative momentum across the different slowly varying fields, say the operators corresponding to the valley of the top and bottom layers, is a physical quantity. Consequentially, there is really only one common ambiguity across all the degrees of freedom appearing in the continuum theory.
Appendix B Valley Symmetry and Wannier Obstruction
We argue here that the valley symmetry resolved band structure does not admit a Wannier representation. Note, since we will ignore spin, this is a two band model which will be crucial for what follows. If one were to include other bands the arguments below would fail, although precisely what selection of bands would lead to localized Wannier functions (LWFs) remains to be determined. In some ways, it is not very surprising that a valley resolved band structure does not admit a Wannier description, a simple example is a single valley of monolayer graphene, which is just an isolated Dirac node. But in those cases the band structure does not terminate on raising the energy, and hence does not form an band. In contrast, in our present problem for TBG there is an isolated band, and so one may expect to capture the physics with LWFs. Nonetheless, we will argue there is an obstruction, as can be seen as follows.
We begin with three ingredients: (i) a two band model; (ii) symmetry; and a third ingredient which will be specified shortly. The two ingredients above enforce the following form on the momentum space Hamiltonian:
(42) 
where there is no condition on the function . Similar to the main text, we implement by , where denotes complex conjugation. Check that leaves the Hamiltonian above invariant. Now, if we are interested in the band wavefunctions, they are independent of the first term in the hamiltonian, and we could pass to the following one by imposing a constraint:
(43) 
The obvious constraint is to demand:
(44) 
This is nothing but the chiral condition that specifies class AIII. Now, we introduce the third ingredient: (iii) the two Dirac points at the middle of this band structure have the same chirality. This allows us to write down the following effective Hamiltonian close to neutrality:
(45) 
Where we now have a four component structure to include the two Dirac nodes. Note, there is no mass term that will gap out these nodes and also preserve the chiral condition 44, hence this corresponds to the surface of a three dimensional topological phase in class AIII, with index , corresponding to the the two Dirac nodes. Since this is the surface state of a nontrivial 3D topological phase, it does not admit a Wannier representation. However, when combined with the opposite valley band structure, together the pair of band structures does admit LWFs, but at the price of losing valley conservation symmetry.
Finally let us address a conundrum that the careful reader may be puzzled by. The valley resolved bands are stated to be the anomalous surface states of a 3D topological phase, nevertheless they appear as isolated bands which seems to contradict the usual expectation that such anomalous bands cannot be separated in energy. The way this is resolved in the present case is through the two band condition, which further allows us to map the problem to one with particle hole symmetry (class AIII). The later problem can have anomalous surface states that are disconnected from the bulk bands because they are forced to stick at zero chemical potential, and hence cannot be deformed into the bulk. This mapping to class AIII only holds for the two band model, hence if we add bands or fold the Brillouin zone from translation symmetry breaking, the presented arguments no longer hold.
Appendix C Wannier functions
We will construct Wannier functions using the projection method Marzari et al. (2012). The method proceeds by first specifying a collection of welllocalized, symmetric trial wavefunctions in real space, which serves as the seed for constructing a smooth gauge required in obtaining welllocalized Wannier functions for the problem of interest.
Let us begin by considering the symmetry properties of a realspace wavefunction in our present problem. For simplicity, we let be a collective index for the valleys and layers, i.e., . Define the realspace electron operators
(46) 
where we will not keep track of the overall normalization of the operator. Here, is defined as the momentum measured with respect to the origin of the Moiré BZ. Note that inherits symmetry transformation from that of .
To construct a collection of welllocalized, symmetric trial wavefunctions, one can follow the standard discussion concerning the symmetry representation associated with such realspace basis, say as reviewed in the supplemental materials of Ref. Po et al., 2017. We will briefly sketch the main ideas below. Let be a twocomponent (column) vector localized to (the twocomponents here originate from the sublattice degree of freedom in the microscopic problem). Define
(47) 
and its associated momentumspace operator