Subduction to the Lower Mantle – a Comparison between Geodynamic and Tomographic Models

It is generally believed that subduction of litho-spheric slabs is a major contribution to thermal heterogene-ity in Earth's entire mantle and provides a main driving force for mantle flow. Mantle structure can, on the one hand, be inferred from plate tectonic models of subduction history and geodynamic models of mantle flow. On the other hand, seismic tomography models provide important information on mantle heterogeneity. Yet, the two kinds of models are only similar on the largest (1000 s of km) scales and are quite different in their detailed structure. Here, we provide a quantitative assessment how good a fit can be currently achieved with a simple viscous flow geodynamic model. The discrepancy between geodynamic and tomography models can indicate where further model refinement could possibly yield an improved fit. Our geodynamical model is based on 300 Myr of subduction history inferred from a global plate reconstruction. Density anomalies are inserted into the upper mantle beneath subduction zones, and flow and advection of these anomalies is calculated with a spherical harmonic code for a radial viscosity structure constrained by mineral physics and surface observations. Model viscosities in the upper mantle beneath the lithosphere are ∼10 20 Pas, and viscosity increases to ∼10 23 Pas in the lower mantle above D ". Comparison with tomography models is assessed in terms of correlation , both overall and as a function of depth and spherical harmonic degree. We find that, compared to previous geody-namic and tomography models, correlation is improved, presumably because of advances in both plate reconstructions and mantle flow computations. However, high correlation is still limited to lowest spherical harmonic degrees. An important ingredient to achieve high correlation – in particular at spherical harmonic degree two – is a basal chemical layer. Subduction shapes this layer into two rather stable hot but chemically dense " piles " , corresponding to the Pacific and African Large Low Shear Velocity Provinces. Visual comparison along cross sections indicates that sinking speeds in the geodynamic model are somewhat too fast, and should be 2 ± 0.8 cm yr −1 to achieve a better fit.


Introduction
At convergent plate margins, slabs of subducted lithosphere start their journey toward the Earth's interior, and seismic tomography is arguably the best tool to track their sinking. Based on such seismic models, the opinion of most scientists is currently that most slabs eventually sink to the base of the mantle (Grand et al., 1997;van der Voo et al., 1999), where they accumulate. However, slab sinking trajectories are complicated through their interaction with phase transitions, particularly the spinel-perovskite transition at 660 km depth, where some slabs may lay flat for a while before they sink further (e.g. Fukao et al., 2001). While long-debated, the interactions of slabs with the transition zone, and the degree of mass transport, are still unclear.
Published by Copernicus Publications on behalf of the European Geosciences Union.

B. Steinberger et al.: Subduction to the lower mantle -comparison between models
By comparing slab locations predicted from geodynamic models based on subduction history, both plate reconstructions (e.g. Bunge and Grand, 2000;Hafkenscheid et al., 2006;Zhang et al., 2010;Shepard et al., 2012) and geodynamic model parameters, such as slab sinking rates and mantle viscosity (e.g. Ricard et al., 1993;Lithgow-Bertelloni and Richards, 1998;Liu et al., 2008;van der Meer et al., 2010;Čížková et al., 2012), can be constrained. Towards that goal, we present here a simple geodynamic model of mantle density based on subduction history, and compare it to seismic tomography. We both visually compare along cross sections and compute formal correlations (cf. Ray and Anderson, 1994;Becker and Boschi, 2002). Our work is essentially an update of Steinberger (2000), which we believe is appropriate now, as both models of seismic tomography and subduction history have changed since then. We also compare our results with those of a simple slab sinker approach (Ricard et al., 1993;Lithgow-Bertelloni and Richards, 1998), as well as an updated slab sinker approach based on our own subduction history model and the sinking rate 1.2 cm yr −1 inferred from van der Meer et al. (2010), in order to assess whether the geodynamic model in fact leads to an improvement.
Our geodynamic model also includes a thermochemical layer in the lowermost mantle, which is shaped by subduction into piles (e.g. McNamara and Zhong, 2005;Garnero and McNamara, 2008;Zhang et al., 2010). A similar global comparison has recently also been performed by Shephard et al. (2012). However, they use a different mantle convection code and mantle viscosity structure; their mantle model does not include thermochemical piles; they use a somewhat different subduction input model and they also compare their results to a different tomography model.

Geodynamic models
The geodynamic models are very similar to Steinberger and Torsvik (2012) -mostly, unless noted otherwise, the thermochemical model, as shown in their Fig. 5 -where the emphasis was on the creation of plumes, whereas here we focus on subduction. In the following, we give a brief model description based on that paper.
To compute the changing mantle density distribution and flow through time, we solve equations that represent -with some simplification -conservation of mass, momentum and energy, as well as the advection of compositional heterogeneities. We use an index formulation where subscript i stands for spatial component i, subscript , i for the derivative in the direction of i, subscript , rr for the second derivative in radial direction, and subscript , t for time derivative. Conservation of mass is then written as (ρu i ) ,i = 0 (1)  Steinberger and Calderwood, 2006) and thermal expansivity ((B); lower mantle profile from Steinberger and Calderwood, 2006, extended to the upper mantle). The shape of the viscosity profile in each layer (upper mantle, transition zone, lower mantle) has been inferred from mineral physics, and absolute viscosity values in each layer are adjusted to optimize fit to geoid and global heat flux (for a flow model based on seismic tomography) and satisfy the "Haskell" constraint from postglacial rebound (Mitrovica, 1996).
where u is velocity and ρ is the depth-dependent reference density, i.e. we consider compressibility. The conservation of momentum equation is formulated as −δp, i +(η(u i,j + u j,i − 2/3u k,k δ ij )) ,i − δρgδ ir − ρδg i = 0 (2) whereby δp is pressure anomaly, η is viscosity (only radial variations; Fig. 1a), δ ij is the Kronecker delta tensor, δρ is density anomaly, g is the reference gravity and δg is gravity anomaly, i.e. we consider terms for pressure gradient, viscous stresses and buoyancy. δρ is the sum of a compositional part δρ c and thermal part δρ th = δT αρ that depends on temperature anomaly δT and thermal expansivity α, which decreases with depth ( Fig. 1b). Consequently, density anomalies corresponding to subducted slabs become considerably smaller as they sink through the lower mantle. Free slip boundary conditions are used at top and bottom, but with a highviscosity lithosphere on top. Mass and momentum equations are solved with a spherical harmonic approach initially developed by O'Connell (1979, 1981) with lateral expansion to degree and order 127, and on 78 radial layers spaced at a distance from 20 km (near the base of the mantle) to 50 km (near its top). We assume a constant heat capacity c p = 1250 J kg −1 K −1 , and only consider radial conduction of heat, which is most important in the lower mantle thermal boundary layer above the core. With these simplifications, we write the conservation of energy equation as δT ,t + u i δT , i = κδT ,rr + H /(ρc p ) + S. ( We use a thermal diffusivity κ = 0.95 × 10 −6 m 2 s −1 = 50 km 2 /100 Myr. Hence, during time t = 100 Myr (the approximate sinking time of slabs in our model), heat diffuses over a distance of about 2(κt) 1/2 = 100 km, which is less than 2 degrees of arc just above the CMB. This is rather small compared to the width of anomalies as seen in our figures, and we therefore think that the approximation of only including radial diffusion is justified. Internal heating rate is H/ρ = 1.42 × 10 −12 W kg −1 , corresponding to 6 TW in the whole mantle. The "source" term S, which is an "ad hoc" modification of the energy equation, represents the explicit adding of thermal anomalies corresponding to subducted slabs, beneath time-dependent subduction zone locations (Fig. 2), including the last 300 Myr of subduction history in the TPW corrected, global hybrid reference frame Torsvik et al., 2008) corrected in longitude according to van der Meer et al. (2010) in most cases (see Table 1).
Depths and times where these anomalies are added differ among our model runs (see Table 1); they are either inserted at the time of subduction and distributed according to a cosine bell shape centered at 300 km with 100 km half-width at half maximum, giving a more diffused slab, or 12 Myr after subduction at depths 600-650 km, or 14 Myr after subduction at 650-700 km, yielding more narrow slabs. The insertion over a wider depth range in some models follows our earlier approach Torsvik, 2010, 2012) and was used here with the aim of modeling a smooth present-day structure, for which a better correlation with tomography can be expected. Insertion over a narrower depth range is meant to give a better visual agreement in the cross sections, since slabs observed by tomography are also often rather narrow.
In this case, slabs are also inserted at a greater depth, since our viscous flow model is probably not appropriate for upper mantle slabs, which are still mechanically attached to the plates (Becker and O'Connell, 2001;Conrad and Lithgow-Bertelloni, 2002). They are inserted with a time delay corresponding to a presumed slab sinking speed of ∼5 cm yr −1 in the upper mantle, similar to typical speeds of plate motion. Hence we effectively consider slabs below depth 600 or 650 km as "detached", since shallower slabs are not included in the density and flow model. However, we consider upper mantle slabs for the correlation computations and the cross section plots: slabs subducted at 14 Ma are assigned to depth 650 and 700 km, slabs from 12 Ma to 550 and 600 km, and so on, upward, until slabs from 2 Ma are assigned to the 50 and 100 km layers.
In some of our models (see Table 1), the Clapeyron slope of phase boundaries is considered through mass anomalies at the depth of the phase boundary. Specifically, we consider an additional mass per area at depth 660 km that is equal and opposite to the density anomaly just below 660 km (here at depth 700 km) multiplied with a 77 km thickness. This thickness corresponds to a product of Clapeyron slope and density difference of 300 (MPa K −1 )(kg m −3 ) as suggested by Akaogi and Ito (1999), and a thermal expansivity of α = 2.079 × 10 −5 K −1 (Steinberger and Calderwood, 2006). In the same way, we use a thickness of 105 km (and equal, not opposite, density anomaly) for modeling the effect of the "410"; however, this plays almost no role, as density anomalies are inserted into the model below that depth.
These slight modifications were made because a slab that has a realistic width of 100 km at the depth of the "660" interacts with the phase transition to slow down sinking into the lower mantle. We expected that, combined with the delay of 14 Myr of inserting slabs and the stronger decrease of thermal expansivity with depth, an increased sinking time, and hence improved correspondence between predicted slab locations and tomography, could result. However, comparison of a model with phase boundaries and the corresponding model without shows the effect of these modifications is rather minor.
We include diffusion of heat through the CMB, but apply an "isolating" upper surface, in order to not include twice the same effect, which is already considered by the explicit addition of subducted slabs into the model. For that reason, we choose a thermal diffusivity value that is meant to be realistic for the lowermost mantle. The CMB is assumed to be isothermal, with thermal density contrast of -1.2 % across the lower thermal boundary layer, corresponding to a temperature contrast of 1126 K with our expansivity profile (Fig. 1b). Heat flow across the CMB is hence an output of our model, and turns out to be rather variable, between less than 10 mW m −2 in some regions beneath thermochemical piles, and several hundred mW m −2 in some regions beneath subduction zones.  Steinberger and Torsvik (2012), but displayed in a different and hopefully intuitive way: Color represents time of subduction before present, and darkness represents the amount of subducted material per time and per subduction zone length, expressed in terms of convergence rate (CR) times a thickness factor (TF), which accounts for the increase of lithosphere thickness with age. We use TF= √ (age/80 Myr) for age <80 Myr and TF=1 for age >80 Myr. Convergence rates are largely unconstrained before 140 Ma and therefore not considered -see Steinberger and Torsvik (2010) for details. Younger slabs are plotted on top of older ones, corresponding to a "view from above" for slabs sinking vertically at constant speed (cf. Sigloch, 2011).
These heat flow variations and their effect on the dynamics of the core are discussed by Biggin et al. (2012).
Most of our models also include a chemical layer at the base of the mantle, which is initially 70 km thick, with a compositional density contrast 2.3 % at the time of model initiation at 300 Ma. This initial condition is chosen because it appears the least prejudiced, but we note that there are indications that this chemical layer was already formed into piles similar to present-day 300 Myr ago or perhaps much longer . In these cases, advection of compositional heterogeneities is formulated as Equations (3) and (4) are solved on the same radial layers as (1) and (2) and laterally on a grid of 256 equally-spaced longitudes and 128 "Gaussian" latitudes (Press et al., 1986). This grid enables transformation back and forth between spatial and frequency (spherical harmonic) domain, which is necessary at each time step. The equations are time integrated with the fourth-order Runge-Kutta method (Press et al., 1986), using upwind differencing for the advection terms in order to avoid numerical instabilities. As our model output, we compute the present-day temperature or thermal density field, which is compared with a seis-mic tomography model as discussed in the next section. As a model enhancement, we have introduced slab tracers. They are also added vertically below subduction zones at depths 650 and 700 km and advected with the flow using a 4th order Runge-Kutta scheme (Press et al., 1986). They carry as information their weight, location and time of insertion. In this way, we can more directly compute sinking times and speeds at specific locations, as well as averages and variability of sinking times and speeds and lateral motion, and compare these modeling results with observation-based estimates (e.g. van der Meer et al., 2010) to complement our own comparison with tomography.

Tomography model
A great number of tomography models have been published over recent years. Here we mainly compare to the SMEAN composite S-wave tomography model of Becker and Boschi (2002) (Fig. 3) which was obtained by RMS-weighted averaging of S20RTS (Ritsema and van Heijst, 2000), NGRAND (a 2001 update of Grand et al., 1997), and SB4L18 (Masters et al., 2000). SMEAN has been found to outperform other models in geodynamic tests (Steinberger and (Becker and Boschi, 2002) composite tomography model that is meant to show the depth and intensity of slab-related anomalies. For each location, we determine local maxima of the S-wavespeed anomaly vs. depth profile, as such maxima may correspond to the centers of subducted slabs. We plot the depth (represented by color) and S-wavespeed anomaly (represented by darkness) of such maxima, if they occur in the lower mantle (depth >670 km) and exceed 0.5 %. If at a given location more than one maximum satisfying these conditions is found, only the upper one or uppermost one is plotted, corresponding to a "view from above". 2006) and yields good variance reductions when put to the test with actual seismic waveforms (Qin et al., 2009).
We also constructed an update, which we call SMEAN3, where we replaced S20RTS by S40RTS  and NGRAND by TX2008 (the purely seismic model of Simmons et al., 2009). Differences between SMEAN3 and SMEAN are rather minor; correlations are close to unity up to spherical harmonic degree l ∼14, and average correlation up to degree l = 20 is 0.95. We also find that different mean models -such as a simple average of these three models, or the average of Buiter et al. (2012), or similar averages where we only include those models that have global coverage, or only S-wave models with global coverage -all look very similar. Correlation and visual agreement among different S-wave models -and even more so among the various mean models constructed -is generally much higher than between tomographic and geodynamic models (which will be discussed below; cf. Becker and Boschi, 2002). Because Pwave models often have better resolved slabs, we also visually compare our model with a recent P -wave model, MITP-08 (Li et al., 2008).

Ways of comparison between geodynamic and tomographic models
With the exception of two additional scaling tests (discussed below), we assume here that S-wave velocity anomalies are linearly related to the density anomalies we infer from our slab models, implying that all chemical heterogeneity is captured by the chemical piles in the lower mantle. This assumption ignores the potential complexities which may arise from mineral physics and thermodynamic considerations of the shear wave to density anomaly scaling in a heterogeneous mantle (e.g. Ricard et al., 2005;Stixrude and Lithgow-Bertelloni, 2010). We compare geodynamic and seismic models both visually along cross sections and in map view and formally in terms of correlation coefficients. Formal correlation is computed and displayed in the same way as in Becker and Boschi (2002). That is, we plot or give global correlation -as a function of depth and spherical harmonic degree, -as a function of depth for all spherical harmonics up to l = 8 or l = 20, All panels are for model st12den-2; red lines indicate the average, whereas histograms plotted for certain depths or times illustrate variability. Only those slabs subducted since 120 Ma are plotted, since older slabs tend to move laterally in the lowermost mantle, and may get heated up and rise again, and would hence make the picture less clear. Both average and histograms consider variable "mass" of slab tracers, which may differ because of variable convergence rate, age of subducted plate and spacing of tracers along subduction zones (which is kept nearly constant).
-and for the entire mantle up to l = 8 or l = 20.
Correlation is a convenient, but perhaps not always the most meaningful way of comparison between models (e.g. Ray and Anderson, 1994). For example, if the predicted location of a subducted slab is only slightly higher or lower than observed through tomography, or -in other words -at the same depth is slightly offset to the side, correlation may be low, although the pattern may be quite similar. This issue has been addressed in Becker and Boschi (2002) by vertically stretching subduction models. Besides this formal comparison, we therefore here also give a visual comparison in map view and along cross sections, which can also give an indication of how predicted and observed slabs are offset, and what kind of model modification could improve the fit.

Sinking speeds and lateral displacement of slabs
Results regarding modeled slab sinking speed and lateral displacement are summarized in Fig. 4. The left panel A shows an average sinking speed of ∼2 cm yr −1 in the lower mantle. However, it also indicates variability; for example, at 60 Ma, most slabs are in a depth range 1700-2300 km, indicating sinking speeds varying between ∼2.2 and 3.5 cm yr −1 in the upper part of the lower mantle. Somewhat lower sinking speeds (average ≈1.6 cm yr −1 ) are computed if we restart the computation at 300 Ma with the computed present-day density model. This might occur because the lower mantle be-neath subduction zones already contains cold material from the previous model run, so the density difference between newly subducted slabs and surrounding mantle is less, leading to slower sinking.
The center and right panels B and C show that most slabs sink close to vertical. For example, at 1850 km depth, the peak of the histogram is at only 1 degree of arc, and the larger amount of slabs get advected less than 2.4 degrees. Only at the very base of the mantle, older slab particles will obviously move horizontal above the CMB, and distances get larger. Figure 5 shows the distribution of slabs predicted by our forward model and displayed in analogous way to the tomography model in Fig. 3. On the positive side, we note that there is an overall agreement between regions where slabs occur in both models. In particular, the agreement is quite good in the lowermost mantle (violet colors). Of course, this does not come as a surprise, as slabs in our model do not move very much laterally until they reach the lowermost mantle (see Sect. 3.1), and the agreement of subduction zone locations through geologic history (Fig. 2) and tomography of the lowermost mantle (Fig. 3) has been noted early on (e.g. Richards and Engebretson, 1992). On the downside, the maps look quite different on a smaller scale. The tomography model looks more "blobby", whereas the geodynamic model generally shows linear anomalies often connected to surface subduction zones, getting more diffused and less strong further down in the mantle before spreading out above the coremantle boundary. It is also not straightforward to "match" individual slabs in both models. Part of the problem is that we have to choose a certain cutoff in both figures, and the figures change with cutoff, which is why we also compare the models along cross sections in Sect. 3.2.2. But it is also clear that trying to match slabs inferred from subduction history and tomography requires a very dedicated effort (e.g. van der Meer et al., 2010). The dependence of fit on wavelength will be more formally discussed in Sect. 3.3; expectedly, the agreement looks much better if you step back and do not scrutinize the details.

Cross sections
While map views as presented in Sect. 3.2.1 enable an overall comparison, the problem of having to choose a certain cutoff, and the fact that slabs higher up in the mantle hide slabs beneath, make cross sections more suitable for a detailed comparison trying to match slabs in both models. Such a comparison of cross sections through the mantle beneath subduc-tion zones of the past 300 Myr is given in Fig. 6. While the cross sections beneath the most of the Americas, Australia and Antarctica mostly cross one subduction zone for a given time, those beneath Indonesia, Eurasia and Alaska typically cross two -one in the circum-Pacific Ring and one corresponding to Tethys, Mediterranean and Indian Ocean subduction at the southern margin of Eurasia. Cross sections are shown for model st12den-2. Cross sections for s12den-1 have more diffused slabs -as the slab input is distributed over a larger depth interval -but otherwise look similar.
Tomographic and geodynamic cross sections show overall agreement on the large scale, with -in most casesthe central parts of the cross sections being dominated by cold or seismically fast material that can be attributed to subduction, while on one (240°-285°), or both (90°-225°) of the sides, often hot or seismically slow material appears due to upwellings that may occur along the margins of the Large Low Shear Velocity Provinces (LLSVPs; Garnero and McNamara, 2008). A different pattern occurs under Asia, with subduction and corresponding slab anomalies (Circum-Pacific and Tethys) on either side, and a slow anomaly (seen in the tomography cross sections especially at 345°, in the model cross section at 345°and 330°) in between. A slow  2), and their color to slab age, i.e. its time of subduction. To make cross sections in the other three columns more comparable, the mean value at each depth layer is set to zero. Numbers in each row indicate the azimuth of the cross section, as indicated in Fig. 2. The orientation is such that mostly West is left and East is right; specifically the letter "A" indicates the "African" end of the cross section, the letter "P" the Pacific one. Part (A): Cross sections beneath Central and South America (north to south from top to bottom panel).
anomaly in the lowermost mantle beneath Russia and Kazakhstan appears in most recent tomography models and seems to be a robust feature. The tomography cross sections at 345°and qualitative agreement with modeling results (see also Steinberger and Torsvik, 2012) indicate that it may well be overlain by a mantle plume.
A more detailed comparison of slow regions, except for those corresponding to the LLSVPs of the lowermost mantle (see Sect. 3.1.3), is not attempted here for the following reasons: 1. Our code does not consider lateral viscosity variations, and therefore our resulting upwellings are probably unrealistically wide.

2.
A statistical comparison of seismically slow regions and predicted plume conduits from geodynamic modeling has already been shown to display good agreement for some deep-rooted plumes being connected to hotspots (Boschi et al., 2007(Boschi et al., , 2008.
3. Although we generally find a pattern of plumes along LLSVP margins , their locations do not exactly correspond to observed hotspots.
If we attempt to match individual features in the seismically fast regions, we find they are generally less deep in the tomography model. We begin the comparison under South America (135°-90°) where it is perhaps most straightforward, and then move counterclockwise around the Pacific before discussing subduction at the southern margin of Eurasia. At 135°, we find a gap (corresponding to smaller amounts of subduction) at a radius ∼0.7-0.75 (normalized to Earth's radius) in the geodynamic model, whereas a similar gap occurs in the S-wave tomography model at about 0.73-0.85. At 120°, this gap occurs at larger depth in both cases: ∼0.6-0.7 in the geodynamic vs. ∼0.63-0.8 in the S-wave and 0.75-0.85 in the P -wave tomographic models. Further north, we observe more or less continuous fast or cold material from the top to bottom of the mantle in all three cases. However, maxima (corresponding to largest amounts of subduction) occur again at larger depths ∼0.8 in the geodynamic model vs. ∼0.85 in both tomography models, at 105°-75°. We also find that modeled maxima typically occur ∼10°further west than seismically observed, which may, at least in the southern part, be caused by flat slab subduction (Isacks, 1988) not accounted for in our model, where we assume vertical sinking in the upper mantle. Beneath North America (15°-60°), disagreement becomes more prominent. Still, we find maxima generally deeper (∼0.65-0.7) in the geodynamic model than in the S-wave tomography model (∼0.72-0.8) and further to the west. Again, flat slab subduction (Bird, 1988) could, at least partly, be responsible for this lateral offset. Accordingly, Bunge and Grand (2000) invoked low-angle subduction from a comparison of geodynamic modeling results and tomography. It has been shown to be difficult to achieve agreement between geodynamic and tomographic models in this region without ad-hoc assumption such as a stress guide (Liu et al., 2008). However, beneath North America also the shape of the anomaly disagrees. Whereas S-wave tomography shows a slab dipping from west to east (especially in the sections at 60°and 45°in Fig. 6), at an approximately constant dip angle, hence indicating an approximately constant slab sinking speed of ∼1-1.5 cm yr −1 (Grand et al., 1997), the modeled slab has a more complicated shape, reaching a maximum depth for slabs of about 80 Ma, whereas older slabs are less deep in the mantle. The P -wave cross section at 60°also indicates a more complicated shape, but quite different from our model. Qualitatively, we can understand the slab shape in our model due to high rates of subduction beneath North America in the late Mesozoic (darker colors in Fig. 2). These not only cause relatively fast slab sinking speeds, but also an upward return flow to the side of it, further enhanced by active upwelling (indicated by red colors beneath the slab in the cross sections) thus hindering the older slabs from sinking further or even pushing them up again. However, the fact that this shape is not observed may indicate that rates of Farallon subduction beneath North America before ∼80 Ma were higher than in our model. This could be the case, as the absolute Pacific plate motions is not well known before the age of the Hawaii and Louisville hotspot chains, and hence Circum-Pacific rates of subduction are not well constrained before ∼80 Ma. Larger amounts of Farallon subduction could also account for the discrepancy between seismically fast material at the base of the mantle at 60°and 45°(S-waves) or only 60°(P -waves) beneath North America, and the absence of corresponding material in the geodynamic model. Beneath Alaska and the Aleutians (left part of cross section at 0°and right part at 345°) a larger modeled anomaly centered at radius ∼0.8 might correspond to an anomaly at radius ∼0.82 that appears in both tomography models at 0°and the S-wave model at 345°.
A reasonably good match between s-wave and subduction model is found beneath Kamchatka and especially Japan (right parts at 330°and 315°). The main difference is that the slab beneath Japan appears to stagnate in the transition zone (Fukao et al., 2001) and is therefore observed at a depth shallower than modeled. The fact that the observed slab is further to the west could be due to the slab moving (possibly being pushed by the subducting plate) further westward while it is stagnating.
Further south, disagreement becomes more prominent again. At 300°(Izu-Bonin arc) the model slab is still continuous, whereas both tomography models actually show slow anomalies in the mid-mantle, separating fast anomalies below and above. Such a separation between slabs in the upper part of the lower mantle, and in its lowermost part occurs in the geodynamic model only further south, at 285°(Marianas arc). This pattern of fast or cold anomalies mainly down to the upper part of the lower mantle and in the lowermost mantle in both the geodynamic and tomographic models continues south until 240°and 225°(Tonga-Kermadec). Anomalies in the lowermost mantle are mainly seen in S-wave tomography, though. The difference is particularly striking in the 240°cross section. Tonga-Kermadec is also the only region where the slab, as seen in the tomography models, occurs at larger depth than in the geodynamic model. This is likely partly because due to fast spreading in the Lau back-arc basin and resulting trench retreat, actual convergence rates at the Tonga trench are up to 24 cm yr −1 (Bevis et al., 1995). This is much higher than in our simplified plate reconstruction, where we use Pacific and Australian plate motions to compute convergence, and larger amounts of subducted material may lead to faster sinking. A comparatively smaller effect is that in our geodynamic model, sinking is counteracted by upward flow in the lower mantle, whereas the S-wave tomographic model only shows further to the east slow wavespeeds indicative of upward flow.
A further discrepancy is that further to the west, beneath Australia and the Australian Antarctic Discordance south of Australia, there are considerable amounts of seismically slow material throughout the lower mantle in the S-wave tomography model, whereas in the geodynamic model most of the cold material is in the lowermost mantle. Again, this could be because sinking speeds are too fast in the model, but it has also been suggested that an ancient slab is being drawn up beneath the discordance (Gurnis and Müller, 2003).
In the regions further south, across and near Antarctica (cross sections at 210°and 180°) agreement is good again to the extent that in both S-wave and and the geodynamic models the most prominent fast or cold anomalies only occur in the lowermost mantle, because subduction has terminated in that region at about 80 Ma. Again, the P -wave model is different to the extent that slow anomalies at the base of the mantle are much less prominent. At the 180°(south polar) cross section it can also be seen that, once subduction has stopped, the last slabs sink at considerably slower rates than average, as shown in Fig. 4a: from the computed presentday depth ∼1300 km and the insertion depth 650-700 km at 66 Ma, an average sinking speed of ∼1 cm yr −1 is inferred. Given that the S-wave tomographic anomaly reaches up to about the same depth, one can infer a similar sinking speed also from tomography.
Slabs subducted at the southern margin of Eurasia can be seen at the right side of the cross sections from 30°to 0°and at the left side from 345°to 270°. We can again mostly match fast anomalies in the seismic with cold anomalies in the geodynamic cross sections. Again, anomalies in the geodynamic cross sections occur often (such as in the 315°cross section) at greater depth than in the seismic ones. There is no strong lateral offset, except at 270°. In that equatorial cross section beneath Indonesia, the fit is very poor. The geodynamic model predicts cold anomalies throughout the lower mantle, whereas the seismic models show a strong anomaly laterally displaced mainly in the upper part of the lower mantle. This misfit is probably in part because the plate kinematic history in that region is very complicated and not adequately matched by our simplified model.
Finally, cross sections beneath Asia, especially at 330°, show in their central parts the remains of the Mongol-Okhotsk subduction zone (van der Voo et al., 1999). Like in the other case where subduction has subsequently stopped -beneath Antarctica -the geodynamic model predicts here comparatively slow sinking of the final slabs subducted in that region, such that the geodynamic model predicts cold anomalies at a similar depth to the observed fast seismic anomalies.

Map views of the lowermost mantle
We now compare model predictions with and without a thermo-chemical layer at the base of the mantle. If -as in the model shown in Figs. 5 and 6 -such a layer is included, it is shaped into thermo-chemical piles, essentially being pushed away by subducted slabs and piled up beneath the regions where no subduction has occurred since 300 Ma (Mc-Namara and Zhong, 2005). In the map view shown in Fig. 7a, these piles appear as large regions of negative thermal density anomaly, corresponding to high temperature; they are chemically denser and hence remain near the CMB, where they : Same for a model without thermochemical layer. No phase boundary is considered, and slabs are inserted at depths 600 and 650 km 12 Myr after subduction. However, the latter two differences only change results in a minor way; the main difference is due to presence or absence of a thermochemical layer. Bottom panel (C): Composite tomography SMEAN (Becker and Boschi, 2002). Again, mean values are set to zero in all cases. become hot. These regions correspond in location and shape approximately to the LLSVPs seen in tomography (Fig. 7c), but are somewhat larger in size. The size could however be adjusted by varying the chemical density contrast. On the other hand, in the purely thermal model, a larger number of smaller anomalies similar in size, and one of them similar in location, to the seismic anomaly seen beneath Russia and Kazakhstan are predicted (Fig. 7b).

Correlations between geodynamic models and tomography models
Our more qualitative visual comparison in the previous section has already shown that agreement is best on the largest scales, whereas finer details, if matching pairs can be found at all, are often shifted laterally or radially. The correlation plots in Fig. 8 provide a more quantitative description of this finding. For st12den-1 (Fig. 8a) and st12den-2 (Fig. 8b), the center parts show that correlation is good throughout the lower mantle until l ∼4, corresponding to a half-wavelength, or size of anomalies, of ∼5000 km. However, since lowermost mantle structure is dominated by large-scale structure, this still corresponds to a total correlation ∼0.7 for both r 8 and r 20 , i.e. up to degree l = 8 or 20. In the middle part of the lower mantle, large-scale structure becomes less dominant, and therefore overall correlation is lower, particularly for r 20 . From 2000 km up to 670 km, correlation becomes better at gradually higher degree. At 670 km, it is reasonably good until degree 11 (half-wavelength 1800 km or 16 degrees of arc) -corresponding to a lateral offset of anomalies in the geodynamic vs. seismic model of typically less than that -in accord with the visual assessment in the previous section. Correspondingly, overall correlation in the upper part of the lower mantle is also higher than in its midpart, particularly for r 20 . The right panels show that through much of the lower mantle the rms density anomaly of the geodynamic model is about one quarter of the RMS wavespeed anomaly of SMEAN, similar to the value expected if both are due to temperature anomalies (e.g. Steinberger and Calderwood, 2006). In the lowermost mantle, where the assumption of thermal density anomalies holds less well, the difference in rms anomaly becomes somewhat less. Overall, SMEAN is somewhat better correlated with st12den-1 than with st12den-2. This is probably partly because slabs in st12den-1 are inserted at shallower depth. Partly it may also reflect that slabs in st12den-1 are inserted over a larger depth interval and therefore more diffused, and tomography models also tend to be "smeared out". These correlations imply the assumption of a constant scaling between seismic and density anomalies, whereas in reality, non-linear seismic sensitivity to temperature probably causes warm low-density anomalies to be scaled to larger anomalies than cold high-density anomalies. To address this issue, we did a simple test where we correlated st12den-1 with SMEAN, SMEAN-, and SMEAN+ where the latter two were derived by scaling negative and positive anomalies with a factor of 2.5, respectively, while leaving the other anomalies at unity scaling. The correlations are overall similar for all three models, but SMEAN-has somewhat worse performance because the slow anomalies are not predicted that well by mainly slab-driven models. SMEAN+, on the other hand, is almost identical to SMEAN correlations with st12den-1, consistent with the suggestion that we are matching fast  Becker and Boschi (2007) and based on the modeling procedure described in Steinberger and Antretter (2006) have been added to the slab model st12den-7.
anomalies best, and increasing the amplitude of those does not make much of a difference. The overall appearance of the correlation plots remains similar, and total correlation becomes only slightly lower if we replace SMEAN by any of the other mean models, or by S40RTS (Ritsema etal., 2011), or if we replace our geodynamic model by the model of Steinberger and Torsvik (2012) (shown in their Fig. 5) with a less strong decrease in thermal expansivity with depth, compared to st12den-1.
Whereas in models st12den-1 and st12den-2 subduction zones are in a TPW corrected global hybrid reference frame Torsvik et al., 2008), additionally corrected in longitude (van der Meer et al., 2010), we find somewhat lower correlations r 8 = 0.29 compared to 0.35 and r 20 = 0.18 compared to 0.21 without the longitude correction (model st12den-3). This is not surprising given that the correction of van der Meer et al. (2010) by construction is meant to optimize the fit between slab locations based on subduction zones and tomography.
Correlation is also substantially reduced throughout the mantle if the model does not include thermo-chemical piles (st12den-6; cf. Fig. 7b). We attribute that difference occurring throughout the mantle and not only at the depths where the thermo-chemical piles occur to the fact that upwelling in the model with piles are generated at locations that are less discrepant from regions of low seismic wave speed.
In contrast, if we include neither thermo-chemical piles nor diffusion of heat across the bottom thermal boundary layer in the model (st12den-7; Fig. 8c), we obtain an even higher correlation, except in the lowermost mantle; including diffusion of heat across the CMB improves correlation in the lowermost mantle because model slabs push hot material towards locations corresponding to LLSVPs. However, higher up in the mantle, correlation gets worse, as upwellings in the model form at locations that generally do not match well with actual upwellings.
We find that correlations have improved compared to the earlier slab model of Steinberger (2000), which gives r 8 = 0.3 and r 20 = 0.21 with SMEAN (Becker and Boschi, 2002). In this case, like for our new slabs-only model st12den-7 (Fig. 8c), correlation remains at a similar level throughout the mantle, and is slightly higher in the upper part of the lower mantle than at its base. Correlations of SMEAN with the simple slab sinker model (vertical sinking at a prescribed speed) of Lithgow-Bertelloni and Richards (1998) are also similar to correlations with Steinberger (2000) (r 8 = 0.33 and r 20 = 0.18). For comparison we have also devised a slab sinker model based on our own subduction model, both in the TPW corrected global hybrid reference frame Torsvik et al., 2008) and in a reference frame additionally corrected in longitude (van der Meer et al., 2010) shown in Fig. 2, and vertical sinking of 1.2 cm yr −1 (van der Meer et al., 2010). We find that despite the update in plate reconstruction model, correlations of the slab sinker model with tomography remain low, on a similar level to the model of Lithgow-Bertelloni and Richards (1998). The fact that the "slabs-only" geodynamic model st12den-7 gives much higher correlations than the slab sinker approach emphasizes the importance of mantle flow modeling that predicts (although rather small) lateral advection of slabs and variable slab sinking speeds (Fig. 4a).
Finally, because it appears that including upwellings that dynamically form in our model always deteriorates correlation, we consider a model where, instead, we include plumes with surface positions based on hotspots, and tilted plume conduits with moving source at the CMB, as in Boschi et al. (2007Boschi et al. ( , 2008, based on the modeling approach of Steinberger and Antretter (2006). We find that in this case (Fig. 8d) correlations are further improved compared to the slab-only model. We note that here the amplitude scaling of plumes and hence the amplitude of the combined model is somewhat arbitrary, but resulting correlations depend on this scaling only slightly. Note, though, that the flow field used to advect plumes in this approach is not based on subduction, but inferred from tomography.

Discussion
Slab sinking speeds in our model (Sect. 3.1) are significantly higher, by about a factor 2, than the estimate 1.2 ± 0.3 cm yr −1 of van der Meer et al. (2010) based on comparing reconstructed subduction zones with tomography. Shifting the viscosity profile to higher viscosities does lead to somewhat reduced sinking speeds. We have run a test case where viscosity has been increased by 50 % everywherewhich is probably already above the upper limit of what is compatible with the "Haskell" postglacial rebound constraint -and found that this reduces sinking speeds, inferred from a depth-vs-time plot as in Fig. 4a, by only ∼10 %.
Our comparison to tomography in Sect. 3.2 also indicates that model sinking speeds are too high. Building upon this comparison we can give our own estimate of what would be appropriate slab sinking speeds to best explain tomography; we identify characteristic features that can be visually matched in the geodynamic model and tomography model cross sections. Based on the slab tracers, we determine the age of slabs corresponding to this feature. We then obtain our own observation-based sinking speed estimate by dividing the depth of the feature in the tomography model through this age.
We distinguish between the following three cases: (a) beginning of subduction, or substantial increase in the amount of subduction (e.g. beneath South America), (b) end of subduction (especially Mongol-Okhotsk subduction beneath Asia; Phoenix subduction beneath Antarctica), and (c) specific features in the middle of subduction (the bend in the slab beneath Japan). Results are plotted in Fig. 9. It appears that most data points plot around a straight line through the origin with a slope of about 2 cm yr −1 , but with considerable spread, with most data points falling between lines with slope 1.2 cm yr −1 and 2.8 cm yr −1 . Exceptions are the two data points corresponding to the Mongol-Okhotsk slab, which would correspond to a much lower sinking speed. However, here our geodynamic model predicts an inverted age progression, with the oldest slabs on top, as subduction at two sides of it -at the southern and eastern margins of Asia -has pushed this slab up again. On the other hand, the "225" data point (Tonga-Kermandec) corresponds to sinking speed higher than 2.8 cm yr −1 , which could well be caused by the fast convergence rate and corresponding large amount of subducted slab per time and subduction zone length in this region.

B. Steinberger et al.: Subduction to the lower mantle -comparison between models 429
The discrepancy of our observation-based sinking-speed estimate with the 1.2 cm yr −1 determined by van der Meer et al. (2010) is therefore somewhat marginal. We also note that our approach is somewhat biased toward high sinking speeds -at least when interpreting the lower end of a slab -as often slabs are bent in our geodynamic model such that not the oldest slab is at the lower end.
Our modeled sinking speeds are higher than found by Cížková et al. (2012) for the case where they use a similar viscosity structure. We think that this difference occurs mainly becauseČížková et al. (2012) model relatively short episodes of subduction, whereas our model typically has subduction in the same region for a long time, leading to larger amounts of subducted slabs, and hence faster sinking.
The models of Shephard et al. (2012) are more similar to ours in that respect, as they are also based on actual subduction history. These authors find that sinking speeds in the lower mantle do not exceed 1.5-2 cm yr −1 , but this difference is probably due to their viscosity being 10 23 Pas throughout the lower mantle, whereas in our model such high viscosities are only reached in the lower part of the lower mantle. We find that our correlations between models and tomography are mostly higher than those of Shephard et al. (2012), in their case with 200 Myr of subduction in the "subduction reference frame", which is their case most similar to our model. This is in part due to our comparison to a different tomography model (for example, we find somewhat lower correlations if we compare our model with S40RTS rather than SMEAN), and, more importantly, because we have included a thermo-chemical layer that is formed into "piles" at the base of the mantle. Figure 7 suggests, at least to us, that the thermo-chemical model -with two large hot regions approximately corresponding to the two LLSVPs -fits tomography better than the purely thermal one. In addition to the two LLSVPs, the tomography model shows one smaller low shear velocity province beneath north of the Caspian Sea. This feature occurs in many recent tomography models, and therefore appears to be robust. One might consider that if this feature -which is similar in size and location to one of the small hot anomalies in the thermal model -is resolved, tomography would generally resolve a pattern such as in the thermal model in Fig. 7b. However, more appropriately, the geodynamic models should also be "looked at" through a tomography filter (Mégnin et al., 1997;Schuberth et al., 2009;Bull et al. 2009).
Our model provides an improvement compared to earlier models -both based on a simple slab sinker approach (Lithgow-Bertelloni and Richards, 1998) and on mantle flow models (Steinberger, 2000). The remaining, and still quite substantial, misfit can help to identify how the model needs to be improved in order to come closer to the ultimate goal of a subduction-based model of mantle evolution that accurately explains present-day tomography.
Firstly, we have tried to match certain features in the tomography model with corresponding features in the geodynamic model, and find that they occur in the geodynamic model generally too deep in the mantle. This might be compensated by assuming an even higher viscosity in the lower mantle. However, an even higher viscosity globally would presumably be difficult to reconcile with constraints for fitting geoid, global heat flux and postglacial rebound (e.g. Steinberger and Calderwood, 2006). This points towards a possibly important effect of lateral viscosity variations, which are not included in our model: if slabs have been subducted for a long time in a certain region, they cooled the lower mantle, leading to increased lower mantle viscosities in that region and thus slower slab sinking speed, while the global average viscosity could remain compatible with geoid constraints (Yoshida and Nakakuki, 2009;Ghosh et al., 2010). Neglecting lateral viscosity variations may also lead to overestimated sinking velocities, as weaker slabs can thicken more and hence sink faster. Thus, stronger slabs may sink slower even without the need for a higher lower-mantle viscosity.
Another difference is that in our model, the slabs often appear bent (e.g. in the cross sections beneath North America), such that sometimes older slabs are less deep than younger ones. In contrast, tomography in that region has been interpreted such that slabs sink at approximately constant speed, such that a subduction zone migrating at a constant speed could give a slab with constant dip (Grand et al., 1997). Again, this discrepancy could possibly be due to lateral viscosity variations causing slabs to be stiffer than their surroundings; a stiffer slab would be less readily bent.
Although in general our model slabs are too deep, the opposite case also occurs, namely beneath the Tonga-Kermadec subduction zone. This can be attributed to the fact that the convergence rate in our model is too low, and hence illustrates the importance of considering detailed, regional plate reconstructions. Also, a cross section where the fit is particularly poor is beneath Indonesia, which is also known to be a region of particularly complicated plate tectonic history. Clearly, it would be beyond the scope of any single paper to address this problem globally, so improvements here should be made region by region, possibly still within a global model, but with regionally-refined plate reconstructions and focusing on a regional comparison.
Besides including more detailed plate reconstructions, it will also be a key issue to start with plate reconstructions further back in time. Our tentative model, where we re-ran starting from the modeled present-day structure for another 300 Myr, yielded about 25 % slower slab sinking speeds. We think this occurs because of the accumulation of cold slab material in the lower mantle beneath subduction zones, leading to a reduced density contrast of newly subducted slabs and the surrounding lower mantle, and thus a reduced sinking speed, even without considering lateral viscosity variations. Including plate reconstructions further back in time could hence yield a similar effect. Furthermore, it could help to make thermo-chemical piles stable for a longer time. Observational evidence indicates that they have been in a stable position since 200 Myr (Torsvik et al., 2006) and possibly much longer , whereas they form more recently in our model because subduction is initiated at 300 Ma. But it could be a challenge to keep them stable, particularly if earlier plate reconstructions feature subduction zones between different continents assembling to form Pangea, and these subduction zones directly overly the African LLSVP (cf. Zhang et al., 2010).
P -wave models typically contain better resolved slabs (e.g. Bijwaard et al., 1998;Li et al., 2008), yet for example r 8 correlations of MITP-08 (Li et al., 2008) with st12den-2 is ∼0.19, which is worse than for SMEAN on a global scale. The SMEAN model, which is an average over three S-wave tomography models, and which we mainly compare to here, can be seen as a sort of "common denominator" that maintains robust features on global scales. On the other hand, some models may robustly resolve features that are not included in the mean model by design, particularly on regional scales. Again, as for the use of more detailed plate reconstructions, a careful analysis and comparison should be done region by region.
The overall effect of phase transitions in thermo-chemical convection on the location and shape of subducted slabs remains poorly understood. Our simplified treatment does not yield large model differences with and without phase boundary, yet the complexities of slab ponding at the spinelperovskite phase transition around 660 km depth may well be an important reason to blame for the poor fit, and further studies are needed to better understand these complexities.
Although not the focus of this paper, we note that including upwellings from a basal thermal boundary layer in the dynamic model always worsens the fit. Here we have shown that if positions of upwellings are based on surface hotspots, a much improved fit results. It will be a further challenge to improve models such that upwelling plumes self-consistently form at the right locations.
Although van der Meer et al. (2010) assume vertical sinking at constant speed, we confirm their approach to the extent that slab input in the reference frame that uses their longitude correction gives a better fit than without longitude correction. Future work should attempt to re-calibrate the longitude correction with a mantle flow based approach similar to this paper.

Conclusions
We have devised a geodynamic model of the mantle based on 300 Myr of subduction. In the model, most slabs sink to the lowermost mantle in about 120 Myr, while they typically only move a few degrees laterally. However, such lateral advection and the lateral variations of slab sinking speeds are relevant, and they lead to an improvement in model fit to tomography compared to models with slabs sinking vertically at constant speed. If a chemical layer is included in the model, it yields two thermo-chemical piles in the lowermost mantle, similar in shape and location to the Large Low Shear Velocity Provinces that are seen in tomography. This model correlates very well with the SMEAN composite tomography model until about spherical harmonic degree 3-4. Comparison along cross sections shows substantial differences between geodynamic and tomographic model, but allows to match certain "slab" features in either model with each other. Corresponding features in the geodynamic model appear normally at greater depth than in the tomography model, indicating that modeled sinking speeds are too fast. Through such matching of features, we can obtain an observationbased slab sinking speed estimate of about 2 cm yr −1 , varying mostly between 1.2 cm yr −1 and 2.8 cm yr −1 .