Folding and necking across the scales : a review of theoretical and experimental results and their applications

The shortening and extension of mechanically layered ductile rock generates folds and pinch-and-swell structures (also referred to as necks or continuous boudins), which result from mechanical instabilities termed folding and necking, respectively. Folding and necking are the preferred deformation modes in layered rock because the corresponding mechanical work involved is less than that associated with a homogeneous deformation. The effective viscosity of a layered rock decreases during folding and necking, even when all material parameters remain constant. This mechanical softening due to viscosity decrease is solely the result of fold and pinch-and-swell structure development and is hence termed structural softening (or geometric weakening). Folding and necking occur over the whole range of geological scales, from microscopic up to the size of lithospheric plates. Lithospheric folding and necking are evidence for significant deformation of continental plates, which contradicts the rigid-plate paradigm of plate tectonics. We review here some theoretical and experimental results on folding and necking, including the lithospheric scale, together with a short historical overview of research on folding and necking. We focus on theoretical studies and analytical solutions that provide the best insight into the fundamental parameters controlling folding and necking, although they invariably involve simplifications. To first order, the two essential parameters to quantify folding and necking are the dominant wavelength and the corresponding maximal amplification rate. This review also includes a short overview of experimental studies, a discussion of recent developments involving mainly numerical models, a presentation of some practical applications of theoretical results, and a summary of similarities and differences between folding and necking.


Introduction
One of the principal objectives of theoretical research in any department of knowledge is to find the point of view from which the subject appears in its greatest simplicity.(J.Willard Gibbs, 1880, acceptance letter of Rumford Prize) About 200 years ago Sir James Hall made his famous analogue experiments on layer-parallel shortening of linen and woollen cloth layers with vertical confining pressure provided by "a door (which happened to be off the hinges)" loaded with weights (Hall, 1815;Fig. 1).During shortening the layers deflected laterally (orthogonal to the shortening direction), which is a process termed buckling (mainly used for elastic material) or folding (mainly used for viscous material; Fig. 2a, b).James Hall was probably the first to show that natural folds in rock are the result of a horizontal compression.However, some of the first scientific observations on folds were already made more than 100 years earlier, by Marsili and Scheuchzer in 1705, in the European Alps around Lake Uri in Switzerland (see Hantke, 1961, Ellenberger, 1995, Vaccari, 2004;Fig. 3a).If a competent layer (i.e. one having a higher mechanical strength or greater resistance to deformation) is extended, rather than shortened, it will not usually deflect but it will either break (brittle or fracture boudinage) or it locally thins in a ductile manner (necking; Fig. 2c, d).The term boudinage goes back to Lohest (1909) but necking and pinch-and-swell structure were probably first described by Ramsay in 1866 (Ramsay, 1866;Cloos, 1947;Lloyd et al., 1982;Fig. 3b).We use necking here as a general term to refer to local thinning due to the inherent mechanical instability of a competent layer under extension and pinch-and-swell to refer to the periodic structure that develops from repetition of necking zones along the layer.The finite strain deformation geometry of a competent layer is thus fundamentally different for layer-parallel shortening and extension (Fig. 4), although the initial stages of both folding and necking instabilities can be mathematically explained with the same theory (e.g.Smith, 1975Smith, , 1977)).The mechanical processes controlling the different behaviour of competent layers under compression and extension are the main focus of this review.
Folding and necking are processes that result from instabilities in elastic, plastic and viscous material caused by layer-parallel compression and extension, respectively, of mechanically competent layers.In this review, the overall deformation behaviour is assumed to be ductile and continuous so that fracturing does not play any controlling role.However, after some amount of ductile necking a layer often fractures around the necked region, which is a process termed ductile fracture (e.g.Dieter, 1986), and necking can act as a ductile precursor to brittle boudinage.In nature, brittle structures can also act as precursors that localise ductile shearing (Segall and Simpson, 1986;Mancktelow and Pennacchioni, 2005;Pennacchioni and Mancktelow, 2007) and hence may trigger the formation of a pinch-and-swell structure (Gardner et al., 2015).
The structures resulting from folding and necking can have a wide variety of different geometries, especially in multilayers (Ramberg, 1955;Ramsay and Huber, 1987;Price and Cosgrove, 1990;Goscombe et al., 2004;Fig. 5).As noted by Ramsay and Huber (1987), "folds are perhaps the most common tectonic structure developed in deformed rocks" and a thorough understanding of folding is therefore essential to understand the deformation of typically layered or foliated rocks.Folds can also be generated by passive flow or bending but here we focus only on folds resulting from mechanical instability due to layer shortening.Pinch-and-swell structure seems to be less frequent in nature than folds and this review will provide potential reasons to explain this observation.Folding and necking (and also brittle boudinage) can occur simultaneously, with necks and/or boudins commonly forming in the limbs of folds (Ramberg, 1959;Fig. 5c).In nature, folding and necking are always three-dimensional (3-D) processes (Fig. 6), but most theoretical results are based on 2-D models.
Folding and necking are important tectonic processes because they occur over the whole range of geological scales, from microscopic dimensions up to the size of lithospheric plates.Lithospheric folding and necking in tectonic plates contradict the paradigm of plate tectonics sensu stricto, which states that tectonic plates are rigid and deformation only occurs at plate boundaries.Lithospheric folding can occur on a length scale of several thousands of kilometres, such as in Central Asia (Burov et al., 1993;Fig. 7a).Lithospheric necking is also important, for example, in the formation of rift basins (Zuber and Parmentier, 1986) and of magma-poor passive continental margins, because many of these margins are characterised by so-called necking domains in which the crustal thickness decreases from normal crustal thickness (30-35 km) to ca. 5-10 km (Sutra et al., 2013).Lithospheric necking can also generate a crustal-scale pinch-andswell structure (Fletcher and Hallet, 1983;Gueguen et al., 1997;Fig. 7b) and is a first-order process during slab detachment (Lister et al., 2008;Schmalholz, 2011;Duretz et al., 2012;Bercovici et al., 2015).
The literature on the mechanics of folding and necking is vast because these processes (i) occur from millimetre to kilometre scale, (ii) were modelled using a variety of constitutive equations such as elastic, plastic, viscous, viscoelastic or viscoelastoplastic, (iii) can be driven by imposed boundary displacements, velocities or stresses, or by gravity, (iv) were studied for different bulk deformation geometries such as pure shear, simple shear or wrenching, (v) were studied for single-or multilayer configurations, (vi) were studied for isotropic and anisotropic materials, (vii) were studied in 2-D and 3-D and (viii) were studied using analytical solutions, laboratory (analogue) experiments, or numerical simulations.We present here only a small selection of studies and results.Further information on the mechanics of folding and necking in rock can be found in textbooks (Price and Cosgrove, 1990;Johnson and Fletcher, 1994;Pollard and Fletcher, 2005) and in other review articles (Hudleston and Treagus, 2010;Cloetingh and Burov, 2011).
We focus on studies that investigated particular mechanical aspects of folding and necking and on studies that applied the results to geological observations and problems.Particular geological questions concerning folding and necking are, for example, (i) which parameters control the observed geometry of folds and pinch-and-swell structures, (ii) how much shortening or extension is required to generate the observed finite (high) amplitude folds and pinch-and-swell structures, (iii) does folding and necking change the overall (effective) strength of a rock unit, and (iv) how much force or stress is required to generate observed folds and necks, particularly on the lithospheric scale? 2 Folding

Theoretical results
Theoretical studies and analytical solutions invariably involve simplifications but can provide the best insight into the fundamental parameters controlling a mechanical process.The aim of such studies is thus to determine the (hopefully) small number of (non-dimensionalised) controlling parameters and to investigate their specific influence.

Single-layer folding
Some of the first mathematical studies on bending of beams were performed by Galilei (1638), who studied the strength of beams under beam-orthogonal loading (Fig. 8a).A beam is a 2-D layer that is much longer than thick and a number of simplifications can therefore be made for the geometrical description of the bending.A first beam theory was developed in the 18th century with major contributions from Euler and Bernoulli and is hence often termed the Euler-Bernoulli beam theory (e.g.Timoshenko, 1953;Szabo, 1987).It is assumed that the central (neutral) line in the beam is neither extended nor shortened and that the inner side of the beam is shortened while the outer side is extended; i.e. there are both layer-parallel extensional and compressional strains in the beam due to bending.A major result of the Euler-Bernoulli beam theory is that it can relate the layer-parallel strain, ε xx , due to bending of the beam to the amplitude, A, of the deflection of the beam (Fig. 9) In this equation, y is the orthogonal (vertical) coordinate measured from the middle line of the beam and x is the coordinate along the beam (Fig. 9).The second spatial derivative represents the curvature of the beam.Historically, bending and buckling of beams have first been studied for elastic materials, which is why we here also first present some fundamental results for elastic materials that have then later been applied to viscous materials.For elastic material, the total layer-parallel stress due to bending is σ xx = Kε xx , where K is a material property.For example, in a modern plane stress formulation, it corresponds to E/ 1 − υ 2 , with E being Young's modulus (with units of Pa) and υ being Pois-son's ratio (dimensionless).Frequently used symbols with a consistent meaning throughout the text are listed in Table 1.
Using the above expressions for ε xx (Eq. 1) and σ xx the bending moment associated with the flexure of the beam is where H is the thickness of the beam and D is usually termed the flexural rigidity.If the beam is also compressed, then a horizontal compressional load, F , is present within the layer (with units of Nm −1 in 2-D).Euler (1744) solved this first buckling problem by equating the flexural moment due to bending of the beam, M, to the moment caused by compression of the deflected beam, FA (Fig. 9), to give The famous result of Euler (1744) for the smallest load for which the beam buckles, the so-called Euler load (see also, for example, Bazant and Cedolin, 1991), is then the beam are fixed in vertical and horizontal position but can rotate freely.For loads smaller than F E the beam does not buckle, assuming that it is initially perfectly straight.Equation (3) describes the balance of moments acting on a compressed beam that can deflect freely because the beam is not embedded in another material and gravitational stresses arising due to the deflection are also not considered.If there is an additional stress, q, which resists the vertical deflection of the beam, either due to an embedding medium or gravity, then Eq. (3) will be expanded to (e.g.Smoluchowski, 1909;Biot, 1961) Horizontal stress due to bending Horizontal stress due to compression (5)

+ q
Vertical stress due to embedding medium and/or gravity = 0 with the convention here, opposite to Biot (1961), that q is positive when acting against the positive direction of the deflection A. Equation ( 5) represents an approximate force balance equation that is 2-D, and the derivation of this approximate equation from the full set of 2-D force balance equations is given in Appendix A. In Eq. ( 5), D is assumed constant and all terms (summands) have units of Pa.In a geological context, the above equation was probably first solved by Smoluchowski (1909), who considered crustal folding and used q = ρgA, where ρ is the crustal density and g the gravitational acceleration (see Sect. 2.1.3on lithospheric folding).The results of Smoluchowski were re-derived by Goldstein (1926), who also applied them to folding of the Earth's crust.In engineering applications, the resistance of an elastic medium below the layer (often termed the foundation) commonly has the form q = bA, where b is termed an elastic foundation modulus.The resistance of an elastic foundation is hence directly proportional to the deflection, A. The terms describing the resistance of an elastic foundation and the resistance due to gravity are mathematically identical if we identify b with ρg.Equation ( 5) is a 1-D equation that can be used to study 2-D folding (buckling) because of the assumption that the horizontal strain can be expressed by the second spatial derivative of the vertical deflection, A (Eq. 1).Many studies on folding, involving elastic, viscous and viscoelastic material (see below), are based on this simplified equation and we refer to the approach of using such a simplified "beam equation" or "thin-plate equation" as the thin-plate approach.The thinplate approach can also be applied to 3-D folding of plates.The thin-plate equations for 2-D and 3-D have been continuously developed and derived in the 18th, 19th and 20th centuries (e.g.Timoshenko, 1953;Szabo, 1987).The derivation of a general thin-plate theory applicable for 3-D buckling of plates was finalised by Kirchhoff (1850).Sander (1911) observed that, for multilayers within a quartz phyllite, the size of individual folds is related to their layer thickness and that folds become systematically smaller as their layer thickness becomes thinner (e.g.Fig. 10).He termed this observation the "rule of fold size" (in original German: Regel der Stauchfaltengrösse).Several authors applied variations of Eq. ( 5) to folded rocks to derive formulas for the size or wavelength, L, of folds (e.g.Gunn, 1937;Kienow, 1942).These studies applied a foundation term, q, which is proportional to A but independent of the wavelength, L, of the deflection.Biot (1937) showed that if the embedding medium is more correctly considered as a 2-D elastic half-space, then the stress resistance of the embedding medium will be where b is here a material parameter.The importance of this result is that the resistance of the embedding medium depends not only on the amplitude of the deflection but also on the wavelength of the deflection (Fig. 11).
Based on Biot's correspondence principle (1954,1956,1961), the thin-plate equation for elastic material can be easily applied to viscous and viscoelastic materials (Biot, 1957).For incompressible viscous material, the horizontal total stress to calculate the moment, M, is σ xx = 4ηD xx , where η is the viscosity and the strain rate due to bending, D xx , can be calculated from the time derivative of Eq. ( 1) The factor 4 for the total horizontal stress (σ xx = −P + 2ηD xx with P being the pressure) arises because it is assumed that the layer can expand freely vertically, so that σ yy = −P + 2ηD yy = 0, which yields P = 2ηD yy .Using the incompressibility condition D yy = −D xx yields P = −2ηD xx and σ xx = 4ηD xx (Biot, 1961).From σ xx = 4ηD xx , it also follows that the horizontal load is F = 4η Dxx H , where Dxx is the strain rate due to bulk shortening (i.e. the shortening between the two ends of the layer, indicated with an overhead bar).In passing, it should be noted that there is always a "tectonic" or "dynamic" overpressure in the shortening layer that is directly proportional to the viscosity η and bulk shortening rate Dxx in the layer (P = −2η Dxx ; e.g.Mancktelow, 1993Mancktelow, , 2008)).For extension of the layer, the sign is simply reversed and there is an underpressure of equal magnitude, an observation that is relevant when considering why boudins often show more brittle behaviour.Assuming that the viscous layer is embedded in a viscous half-space, the net resistance of the embedding medium acting on the layer q = −4η M 2π L dA dt (Biot, 1961; see Appendix B) with η M being the viscosity of the embedding medium.The thin-plate equation for folding of a viscous layer embedded in a viscous medium is then (Biot, 1961) In direct analogy to Eq. ( 5), the first term corresponds to the stress due to bending of the layer, the second to the stress due to compression, and the third to stress due to the resistance of the surrounding medium.To solve the above partial differential equation (PDE) one assumes a solution of the form (Biot, 1961) for which the derivatives with respect to time, t, and coordinate, x, can be taken.The parameter α is the non-dimensional dynamic amplification rate of any initial deflection A 0 (the kinematic component of amplification is ignored in the thinplate approach; see below) and the minus sign in the exponent accounts for the convention that mathematically a shortening strain rate is negative (Fig. 11).Substituting Eq. ( 9) into Eq.( 8) and taking the derivatives transforms Eq. ( 8) into an algebraic equation of the form  which can be simplified by introducing s = 2π H /L (the non-dimensional wavenumber) and R = η/η M (the viscosity ratio) to The amplitude, A, the bulk strain rate, Dxx , and one term s could be dropped from Eq. ( 10) because they are multiplied with every term in Eq. ( 10).When the algebraic Eq. ( 11) is solved for α it provides an expression for α as function of L (within s) α = 1 As seen from plots of this relation in Fig. 12, all wavelengths are amplified but there is a wavelength, the so-called dominant wavelength, L d , for which α is at a maximum.The value of L d can be found by taking the derivative of α in Eq. ( 12) with respect to L, setting this derivative to zero and solving for L, which yields from which it is immediately obvious that the dominant wavelength is directly proportional to the layer thickness and increases with increasing viscosity ratio, R, of layer to matrix (Fig. 12).This famous dominant wavelength expression was first derived by Biot (1957) using the thin-plate approach described above.The maximum amplification rate corresponding to L d is found by substituting Eq. ( 13) into Eq.( 12) and is This value also increases with the viscosity ratio.Eq. ( 14) already indicates that the thin-plate approach is inaccurate for small values of R because α d ≈ 1.21 for R = 1 but for R = 1 (no viscosity difference between layer and matrix) the amplification rate should be zero (since passive layer thickening is not considered in the thin-plate approach).
For comparison with a viscous layer, the dominant wavelength for an elastic layer embedded in a linear viscous medium is L d = 2π H (G/ σe ) 1/2 and the corresponding maximal amplification rate is α d = σe ( σe /G) 1/2 / (6η M ), where G is the shear modulus of the layer and σe is the elastic compressive stress in the layer (Biot, 1961;Schmalholz and Podladchikov, 2001a).For an elastic layer, the dominant wavelength is independent of the viscosity of the embedding medium.If the layer is viscoelastic and its rheology is described by a Maxwell model (i.e.elastic and viscous element connected in series), then the layer will behave as effectively viscous for λ < 1 and effectively elastic for λ > 1, whereby λ is the ratio of the dominant wavelength for a viscous layer to the dominant wavelength for an elastic layer; i.e. λ = (R/6) 1/3 ( σe /G) 1/2 (Schmalholz and Podladchikov, 1999).If the viscoelasticity of the layer is described by a viscous and elastic element connected in parallel (Kelvin model), then the layer will behave as effectively viscous for λ > 1 and effectively elastic for λ < 1 (Schmalholz and Podladchikov, 2001b).
As discussed in some detail by Biot (1961), the selectivity of amplification, and therefore the tendency to develop a clear sinusoidal form with a wavelength approximating that of the dominant wavelength, depends on the relative bandwidth of the amplification rate curve, which he defined as the wavelength difference at half the maximum amplification , which is non-dimensionalised by dividing it by the product of matrix reference viscosity times bulk deformation rate (η M Dxx ).For the initial geometry, the value of σ I I in the layer is close to the absolute value of the basic state deviatoric stress τxx = 2η ref Dxx .With progressive folding and necking, the average stress of the layermatrix system decreases.For necking, high stresses are localised in the neck.The corresponding structural softening of the folding and necking simulations is shown in Fig. 18.
rate L divided by L d (Fig. 6 in Biot, 1961).He found that and that the selectivity therefore depends only on the maximum amplification rate (and thus the viscosity ratio) but that this dependence is relatively weak, as can be qualitatively seen in Fig. 12.At lower viscosity ratios (e.g. 15 in Fig. 12a), fold trains will be more irregular and the initial perturbation geometry will have an increasing influence (Abbassi and Mancktelow, 1990Mancktelow, , 1992;;Mancktelow and Abbassi 1992;Mancktelow, 2001).It should be noted that the analysis presented above is specifically for infinitesimal amplitudes, so that the dominant wavelength and amplification rate derived represent initial values when the fold amplitude is very small.A different approach to derive L d was presented by Ramberg (1962).He considered the full 2-D force balance equations of a viscous fluid and solved the equations with a stream function approach (see Appendix B).The stream function approach provides the complete 2-D velocity and stress fields for a viscous layer embedded in a viscous medium.Ramberg considered a flat layer with a sinusoidal, geometrical perturbation of the layer interface (Fig. 11a).From the corresponding stress field, Ramberg calculated the horizontal force required to fold the layer and showed that the force is minimal for a certain perturbation wavelength.The wavelength that minimises the force corresponds to the dominant wavelength.If we consider an infinitesimal horizontal displacement due to the applied force, then the dominant wavelength will also correspond to the sinusoidal geometry for which the required work (force × displacement) to deform the layer is minimised.The stream function approach provides an expression for L d that is identical to the one of Eq. ( 13) (Ramberg, 1962).Ramberg (1962), in his Eq. ( 32), has a factor of 4 in the expression for L d , because he used the half-layer thickness for H .
The stream function approach was later used by Fletcher (1974) to perform a hydrodynamic stability analysis for single-layer folding of a power-law viscous layer.Detailed mathematical descriptions of the stream function approach in combination with a hydrodynamic stability analysis are given in Fletcher (1974Fletcher ( , 1977)), Smith (1975Smith ( , 1977)), Johnson and Fletcher (1994) and Pollard and Fletcher (2005).We term this approach here stability analysis, but this approach has also been termed perturbation method or thickplate analysis.We do not use the term thick-plate analysis to avoid confusion with the thick-or shear-deformable plate theory, which is an elaboration of the thin-plate equation that considers also shearing within the layer (e.g.Wang et al., 2000).As will be shown later, the same stability analysis can also be applied to study necking (Smith, 1975(Smith, , 1977)).The stability analysis assumes that geometrical perturbations are superposed on a flat layer that is shortening and thickening by pure shear (the basic state, Fig. 11).In a stability analysis, the initial deflection, A 0 , of the thin-plate approach corresponds to the amplitude of a sinusoidal geometrical perturbation superposed on a flat layer boundary.The stability analysis provides an expression for the time derivative of A of the general form where the negative sign accounts for the convention that shortening strain rates are negative.In the stability analysis, the kinematic (or passive) amplification velocity due to the basic state pure shear thickening is taken into account and is given by− Dxx A. This corresponds to a passive growth rate of 1, which is added to the dynamic growth rate α within the square brackets of Eq. ( 16).Kinematic amplification is neglected in the thin-plate approach because the layer thickness is assumed to be constant during folding.The exact infinitesimal amplitude solution from stability analysis for the dynamic amplification rate α is given by (Fletcher, 1974(Fletcher, , 1977;;Smith, 1977), where s is again the non-dimensional wavenumber s = 2π H /L as defined above.
A so-called closed form solution for the dominant wavelength cannot be derived because the mathematical expression in Eq. ( 17) is too complicated.The above expression for α can be simplified by performing a Taylor expansion for L/H 1 and keeping only dominating terms in L/H .The resulting expression for α is identical to the one derived by the thin-plate approach (Eq.12) and of course the dominant wavelength is therefore also identical.In Fig. 12, the exact solution of Eq. ( 17) is plotted in comparison to the simplified thin-plate result.It can be seen that the thin-plate approximation always overestimates the amplification rate and the dominant wavelength is shorter.The thin-plate solution becomes a better approximation with increasing viscosity ratio.Fletcher (1974) also derived a solution for α for powerlaw viscous fluids, which is given by his Eq. ( 8) as q(k) in that publication.A flow law for a power-law viscous fluid typically has the general form xx where τ xx is the horizontal deviatoric stress, D xx is the horizontal strain rate, B is a material property and n is the power-law exponent.The general solution for the amplification rate of folding, and also for necking, of power-law viscous layers embedded in power-law viscous medium is (Smith, 1977;Pollard and Fletcher, 2005 For folding the signum function of the bulk shortening rate, sign Dxx = θ = −1 and for necking θ = 1; n and n m are the power-law exponents of the layer and embedding medium, respectively.Plots of the amplification rate against L/H in Fig. 13 show that, in comparison to the linear viscous case, the amplification rate is greater, the dominant wavelength is shorter and the selectivity is greater (i.e.narrower curve).In the exact power-law infinitesimal amplitude solution, there is oscillation in the amplification rate curve as L/H approaches zero, with the amplification rate becoming negative and then increasing again with decreasing values of L/H .This oscillation was discussed by Johnson and Fletcher (1994) with regard to their Fig.8.2.Numerical models determining the amplification rate of folding in power-law viscous materials, as can be carried out with the Folder package of Adamuszek et al. (2016), reproduce this oscillating amplification rate curve (Fig. 13c, inset) and it is not an artefact.However, considering the very low amplification rates (in part negative) and normalised wavelengths, it has little practical influence on fold development in power-law materials.Furthermore, numerical results indicate that the oscillation disappears for increasing amplitudes (Fig. 13d, inset).The dominant wavelength for folding of a power-law viscous layer and the corresponding maximal amplification rate depend on both the viscosity ratio and the power-law stress exponent (Fig. 14).The lower limit for the dominant wavelength to thickness ratio is ∼ 4 (Fig. 14).
For power-law viscous flow, the approximate formula (for L/H 1) for the dominant wavelength is The approximate maximum value of α corresponding to L d is (Fletcher, 1974) A power-law viscous behaviour of the layer and/or the embedding medium (n and/or n m >1) hence increases the amplification rate and consequently the folding instability.For a power-law viscous flow law such as xx (B is a material property), the effective viscosity (ratio of stress to twice strain rate) can be expressed as a function of the square root of the second invariant of the strain-rate tensor, D I I , (Johnson and Fletcher, 1994) A viscosity formulation with an invariant is required so that the flow law is independent of the chosen coordinate system (so-called material objectivity).For 2-D incompressible flow the strain-rate invariant reduces to xy (e.g.Johnson and Fletcher, 1994).Using Eq. ( 21), the power-law viscous flow law is then τ xx = 2ηD xx .The coefficient B/2 is not a viscosity because it has units Pas 1/n and not Pas.It is useful to reformulate the above Eq.n is a viscosity with units Pas and is the reference viscosity corresponding to a homogeneous pure shear with a constant strain rate corresponding to Dxx , since for this homogeneous pure shear rate D I I = Dxx (Fig. 15).The viscosities used in the formulas for the dominant wavelength and corresponding amplification rates for power-law viscous material are typically the reference viscosities corresponding to the basic state of deformation, which is typically a homogeneous pure shear.Due to folding, strain rates deviate locally from the bulk shortening rate and cause local variation in the effective viscosity.Strain rates higher than the basic state strain rates cause a decrease in the effective viscosity and strain rates smaller than the basic state rate an increase in effective viscosity.
Stability analysis is performed with linear equations and hence the non-linear power-law flow law must be linearised.This is done by assuming that every quantity, such as strain rate (e.g.D xx ) or deviatoric stress (e.g.τ xx ) can be expressed as the sum of a quantity representing the basic, pure shear state of deformation (e.g.Dxx and τxx ) and a quantity representing the perturbation (deviation) flow from the basic state of pure shear (e.g.D xx and τ xx ), for example D xx ≈ Dxx + D xx (Fig. 15).The non-linear equations are then linearised by performing a Taylor expansion around the basic  (Burov et al., 1993).(a) Crustal geometry across the Rockall and Porcupine basins modified after Mohn et al. (2014) and Welford et al. (2012).The thinned continental crust has been interpreted to be the result of necking.1744), who studied the so-called elastic curves (elastica) and the critical load for buckling of loaded columns.state, keeping only those terms that are linear in the perturbation quantities.The resulting linearised flow laws for the perturbed, horizontal and shear stresses are then (e.g.Fletcher, 1974;Smith, 1977;Fletcher and Hallett, 1983;Johnson and Fletcher, 1994;Pollard and Fletcher, 2005)  where η R is the reference viscosity for the basic state of pure shear deformation given in Eq. ( 22).The flow laws for the perturbing flow, namely the deviation from pure shear, are thus anisotropic, because the viscosity for the normal stresses (but not the shear stresses) is divided by n, which is a result of the linearisation as a Taylor expansion to first-order terms.This can also be seen qualitatively in Fig. 15, reproduced in a modified form after Fig. 3 of Smith (1977).The implicit anisotropy in power-law viscous materials means that the effective viscosity for perturbing normal strains is smaller than for perturbing shear strains.The anisotropy occurs because the second strain-rate invariant, D  (Smith, 1977).The anisotropy in the perturbing flow is responsible for the difference between linear viscous and power-law viscous folding (and also necking; Smith, 1977).A detailed derivation of the linearisation of the power-law flow law and the separation into basic state and perturbing flow can be found in Pollard and Fletcher (2005;their Sect. 11.2.3).The impact of anisotropy on folding has been studied numerically by, for example, Mühlhaus et al. (2002) and Kocher et al. (2006Kocher et al. ( , 2008)).A note on nomenclature: for linear viscous fluids (n = 1) the stress continuously increases with increasing strain rate and the viscosity is constant.For power-law viscous fluids with n > 1, the stress also continuously increases with in-  Approximate results are based on the thin-plate approach (Eq.12) and exact results (for infinitesimal amplitudes) are based on the stability analysis (Eq.17).
As an example, in (a) the dominant wavelength to thickness ratio, L d /H , and the corresponding maximal dimensionless amplification rate, α d , are also indicated for the thin-plate solution.
creasing strain rate but the effective viscosity continuously decreases with increasing strain rate (Eq.21, Fig. 15).In the geological literature, such power-law viscous fluids have been termed both strain-rate softening (due to the viscosity decrease; Smith, 1977) and strain-rate hardening (due to the stress increase; Schmalholz and Maeder, 2012).We use here the terminology strain-rate hardening, which follows the nomenclature in the material science literature (e.g.Ghosh, 1977).The term softening is usually applied to flow laws for which the stress decreases with increasing strain (i.e.strain softening) or strain rate (i.e.strain-rate softening).Strain-rate softening in power-law viscous materials is then expressed by a negative power-law exponent (n < 0; e.g.Montesi and Zuber, 2002).Stability analysis is of great practical importance in nearly all branches of mechanics because it shows whether mathematically and mechanically correct solutions are actually possible (or stable) in nature.For example, pure shear short-ening and thickening of a perfectly straight (rectangular) and competent viscous layer embedded in a less viscous medium, which takes place without folding, is a mathematically and mechanically correct solution.However, this solution is not stable because any small geometrical perturbations, which are always present in natural materials, amplify with faster velocities than the corresponding pure shear velocities and cause folding.Homogeneous pure shear deformation of a competent viscous layer therefore does not occur in nature, although the mathematical solution is correct.Stability analysis is thus essential to determine whether correct mechanical solutions are applicable to natural processes.For pure shear shortening of a layer of thickness H with geometrical perturbations of amplitude A, the kinematic (passive) velocity due to shortening/thickening is V k = − (H /2 + A) Dxx (assuming the vertical coordinate is zero in the middle of the layer) and the dynamic velocity due to active folding is V a = −α Dxx A. Both velocities are equal when A/H = 1/ [2 (α − 1)] (Schmalholz and Podladchikov, 2001a), which corresponds to the transition between passive kinematic shortening/thickening and active ("explosive") folding (cf.Ramsay, 1967, Figs. 7-37).For active folding, V a > V k and hence A/H > 1/ [2 (α − 1)].For example, for α = 20, active folding occurs for A/H 0.026; i.e. when the amplitude is larger than ∼ 2.6 % of the layer thickness.If, for α = 20, the initial amplitude A 0 is < 0.026 H then an initial deformation phase dominated by layer thickening occurs, but if A 0 is > 0.026 H then the deformation will be immediately dominated by explosive folding.
The derivation of the theoretical dominant wavelength and corresponding amplification rates for linear and power-law viscous materials are only strictly valid for infinitesimal amplitudes but have been verified by both laboratory deformation experiments and numerical simulations at small fold amplitudes (e.g.Biot et al., 1961;Mancktelow and Abbassi, 1992;Schmalholz, 2006).The further development of fold geometries and the "selection" and "locking" of a fold wavelength have been discussed and analysed in a number of studies (Sherwin and Chapple, 1968;Fletcher, 1974;Fletcher and Sherwin, 1978;Hudleston and Treagus, 2010).However, a detailed discussion of their results is beyond the scope of this review.
The theories outlined above are valid for single layers with the imposed layer-parallel displacement, velocity or load directly applied at the ends of the layer.However, folded veins or dikes are not infinitely long and therefore a layerparallel load is usually not directly applied at their ends.Schmid et al. (2004) considered folding of power-law viscous layers with a finite length, essentially corresponding to isolated inclusions removed from the lateral boundaries and embedded in a more extensive linear viscous medium.They considered the layers as ellipses with large aspect ratios (i.e.length to thickness ratio).If the viscous medium is shortened in a direction parallel to the long axis of the ellipse, the stresses in the surrounding viscous medium will Figure 14.Contours of the ratio of dominant wavelength to layer thickness (thinner lines) and dimensionless amplification rate (growth rate; thicker lines) for folding (a) and necking (b) as a function of the reference viscosity ratio (R) and the power-law stress exponent of the layer (n).The embedding medium is linear viscous (n m = 1).For folding the amplification rates are significant even for small values of the power-law exponent (< 5), whereas for necking the amplification rates are insignificant (< 10) for power-law exponents < 10.  cause deformation and folding of the isolated, elongate ellipse.Finite-length single-layer folding is controlled by the dimensionless ratio D a = η e / (η M a), where a is the aspect ratio, η e = η 1 + 2η/η M a n−1 for n ≥ 1 (with η e = η for n = 1), and η is the effective viscosity of the layer calculated with the bulk shortening rate of the medium (i.e.D I I = Dxx in Eq. 21).D a controls whether the theories assuming an infinite layer (for D a 1) can be applied or whether a modified theory has to be used.The modified theory is based on Muskhelishvili's complex potential method and shows that the dominant wavelength for the general case of a finite length layer is and the corresponding maximum amplification rate is For a linear viscous layer, L d is identical to the one for an infinite layer (Eq.13).The analysis further shows that for finite-length layers (D a 1), the amplification rates are substantially reduced relative to the bulk shortening rate and that the growth of large wavelength to thickness ratio folds is suppressed.
As can be seen from Eq. ( 9) or from a time integration of Eq. ( 16), both the standard thin-plate approach and the stability analysis give an exponential growth of the fold amplitude with time, whereby the amplification rate is constant.This exponential growth is the result obtained if the mathematical analysis is carried out to first-order in slope.The solution in Eq. ( 12) provides the amplification rate for all possible wavelengths (or Fourier components) and with this solution the evolution of fold shapes can be calculated analytically for certain initial layer geometries, such as an initial isolated bell-shaped geometry (Biot et al., 1961).The analytical treatment is possible because the bell-shaped geometry can be represented as an infinite cosine series by a known Fourier integral expression where x and y are the horizontal and vertical coordinates (with the origin on the central symmetry axis; see Fig. 16), k = 2π/L is the wavenumber, b is here the amplitude of the bell-shape on its central symmetry axis (i.e.y = b for x = 0) and a controls here the width of the bell-shape.If the wavelength components are amplified independently, the amplified shape of the perturbation can be calculated at any time by simple linear superposition using (Biot et al., 1961) where α is the amplification rate given by Eq. ( 12) and t is time.The integral in Eq. ( 27) can be calculated numerically and Eq. ( 27) allows for the fold shape to be calculated for any time t (or bulk strain, − Dxx t; Fig. 16).In Fig. 16, the analytically predicted fold evolution is compared with the evolution calculated by a numerical simulation based on the finiteelement method.The comparison shows not only that the analytical prediction is valid up to moderate fold amplitudes (or fold limb dips), but also that the analytically predicted amplification becomes too large once the amplitudes exceeded a certain value.In the case of the bell-shaped perturbation of Fig. 16, the infinitesimal amplitude, linear superposition approach is a good approximation up to ca. 20 % bulk shortening, with a maximum limb dip of ca. 25 • (Fig. 16c).However, the limb dip and the amount of bulk shortening up to which the exponential analytical solution is valid depends on both the initial amplitude and the viscosity ratio (see discussion of Eq. 30 below).As noted above, there is always a passive or kinematic component of layer-parallel shortening and thickening that In (e) the yellow line corresponds to the case where arc-length shortening rather than bulk shortening is used in the solution of Biot et al. (1961;Eq. 27), as discussed in the text.At lower values of shortening (a, b, c) this result does not differ significantly from that for the uncorrected Biot or LAF.
dominates at very small amplitudes before "explosive" folding manifests itself due to the exponential amplification rate.Indeed, as also presented above, in the limit of a perfectly planar layer, folds never develop regardless of the viscosity ratio and the layer will simply shorten and thicken.The result of this effect, which is not specifically considered in the thin-plate or stability analysis solutions above, is that the non-dimensional wavenumber, s, of a sinusoidal fold will increase during bulk shortening (Sherwin and Chapple, 1968) at a rate given by ds/dt = −2 Dxx s (e.g.Fletcher, 1974).Combining this with differential Eq. ( 16) for the instantaneous growth in fold amplitude and numerically integrating through time allows for the calculation of a wavelength of maximum amplification for a particular imposed bulk shortening (Sherwin and Chapple, 1968;Fletcher 1974;Fletcher and Sherwin, 1978;Johnson and Fletcher, 1994;Adamuszek et al., 2013b).This wavelength was termed the "preferred wavelength" by Johnson and Fletcher (1994).
It is of course expected that the initial exponential amplification with a constant rate of Eq. ( 16) must eventually break down, because fold amplitudes cannot grow exponentially at the same rate forever during shortening; the initial dynamic rate (α) must effectively decrease, but the passive component (the "1" in Eq. 16) will remain.Indeed, both numerical calculations (Chapple, 1968;Zuber and Parmentier, 1996) and analogue experiments (Hudleston, 1973;Mancktelow and Abbassi, 1992) showed that exponential amplification with a constant value of α only occurs when fold amplitudes and limb dips are small and that fold amplification slows down with increasing fold amplitude and limb dip.Analytical solutions for the finite amplitude evolution of single layer folds have been derived by, for example, Mühlhaus et al. (1994) and Schmalholz and Podladchikov (2000).The analytical solutions are derived by considering geometrical non-linearities due to finite amplitudes in the thin-plate Eq. ( 8).For example, Schmalholz and Podladchikov (2000) argued that the bulk shortening rate is no longer a good measure for the shortening rate of the layer (and hence the load F ) when the amplitude is finite because, due to layer deflection, the layer is shortening less than a horizontal line parallel to the direction of bulk shortening.Shortening of the layer at finite amplitudes is more accurately described by the shortening of the fold arc length, S. Therefore, the value of Dxx , which controls the layer-parallel load, is not given by the change of the horizontal wavelength, namely Dxx = 1 The above relation can be derived with a Taylor expansion for the geometrical formula for the arc length (Schmalholz and Podladchikov, 2000).Due to the resulting non-linearity (i.e. the A 2 term) in the thin-plate equation, an explicit solution for A as a function of the shortening (or time t) cannot be derived but an implicit solution is possible (Mühlhaus et al., 1994;Schmalholz and Podladchikov, 2000).The dimensionless finite amplitude solution for amplification of the dominant wavelength component, as derived by Schmalholz and Podladchikov (2000), is then This finite amplitude solution provides an algebraic relationship between shortening (quantified by L 0 /L; subscript 0 indicates initial values) and the fold amplification (quantified by A/L; Fig. 17).The first term on the right-hand side including A/L represents the classical exponential solution for For n = 1 no necking instability is active and thinning is due to homogeneous pure shear thinning only.The time for complete thinning (H /H 0 = 0) is given by t/t c = 1/n (see text).
a constant dominant amplification rate, α d , reformulated as a power-law equation (e.g.Johnson and Fletcher, 1994, their Eq. 5.2.15c;Schmalholz, 2006).The second term represents the finite amplitude correction due to the change of the arc length, S, with respect to the wavelength, L. When amplitudes are small, then S ≈ L or S/L ≈ 1 (Eq.28), and the finite amplitude correction term is close to one.When amplitudes, and hence S/L, increase, then the finite amplitude correction term also increases.Specific values of L 0 /L can be calculated for specific values of A/L (from which corresponding values of S/L can be calculated from Eq. 28) and hence a curve A/L vs. L 0 /L can be plotted.A more accurate finite amplitude solution can be constructed if in Eq. ( 28) π 2 is replaced by π 2 / 1 + 3(A/L) 2 , which is a term calibrated with numerical simulations.The breakdown of the exponential amplification and the deviation from exponential amplification occurs approximately at a ratio of amplitude to wavelength of (Schmalholz and Podladchikov, 2000;Schmalholz, 2006) Assuming a sinusoidal fold shape, the ratio A/L corresponds to a maximal limb dip of atan (2π A/L) 180/π.Hence, using Eq. ( 14) for linear viscous folding, the breakdown of the exponential solution occurs at ∼ 24 • limb dip for a viscosity ratio of 25 and at ∼ 15 • limb dip for a viscosity ratio of 100.
For the non-sinusoidal, bell-shaped initial perturbation of Fig. 16, with a viscosity ratio of 75, comparison with the numerical model indicates that the approximate solution of Eq. ( 27) is very good up to a maximum limb dip of ca. 25 • (Fig. 16c).The arc length in this example is determined by the finite shape and therefore by a non-linear function of the amplitude and wavelength of the full infinite cosine Fourier series.It is clear that a simple linear superposition approach, Solid Earth, 7, 1417-1465, 2016 www.solid-earth.net/7/1417/2016/as presented by the theoretical curve in Fig. 16, must therefore break down at finite amplitude.Adamuszek et al. (2013) presented a comprehensive finite amplitude solution for both single sinusoidal folds and for multiple waveforms represented by a Fourier series, such as the case for the bell-shaped perturbation (their Appendix B).They combined the results of stability analysis with the two previously proposed corrections to the original linearised theories outlined above.Their large amplitude folding (LAF) model includes both the effect of homogeneous shortening and thickening on the non-dimensional wavenumber and the need to consider the shortening rate of the arc length, rather than the shortening rate imposed at the boundaries.The LAF model is described by a coupled system of ordinary differential equations involving time derivatives, so that the time evolution has to be calculated numerically.The LAF solution allows for quite exact prediction of fold geometries up to moderate limb dips of ∼ 20 • or more.However, Adamuszek et al. (2013) also showed that 2-D numerical simulations are required to accurately predict the shape of folds with larger limb dips or finite amplitudes, or more complicated geometries due to initial irregularities.
The improvement in fit resulting from these additional corrections can be seen for the bell-shaped perturbation example in Fig. 16.As noted previously, the simple solution of Eq. ( 27) breaks down at limb dips of ∼ 25 • .The next step to improving the result is made by considering arc-length shortening rather than the bulk shortening.In the most elementary manner, this can be achieved at any time by first calculating the shape using Eq. ( 27) with − Dxx t, determining the deformed arc length S along the layer, calculating − ln (S/S 0 ), where S 0 is the initial arc length, using this rather than − Dxx t as input to Eq. ( 27), and iterating until the arc length no longer changes significantly.In practice, convergence is fast and the improvement in fit of the predicted shape compared to the numerical result is significant, providing good results up to limb dips of ∼ 40 • (yellow line on Fig. 16e), at least for the parameters used in this particular example.The next step in complexity is the full LAF approach (magenta line in Fig. 16).At some stage the transition must still be made to a fully numerical approach, which has recently been made easy with the publication of the Folder package (Adamuszek et al., 2016).However, the real advantage of the original solution of Biot et al. (1961;Eq. 27) is its simplicity in that it is described by a single equation without any derivatives and is, hence, considerably more transparent than the coupled system of ordinary differential equations of the more exact LAF model.Despite its limitations, overall fold shapes (e.g.localised or regular, symmetric or asymmetric) can be modelled with the simple solution of Eq. ( 27) up to limb dips of ca. 25 • , so that it remains very useful for estimating first-order fold-shape evolution and for providing fundamental insights into fold amplification.
The finite amplitude solutions outlined above are based effectively on an elaboration of the linear thin-plate approach.
It is also possible to further elaborate the linear first-order stability analysis in order to obtain results that are more accurate for finite, but still small amplitudes, by considering sinusoidal terms up to third-order (Fletcher, 1979;Johnson and Fletcher, 1994).This higher-order stability analysis provides results which are valid up to moderate limb dips of ∼ 30 • and, in particular, provides more accurate fold geometries than the first-order analysis and the thin-plate approach, especially for small to moderate viscosity ratios ( 50).
One particular result of the finite amplitude solution is that the layer-parallel load, F , required to drive folding decreases with increasing amplitude and hence shortening.This decrease in F with increasing shortening is only due to geometrical effects and is usually termed structural softening (or geometric softening/weakening; e.g.Schmalholz et al., 2005;Schmalholz and Schmid, 2012).Structural softening can be quantified by determining the evolution of the effective viscosity of the entire layer-medium system.The effective viscosity of a rock unit consisting of competent layers embedded in a weaker medium during shortening with a constant bulk rate of deformation can be calculated as the ratio of the area-averaged stress (e.g. the second invariant of the deviatoric stress tensor) to the bulk rate of deformation.If shortening was accommodated by homogeneous pure shear at a constant rate, the layer would deform by homogeneous shortening and thickening, and the effective viscosity of the layered rock unit would remain constant.However, if folding takes place, the effective viscosity will decrease during bulk shortening, because the area-averaged stress is smaller during folding than during pure shear thickening (Figs. 4 and 18).
Implicit in developing the analytical solutions above, as well as in many analogue and numerical models, is that there are basically three stages of fold development.As is obvious from Eq. ( 16), the amplification velocity dA/dt is directly proportional to both A and 1 + α.If the initial amplitude of irregularities in the layer is small and/or the dynamic amplification rate is low (low viscosity ratio η/η M ), there will be an initial stage of dominantly layer-parallel shortening and thickening without obvious fold development.Eventually, as the amplitude increases, this will be followed by an "explosive" period of growth to finite amplitude (Ramsay, 1967), during which the bulk shortening is predominantly accommodated by the folding.As discussed already above, this transition occurs around and Podladchikov, 2001a).Explosive fold growth cannot go on indefinitely, and the dynamic amplification rate will decay toward zero, especially at limb dips > 45 • , in which case the limbs are now infinitesimally stretched rather than shortened.Based on the finite amplitude solution of Eq. ( 29), Schmalholz (2006) therefore separated the amplification of single layer folds into three stages: (1) a nucleation stage, during which the amplification velocity is continuously increasing (acceleration); (2) an amplification stage, in which the amplification velocity is decreasing but the dynamic amplification rate is still greater than the kinematic amplification rate (i.e.α > 1); and (3) a third, kinematic stage, where the dynamic amplification rate is less than the kinematic one (i.e.α < 1) and fold amplification is predominantly passive, namely controlled by the basic state of pure shear deformation.
An important, and still controversial question, is the magnitude of the effective viscosity ratio implicit in observed natural single-layer folds (e.g.Hobbs et al., 2008;Schmid et al., 2010).Using Eq. ( 16) for the amplification of the dominant wavelength with the maximal amplification rate, integrating it in time with A 0 being the initial amplitude (at t = 0) and rearranging the terms yields The product Dxx t quantifies the bulk shortening.Folding could be considered significant if amplification A/A 0 > 10 can be achieved within 20 % shortening ( Dxx t = −0.2),which requires that α d > 10.Although the development of distinct finite-amplitude folds still depends on the initial magnitude of irregularities in the layer (A 0 ), it is a reasonable conclusion that the dominant amplification rates, which can be calculated for various scenarios described above, should be at least larger than 10 for significant folding to take place.
If the material properties are known, one can then calculate and predict whether significant folding would take place or not.On the other hand, if one can observe folds, it can be deduced that α d was at least larger than 10 and make estimates for the material parameters, such as estimating the minimum viscosity ratio for linear viscous folding.For example, folding of a linear viscous layer in a viscous medium requires a viscosity ratio larger than ∼ 25 to achieve α d > 10 (see Eq. 20).However, if both layer and embedding medium are power-law viscous, then for n, n m = 3 an effective viscosity ratio of 8 will be sufficient (Eq.20; see also Schmid et al., 2010).
As a concise summary, the simpler solutions for the dominant wavelength and corresponding maximal amplification rate derived above are listed in Table 2.
Many natural folds seem to have a cylindrical structure; i.e. their lateral extend along the fold axis is considerably larger than their wavelength.For such cylindrical fold shapes the above discussed 2-D solutions are applicable.However, other natural folds are clearly not cylindrical (such as dome and basin fold shapes) or have been formed by two consecutive events of deformation.For such fold shapes, 3-D models are required.Three-dimensional viscous folding has also been studied analytically with both the thinplate (Ghosh, 1970) and the stability analysis for linear viscous (Fletcher, 1991) and power-law viscous fluids (Fletcher, 1995;Mühlhaus et al., 1998).The finite amplitude solution in Eq. ( 29) has been elaborated for 3-D folding to take into account the ratio of the two orthogonal, layer-parallel shortening (or extension) rates and was verified with 3-D numerical simulations (Kaus and Schmalholz, 2006).

Multilayer folding
Multilayer folds (Figs. 5,10) are more frequent in nature than single-layer folds.The mechanics of elastic multilayer buckling was already discussed in Smoluchowski (1909) and we provide here only a short review.Viscous multilayer folding is also discussed in detail in Johnson and Fletcher (1994) and a review of multilayer folding can be found in Hudleston and Treagus (2010).
The thin-plate approach shows that if a multilayer is composed of m layers with identical thickness then the dominant wavelength will increase by the factor of m 1/3 (Biot, 1961).Ramberg (1962) applied the 2-D stream function approach to viscous multilayer folding and introduced the concept of contact strain.This concept states that strain due to folding is negligible outside a zone about one initial wave-Solid Earth, 7, 1417-1465, 2016 www.solid-earth.net/7/1417/2016/ Large-scale folding (Power-law viscous layer resting on inviscid medium) Detachment folding (Power-law viscous layer resting on linear viscous layer with finite thickness H M ) Finite length power-law viscous folding (η e = η 1 + 2η/η M a n−1 , D a = η e /η M a and finite length solution valid for D a 1 Multilayer folding Few linear viscous layers (Multilayer embedded in viscous medium for m 2π(R/6) 1/3 ; m is number of layers) Many linear viscous layers (Multilayer embedded in viscous medium for m 2π(R/6) 1/3 ; m is number of layers) Internal linear viscous folding (Confined multilayer with weak layers of significant thickness.
Internal linear viscous folding (Confined multilayer with weak layers of insignificant thickness. Single-layer necking Power-law viscous necking (embedded in infinite medium; linear viscous solution for n, n M = 1 ) Large-scale necking (Power-law layer resting on viscous medium with exponentially decaying viscosity for typical crustal rheologies and densities; Fletcher and Hallet, 1983) length wide on either side of the folded sheet (Fig. 19).This means that layers spaced at a distance greater than the wavelength of embryonic folds do not interact and can hence fold independently (so-called "disharmonic" folds, e.g.Ramsay and Huber, 1987).An important first result of these multilayer studies is that a multilayer with many competent layers folds with a faster amplification rate than a single competent layer in the same medium; i.e. folding instability increases with the number of competent layers (Biot, 1961;Ramberg, 1963;Fig. 19).Another important result is that the dominant or preferred (Ramberg, 1962) wavelength increases but the selectivity decreases (i.e.increasing bandwidth) as the rela- .The two layers in the middle of the model have a spacing 7 times the layer thickness.These two layers do not form harmonic folds but they do influence each other, as can be seen from the passively folded marker lines between the two layers (b, c).The top layer develops single-layer folds that are effectively independent of the layer below, as shown by the fact that the thin passive lines between the top and second to the top layer remain almost straight.The results also show that amplification rates for multilayers are greater than for single layers, because fold amplitudes in the lowest four-layer stack are larger than amplitudes in the top single layer for the same bulk shortening (b, c).Fold wavelengths in the bottom four-layer stack are also larger than the ones in the top single layer.Spacing between the bottom layer and the model bottom and between the top layer and the model top is 16 times the layer thickness.Spacing between the fourth and fifth layer from the bottom and the top and second to the top layer is 20 times the layer thickness.The dominant wavelength to thickness ratio for single layers is ∼ 15 for R = 75.The spacing between the top two layers (i.e.20) is thus slightly larger than the dominant wavelength (i.e.15), so that the observed independent or "disharmonic" folding of these two layers agrees with the concept of contact strain (see text).
tive thickness ratio of the lower viscosity to the higher viscosity layers decreases away from infinity (the single-layer case considered previously above).Below a critical value there is no real wavelength selection and the preferred wavelength is as long as the body happens to be (e.g.see Fig. 5 of Ramberg, 1962;Fig. 4 of Ramberg, 1964;Figs. 7-40 of Ramsay, 1967).
In a series of articles, Biot developed a theory of internal buckling and multilayer folding (Biot, 1964a(Biot, , b, 1965a, b), b).In his theory of internal buckling, he considered multilayers as a confined anisotropic or stratified medium under compressive stress (Biot, 1964a(Biot, , 1965a)).This simplifies the mathematical analysis, because the individual deformation of each competent and incompetent layer is neglected.For a confined multilayer, the top and bottom boundaries remain straight so that the top and bottom layers in the multilayer cannot fold.Application of this internal buckling theory to confined viscous multilayers with competent and incompetent layers of equal thickness, H , and a viscosity ratio larger than 50, yields an approximate formula for the dominant wavelength of folds that develop in the confined multilayers of total thickness, H tot (Biot, 1964a): The striking result is that the dominant wavelength is independent of (or, in the non-approximated formula, extremely insensitive to) the viscosity ratio between competent and weak layers.Biot presented several modifications of the dominant wavelength in confined multilayers, depending on various underlying assumptions, for example (Biot, 1965a) where m is again the number of layers and a = H M / (H M + H ) with H M being the thickness of the incompetent layer.If a = 0, Eq. ( 33) reduces to Eq. ( 32), which means that the dominant wavelength of Eq. ( 32) represents multilayers, where the incompetent layers have essentially zero thickness but are incompetent enough that there is negligible shear stress between the competent layers (i.e. the contact between the competent layers is perfectly lubricated).
The results for the multilayer dominant wavelength presented above all show an extremely weak sensitivity to the viscos-ity ratio, with the immediate implication that multilayer fold geometries are not suitable for estimating viscosity ratios.Biot (1965b) further showed that in a homogeneous anisotropic (orthotropic) medium two essential types of internal instability can occur, corresponding to either folding or localised kink-band formation.Following Biot's theory of internal instability, Cobbold et al. (1971) performed analytical analyses and laboratory experiments for compression of confined multilayers representing a homogeneous, anisotropic material and could validate the application of the analytical results for an anisotropic medium to true confined multilayers consisting of competent and incompetent layers.They further concluded that (i) the instability due to multilayer compression is mainly governed by the degree of www.solid-earth.net/7/1417/2016/Solid Earth, 7, 1417-1465, 2016 anisotropy rather than rheological properties such as elasticity or viscosity, (ii) the form of the internal structure depends on the angle between layer orientation (or anisotropy orientation) and compression direction, (iii) sinusoidal folds (developing for low degrees of anisotropy) and conjugate kink bands (developing for high degree of anisotropy) are endmembers of a range of fold structures that can develop in anisotropic material, (iv) chevron folds (i.e.folds with long straight limbs and short angular hinges; e.g.Paterson and White, 1966) can form by either convergence of conjugate kink bands or by progressive straightening of limbs during amplification of initially sinusoidal folds and (v) the scale of an internal structure in an anisotropic (but statistically homogeneous) rock depends on the scale of the elements that cause the anisotropy.Johnson and Fletcher (1994) presented solutions based on stability analysis for selective amplification during lowamplitude folding for many examples of multilayers with different configurations of competent and weak layers and embedding medium.They showed that the strength of the folding instability generally increases with (i) the number of competent layers involved, (ii) increasing viscosity ratios between the competent and weak layers and (iii) larger thickness of the weak medium embedding the multilayer (see also Ramberg, 1962, his Fig. 12).They further show that the amplification of multilayer folds with all layers having free slip interfaces is significantly stronger than for multilayers with all layers having no slip (bonded) interfaces (Johnson and Fletcher, 1994).The difference in interface condition (free slip vs. bonded) is much more significant for folding of a multilayer than for folding of a single layer (Johnson and Fletcher, 1994).
A third-order stability analysis of viscous multilayer folding for small but finite amplitudes was performed by Johnson and Pfaff (1989) to study fold shapes in multilayers.They distinguished three end-member forms in multilayers: parallel, constrained and similar folds.Parallel (concentric) folds develop in multilayers confined by a soft medium, whereas constrained (internal) forms develop in multilayers confined by a very competent medium.Similar (chevron) forms develop if wavelengths are short relative to the thickness of the multilayer.Schmid and Podladchikov (2006) applied the stability analysis to show that folding of embedded multilayers can occur in essentially three modes: as an effective single layer, as a true multilayer or as real single-layer folding (Schmid and Podlachikov, 2006).They consider a multilayer embedded in a weak medium, so that the top and bottom layer in the multilayer can fold, which differs from the configuration of internal folding in confined multilayers considered by Biot (1965a) 13).These conditions for the transition between multilayer folding modes are approximations because the transitions are continuous and not sharp (Schmid and Podladchikov, 2006).For the effective single layer folding mode, the dominant wavelength of the multilayer, L dm , is a factor m (= number of layers) larger than L d (Fig. 20a).For the true multilayer mode, L dm is a factor m 1/3 larger than L d .For the real single layer folding mode L dm = L d , which agrees with the concept of contact strain of Ramberg (1962), because for H m > L d the individual folds are not influenced anymore by the deformation of layers above or below and effectively deform as if they would be single layer folds (Fig. 20a).For the effective and real single layer folding modes, the maximal amplification rates of the multilayer are identical to the corresponding ones for single layer folding (α dm = α d ; Fig. 20b).For the true multilayer mode, α dm is a factor m 2/3 larger than α d but values of α dm do not increase continuously with increasing number of layers because the value is limited by ∼ 0.83R 1/3 (Fig. 20b).The difference between a true multilayer and real singlelayer folding can be observed in the numerical simulation shown in Fig. 19.The wavelengths of individual layers developed in the bottom four-layer stack are approximately 1.4 times larger than the wavelengths developed in the top single layer, which agrees with the analytical estimate of 4 1/3 ≈ 1.6 (Fig. 19).Also, amplification rates of the four-layer stack are larger than that of the top single layer (Fig. 19).
A particular problem of multilayer folding is the mechanism of formation of asymmetric parasitic folds (S and Z folds) or "polyharmonic" folds (e.g.Ramberg, 1963Ramberg, , 1964;;Ramsay and Huber, 1987).As outlined above, studies have consistently shown that multilayers fold more strongly than individual single layers.However, in the field, individual small folds are often observed on the limbs of larger folds, indicating that the smaller, shorter wavelength folds developed in the thin layers must grow faster than the larger folds that involve more layers, so that the small folds can be amplified sufficiently to be observable on the limbs of larger folds (Treagus and Fletcher, 2009).Frehner and Schmalholz (2006) performed 2-D numerical simulations and showed that thin layers first develop symmetric folds during multilayer folding and that these folds are later transformed into an asymmetric shape due to relative shearing between fold limbs (Ramberg 1963;Frehner and Schmalholz, 2006).They argued that thinner layers have larger initial ratios of amplitude to thickness than thicker layers and that hence the explosive folding of thinner layers starts before explosive folding of the thicker layers.An alternative explanation was proposed by Treagus and Fletcher (2009), who studied analytically the amplification rates of individual layers within a multilayer for various configurations.They found that in viscous multilayers with a central thinner layer, the folds in the thin layer will initiate with greater amplification rates than larger folds of the whole multilayer if the multilayer is narrowly confined, and/or if the thin layer is the most competent layer (Treagus and Fletcher, 2009).
Some approximate solutions for the dominant wavelength and maximal amplification rates for multilayers are summarised in Table 2.

Lithospheric folding
Lithospheric folding is of general geodynamic importance because it demonstrates that large regions of the interior of tectonic plates are deformable.Internal deformation of tectonic plates contradicts a basic principle of plate tectonics sensu stricto, which states that tectonic plates are essentially rigid and deformation should only occur at plate boundaries.Continental lithospheric folding and necking can be considered as a specific and important component of continental tectonics.
Probably the first large-scale folding analysis was performed by Smoluchowski (1909), who considered an elastic beam and applied Eq. ( 5).Smoluchowski (1909) equated the load term q to ρgA, where ρ is the crustal density.For elastic beams under gravity, a critical load, similar to the Euler load, is required to cause buckling (Smoluchowski, 1909).However, if the beam is viscous it will fold no matter how small the applied compressive force (Biot, 1961).For a small compressive load, the fold amplification rate will be negligibly small and Biot suggested that a horizontal stress > 9 ρgH is required for folding to take place (Biot, 1961), where ρ is the density difference between the material below and above the layer and H is layer thickness.Applying this formula to folding of the oceanic lithosphere using ρ = 3300-1000 kg m −3 and H = 20 km gives an unrealistically high stress 4 GPa.This stress, if vertically integrated over the thickness of 20 km, corresponds to an unrealistically high force per unit length of ∼ 8 × 10 13 Nm −1 .All studies on elastic lithospheric folding have shown that the stress required for folding is unrealistically high and hence folding of the lithosphere was considered impossible (see, for example, the introductions in Ramberg and Stephansson, 1964;Lambeck, 1983a).However, McAdoo and Sandwell (1985) showed that for an elasto-plastic layer, with lithospheric strength based on experimental rock mechanics, the average stress required to fold the oceanic lithosphere is reduced to 600 MPa, which could be naturally realistic.They therefore concluded that the observed basement topography and geoid height in the northern Indian Ocean results from folding of the oceanic lithosphere, caused by the India-Asia collision.
During shortening of the crust and lithosphere, rock layers can deform by either distributed ductile creep (e.g.thickening or folding) or localised brittle faulting and plastic shearing (thrusting), corresponding to bulk viscous or brittleplastic flow, respectively.Several studies have therefore investigated the parameters that control whether shortening of rock layers is more likely to take place by folding or faulting (e.g.Johnson, 1980;Erickson, 1996;Simpson, 2009;Yamato et al., 2011).The question of "folding vs. faulting" is of particular interest for fold-and-thrust belts (e.g.Jura, Zagros or Appalachians).For example, Johnson (1980) applied the stability analysis described above to elasto-plastic, strainhardening layers and showed that during shortening of such layers folding is more likely than faulting for (i) multilayers and (ii) frictionless layer contacts.In many analytical studies involving plastic deformation, the plastic layer is characterised either by a representative, constant yield stress or by a large power-law stress exponent, which mimics a constant von Mises stress (e.g.Martinod and Davy, 1992;Martinod and Molnar, 1995).However, the pressure sensitive yield stress of rocks is usually represented by a Mohr-Coulomb failure criterion, which is difficult to treat with analytical methods.Hence, numerical simulations have been applied to quantify the parameters that cause folding or faulting in foldand-thrust belts (e.g.Simpson, 2009;Yamato et al., 2011).For example, Yamato et al. (2011) numerically calculated amplification rate vs. fold wavelength curves for sedimentary sequences.Based on these curves, they determined the folding-faulting boundary for the sedimentary sequences and applied the results to the Zagros fold belt.In this review, we focus on ductile folding studies and do not discuss in further detail the folding vs. faulting issue.Biot (1961) also derived the dominant wavelength for a viscous layer floating on an inviscid medium in the field of gravity, which can be written as This dominant wavelength solution was re-derived by Ramberg and Stephansson (1964) with a slightly different approach, and in addition was verified by them with laboratory experiments.In Eq. ( 34), Ar F is the Argand number for folding and represents the ratio of gravitational stress acting against folding to compressive stress driving folding (Schmalholz et al., 2002).The Argand number was originally introduced in a slightly different form for thin viscous sheet models (England and McKenzie, 1982).As mentioned above, the main difference between folding of an elastic layer and a viscous layer is that for an elastic layer a certain load must be exceeded to initiate folding; i.e. there is a critical load (e.g.Euler load) below which no instability appears, whereas for a viscous layer folding takes place for any compressive load.The formulation for the dominant wavelength for viscous and elastic folding under gravity is identical if the term 2η Dxx in the viscous case is replaced by half the compressive load applied to the elastic layer (Biot, 1961).However, if the load is small then the maximal amplification rate α d is less than 1 order of magnitude larger than the bulk shortening rate Dxx (i.e.α d < 10) and hence folding is insignificant.Values of α d should be at least >10 for folding to be significant and observable for typical tectonic shortening www.solid-earth.net/7/1417/2016/Solid Earth, 7, 1417-1465, 2016 of several tens of percent (see Sect. 2.1.1 above).Biot (1961) argued that 4η Dxx / ρgH >∼ 9 is required for viscous folding under gravity to be significant.The maximum value of the amplification rate for folding of a power-law viscous layer resting on a viscous medium under gravity is (Schmalholz et al., 2002) The dominant wavelength for viscous and power-law viscous layers resting on an inviscid medium under gravity is the same (Schmalholz et al., 2002).Values of n for dislocation creep are usually in the range 3-5.The result for powerlaw viscous folding can also be applied to the oceanic lithosphere, for which folding is dominated by the deformation of the upper mantle lithosphere.Assuming n = 5, a total horizontal stress 4η Dxx = 350 MPa, a density difference between mantle lithosphere and water ρ = 2300 kg m −3 and H = 25 km yields α d ≈ 9, which is not sufficient for significant folding.However, if we assume that the mechanically competent (upper) region of the mantle lithosphere is deforming by low-temperature plasticity (Peierls creep), then the effective (or apparent) value of n is in the range 10-25 (Dayem et al., 2009;Schmalholz and Fletcher, 2011) and we get α d ≈ 19-47, a value sufficient for significant folding.Furthermore, if we assume that the competent level of the oceanic lithosphere is not overlain by water, but by unconsolidated sediments, then the density difference reduces to ρ = ∼ 1000 kg m −3 (Martinod and Molnar, 1995) and α d ≈ 43-107.These amplification rates are sufficient for significant folding of the oceanic lithosphere.The assumed total stress of 350 MPa integrated over an assumed 25 km thickness yields a force per unit length of 8.75 × 10 12 Nm −1 , which is a reasonable value similar to the estimated horizontal driving force per unit length of ∼ 7 × 10 12 Nm −1 associated with the crustal thickness variation from the Indian lowlands to the Tibetan plateau (see e.g.Martinod and Molnar, 1995).Also, the dominant wavelength for the applied values with ρ = 1000 kg m −3 is ∼ 137 km, which is within the range of wavelengths observed in the northern Indian Ocean of 130-250 km (McAdoo and Sandwell, 1985).It follows that the simple thin-plate-based solution for folding of a power-law viscous layer under gravity supports the proposal that folding of the oceanic lithosphere is feasible for reasonable compressive stresses.The observed fold wavelength can also be used to estimate the Argand number, Ar F (Eq. 34), which requires an assumed value for the effective thickness of the lithospheric level that is actively folding.Assuming 25 km for this thickness, wavelengths between 130 and 250 km in the Indian Ocean correspond to Ar F between 1.5 and 0.4, respectively.Assuming an effective thickness of 15 km gives Ar F = 0.14-0.53.For central Asia, Burov et al. (1993) estimated the wavelength due to folding of the mantle lithosphere to be between 300 and 360 km (see also Fig. 7).Note that for folding of the continental mantle lithosphere, which is mechanically decoupled from folding of the crust (Burov et al., 1993), the density difference relevant to folding is the difference between mantle and crustal density.Assuming effective thickness between 20 and 50 km for the folding mantle lithosphere with a wavelength between 300 and 360 km gives Ar F = 0.12-1.1.These simple estimates suggest that values of Ar F are of the order of 0.1 to 1 for folding of oceanic and continental lithosphere.If, for both oceanic and continental lithosphere, it is the upper, cold level of the mantle lithosphere that controls folding and if this level deforms plastically or by lowtemperature plasticity with effective values of n = 10-25, then values of Ar F = 0.1-1 correspond to significant folding based on Eq. ( 35).Cloetingh and Burov (2011) compiled wavelengths of folded lithosphere, including crustal, mantle and whole lithosphere folding, for 17 regions worldwide, including Central Asia (Burov et al., 1993), central Australia (Lambeck, 1983b), the north-east European platform (Bourgeois et al., 2007), the Tibet/Himalayan syntaxes belt (Shin et al., 2015) and the Indian oceanic lithosphere (McAdoo and Sandwell, 1985;Krishna et al., 2001), and showed that all fold wavelengths are between 40 and 700 km.Not surprisingly, wavelengths for crustal folding are the smallest and wavelengths for whole lithosphere folding are the largest.They also showed that the wavelengths increase with the thermo-tectonic age of the lithosphere, because older lithosphere is mechanically stronger and exhibits larger effective thickness of the folded levels.Cloetingh and Burov (2011) also discussed lithospheric folding as a mechanism of sedimentary basin formation.Lithospheric folding has also been studied with the stability analysis (Zuber, 1987;Martinod and Davy, 1992;Burov et al., 1993;Martinod and Molnar, 1995), which yields more accurate (but also more complicated) solutions for the amplification rate without changing the first-order results and conclusions of studies based on the thin-plate approach.The solution in Eq. ( 35) is strictly valid only for infinitesimal amplitudes and α d decreases with increasing fold amplitude according to the finite amplitude solution (Schmalholz and Podladchikov, 2000).Nevertheless, Eq. ( 35) indicates that plastic behaviour (n 10) can significantly increase the growth rate of a folding instability.Indeed, numerical simulations of lithospheric shortening considering representative viscoplastic yield strength profiles for the continental and oceanic lithosphere indicate that lithospheric folding most likely takes place during lithospheric compression (Zuber and Parmentier, 1996;Burg and Podladchikov, 1999;Cloetingh et al., 1999;Gerbault, 2000).The intensity of folding depends mainly on the applied bulk shortening rate and the temperature at the Mohorovičić discontinuity (Moho), which controls the integrated strength of the lithosphere (Schmalholz et al., 2009).Martinod and Molnar (1995) performed a stability analysis in which they considered a power-law viscous rheol-ogy and Mohr-Coulomb plastic yield strength of the Indian oceanic lithosphere.They argued that the oceanic lithosphere is overlain by unconsolidated sediments with average density of 2300 kg m −3 , which gives ρ = 1000 kg m −3 .Their result indicates that a force per unit length of 4.8 × 10 12 (±1.3× 10 12 ) Nm −1 is sufficient to fold the oceanic lithosphere, which is a value significantly smaller than the ones obtained from simpler models based on folding of beams.This value is important because the lateral variation of the gravitational potential energy, caused by the lateral crustal thickness variation between the Indian foreland and the Tibetan plateau, generates a force per unit length of ∼ 7 × 10 12 Nm −1 (Molnar and Lyon-Caen, 1988), which means that the growth of the Tibetan plateau alone could in principle have caused the folding of the Indian oceanic lithosphere (Molnar et al., 1993).The analysis of Molnar et al. (1993) caused some controversy because other authors using thin viscous sheet models argued that the values for the force per unit length related to the Tibetan plateau from Molnar et al. (1993) were overestimated by a factor of 2 and, hence, that the growth of the Tibetan plateau alone could not be responsible for folding of the Indian Ocean lithosphere (Ghosh et al., 2006(Ghosh et al., , 2009)).However, the analysis of Molnar et al. (1993) is based on total stress and differential stress (difference between maximal and minimal principal stress), whereas the results of thin viscous sheet models are based on deviatoric stress, which is half the differential stress in the thin viscous sheet model (Schmalholz et al., 2014).Furthermore, the force (or force per unit length in 2-D) driving folding is controlled by the total stress (4η Dxx ) and not by the deviatoric stress (2η Dxx ; see Eq. ( 8)), so that the application of Molnar et al. (1993) of the force per unit length (calculated from differential stress) due to lateral variations in gravitational potential energy to folding of the oceanic lithosphere is correct.
A frequently applied model for viscous deformation of the continental lithosphere is the thin viscous sheet model (England and McKenzie, 1982).This model assumes that lithospheric folding is negligible and that the lithosphere deforms by homogeneous, kinematic thickening.Thin-sheet models consequently assume that vertical velocities due to folding are less than vertical velocities due to kinematic thickening and hence they assume amplification rates for folding α d < 1. Thin-sheet models are useful to explain the first-order response of the continental lithosphere due to shortening on the scale of thousands of kilometres but they are not suitable for estimating the deformation on the 100 km scale, because on this scale folding is likely to be important and may control the lateral variation of vertical velocities (Lechmann et al., 2011).For example, thin sheet models are useful to predict the average topography of Central Asia (Fig. 7a) but not to predict the characteristic fold-like topography on the hundreds of kilometre scale and the related vertical velocities.

Experimental results
In most branches of science, proposed analytical solutions can be tested by direct observation or experiment.However, in the case of folding and necking of rock layers as considered here, this is impossible due to the long times and large forces, pressures, temperatures, and (often) length scales involved.By necessity, the development of such structures to large amplitude can only be studied by analogue and, more recently, numerical modelling.Initially these analogue models were only qualitative but progressively became more quantitative with the application of correct scaling laws (Hubbert, 1937) and the use of materials more rheologically similar to rocks but deformable at lower stresses, temperatures and pressures.Only a limited personal selection of the many published studies can be presented here but, in keeping with the overall theme of the review, we particularly highlight those studies that either specifically constrained analytical solutions or attempted to extend them to higher amplitudes typical of natural geometries.
The pioneer in analogue modelling of folding was Hall (1815), who "conceived that two opposite extremities of each bed being made to approach, the intervening substance, could only dispose of itself in a succession of folds, which might assume considerable regularity, and would consist of a set of parallel curves, alternatively convex and concave towards the centre of the earth".To test this premise he carried out his now famous experiments using layers of cloth to demonstrate that the folds he observed in nature could develop by shortening of horizontally layered rocks by application of a horizontal force (Fig. 1).The experiments were entirely qualitative but established the basic principle.Since then, a large number of analogue experimental studies have investigated the influence of material contrast (e.g.viscosity ratios), constitutive equations (elastic, linear and power-law viscous, plastic and different combinations), material anisotropy and initial perturbation geometry on the initiation and development of single-and multilayer folds and boudins.Only a limited selection is presented here as examples.
In a companion paper to Biot (1961), Biot et al. (1961) presented a series of analogue models aiming to provide experimental verification of the analytical results for folding of stratified viscoelastic media (Biot, 1957(Biot, , 1961)).These experiments considered layer-parallel shortening of both elastic and viscous layers embedded in a viscous matrix.Biot's thinplate theory is for a layer of infinite length and predicts an amplification rate as a function of normalised wavelength (or wavenumber) given by Eq. ( 12) with a dominant wavelength, corresponding to the maximum amplification rate, given by Eq. ( 13).In an analogue model, a layer of infinite length is unattainable and an initial infinitesimal amplitude perturbation spectrum of perfect random white noise (all wavelength components present and with equal amplitude) is also unrealistic.A novel alternative approach proposed by Biot et www.solid-earth.net/7/1417/2016/Solid Earth, 7, 1417-1465, 2016 al. (1961) was to consider the amplification of an initial isolated bell-shaped perturbation.This can be represented as an infinite cosine series by a known Fourier integral expression given in Eq. ( 26) and the evolving fold geometry with time (strain) can be calculated with Eq. ( 27).Biot et al. (1961) used this approach in a numerical evaluation of the time history of folding and sideways propagation away from an isolated perturbation but not in their analogue models.In these models, they used thin plates of aluminium or cellulose acetate butyrate (elastic layers) or roofing tar (viscous layers) in a corn syrup viscous matrix, without any prescribed initial perturbation, to establish a good correspondence with theory -at least for the very high viscosity ratios (2.22 × 10 3 and 4.28 × 10 4 ) and corresponding long dominant wavelength to thickness ratios considered (45 and 121, respectively).As predicted by Biot's theory, for such large contrasts in properties, amplification rates were high and wavelength selection strong so that a relatively clear sinusoidal wave train was rapidly developed.However, such high wavelengths are not typical of natural examples, where common wavelength to thickness ratios are between 2 and 16 with a mean value at ∼ 6.5 (e.g.Hudleston and Treagus, 2010).Ramberg and Stephansson (1964) performed laboratory experiments on folding of a viscous plate (made from molten mixtures of colophony and diethyl phthalate) floating on an aqueous solution of potassium iodide to verify the dominant wavelength for folding under gravity given in Eq. ( 34).They showed that the value of L d /H developed in the experiments is linearly proportional to the ratio √ σ/ ρgH , where σ is the compressive load applied in the experiments and corresponds to the value of 4η Dxx in the theoretical analysis.The experiments hence verified the theoretical result for the dominant wavelength, which states that L d /H is directly proportional to 4η Dxx / ρgH .Ghosh (1966) studied single-layer folding under simple shear, using combinations of modelling clay, putty and wax.He noted that the fold axis developed parallel to the major axis of the strain ellipse on the surface of the layer (i.e.perpendicular to the principal component of shortening within the layer), which, for generally oblique layering, is not necessarily parallel to a principal axis of the applied bulk strain.He also noted that the single layer folds are, at least initially, generally symmetric despite the simple shear boundary conditions.This is consistent with the later, general observation of Lister and Williams (1983) that single layer buckle folds are good examples of coaxial spinning deformation (their Fig. 4) and agrees with results of numerical models (Viola and Mancktelow, 2005;Llorens et al., 2013a).Ghosh (1968) also did analogue experiments on multilayer folding to develop rough qualitative constraints on the transition from conjugate to chevron to concentric folds.Currie et al. (1962) had previously also qualitatively investigated single-and multilayer folding in elastic materials using photoelastic gelatin.With this experimental technique they could not only investigate the influence of layer thickness and ratios in elastic properties on fold wavelength but also analyse the stress trajectories in the layer and matrix during folding.Their experiments provided an excellent visual representation of the zone of contact strain around a folding layer and the consequent development of disharmonic or harmonic folding depending on the spacing between layers (their Plate 2).Hudleston (1973) performed experiments to study the development of single-layer folds with shortening parallel to the layer.The material used for both layer and matrix were mixtures of ethyl cellulose in benzyl alcohol, which, at the low concentrations used in his experiments, is effectively linear viscous.The viscosity ratios considered were between 10 and 100 and thus much lower than those used by Biot et al. (1961).One of the aims of the experiments was to establish that folding to finite amplitude with such low ratios, and correspondingly short wavelength to thickness ratios, was possible, in contrast to what was implied in the original papers of Biot (1961) and Biot et al. (1961).In these experiments, Hudleston (1973) also specifically investigated layerparallel shortening and thickening and the transition to rapid (explosive) fold amplification, as well as making harmonic analyses of the experimental fold shapes.Cobbold (1975a) carried out analogue experiments to study the sideways propagation of folds away from an initial isolated perturbation in a single layer undergoing layerparallel shortening, using a pure-shear deformation rig (Cobbold, 1975b;Cobbold and Knowles, 1976).Materials used were well-calibrated paraffin waxes of different melting points, with power-law stress exponents of ca.2.6 and an effective viscosity ratio between layer and matrix of ca.10.Conceptually this was an experimental investigation of the process considered theoretically and numerically by Biot et al. (1961) with an initial isolated bell-shaped perturbation, but for power-law viscous materials and a much lower (and more realistic) viscosity ratio.However, Cobbold (1975a) used a cylindrical form for the initial perturbation, rather than a bell-shape with the known Fourier integral representation of Eq. ( 26), and did not consider the propagation in terms of amplification of Fourier spectral components.Instead, he introduced the important concept of the perturbation flow lines (Passchier et al., 2005) to qualitatively investigate the sideways spread of the folding instability.Gairola (1978) made single-layer fold experiments with plasticene layers embedded in putty to investigate the effects of progressive deformation on fold shape and particularly on the internal strain within the layer and on the varying position of the neutral surface.He found that the appearance of the neutral surface depends on the "ductility contrast" between the layer and matrix, and the amount of strain.A neutral surface may not appear at all if the contrast between layer and matrix is very small, due to the strong component of layerparallel shortening, which agrees with recent results of numerical simulations (Frehner, 2011).These experimental re-sults can also explain why thin-plate results become more inaccurate for smaller viscosity ratios, because the thin-plate results are based on the assumption of a neutral line in the centre of the folding layer.Neurath and Smith (1982) performed folding experiments with wax models, measured the effective viscosities and power-law exponents for the wax models, and compared the experimentally determined amplifications rates with the corresponding theoretical rates.They showed that theoretical and measured amplification rates more or less agreed with the theoretical rate as derived by Smith (1975Smith ( , 1977Smith ( , 1979) ) and the equivalent results of Fletcher (1974Fletcher ( , 1977)).
Abbassi and Mancktelow (1990) investigated the influence of initial perturbation shape on fold shape, establishing that markedly asymmetric folds, even with overturned limbs, could develop by amplification of a small initially asymmetric irregularity, despite the fact that the imposed boundary condition was layer-parallel shortening in a pure shear deformation rig (Mancktelow, 1988a).Abbassi and Mancktelow (1992) and Mancktelow and Abbassi (1992) employed the isolated bell-shaped perturbation technique originally developed by Biot et al. (1961) directly in analogue experiments, both to investigate the effects of perturbation geometry on fold shape and lateral propagation (Cobbold, 1975a) and to experimentally determine fold amplification rates.Instead of calculating a numerical forward model for a specific amplification rate curve as done by Biot et al. (1961), they reversed the approach and used the changing shape of an initial bell-shaped perturbation with known initial values of a and b (see Eq. 26 for the meaning of a and b) to determine, via Fourier analysis, the amplification rates for folding in well-calibrated power-law viscous materials (paraffin waxes of different melting temperatures; Cobbold, 1975a;Mancktelow, 1988b).The amplification rate curves determined in this way were directly comparable to those predicted theoretically (Fletcher, 1974(Fletcher, , 1977;;Smith, 1975Smith, , 1977) but for short wavelengths and particularly for narrow initial perturbations, the observed amplification rates were generally higher than theoretical values.This could reflect the strain softening behaviour of the layer, as also suggested by Neurath and Smith (1982) for their boudinage experiments (but significantly not for their folding experiments).The experiments indicate that bonding of the matrix-layer interface may have a much greater effect on the amplification rate curve than is theoretically predicted, at least for the low to moderate effective viscosity ratios investigated (8 and 30).For better bonding between layer and matrix, the amplification rate decreases and consequently the amount of initial layer-parallel shortening increases.Abbassi and Mancktelow (1992) observed that the influence of the initial perturbation is greater for broader irregularities, when the average wavelength component is longer than the dominant wavelength, than for narrow isolated irregularities.Marques and Podladchikov (2009) placed a thin layer of either plasticine or polyethylene between viscous poly-dimethylsiloxane (PDMS; Dow Corning SGM36) below and Fontainebleau quartz sand above.The PDMS represents the ductile part of the lithosphere, the quartz sand the brittle parts and the thin layer of either plasticine or polyethylene the thin elastic core, which is easily flexed but unstretchable/unbreakable. Their results show that a very thin, elastic layer between an overlying brittle and underlying viscous medium produces folding as the dominant deformation mechanism during shortening, and not brittle faulting or viscous homogeneous thickening.
Recently, Marques and Mandal (2016) have made experiments to investigate the buckling and post-buckling behaviour of an elastic single layer (cellophane, plasticine, or polyethylene film) in a linear viscous medium (PDMS silicone putty).The experiments were performed in two stages: a first stage of buckling by layer-parallel shortening at different rates and a second stage of buckling relaxation with fixed lateral boundaries.They found major contradictions between their experimental results and both the analytical results of Biot (1961) for the buckling phase and with the analytical solutions and conclusions of Sridhar et al. (2002) for the buckling relaxation stage.Their results have still to be explained by theoretical models.
Analogue experiments on single-and multilayer folding have generally investigated a geometry where the principal bulk shortening direction is within the layer and the principal extension axis is perpendicular to the layer.Experiments with oblique layers are technically difficult because the layer ends tend to slide along the boundaries.Oblique loading of the ends also introduces unavoidable additional perturbation components, because the planar boundary is no longer parallel to the axial plane of the developing folds.Grujic and Mancktelow (1995) carried out pure and simple shear analogue experiments, where the intermediate axis was perpendicular to the layer (i.e. both the principal shortening and extension directions were within the layer).Models were generally constructed of power-law (n = 2-3) viscous paraffin waxes of different melting temperatures, but in some cases a matrix of linear viscous PDMS silicone putty was used to allow observation.Folds developed parallel to the stretching direction but significant amplification was only possible for a (very) high effective viscosity ratio (i.e.only for ca.600 and not for ca.30).In rotational simple shear experiments, Grujic and Mancktelow (1995) observed that, in high viscosity ratio experiments, the amplifying folds develop initially approximately parallel to the infinitesimal stretching direction.With increasing shear and amplification, the fold axes remain fixed to the same material points and therefore rotate as a passive line, rotating toward but not strictly tracking the finite extensional axis.As a result, there is a component of antithetic shear along the axial plane of these folds.The observation that fold hinges remain fixed to material points may reflect material damage and strain softening along the fold hinges, which correspond by definition to lines of maximum layer curvature.The paraffin waxes employed are strain softening (Mancktelow, 1988b), so that increased strain in the hinge will tend to subsequently localise further strain.Davy and Cobbold (1991) modelled the lithosphere as two, three or four layers: brittle crust (quartz sand), ductile crust (silicone), brittle mantle (quartz sand) and ductile asthenosphere (sugar solution).Variation in the rheology with depth (e.g.temperature dependence of viscosity) was not considered in the simplified model but the potential effects of erosion were.They investigated the interplay between buckling and lithospheric thickening, showing that thickening style is mainly dependent on mantle behaviour, as well as demonstrating the effect of low degrees of coupling, when the brittle crust can detach and buckle independently of mantle layers.Martinod and Davy (1994) modelled the development of periodic instabilities in continental and oceanic lithosphere under compression.The lithosphere was modelled as a stack of alternating brittle (quartz sand) and ductile (silicone putty) layers.As with Davy and Cobbold (1991), there was no vertical variation within the layers themselves.For small strain, the deformation modes mainly depend on the spatial distribution of the brittle layers and the amplitude of buckling is an exponential function of horizontal strain, as would be expected for folding (Eq.9).

Short history of necking
The terms "boudin" and "boudinage" were first introduced by Lohest et al. (1908) and Lohest (1909) as a descriptive term for sausage-like structures (hence "boudin", which is a French word for blood sausage) that they observed in the High-Ardenne Slate Belt, which were developed in psammitic layers embedded within a more pelitic matrix.However, recent studies now interpret these classic "boudins" of Lohest and co-authors to be in fact "mullions" (Urai et al., 2001;Kenis et al., 2002Kenis et al., , 2004)), developed due to layer shortening."Pinch and swell" was already used by Matson (1905) as a purely descriptive term for the geometry of peridotite dykes from near Ithaca, New York, but without a sketch and with the implication that this was an original intrusive rather than tectonic structure.A short but relatively comprehensive summary of early literature on boudinage is given by Cloos (1947).By this time, there were already descriptions in the literature of more ductile pinch-and-swell structures (e.g.Ramsay 1866;Harker 1889;Walls 1937), but Cloos (1947) concentrated more on examples involving fracture and interpreted the initial fractures as tension joints normal to the direction of extension.However, he notes that "the barrel shape of the classical boudins is somewhat puzzling but seems to be a function of incipient flowage in the competent layer".Fracture development producing rectangular or barrel-shaped boudins is promoted by the dynamic (or tectonic) underpressure inherently developed in an extending competent layer, in contrast to the overpressure developed if the layer is shortened (Mancktelow, 1993(Mancktelow, , 2008)).This under-or overpressure is associated with corresponding refraction in the principal stress axes in the more competent layer (Mancktelow, 1993), so that extensional fractures are nearly perpendicular to layering, as typically reported for brittle boudins (e.g.Cloos, 1947).As discussed by Rast (1956), the difference in behaviour between barrel-shaped and lozenge-shaped boudinage directly reflects the mechanical response of the layer: if the layer is effectively elasto-plastic (i.e."brittle") it develops extensional fractures (joints) and rectangular or barrel-shaped boudins; if viscous flow dominates (at least initially), mechanical instability will lead to necking and the development of pinch-and-swell or lozenge shapes.

Theoretical results
We focus here on studies investigating ductile necking instability.Many studies on boudinage consider brittle boudinage or study deformation with an initial configuration where the competent layer is already broken or already includes weak layers separating the layer.Such studies are useful to investigate the kinematic evolution of boudins but yield no insight into the necking instability.An extensive review of boudinage and necking is also provided in Price and Cosgrove (1990).

Single layer necking
Galilei (1638) performed one of the first experiments to test the tensile strength of columns (Fig. 8) and the first mathematical study of necking was probably by Considère (1885) (see also Dieter, 1986, in his Sect. 8-3).Assuming a homogeneous layer with thickness H , the extensional load (here force per unit length) is F = σ H with σ being the total stress.We assume in addition that the material is strain hardening; i.e. the stress and hence the load-carrying capacity increases with increasing strain.During extension at an imposed constant rate, the stress also increases due to the decrease in layer thickness.Necking or localised deformation begins when the increase in stress due to decrease in layer thickness becomes greater than the increase in load-carrying ability of the layer due to strain hardening.At the onset of extension, the load required to extend the strain-hardening layer is increasing.The maximum load is achieved when the rate of change of the load during extension is zero; i.e. dF = dσ H +σ dH = 0.The ratio dH /H corresponds to the vertical shortening and is identical to the negative of the horizontal (layer-parallel) extension, −dL/L = −dε with L being the layer length, assuming mass conservation and an incompressible material.The variation of the load can be reformulated to The above equation is known as the Considère criterion and states that the load is at a maximum when the rate of strain hardening, dσ/dε, is equal to the stress, σ .When dσ/dε > σ then dF > 0 and the extension is stable, whereas when dσ/dε < σ then dF < 0 and unstable necking takes place.Introducing the dimensionless strain-hardening coefficient β = dσ/dεσ , the Considère criterion predicts onset of necking instability for β < 1.The Considère criterion can be used to predict the amount of extension at which necking takes place.For example, if a material follows a strainhardening stress-strain relation of the form σ = Kε r , with K and r being here material parameters (K > 0; 0 < r < 1), then dσ/dε = Krε r /ε.Substituting the expressions for σ and dσ/dε in the Considére criterion, σ = dσ/dε, yields ε = r.This means that necking begins when the extensional strain ε is equal to the strain-hardening exponent r.
The analysis above, which is based on the early work of Considère, assumes that the flow stress is only dependent on strain.A similar analysis can be done for a material that is both strain and strain-rate sensitive.The strain-rate sensitivity is described by a standard, strain-rate hardening powerlaw viscous flow law; i.e. σ = η C ε1/n (n > 1), where here ε is the strain rate.The result of the stability analysis shows that the onset of necking instability takes place when (Hart, 1967; see also Dieter, 1986, his Sect. 8-10) If only strain-rate sensitivity is considered (β = 0), then the stability criterion reduces to 1/n < 1 and indicates that necking in power-law viscous material only takes place if n > 1, which means that in a linear viscous material (n = 1) a necking instability does not occur.This result for power-law viscous material has been confirmed by a slightly different analysis of Emerman and Turcotte (1984) and by the analysis of Smith (1975Smith ( , 1977) ) presented in more detail below.The stability criterion of Hart (1967) in Eq. ( 37) is controversial and has been much discussed in the engineering literature because it is not universally valid for any kind of initial perturbation or imperfection (e.g.Ghosh, 1977;Hutchinson and Obrecht, 1977).Indeed, there is a very extensive engineering literature on necking in strain and strain-rate sensitive materials due to its importance, for example, for metal forming, but a review of the non-geological literature is beyond the scope of this review.The interested reader is referred to Hill (1952), Ghosh (1977), Hutchinson and Neale (1977), Hutchinson and Obrecht (1977), Tvergaard et al. (1981) and Dieter (1986).Smith (1975Smith ( , 1977) ) applied the stability analysis to both folding and necking of linear and power-law viscous layers embedded in a linear and power-law viscous medium.He showed that the dominant wavelength solution for folding and necking is identical (for the same material parameters), but that the corresponding amplification rates for folding and necking are different (Fig. 14).The maximal amplification rate for necking, namely the maximum from Eq. ( 18) for θ = 1, which corresponds to the dominant wavelength, can be approximated (Smith, 1977) by The result shows that for linear (Newtonian) viscous fluids α d = 0 and there is no active component of necking and therefore that necking does not occur in linear viscous fluids, in agreement with Eq. ( 37) for β = 0. Hence, pinch-andswell structure is an excellent palaeo-rheology indicator, because rocks developing a pinch-and-swell instability behaved as non-linear (e.g.power-law) viscous fluids during pinchand-swell formation.Necking can also occur for non-linear flow other than power-law such as, for example, an exponential flow law (Schmalholz and Fletcher, 2011).Neurath and Smith (1982) also performed necking experiments with wax models in addition to the folding experiments.For necking the measured amplification rates where significantly higher (a factor of 2-3) than the theoretical ones (Neurath and Smith, 1982).They suggested that the discrepancy could be due to some kind of strain softening by which the power-law exponent would increase with increasing strain.They show analytically that strain softening can be described with an effective power-law stress exponent where ε * is a measure for the strain during softening.Note that in Neurath and Smith (1982) values of n eff remained positive during strain softening so that the material remained strain-rate hardening.
A simple 1-D analytical solution for the evolution of thinning during necking of an incompressible power-law layer can be found by assuming that the layer is free (no embedding medium) and that plane sections in the layer remain plane during necking (Emerman and Turcotte, 1984;Schmalholz et al., 2008).The extension rate parallel to the layer can then be expressed by a change of the layer thickness, that is , where τ xx is the horizontal deviatoric stress and B is a material constant.Assuming in addition a constant horizontal extensional force, F (in units Nm −1 ), the deviatoric stress is τ xx = F /2H (note that the factor 2 appears again because force is related to total stress and for a free layer the deviatoric stress is half the total stress).Equating the two above expressions for D xx and using τ xx = F /2H yields a non-linear ordinary differential equation (ODE) for Integrating both sides of the equation with respect to time and using the initial condition H (t = 0) = H 0 yields the solution (Schmalholz, et al., 2008;Schmalholz, 2011) where the characteristic time t c = 1/ Bτ  Equation (41) shows that when t/t C reaches the value of 1/n, then H is zero and the layer has been separated by necking (Fig. 17b).Hence, the time necessary to separate a layer by necking is t N = 1/ nBτ n xx0 .For example, if a necking experiment for τ xx = 250 MPa was performed at a temperature T = 800 • C with Madoc dolomite following the flow law of Davis et al. (2008), given by xx with ε = 10 29 s −1 , µ = 45.6 GPa, Q = 420 kJ mol −1 , R = 8.31 kJ mol −1 K −1 and n = 7, then B = 8.1422 × 10 −67 Pa −n s −1 and it would take nearly a year to separate the dolomite by necking because t N = ∼ 328 days.
Comparison with numerical simulations shows that the above simple analytical solution provides reasonably accurate results for the evolution of thinning (H /H 0 ) with progressive extension up to H /H 0 = 0.2 and for effective viscosity ratios larger than ∼ 100 (Schmalholz et al., 2008).Numerical simulations of necking show that initially straight vertical (layer-orthogonal) passive lines across the layer remain straight and vertical during necking (i.e.plane sections remain plane), which means that within the layer there is essentially no layer-parallel shear around the necking region (but there is shear in the medium around the layer; Schmalholz et al., 2008).Furthermore, the amplification rates of initial geometrical perturbations decrease with increasing extension, similar to finite amplitude folding.For necking, the decrease in amplification rates is most likely due to the increasing shear resistance of the embedding medium around the necking zone, because for necking of a free layer the amplification rates actually increase with progressive necking (see Fig. 9 in Schmalholz et al., 2008).The increase of amplification rates with progressive necking of a free layer is also predicted by the analytical solution of Eq. ( 41), because the thinning-vs.-timecurves for n > 1 have a concave-downward shape (Fig. 17b).Finite amplitude necking of free and embedded layers are also associated with structural softening, similar to finite amplitude folding (Fig. 18).Some approximate solutions for the dominant wavelength and maximal amplification rate for necking are listed in Table 2.

Multilayer necking
Theoretical studies on small-scale multilayer necking are rare in the geological literature.Most analytical multilayer necking studies have been applied to large-scale necking and lithospheric extension (see below).Most theoretical studies have considered brittle boudinage in order to calculate the stress field in multilayers under extension or to calculate the stress fields for layers with pre-existing vertical fractures, in which case the initial fracturing process itself has not been investigated (e.g.Strömgård, 1973;Mandal et al., 2000).Cobbold et al. (1971) showed that if the theory of internal instability for folding, as developed by Biot (1957Biot ( , 1964a)), is used for a compression direction orthogonal to the anisotropy orientation, then structures can form that are similar to pinch-and-swell structure (they also used the term internal boudins).Artemjev and Artyushkov (1971) were probably the first to suggest that rift systems are caused by crustal thinning due to a necking instability during lithospheric extension.It was subsequently shown that lithospheric necking for slow spreading rates (1-3 cm yr −1 ) is feasible for creep flow laws considered typical for the lithosphere (Tapponnier and Francheteau, 1978).Later, the stability analysis (described above for folding) has been applied to study necking instability during lithospheric extension (Fletcher and Hallet, 1983;Ricard and Froidevaux, 1986;Zuber and Parmentier, 1986), including lithospheric models with two competent layers (upper crust and upper mantle) separated by a weak lower crust (Zuber et al., 1986).Compared to small-scale necking models, models of lithospheric necking are usually more complex because they consider (i) gravity, (ii) one or more competent layers with a very high power-law stress exponent mimicking effectively plastic deformation, (iii) a viscosity that decays exponentially with depth in the weak layers to mimic the temperature dependence and (iv) some kind of stress limit to mimic the brittle yield strength of rock.

Lithospheric necking
The impact of gravity on necking can be also quantified by an Argand number, namely the dimensionless ratio of gravitational stress to extensional stress (Fletcher and Hallet, 1983) where ρ is the density difference between the material below and above the competent layer, g the gravitational acceleration, H the thickness of the competent layer and τ y the representative extensional yield stress in the competent layer.The Argand number Ar N is similar to the one that has been introduced by England and McKenzie (1982) to scale the gravitational stress to the horizontal stress during lithospheric thickening and is similar to the Argand number applicable to lithospheric folding (Schmalholz et al., 2002).Fletcher and Hallet (1983) showed that for a wide range of creep flow laws and an extension rate of the order of 10 −15 s −1 , the necking instability is strong (α d 40) and that the dominant wavelength L d = 30-90 km.The ratio of L d to the depth of the brittle-ductile transition (representing the thickness of the competent layer) is ∼ 4 and typical values of Ar N are 2-6.Values of Ar N are larger than values of Ar F for folding (0.1-1.5; see Sect.2.1.3.) because the yield stress for extension and normal faulting is approximately a factor of 4 smaller than the corresponding yield stress for compression and thrusting (e.g.Sibson, 1974).Several subsequent studies have applied the stability analysis to study necking in an extending lithosphere and showed, for example, the strong impact of the rheological assumptions and stratification on necking (e.g.Bassi and Bonnin, 1988;Martinod and Davy, 1992).Fletcher and Hallet (2004) and Pollard and Fletcher (2005;their Sect. 11.2.4) presented large-scale analytical necking solutions that also consider the effect of erosion, which is described by a diffusion-type law.Fletcher and Hallet (2004) showed that erosion can significantly increase the necking instability.
Recent studies on magma-poor rifted margins have identified so-called necking zones in which the crustal thickness is strongly reduced from a normal thickness of 30-35 km to about 5-10 km (Peron-Pinvidic and Manatschal, 2009).These necking domains separate the proximal domain from the hyperextended domains in which the continental crust is strongly thinned (e.g.Sutra et al., 2013).Recent studies on magma-poor margins indicate that the continental lithosphere can be significantly extended and necked over several hundreds of kilometres without a lithospheric breakup, which would result in the formation of new oceanic crust at a mid ocean ridge (Sutra et al., 2013).Assuming that pre-rift (initial) geometrical perturbations of crustal thickness have an amplitude (A 0 ) of the order of 100 m requires an amplification, A/A 0 , of 10 5 to thin an initially 30 km thick crust to 10 km.Using typical amplification rates (scaled by the bulk extension rate) for the continental lithosphere in the range 40-100 (Fletcher and Hallet, 1983), the bulk extension required to achieve A/A 0 = 10 5 can be calculated by the formula ln (A/A 0 ) / (α d − 1) (compare with Eq. 31), which gives an extension of about 30 % and 12 % for amplification rates of 40 and 100, respectively.Applying these extension values to the 25-123 km range of dominant wavelengths derived by Fletcher and Hallet (1983) for typical continental rocks provides a corresponding range of "extended" wavelengths between 28-166 km.The necking zone corresponds to half the extended wavelength and hence ranges between 14 and 83 km, which agrees with the observed widths of necking zones of 20-100 km for passive margins worldwide (Chenin, 2016).The agreement between observed and predicted width of necking zones suggests that the observed necking zones at passive margins could indeed be the result of mainly viscous necking.
Lithospheric extension, rifting and associated sedimentary basin formation in a number of regions worldwide have been attributed to mainly lithospheric necking, such as the region around the Porcupine and Rockall basins in the southern North Atlantic (Mohn et al., 2014;Fig. 7b), the Baikal rift (Artemjev and Artyushkov, 1971) or the western Mediterranean back-arc basin (Gueguen et al., 1997).Furthermore, most kinematic or semi-kinematic (including flexure) models of lithospheric thinning and associated sedimentary basin formation implicitly assume a continuous necking of the lithosphere (McKenzie, 1978;Kooi et al., 1992).Such thinning models are of practical importance for the assessment of hydrocarbon reservoir potential in extensional sedimentary basins (see applications).
Necking has also been suggested to be the controlling process for slab detachment (Lister et al., 2008;Duretz et al., 2012).Detachment of an oceanic slab usually occurs when the corresponding ocean is closed and continental collision has started.The cold and dense oceanic slab is then hanging more or less vertically in the mantle and is attached to the overlying, less dense continent.The downward extension is controlled by the negative buoyancy of the cold slab in the warmer mantle.The analytical necking solution of Eq. ( 41) has been applied to show the feasibility of slab detachment (by using the buoyancy as the driving force F ) for the Hindukush region, as an example (Schmalholz, 2011).The simple analytical solution can accurately describe the thinning of the lithospheric slab during slab detachment, which was also numerically simulated with 2-D thermo-mechanical models considering viscoelastoplastic rheologies, heat transfer by conduction and advection, and thermo-mechanical coupling by shear heating (Duretz et al., 2012).The first-order agreement between the simple 1-D analytical solution and the 2-D thermo-mechanical numerical solution indicates that the 1-D necking solution captures the first-order processes of slab detachment.The simple ODE in Eq. ( 40) was elaborated into a system of ODE's to study the impact of coupling between grain-sensitive rheology and grain-size evolution with damage on slab detachment (Bercovici et al., 2015).The more elaborated system of ODE's had to be solved numerically.Bercovici et al. (2015) showed that weakening due to grain size reduction and damage in polycrystalline rock can significantly accelerate necking and hence slab detachment, so that detachment can occur in about 1 My.

Experimental results
There are fewer experimental studies on the development of viscous pinch-and-swell necking for several reasons.First, as shown theoretically by Smith (1975Smith ( , 1977)), Emerman and Turcotte (1984) and Eq.(37) above, the dynamic growth rate of necking in linear viscous materials is zero.Whereas for folds the kinematic or passive growth rate due to the homogeneous component of background strain is +1, reversing the sign of this bulk strain relative to the layer also reverses the sign of the passive growth rate: in contrast to folds there is a passive deamplification of initial perturbations due to stretching of the layer.It follows that viscous necking should not develop in linear viscous materials as used in many analogue experiments.Smith (1977) did predict dynamic growth of necks in strain-rate hardening power-law viscous materials, with the amplification rate increasing for higher values of the power-law stress exponent, especially in the layer.However, there are technical difficulties with developing necks in competent power-law viscous layers in analogue experiments.As discussed above in relation to Eq. ( 22), the effective viscosity in a power-law viscous material is a function of strain rate and, when the strain rate is zero, the effective viscosity should be infinite.In theoretical studies such as that of www.solid-earth.net/7/1417/2016/Solid Earth, 7, 1417-1465, 2016 S. M. Schmalholz and N. S. Mancktelow: Folding and necking across the scales Smith (1977), the layer is infinitely long and the strain rate is taken as a given parameter.In an experiment, the layer is of finite length, is poorly bonded to the model boundary, and initially has a zero strain rate.In pure shear folding experiments, shortening is directed along the length of the layer and the layer ends have no alternative other than to move inwards with the model walls.This constraint is not present when the walls extend away from the layer ends in a pure shear necking experiment.In such a model configuration, the layer represents a finite-length inclusion in a weaker matrix, equivalent to the case considered by Schmid et al. (2004) for folding of layers of finite length.Extending the discussion of Mancktelow and Pennacchioni (2010) on isolated powerlaw inclusions, for strictly power-law viscous materials, a finite-length layer with initially zero strain-rate and infinite effective viscosity should behave rigidly and detach from the model walls, with the matrix simply flowing around the layer ends.However, because no material has a perfect power-law rheology and the effective viscosity is generally asymptotically limited to a non-infinite value with decreasing strain rate, the situation is not as bad in practice as in theory.Also, following Schmid et al. (2004), increasing the length to thickness of the modelled layer(s) is advantageous, but there are realistic limits on the length of model rigs and long thin layers are more difficult to prepare accurately and to observe in sufficiently fine detail.Ramberg (1955) performed compression experiments orthogonal to layering of layered cakes of putty, plasticene and cheese, with either 1-D or 2-D compensating extension.The resulting structures are similar to natural boudinage and pinch-and-swell structure, but such models, like the models of Hall (1815) on folding, were more illustrative than quantitative.Griggs and Handin (1960) studied the mechanisms of deep earthquakes and performed extension experiments with natural rock, but not necessarily scaling length, time and temperature.Amongst others, they performed experiments with Hasmark and Luning dolomite and Eureka quartzite layers embedded in Yule marble for confining pressures of 200 and 500 MPa (2 and 5 kbar) and temperatures of 800 • C.They showed that, depending on the confining pressure and temperature, three macroscopic deformation processes take place: extension fracturing, faulting and uniform flow (necking).Extension fracturing takes place in the brittle regime at lower confining pressures and temperatures, faulting (i.e.shear failure) takes place at the transition between brittle and ductile deformation and uniform flow occurs in the ductile regime at higher confining pressures and temperatures.For confining pressures of 200 and 500 MPa and 800 • C, the dolomite layers were necking while the quartzite layer was fracturing.
In addition to their experiments on folding in powerlaw viscous materials, Neurath and Smith (1982) also performed necking experiments.The measured amplification rates where significantly higher (a factor of 2-3) than the theoretical ones and they suggested that the discrepancy could be due to some kind of strain softening, by which the power-law exponent would increase with increasing strain.They showed analytically that strain softening can be described with an effective power-law stress exponent given in Eq. (39).Ghosh (1988) conducted experiments with plaster of Paris resting on a substrate of pitch with equal stretching of the layer in all directions to investigate 2-D chocolatetablet structure.The study was designed to consider the geometry during progressive development and from the materials chosen could only develop brittle boudins rather than the pinch-and-swell structures considered here.Kidan and Cosgrove (1996) used the same rig employed in earlier folding experiments (Cobbold 1975a(Cobbold , 1975b;;Cobbold and Knowles, 1976) to investigate multilayer boudinage, using layers of paraffin wax and plasticine.Their experiments generally developed rectangular boudins due to (sequential) fracturing but internal pinch-and-swell structure in some cases developed on a larger scale, reflecting the overall anisotropy.
More recent experimental studies on boudinage in 2-D or 3-D have used specially designed rigs (Zulauf et al., 2003) and power-law materials with high stress exponents, such as plasticine with n = ∼ 7 (McClay, 1976;Zulauf and Zulauf, 2004;Zulauf et al., 2011).As shown by Schöpfer and Zulauf (2002), plasticine never flows at steady state but is strongly strain hardening, with the stress exponent also increasing (in some mixtures markedly) with increasing strain.Both of these effects promote heterogeneous deformation and localisation (Hobbs et al., 1986), with the development of ductile shears, as noted by McClay (1976).Strain hardening is also a pre-requisite for the onset of necking according to the Considère criterion, as discussed in detail above.The experiments of Schöpfer and Zulauf (2002) with plasticine layers in a plasticine/oil mixture matrix developed pinchand-swell structures even for remarkably low effective viscosity contrasts of ∼ 1.5, with more distinct boudins at ratios of ca.2.0-2.5.Their results were consistent with the theoretical dominant wavelength predicted by Smith (1977) for such low effective viscosity ratios.However, the experiments of Schöpfer and Zulauf (2002) also suggest that at these low viscosity ratios the dominant wavelength is approximately constant.Only the boudin geometry is sensitive to the viscosity ratio, with pinch-and-swell geometries developing at the lower values.Marques et al. (2012) used layers of viscoelastoplastic clay or elastic soft paper in linear viscous PDMS silicone putty to investigate the influence of layer thickness and bulk strain rate on the average boudin width for brittle boudins.Although their natural measurements from south-west Portugal show a clear linear relationship between layer thickness and boudin width, as would be expected from elastic theory, the average boudin width in their experiments shows an exponential dependence on layer width and a power-law dependence on bulk strain rate.
4 Newer developments 4.1 Non-linear terms in the folding equation and localised folding Equations ( 5) and ( 8) for elastic and viscous folding, respectively, are linear and the corresponding solutions are periodic; i.e. they can be expressed with trigonometric functions such as cosine or sine.However, most natural fold systems are not strictly periodic but irregular and sometimes localised.Localised folding is characterised by the existence of large amplitudes only over a small region of a shortened layer (after Wadee, 1999).The reason for irregular and localised fold geometries has been controversially discussed in the last decades (e.g.Zhang et al., 1996;Mancktelow, 1999;Schmid et al., 2010;Hobbs and Ord, 2012).There are essentially two fundamental reasons for irregular and localised fold geometries: (1) geometrical and material heterogeneities, and (2) material softening.
Considering the first reason, if linear equations for folding are considered, then irregular and localised fold geometries can result from (i) an irregular and localised initial geometry of the layer or (ii) from non-homogeneous material properties.Concerning (i), in the thin-plate approach one usually assumes that the layer has initially a constant thickness but that the layer is initially not perfectly straight; for example, it can have the shape of a bell-shaped function (Eq.26; Fig. 16).Stability analysis can consider initial irregularities either as deviation from a straight layer having constant thickness or as initial variations in the layer thickness.The thin-plate approach and the stability analysis can also include the impact of non-linear rheologies, such as a power-law flow that is strain-rate hardening (stress increases with increasing strain rate), but these flow laws are in practice linearised to provide accurate solutions provided fold amplitudes are small (i.e.limb dips smaller than ∼ 20 • ; Chapple, 1968;Schmalholz, 2006).Analytical results, numerical simulations and analogue experiments have shown that initial irregular layer geometries can generate a wide variety of irregular, non-periodic and localised fold geometries (e.g.Cobbold, 1975a;Abbassi andMancktelow, 1990, 1992;Mancktelow and Abbassi 1992;Mancktelow, 1999Mancktelow, , 2001;;Schmalholz, 2006;Schmid et al., 2010).Concerning (ii), heterogeneities, such as stronger or weaker inclusions, in a perfectly straight layer or in the matrix close to the layer will cause local perturbations of the deformation and hence cause local deformations in the layer, which in turn cause a deviation from the straight geometry.These geometric variations can then also cause irregular or localised fold shapes.One of the first localised folding solutions was presented by Smoluchowski (1909) for folding of an elastic layer under gravity.The solution is described by a sinusoidal waveform, whose amplitude at one end of the layer is exponentially decaying along the (one-sided) infinite layer.The small initial amplitude at one end of the layer is interpreted as result of local de-viations from isostatic equilibrium (Smoluchowski, 1909).If non-linear equations for folding are considered then a much wider spectrum of solutions is possible.Non-linearities arise essentially due to two reasons: geometrical non-linearities or material non-linearities.Geometrical non-linearities have been considered to describe the finite amplitude evolution because the linear solutions based on exponential amplitude growth with a constant rate break down when fold limb dips exceed ∼ 20 • .Material non-linearities, such as due to powerlaw flow laws, are often linearised and, as mentioned above, the resulting fold geometries can be explained by the corresponding linearised equations.Therefore, the fundamental finite amplitude fold geometries (i.e.regular or localised) due to geometrical and material non-linearities for hardening behaviour can be well estimated with linearised equations because these linearised equations are valid up to limb dips of ∼ 20 • .Hence, even if geometrical non-linearities and material non-linearities associated with hardening behaviour are considered, geometrical and material heterogeneities are still required to generate localised fold shapes.
Considering now the second reason for localised fold shapes, other types of non-linearities have also been studied with the thin-plate approach, whereby the resistance of the material in which the layer is embedded (often termed the matrix or foundation) is assumed to be non-linear.The linear term for the matrix resistance in Eq. ( 5) is usually q = kA but in the non-linear analysis it is usually expanded to q = kA + c 1 A 2 + c 2 A 3 + . . .(Tvergaard and Needleman, 1980;Wadee, 1999;Hunt et al., 2000).Typically, expressions like q = kA − cA 2 or q = kA − cA 3 have been considered where c is a constant.These non-linearities describe a material softening of the matrix resistance because the matrix resistance becomes smaller as the amplitude becomes larger.Also non-linear and viscoelastic behaviour of the matrix has been investigated using q = 4G 2π where G is the shear modulus of the embedding medium and c is a positive constant (Hunt et al., 1996).Some of these non-linear folding equations are mathematically identical to the non-linear equations that have been studied in the framework of non-linear dynamics.For example, the non-linear ODE for the buckling of a free elastic beam (i.e.Eq. 3 in which the deflection is quantified by θ , namely the angle between the horizontal x direction and the beam and not by amplitude A) is This non-linear (due to the sinus function) folding equation is mathematically similar to the non-linear ODE describing a pendulum (Hunt et al., 1989) where θ is the deviation from the vertical direction of gravity, m is the mass, L the length of the pendulum and g the www.solid-earth.net/7/1417/2016/Solid Earth, 7, 1417-1465, 2016 gravitational acceleration.Equations.( 43) and ( 44) become exactly equivalent if we identify D as mL, F as mg and x as t (Hunt et al., 1989).This equivalence between the deformation of a beam and the motion of a pendulum can be traced back to Kirchhoff (1859) and is known as Kirchhoff's kinetic analogue (see also Love, 1927).The pendulum equation is a simple example of a non-linear dynamical system, which is typically described by a system of non-linear ODEs and the derivatives are typical time derivatives.There exist many mathematical tools to describe and quantify the behaviour of dynamical systems, such as phase plane, phase paths, limit cycles or homoclinic orbits (e.g.Jordan and Smith, 1999).These tools are useful to describe the behaviour of a dynamical system without actually explicitly solving the non-linear ODE.Also, the so-called chaos theory is based on the analysis of dynamical systems (e.g.Guckenheimer and Holmes, 1983).Because of the mathematical equivalence between equations describing dynamical systems and folding of beams, the folding equations including non-linear terms for the matrix resistance have been analysed with, for example the tools of phase plane as mentioned above (e.g.Champneys, 1998;Hunt et al., 1989).Furthermore, some solutions for these non-linear folding equations can also be expressed with non-periodic functions such as hyperbolic secant (sec h (x) = 1/ cosh (x) = 2/ e x + e −x ), which is also a solution for solitary waves (or so-called solitons; e.g.Drazin and Johnson, 1989).
Geometrical and material heterogeneities are intuitive reasons for observed irregular fold geometries because natural rock layers are never perfectly straight or homogeneous before folding.Geometrical non-linearities are intrinsic for folding because they arise naturally due to the deviations of the folded layer from the initially straight layer.Linearised equations can predict the fold shapes up to amplitudes for which the final irregularities can already be seen, such as for an initial bell-shaped perturbation (Fig. 16).Non-linearities due to material softening, such as a non-linear matrix resistance, are more difficult to justify, and especially quantify, in a straightforward manner.Non-linear matrix resistance is usually justified by some kind of material strain softening (e.g.Hobbs and Ord, 2012).However, this softening process is usually defined a priori and it is not clear what micromechanical processes actually causes such particular nonlinearities related to softening.Typical candidates responsible for softening are, for example, grain size reduction, mineral reactions or fluid-rock interaction.The impact of strainrate softening on folding has been investigated also with numerical simulations (e.g.Hobbs et al., 2011)

Numerical simulations and coupled models
This review focuses on analytical solutions, with some reference to the analogue models that were often used to qualitatively or (semi-)quantitatively test these analytical solutions.However, since the late 1960s more and more numerical stud-ies of folding and necking have been performed.One of the first numerical simulations of folding in a geological context was carried out by Dieterich (1969) and Dieterich and Carter (1969).Stephansson and Berner (1971) already applied the finite element method to various tectonic processes such as folding, deformation of isolated boudins, isostatic adjustment and spreading at the mid-Atlantic ridge.
Numerical simulations are essential to study folding and necking scenarios for which analytical solutions cannot be derived or for which only approximate analytical solutions exist.Such scenarios are for example (i) the finite amplitude evolution of folding and necking in 2-D and 3-D for which only approximate analytical solutions are available (Chapple, 1968;Kaus and Schmalholz, 2006;Schmalholz, 2006;Schmalholz et al., 2008;Schmid et al., 2008;Grasemann and Schmalholz, 2012;Fernandez and Kaus, 2014;Frehner, 2014;von Tscharner et al., 2016), or (ii) the numerical solution of non-linear folding equations (see Sect. 4.1.;e.g. Hunt et al., 1997;Wadee, 1999).A typical application for numerical simulations is, for example, the study of fold propagation (or serial folding) where folding in a competent layer starts from a localised geometrical perturbation and new folds develop sequentially away from the initial perturbation along the layer (such as shown in Fig. 16).Such fold propagation has been studied in 2-D in single-(e.g.Cobbold, 1977;Mancktelow, 1999;Zhang et al., 2000) and multilayers (Schmalholz and Schmid, 2012) and in 3-D in single layers (Frehner, 2014).
Many numerical simulations of folding consider a layerparallel compression of the layers and the bulk deformation of the model is close to pure shear.Folding of layers under bulk simple shear has been studied numerically for single layers (Viola and Mancktelow, 2005;Llorens et al., 2013a) and multilayers (Schmalholz and Schmid, 2012;Llorens et al., 2013b).A main result of the simple shear studies is that folding under bulk simple shear does not generate asymmetric fold shapes but more or less symmetric fold shapes similar to the ones generated under bulk pure shear (cf.Lister and Williams, 1983).Also, when layers rotate in a simpleshear zone they can be first shortened until the fold train is more or less orthogonal to the simple-shear zone.Further shear and rotation, however, extends the fold train, which can unfold the layers again (Llorens, et al., 2013b).Laboratory experiments of such single layer folding and unfolding under bulk simple shear have been already performed by Ramberg (1959).
For active folding, a continuous competent layer is actually not required.Adamuszek et al. (2013a) showed that it is sufficient for folding to have competent inclusions (that can be of various sizes) aligned and clustered in a way to form a "layer" of inclusion clusters.If this "layer" of inclusions is embedded in a weaker viscous medium then the layerparallel shortening also generates folding of the layer consisting of individual inclusions.Adamuszek et al. (2013a) ap-plied their results to a folded sequence of alternating nodular limestone and shale.
Numerical simulations of the extension of power-law viscoplastic (von Mises; Schmalholz and Maeder, 2012) and power-law viscous (Duretz and Schmalholz, 2015) multilayers embedded in weaker power-law viscous medium showed the formation of individual shear zones that crosscut the entire multilayer.The shear zones form after some period of distributed multilayer necking and only occur (i) when the weak inter-layers are power-law viscous and (ii) when the spacing between the competent layers is less than or approximately equal to the thickness of the competent layers.The shear zones crosscutting the entire multilayer form due to the alignment of individual necks in different layers, which is a finite amplitude effect when individual necking zones can form a connected network of weak zones.
The numerical studies mentioned above investigated fundamental mechanical folding and necking processes, but numerical solutions are also useful to study the coupling of folding and necking with other processes such as (i) the generation of heat during folding due to dissipative rock deformation (shear heating) and the related thermal softening caused by thermo-mechanical feed-back with temperaturedependent rock viscosity (Hobbs et al., 2007(Hobbs et al., , 2008;;Burg and Schmalholz, 2008), (ii) the conversion of macroscale mechanical work into microscale mechanical work during the reduction and growth of mineral grain size and related softening due to grain size reduction (Peters et al., 2015), (iii) the impact of metamorphic reactions on rock deformation (Hobbs et al., 2010), (iv) coupling of crustal folding or necking with erosion in 2-D (Burg and Podladchikov, 2000;Burov and Poliakov, 2001) and 3-D (Collignon et al., 2015) or (v) the coupling of folding with salt diapirism (Fernandez and Kaus, 2014).A detailed outline of a coupled thermodynamic approach to study rock deformation and the resulting structures is given in the recent textbook by Hobbs and Ord (2014).The impact of shear heating and grain size reduction on lithospheric folding and necking can be significant because both processes cause a mechanical softening of the rock (e.g.Regenauer-Lieb and Yuen, 1998;Regenauer-Lieb et al., 2006).For example, during shortening of the continental lithosphere, shear heating and thermal softening can cause a transition from distributed folding to localised ductile thrusting (Burg and Schmalholz, 2008;Schmalholz et al., 2009;Jaquet et al., 2016).
Numerical simulations are based on a basic set of partial differential equations resulting from the concepts of continuum mechanics.These equations are useful to describe continuous deformation and strain localisation by shear bands (with no loss of velocity continuity).Elaborated numerical algorithms based on continuum mechanics, the so-called extended finite element method or XFEM (Belytschko et al., 2001), are additionally able to model discontinuous fracture, for example due to 3-D folding (Jäger et al., 2008).In geological studies it is more common to apply so-called dis-crete element methods to study brittle deformation and fracturing.In simple words, these discrete models assume that a material consists of particles that are connected by elastic springs.The force balance is controlled by Newton's law (force equals mass times acceleration), which is an ODE (no spatial derivatives) and not a PDE.A fracture appears when the stress in a spring connecting two particles exceeds the yield strength and the spring connection between the two corresponding particles is then removed.Discrete element modelling has been applied, for example, to study fracturing during detachment folding (Hardy and Finch, 2005) or to study the evolution of brittle boudinage in 2-D and 3-D (Abe and Urai, 2012;Abe et al., 2013).
Recently, Adamuszek et al. (2016) developed the MATLAB © based software termed Folder, which can be used to numerically model folding and necking in power-law viscous single-and multilayers.Folder is freely available under http://geofolder.sourceforge.net.Folder also includes all relevant analytical solutions for the amplification rates for folding and necking.Figure 12 and 13 have been generated with the results of Folder.Folder is easy-to-use software with a user-friendly graphical interface.Indeed, 200 years after the analogue experiments of James Hall, any student or researcher in geology can now easily perform similar experiments on his/her personal computer.
5 Fundamental similarities and differences between folding and necking

Similarities
The stability analysis (Fletcher, 1974;Smith, 1975Smith, , 1977) ) can be used to study the initial, small amplitude stages of both folding and necking.Folding and necking result from the same type of mechanical instability.This instability causes initial geometrical perturbations on the layer interface to amplify with velocities that are faster than the velocities corresponding to the applied bulk deformation (e.g.pure shear).The dominant wavelength for folding and necking is identical for the same material parameters (Fig. 14).Amplification rates for folding and necking increase with increasing viscosity ratio and with increasing power-law stress exponent in both the layer and the embedding medium (Fig. 14).Folding and necking are processes that can take place in single and multilayers and can also act on all scales.For large-scale folding and necking, gravity decreases the intensity of the folding and necking instabilities.The impact of gravity on folding and necking is usually quantified by some kind of Argand number, which is the ratio of the gravitational stress to the layer-parallel stress driving compression or extension, respectively.
Folding and necking are both associated with structural softening (Fig. 18).The effective viscosity of a rock unit consisting of competent layers embedded in a weaker medium

Differences
During folding the layer thickness remains more or less constant and the shortening is compensated by a lateral deflection (Fig. 4).During necking the local variation in layer thickness is significant and the extension is compensated by localised thinning of the layer while the central axis of the layer remains more or less straight (Fig. 4).
In folding, a particular wavelength can be selected and "locked in" if the fold arc length does not vary significantly during fold amplification, which is the case for large viscosity ratios ( 50).During necking a wavelength cannot be "locked in" because the necking zone is continuously extending during bulk extension.
Shortening of viscous single-and multilayers generates folds.Different multilayer configurations and flow laws can generate a wide variety of fold shapes but the preferred deformation mode will always be folding.Extension of power-law viscous single layers can generate necking, for which there is essentially no simple shear deformation within the layer around the necking zone (i.e.plane sections remain plane).In contrast, during extension of power-law viscous, embedded multilayers individual shear zones can form, which crosscut the entire multilayer and hence generate a significant simple shear deformation within the multilayer.Therefore, the general deformation mode for shortening competent layers is independent of the single-or multilayer configuration, whereas for extension of competent layers the deformation mode can be dependent on this configuration.
The maximal amplification rates for folding and necking for the same material parameters are significantly different (Fig. 14).Amplification rates for folding are larger than the ones for necking for the same material parameters.While folding occurs for linear and power-law viscous rheologies, necking only occurs for power-law viscous rheologies.Since the amplification rates for necking are smaller than the ones for folding, significant necking (α d > 10) in rock requires higher viscosity ratios and/or higher power-law stress exponents than significant folding.Hence, the range of material parameters for significant necking is smaller than the parameter range for significant folding (see text below Eq. 31).
The yield stress for brittle fracture is typically described by a Mohr-Coulomb failure criterion, which is based on parameters determined by Byerlee (1978; Byerlee's law).These yield stresses are usually 4 times larger during compression than during extension (Sibson, 1974), due to the implicit development of dynamic over-and underpressure, respectively (Mancktelow, 1993(Mancktelow, , 2008)).Hence, layers under layer-parallel compression can deform viscously up to much larger stresses than layers under layer-parallel extension before fracture occurs.The available stress range for folding without fracturing is therefore much larger than the stress range for necking without fracturing.
The range of material parameters and of flow stresses for significant necking is significantly smaller than the corresponding range for folding, which may be the main reason why pinch-and-swell structure is less frequent in nature than folding, and also why brittle boudinage is more frequent than pinch-and-swell structures.

Some applications
The main direct applications of analytical and numerical solutions for folding are the estimation of (i) the bulk shortening that was necessary to generate an observed fold and (ii) the viscosity ratio during the formation of the observed fold (Sherwin and Chapple, 1968;Talbot, 1999;Hudleston and Treagus, 2010).Natural rock viscosities are commonly estimated using laboratory-derived flow laws but the extrapolation from laboratory (10 −4 -10 −6 s −1 ) to tectonic (10 −12 -10 −15 s −1 ) strain rates makes such estimates uncertain.Also, flow laws are mainly determined for single minerals or specific rock types, whereas natural, polymineralic rocks are usually more heterogeneous.Therefore, independent viscosity estimates based on, for example, analysis of isostatic rebound (e.g.Haskell, 1937), mullions (Kenis et al., 2004) or fold structures are useful to test viscosity estimates based on laboratory-derived flow laws (e.g.Karato, 2008).Observed single-layer fold geometries in folded veins of calcite, quartz and pegmatite on the millimetre to centimetre scale suggest that average effective viscosity ratios are be-tween 17 and 70 and power-law exponents of the layer are between 2 and 5 (Hudleston and Treagus, 2010).Schmalholz and Podladchikov (2001a) presented a diagram that enables estimation of the bulk shortening and viscosity ratio from measured values of A/L and H /L for single layer folds.Adamuszek et al. (2011) developed a MATLAB © based software, the fold geometry toolbox, which determines automatically the values of A/L and H /L from fold shapes digitised from fold photos.Yakovlev (2012 and references therein) also presented a method to estimate bulk shortening from fold shapes and further developed a method to reconstruct the tectonic evolution of folded regions, which he mainly applied to the Caucasus.A problem with the estimation of viscosity ratio from folds is that the same amplification rates and hence similar fold amplifications can be obtained for different combinations of viscosity ratio and power-law exponents (Fig. 14a).Lan and Hudleston (1995) presented a method to estimate the power-law exponent from observed fold shapes.Estimates of the viscosity ratios from fold shapes in combination with microstructural analysis have also been applied to estimate the stress levels during folding (Trepmann and Stöckhert, 2009).Trepmann and Stöckhert (2009) estimated that stresses in folded quartz veins in fine-grained high pressure-low temperature metamorphic greywackes of the Franciscan Subduction Complex at Pacheco Pass, California, to have been between 100 and 400 MPa.
Fold geometries can also be used to estimate the dominant folding mechanism.Schmalholz et al. (2002) distinguished three types of folding mechanism depending on the controlling material parameters: (i) matrix-controlled folding (controlled by viscosity ratio between layer and matrix), (ii) detachment folding (controlled by the thickness of the weak layer below a strong layer) and (iii) gravity folding (controlled by the ratio of gravity to viscous stress, namely the Argand number, Eq. 34).They presented a diagram that allows estimation of the dominant folding mechanism from the fold geometry alone.
Numerical simulations of necking have shown that during necking initially straight and vertical lines remain vertical and straight (Schmalholz et al., 2008).This feature justifies the application of thermo-kinematic models to lithospheric necking and the associated formation of sedimentary basins (e.g.McKenzie, 1978;Kooi et al., 1992).These thermo-kinematic models subdivide the lithosphere laterally into a series of vertical columns, whose independent thinning is quantified by thinning factors.Such models have been applied to reconstruct the thermo-tectonic history of extensional sedimentary basins, which is useful to evaluate the potential of hydrocarbon reservoirs.Two-dimensional thermokinematic models of lithospheric extension are significantly faster to compute than 2-D thermo-mechanical models and can hence be used efficiently in combination with automated inversion or optimisation methods (Poplavskii et al., 2001;White and Bellingham, 2002;Ruepke et al., 2008).
The mathematical solutions for folding and necking have also been used to assess the deformation style of the outer shell of the moons of Jupiter.Dombard and McKinnon (2001) investigated the grooved terrain of Ganymede and argued that the regular structural periodicity found in this grooved terrain could be the result of an extensional necking instability.Dombard and McKinnon (2006) also argued that topographic undulation, with a ca. 25 km wavelength, observed on Jupiter's icy moon Europe could be due to contractional folding.

Summary and conclusions
Significant progress has been made in understanding and quantifying the mechanical processes of folding and necking since the pioneering folding experiments of Hall and the pinch-and-swell observations of Ramsay (1866).The geometry and mechanical evolution of many fold trains can be explained by the dominant wavelength theory of Biot (1957) and Ramberg (1962) and its elaboration to power-law viscous rheology by Fletcher (1974) and Smith (1977).Folding and necking in viscous layers are the result of a hydrodynamic instability.Folding and necking are the preferred deformation modes because they minimise the mechanical work required to shorten or extend mechanically layered rock on all scales.The most important quantities to analyse folding and necking are the dominant wavelength and the corresponding maximal amplification rate.The two quantities allow the estimation of fundamental parameters relevant for folding and necking, such as the effective viscosity ratio or the Argand number, and also allow an evaluation of whether folding or necking instabilities are sufficient to generate observable fold or pinch-and-swell structures.
Folding and necking instabilities should in principle always be active when ductile, layered rocks are shortened or extended on all scales.However, observable folds and necks (pinch-and-swell structure) are usually only generated when the dimensionless amplification rate α > 10; i.e. when the dimensional amplification rate is more than an order of magnitude larger than the absolute value of the bulk deformation rate, Dxx .
Folds are more frequent in nature than pinch-and-swell structure because folding can occur in layered rock that deform according to viscous and power-law viscous flow laws while necking only occurs in rock with power-law viscous (or other non-linear) behaviour.For the same material parameters the amplification rates for folding are also larger than the ones for necking.Furthermore, stresses during folding (compression) can be significantly larger than stresses during necking (extension) before the rock fails by fracture.Hence, brittle boudinage is more frequently observed than continuous necking.
www.solid-earth.net/7/1417/2016/Solid Earth, 7, 1417-1465, 2016 Despite the vast literature on folding and necking there are still many open questions and challenges.For example, 3-D finite amplitude folding and necking in power-law viscous multilayers have not been investigated analytically and numerically in detail.Particular future challenges are to quantify the coupling of folding and necking with other processes acting during rock deformation, such as fracturing, shear heating, grain-size evolution, fluid flow and metamorphic reactions.The concept of continuum mechanics can provide the system of equations that describes these coupled processes and numerical algorithms will be able to solve these equations.However, these equations and related numerical simulations will include many parameters and one of the biggest challenges may be to determine the controlling parameters (e.g. via dimensional analysis) and to make the coupled thermodynamic processes comprehensible.In that sense, one of the main objectives for future research on folding and necking is encapsulated in the famous statement of J. Willard Gibbs quoted at the beginning of this review.

Code availability
This is a review article and hence most results and figures in this article have been taken or modified from already published articles, which are cited in this review.The topographic data used in Fig. 7a were obtained with the online tool Geocontext-Profiler (http://www.geocontext.org/publ/2010/04/profiler/en/).The results of Figs. 12 and 13 have been calculated with the free software Folder (https://sourceforge.net/projects/geofolder/). The numerical results presented in Figs. 4,16,18 and 19 were calculated with a finite element algorithm, which was developed by S. M. Schmalholz and has been described in, for example, Schmalholz et al. (2008) and Schmalholz and Schmid (2012).The numerical results for Large Amplitude Folding (LAF) in Figure 16 (magenta line) were kindly provided by Daniel W. Schmid, Department of Geosciences, University of Oslo.
Appendix A: Derivation of the thin-plate equation from the 2-D force balance equations Mathematical folding studies either use the thin-plate equation or the stability analysis, which is based on a stream function solution for the full 2-D force balance equations.We show here how the thin-plate equation can be derived from the full 2-D force balance equations, based on the derivation of a general extended thin-sheet equation by Medvedev and Podladchikov (1999a, b).The thin-plate equation for folding is essentially derived by vertical integration and approximation of the 2-D force balance equations.
The force balance equations in 2-D without gravity are where x, y, σ xx , σ yy and σ xy are the horizontal (layerparallel) coordinate, the vertical coordinate, the horizontal total stress, the vertical total stress and the shear stress, respectively.We apply the equilibrium equations to a layer whose width in the x direction is larger than its thickness in the y direction.The bottom boundary (Sb(x)) and the top boundary (St(x)) of the layer are described by continuous functions along the x direction.The stresses along the layer boundaries can be related to tractions on the top boundary T t = T tx , T ty and on the bottom boundary T b = T bx , T by .The tractions are related to the stress tensor for the top, σ | St , and bottom, σ | Sb , layer boundaries and to the outer unit normal vectors on both boundaries, n t and n b , by the Cauchy formula: The components of the outward unit normal vectors can be approximated for small slopes (i.e.dropping square root terms) on the layer boundaries by The components of the tractions at the top and bottom layer boundaries can then be expressed as

(A5)
Vertical integration of Eq. (A2) while changing the order of integration and differentiation by using the rules of differentiation of integrals with variable integration boundaries (Bronstein et al., 1997) yields The integral in the first term in Eq. (A7) can be written in different form using the rules of integration by parts Equation (A16) includes effects of both horizontal and vertical tractions on the layer boundaries and the only assumptions made so far are that slopes on the layer boundaries are small so that square root terms in Eq. (A4) are negligible.
The thin-plate approach of Biot (1961) assumes that only vertical tractions act on the layer boundaries, that horizontal tractions are negligible and that H = constant.Under these assumptions and using the terminology T ty + T by = q, Eq. (A16) reduces to As in the thin-plate approach, we assume also that the origin of the vertical coordinate is in the centre of the layer and use St(x) = H /2 and Sb(x) = −H /2.Equation (A17) then becomes + q = 0 (A18) Equation (A18) has already the basic form of the thin-plate equation with the three terms representing the bending moment due to flexure (left term), the moment due to compression (middle term) and the resistance of the embedding medium (right term).To arrive at the thin-plate equation for viscous folding we have to make assumptions about the stress distribution and the constitutive equations.We assume that stresses are viscous and that the horizontal total stress σ xx is composed of a constant layer-parallel stress due to a bulk shortening rate, σxx = 4η Dxx , and of a fibre (bending) stress σ xx , which is only related to the bending (flexure) of the layer but not to the compression.The fibre stress depends on the flexural strain rate which can be approximated using Eq. ( 7) so that σ xx = −4ηy∂ 3 A/∂t∂x 2 .The total horizontal stress can then be written as (Schmalholz et al., 2002) The separation of the total stress into a stress due to a bulk shortening rate and a stress due to flexure is similar to the separation of the stress into a basic state stress and a perturbed stress which is done in the stability analysis.Substituting Eq. (A.19) into Eq.(A.18) and evaluating the integrals yields The component of σxx vanishes by performing the integral in the bending moment (because the stress is multiplied by y in the left term in Eq.A18) while the component of σ xx vanishes by performing the integral in the middle term of Eq. (A18) (because there the stress is not multiplied by y).Hence, the bending moment is only controlled by flexural stresses while the moment due to compression is only controlled by the stress due to bulk shortening rate.Equation (A20) corresponds to the thin-plate Eq. ( 48) in Biot (1961) and has been here derived from the general 2-D force balance Eqs.(A1) and (A2).Using now q = −4ηkdA/dt (see Eq. B14 and text below) yields − ηH 3 3 Equation ( A21) is identical to the Eq.(4.8) used in Biot (1961; he used the symbol P instead of σxx ) to derive formula for the dominant wavelength.Alternative thin-plate equations for elastic material or for gravity as the resisting mechanism against folding can be derived by using constitutive equations for elastic material and/or expressing q by gravitational stresses in Eq. (A18).
Appendix B: Stream function approach and matrix resistance of viscous embedding medium An essential step in the derivation of the dominant wavelength solution for folding was the derivation of a correct term for the resistance of the viscous embedding medium, which depends not only on the amplitude, A, but also on the wavelength, L (see Eq. 8).The derivation below follows essentially the one in Turcotte and Schubert (1982).The constitutive equations for linear viscous fluids are (B3) The top of the above equations is the equation for the conservation of mass if density is constant.Taking the derivative with respect to y of the top equation in (B2), taking the derivative with respect to x of the bottom equation in (B2) and then subtracting both equations eliminates the pressure terms and yields The above equation contains two unknowns, the velocities v x and v y , and can hence not be solved.To solve this equation, the two velocities must be expressed by only one unknown to obtain one equation for one unknown.The two unknown velocities can be represented by the derivatives of a so-called stream function The result D yy (y = 0) = 0 indicates that there is no vertical deviatoric stress acting on the layer interface and hence that σ yy = −P .Incompressibility requires D yy = −D xx and hence there is also no horizontal deviatoric stress acting on the layer boundary.The resistance of the viscous medium corresponds hence to the resistance of a viscous fluid at rest.The v y (y = 0) must be equal to the time derivative of the deflection of the layer, v y (y = 0) = dA/dt.Hence, C 1 = (dA/dt) /k cos (kx) and the pressure at y = 0 is P (y = 0) = 2ηk dA dt = −σ yy . (B14) The value of P (y = 0) = −σ yy is identical at the top and bottom layer boundaries if we assume that the material above and below the layer are identical and that the deflection A is identical, which is guaranteed if a constant thickness of the layer is assumed.The vertical resistance of the matrix against folding therefore results from the pressure at the layer boundary (Eq.B14).The total resistance of the embedding medium above and below the layer is then q = 2P (y = 0) = 4ηkdA/dt with k = 2π/L, using the convention, opposite to Biot (1961), that q is positive when acting against the positive direction of the deflection A.

Figure 2 .
Figure 2. Natural single-layer folds (a, b) and pinch-and-swell structure (c, d).(a) Folded quartz vein in schists around Vale Figueiras, Portugal.(b) Folded quartz vein in sillimanite schists from Cap de Creus, Spain.(c) Extended calcite vein in finer grained calcite marble from the Doldenhorn nappe, Switzerland.(d) Extended quartz vein in grey calcite marble, Ugab region, northern Namibia.Photographs (a) to (c) by S. Schmalholz and (d) by N. Mancktelow.

Figure 3 .
Figure 3. (a) Fold observations in the European Alps around the Lake Uri in Switzerland.The panel shows a part of a larger sketch of Johann Scheuchzer, which was published by Antonio Vallisneri (1715; seeLuzzini, 2011;Vaccari, 2004).(b) Sketch of necking (pinch-and-swell structure) from Ramsay (1866).

Figure 4 .
Figure 4. Numerical simulations of folding and necking resulting from layer-parallel shortening and extension, respectively, of the same competent layer with the same initial lateral thickness variation (initial perturbation).The model was initially 20 times wider than the initial layer thickness and the medium above and below the layer was 5 times thicker than the layer.The initial perturbation was introduced as a reduction in the layer thickness by 10 %, with the width of this perturbation equal to the layer thickness.The layer interface at the two edges of the perturbation has been smoothed to avoid numerical inaccuracies.The geometries are the result of finite element simulations (performed for this study) with a reference viscosity ratio of 100, a power-law exponent of the layer of 10 and of the embedding medium of 3. The colours indicate the square root of the second invariant of the stress tensor σ I I = τ 2 xx + τ 2 xy

Figure 6 .
Figure 6.Outcrops showing the three-dimensional (3-D) geometry of folds.(a) Folded turbiditic sequence from Almograve, Portugal.(b) Folded turbiditic sequences with a large fold hinge plunging towards the viewer from Makran region, Iran.Photographs by S. Schmalholz.
(21)  and to scale D I I by the absolute value of the bulk shortening rate, Dxx ,

Figure 7 .
Figure 7. Lithospheric folding and necking.(a) Topography across Central Asia with data from Geocontext-Profiler.The thin solid line is the original data and the thick solid line shows the running average topography within a 150 km wide window.The fold-like topography has been interpreted as the result of lithospheric folding(Burov et al., 1993).(a) Crustal geometry across the Rockall and Porcupine basins modified afterMohn et al. (2014) andWelford et al. (2012).The thinned continental crust has been interpreted to be the result of necking.

Figure 8 .
Figure 8.(a) and (b) shows sketches of Galilei (1638), who studied the strength of beams under beam-orthogonal loading (a) and the tensile strength of columns (b).(c) shows a sketch of Euler (1744), who studied the so-called elastic curves (elastica) and the critical load for buckling of loaded columns.

Figure 9 .
Figure 9. Sketch illustrating the Euler-Bernoulli beam theory and the related thin-plate approximations.

Figure 10 .
Figure 10.Multilayer folds showing that the size of individual folds (quantified by distance between neighbouring hinges of the same type) is related to their layer thickness and that folds become systematically smaller as their layer thickness becomes thinner.Carbonates with silicate-rich layers belonging to the Jurassic El Quemado formation.The sample was found by Stéphane Leresche around the Mount Fitz Roy, southern Patagonia, and the photo was made by Yoann Jaquet.

Figure 11 .
Figure 11.Configuration for analytical folding (a) and necking (b) models with some basic equations.A is amplitude, H is layer thickness, L is wavelength, V 1,2 are horizontal boundary velocities, t is time, α is amplification rate and Dxx is the applied bulk rate of deformation.
2 I I = D 2 xx + D 2 xy , which controls the effective viscosity (Eq.21) is only sensitive to perturbations in normal strain rate D xx but not to shear strain rates D xy .Using the assumptions D xx ≈ Dxx + D xx and D xy ≈ Dxy + D xy in D 2 I I and performing a multiple Taylor expansion for small D xx and D xy yields D2 I I = D2 xx and D 2 I I = 2 Dxx D xx because Dxy = 0 for the applied pure shear basic state of deformation

Figure 12 .
Figure12.Dimensionless amplification rate, α, vs. wavelength to thickness ratio, L/H , for linear viscous layer and embedding medium (power-law exponent of layer and matrix n, n M = 1) and a viscosity ratio R = 15 (a) and R = 75 (b).Approximate results are based on the thin-plate approach (Eq.12) and exact results (for infinitesimal amplitudes) are based on the stability analysis (Eq.17).As an example, in (a) the dominant wavelength to thickness ratio, L d /H , and the corresponding maximal dimensionless amplification rate, α d , are also indicated for the thin-plate solution.

Figure 13 .
Figure 13.Dimensionless amplification rate, α, vs. wavelength to thickness ratio, L/H .All results have been calculated with the software Folder (Adamuszek et al., 2016) including the numerical results of finite element simulations indicated with red diamond symbols.(a) Layer and matrix are linear viscous (power-law exponent of layer and matrix n, n M = 1), the viscosity ratio R = 75 and the initial ratio of amplitude to layer thickness A 0 /H = 0.01.For such small initial amplitudes, the large amplitude folding (LAF) solution of Adamuszek et al. (2013b) is identical to the stability analysis.(b) like (a) but with A 0 /H = 0.02.For linear viscous folding the LAF solution can be used to calculate the amplification rate for large amplitudes.(c) Layer is power-law viscous and matrix is linear viscous (n = 6 and n M = 1), the viscosity ratio R = 75 and the initial ratio of amplitude to layer thickness A 0 /H = 0.01.(d) like (c) but with A 0 /H = 0.02.In (c) and (d) the insets in the top right show enlargements of the region with small values of L/H .

Figure 15 .
Figure 15.Sketch illustrating the linearisation of power-law flow law (black solid line indicates the exact flow law, τ xx = B(D xx ) 1/n ) by using basic state variables ( τxx , Dxx ) and perturbed variables ( τ xx , D xx ).The reference viscosity at the point Dxx , τxx for the basic state deformation, η R , is related to the secant (represented by the blue line) at that point, while the viscosity for the perturbed deformation is related to the tangent (represented by the red line) at the point Dxx , τxx .The slope of the tangent is a factor n smaller than the slope of the secant at the same point.

Figure 16 .
Figure 16.Evolution of fold geometry for an initial geometrical bell-shaped perturbation calculated with the analytical solution of Biot et al. (1961; Eq. 27; red line), the LAF solution of Adamuszek et al. (2013; magenta line) and a numerical finite element simulation (black layer).The parameters for the bell-shaped perturbation are a = 10 and b = 0.2 (initial layer thickness is 1), and the linear viscosity ratio is 75.Numbers in panels in % represent bulk shortening.(e) shows an enlargement of (d) (enlarged region is indicated by dashed rectangle).In (e) the yellow line corresponds to the case where arc-length shortening rather than bulk shortening is used in the solution ofBiot et al. (1961; Eq. 27), as discussed in the text.At lower values of shortening (a, b, c) this result does not differ significantly from that for the uncorrected Biot or LAF.
change of the arc length, namely Dxx = 1 S dS dt .The fold arc length can be related to the fold amplitude by

Figure 17 .
Figure17.Approximate finite amplitude solutions for folding (a; Eq. 29) and necking (b; Eq. 41).For folding the layer and embedding medium are linear viscous, while for necking the layer is power-law viscous and not embedded in a viscous medium (free layer or embedded in inviscid medium).(a) Ratio of amplitude to wavelength (A/L) vs. horizontal bulk shortening (ε in %) for a viscosity ratio (R) of 50 and 100.In Eq. (29) the ratio L 0 /L is used as a measure for shortening, which is related to the bulk shortening in % by ε = (1 − L/L 0 ) 100.The applied initial value is A 0 /L 0 = 0.001 for both viscosity ratios.Both the exponential and finite amplitude solution are plotted.The thin dashed-dotted lines indicate the breakdown of the exponential solution as quantified by Eq. (30).(b) Ratio of layer thickness to initial layer thickness (H /H 0 ) vs. the dimensionless time (t/t c ; see text below Eq. 41) for different values of the power-law exponent (n) of the layer.For n = 1 no necking instability is active and thinning is due to homogeneous pure shear thinning only.The time for complete thinning (H /H 0 = 0) is given by t/t c = 1/n (see text).

Figure 18 .
Figure 18.Structural softening during folding (black line) and necking (red line) of the simulations shown in Fig. 4. The effective viscosity is calculated by the ratio of the area-averaged square root of the second invariant of the deviatoric stress tensor, σ I I , to the absolute value of the bulk rate of deformation, Dxx , which was constant during the simulations.The effective viscosities are divided by the initial value of the effective viscosity and are plotted vs. the bulk shortening for folding and the bulk extension for necking.Structural softening starts earlier for folding than for necking but the structural softening for necking is more intense.

Figure 19 .
Figure19.Result of a numerical simulation (performed for this study) of linear viscous multilayer folding for a viscosity ratio of R = 75 between competent layers (black) and weak embedding medium (grey).The initial model width is 100 times the identical initial thickness of all the competent layers.All layer interfaces have a small initial random geometrical perturbation with amplitudes ∼ 1/20th of the layer thickness.Thin dark-grey lines are passive lines indicating the finite deformation.The lowest four layers are closely spaced (spacing 1.5 times the layer thickness) and form harmonic folds because the four layers cannot fold independently (b, c; numbers in % indicate bulk shortening).The two layers in the middle of the model have a spacing 7 times the layer thickness.These two layers do not form harmonic folds but they do influence each other, as can be seen from the passively folded marker lines between the two layers (b, c).The top layer develops single-layer folds that are effectively independent of the layer below, as shown by the fact that the thin passive lines between the top and second to the top layer remain almost straight.The results also show that amplification rates for multilayers are greater than for single layers, because fold amplitudes in the lowest four-layer stack are larger than amplitudes in the top single layer for the same bulk shortening (b, c).Fold wavelengths in the bottom four-layer stack are also larger than the ones in the top single layer.Spacing between the bottom layer and the model bottom and between the top layer and the model top is 16 times the layer thickness.Spacing between the fourth and fifth layer from the bottom and the top and second to the top layer is 20 times the layer thickness.The dominant wavelength to thickness ratio for single layers is ∼ 15 for R = 75.The spacing between the top two layers (i.e.20) is thus slightly larger than the dominant wavelength (i.e.15), so that the observed independent or "disharmonic" folding of these two layers agrees with the concept of contact strain (see text).

Figure 20 .
Figure 20.Impact of spacing of competent layers (H m ) in an embedded multilayer with constant spacing and layer thickness (H ) on the dominant wavelength of the multilayer (L dm ) in (a) and maximal amplification rate (α dm ) in (b), using the approximate solutions of Schmid and Podladchikov (2006; see also text).The results are non-dimensionalised by the dominant wavelength for corresponding single layer folding (L d of Eq. 13) and the corresponding maximal amplification rate (α d of Eq. 14).Results are shown for a viscosity ratio between competent and weak layers R = 100 and 10 competent layers (m = 10).The transition zone between the three folding modes (effective single layer, true multilayer or real single layer; see text) is controlled by H /L d and L d /H (indicated by vertical grey bars).L dm is always larger than L d when H m <∼ L d and the factor by which L dm is larger than L d depends only on m.In the real single layer mode L dm = L d because H m >∼ L d and individual layers in the multilayer can fold independent of each other (concept of contact strain; see text).Values of α dm =∼ α d for close spacing (effective single layer; H m /H <∼ H /L d ) and wide spacing (real single layer; H m >∼ L d ).For intermediate spacing (true multilayer) α dm =∼ m 2/3 α d .There is an upper limit for many layers (m L d /H ) given by α dm =∼ 0.83R 1/3 α d .The impact of multilayers on dominant wavelength and amplification rate can be seen in the numerical simulation displayed in Fig.19.

+
(y − A) σ xy St(x) − (y − A) σ xy Sb(x) .(A9) Integration by parts of two functions u and v can be generally expressed as σ xy represents u and y − A represents v.The y − A is the distance from the middle line of the layer, A, in the y direction and find a relation between the vertically integrated vertical gradient of the shear stress and the vertically integrated horizontal gradient of the normal stress, we multiply Eq. (A1) by y − A, integrate it vertically and apply the product rule of www.solid-earth.net/7/1417/2016/Solid Earth, 7, 1417-1465, 2016 S. M. Schmalholz and N. S. Mancktelow: Folding and necking across the scales − A)σ xx ) dy − ∂St (x) ∂x (y − A) σ xx | St(x) 2 = St (x) − A = − [Sb (x) − A].Substituting Eq. (A14) into Eq.(tx − T bx ) + T ty + T by = 0. (A15) Expanding the second term in Eq. (A15) and using Eq.(A− T bx ) + T ty + T by = 0. (A16) where P = − σ xx + σ yy /2 (pressure or negative mean stress) and v x and v y are the velocities in the x and y direction, respectively.Substituting Eq. (B1) into the 2-D equilibrium Eqs.(A1) and (A2), and assuming a constant viscosity yields has only one unknown, ϕ, and represents the force balance in a viscous medium with constant viscosity, and the stream function is a function of both x and y.Usually one assumes a periodic behaviour in the direction along the layer, i.e. the x direction, and writes the stream function as ϕ (x, y) = ψ (y) sin (kx) .(B7)Equation (B6) then becomesk 4 ψ sin (kx) − 2k 2 ∂ψ ∂y 2 sin (kx) + ∂ 4 ψ ∂y 4 sin (kx) = 0 ∂ 4 ψ ∂y 4 − 2k 2 ∂ψ ∂y 2 + k 4 ψ = 0. (B8) www.solid-earth.net/7/1417/2016/Solid Earth, 7, 1417-1465, 2016 S. M. Schmalholz and N. S. Mancktelow: Folding and necking across the scales The general solution of Eq. (B8) is ψ = C 1 e −ky + C 2 e −ky y + C 3 e ky + C 4 e ky y. (B9)Seeking a solution for a half-space where the velocities and stresses, and hence ψ, vanish at large distance y → ∞ one has to set coefficients C 3 and C 4 to zero because the corresponding exponents do not vanish for y → ∞.A solution for the stream function is then ϕ (x, y) = C 1 + C 2 y e −ky sin (kx) .(B10)Calculatingv x via Eq.(B10) from Eq. (B5) and assuming that v x = 0 at y = 0 (which represents the interface between layer boundary and embedding medium) provides C 2 = kC 1 .The velocities are then using Eqs.(B10) and (B5)v x = C 1 k 2 e −ky y sin (kx) v y = C 1 (1 + ky) ke −ky cos (kx) .(B11)Substituting v x from Eq. (B11) in the horizontal force balance (top of Eq.B2) and integrating with respect to x to solve for the pressure yieldsP = 2ηC 1 k 2 e −ky cos (kx) .(B12)Evaluation of v y and D yy = dv y /dy with Eq. (B11) at the interface between matrix and layer (y = 0) yields v y (y = 0) = C 1 k cos (kx) D yy (y = 0) = −C 1 k 3 e −ky cos (kx) y = 0. (B13)

Table 1 .
Frequently used symbols and their consistent meaning throughout the text.

Table 2 .
Approximate solutions for dominant wavelength and maximal amplification rates for folding and necking.
(see above).The multilayer is made of competent layers with equal thickness, H , and weak layers of equal thickness, H m (generally with H m = H ), alternating with the competent layers.Effective single layer folding occurs for H m /H < H /L d , true multilayer folding for H /L d < H m /H < L d /H and real single-layer folding for H m > L d , where L d is the dominant wavelength for a corresponding single layer as given in Eq. ( n xx0 with τ xx0 = F /2H 0 .The thinning of the layer, quantified by H /H 0 , with www.solid-earth.net/7/1417/2016/Solid Earth, 7, 1417-1465, 2016 progressive dimensionless time, t/t C , depends only on n. during shortening and extension with a constant bulk rate of deformation can be calculated by the ratio of the areaaveraged stress to the bulk rate of deformation.If the shortening and extension would be homogeneous pure shear at a constant rate and the layer would deform by homogeneous thickening and thinning, then the effective viscosity of the layered rock unit would remain constant.However, if folding or necking takes place the effective viscosity decreases during bulk shortening and extension, respectively, because the area-averaged stresses are smaller during folding and necking than during pure shear thickening and thinning.Related to the stress decrease is a decrease in viscous dissipation (i.e.stress times strain rate) and mechanical work rate (i.e.product of boundary stress and velocity integrated over the boundary of the rock unit).The structural softening related to folding and necking hence reduces the mechanical work required to deform the layered rock unit.Therefore, folding and necking are the preferred deformation modes of mechanically layered rock units because folding and necking minimises the work required for the deformation.During structural softening the material properties remain constant and for both linear and power-law viscous material the flow laws are always strain-rate hardening; i.e. the stress increases with increasing strain rate.Hence, structural softening is fundamentally different to material strain softening, where some material property (e.g.cohesion, friction angle or effective viscosity) decreases with progressive strain.