Interactive comment on “ Instaseis : instant global seismograms based on a broadband waveform database ”

Abstract. We present a new method and implementation (Instaseis) to store global Green's functions in a database which allows for near-instantaneous (on the order of milliseconds) extraction of arbitrary seismograms. Using the axisymmetric spectral element method (AxiSEM), the generation of these databases, based on reciprocity of the Green's functions, is very efficient and is approximately half as expensive as a single AxiSEM forward run. Thus, this enables the computation of full databases at half the cost of the computation of seismograms for a single source in the previous scheme and allows to compute databases at the highest frequencies globally observed. By storing the basis coefficients of the numerical scheme (Lagrange polynomials), the Green's functions are 4th order accurate in space and the spatial discretization respects discontinuities in the velocity model exactly. High-order temporal interpolation using Lanczos resampling allows to retrieve seismograms at any sampling rate. AxiSEM is easily adaptable to arbitrary spherically symmetric models of Earth as well as other planets. In this paper, we present the basic rationale and details of the method as well as benchmarks and illustrate a variety of applications. The code is open source and available with extensive documentation at www.instaseis.net .


Introduction
Despite the exponential growth of computational power and substantial progress of 3-D numerical methods for seismic wave propagation in the last 15 years (Igel et al., 2000;Komatitsch and Tromp, 2002a;Tromp, 2007;Tromp et al., 2010), the simulation of the highest frequencies observed in seismic waves on the global scale remains a highperformance computing challenge and is not yet done routinely.This is why many seismologists still rely on approximate methods to compute and analyze high-frequency body waves such as ray-theoretical travel times (e.g. the TauPtoolkit described in Crotwell et al., 1999), WKBJ synthetics (Chapman, 1978), the reflectivity method (Fuchs and Müller, 1971) or the frequency-wave number integration method (Kikuchi and Kanamori, 1982).More recently, several methods that include the full physics in solving the seismic wave equation while reaching the highest observable frequencies by assuming spherically symmetric models have become available, see Fig. 1 for an example.These methods include the direct solution method (DSM, Geller and Ohminato, 1994;Kawai et al., 2006), the frequency domain integration method (GEMINI, Friederich and Dalkolmo, 1995) and a generalization of it including self gravitation (Yspec, Al-Attar and Woodhouse, 2008).
As detailed by Nissen-Meyer et al. (2007b), the main drawback of these methods when applied to computing wave fields rather than single seismograms, is their scaling proportional to the number of points in space where the wave field is sampled.This motivated the development of a direct timedomain approach, where the displacement as a function of space and time is a natural field variable and only needs to be written to disk (Nissen-Meyer et al., 2007a, b, 2008).The implementation of this axisymmetric spectral element method AxiSEM was recently extended to include anisotropy and at- The displacement is color-coded analogous to the IRIS global stack (Astiz et al., 1996), i.e. red: transversal component, green: radial component, blue: vertical component.An automatic gain control (AGC) with a window of 100 s length is used to balance large amplitude variations between the various phases.Note that creating this plot does not require to define the source depth at the time of database calculation.A high-resolution version of this plot and one for 5 h long seismograms is added as an electronic supplement.
As computing full global waveforms especially at higher frequencies requires substantial computational resources, several initiatives serve to deliver waveforms by means of databases without having to run a full numerical solver.The ShakeMovie project (http://global.shakemovie.princeton.edu)provides synthetics for earthquakes from the CMT (Global Centroid-Moment-Tensor) catalogue (www.globalcmt.org)recorded at permanent Global Seismic Network (GSN) and International Federation of Digital Seismograph Networks (FDSN) stations in 1-D and 3-D velocity models (Tromp et al., 2010).The Pyrocko toolbox (http: //emolch.github.io/pyrocko)provides a Python interface to generate and access Green's function databases, which for the global case are based on GEMINI, several databases are offered for download.
In this paper we present a method that uses AxiSEM to generate global Green's function databases and provides a Python interface for convenient extraction of seismograms.The advantage over ShakeMovie synthetics are the possible higher frequencies and arbitrary source and receiver combinations independent of catalogues and real stations.Compared to Pyrocko with GEMINI synthetics, AxiSEM is more efficient in generating the databases, allowing to routinely compute them for a large number of different background models or specialized applications (e.g.limited depth/distance ranges).Also, by using the Lagrangian polynomials in the SEM (spectral element method) mesh as basis functions, it achieves higher spatial accuracy.
This paper is structured as follows.In Sect. 2 we present the technical aspects and argue for the choices made in the spatial and temporal discretization.Section 3 gives a short overview of the Python interface.In Sect. 4 we show the performance with respect to accuracy, speed and disk space requirements for the databases.Finally, we depict a variety of applications in Sect. 5.

Computing Green's functions with AxiSEM
AxiSEM was designed from the beginning with the application of computing global wave fields rather then single seismograms in mind (Nissen-Meyer et al., 2007b).This becomes apparent in the following main advantages in this application: it uses a 2-D discretization (Fig. 2), with an analytical decomposition of the 3-D wave field into several 2-D wave fields.For moment tensor sources, four 2-D wave fields are needed, for force sources, two.As it is a time domain method, the displacement field in space-time is a natural field variable of the numerical scheme and simply needs to be written to disk without any extra computational cost when larger regions of Earth are included in the database.AxiSEM uses a spectral element scheme for spatial discretization which lends itself well to parallelization on High Performance Computing (HPC) systems.As it is based on the weak formulation of the wave equation, it naturally includes the free surface boundary condition and allows for highly accurate modeling of surface waves.Nissen-Meyer et al. (2014) argued against using collective parallel I/O since the availability of the NetCDF libraries (Rew and Davis, 1990) was not granted on all supercomputers.For that reason, we implemented a round robin I/O scheme, which remains advantageous when running AxiSEM on less than about 100 cores in parallel and to avoid installation problems on systems where NetCDF is not available as a pre-compiled package.On supercomputers however, the situation has since improved and NetCDF compiled with parallel support seems now to be widespread.For this reason, we implemented a collective parallel I/O scheme that performs well, even when running on more than 1000 cores, see Table 1.In this scheme, all processes that have to write data to disk communicate via the message passing interface (MPI) and then write collectively at the same time to the parallel file system.This way we achieved throughputs of up to 4 GB s −1 on SuperMUC.

Forward and backward databases
Instaseis has the capability of dealing with forward wave fields, i.e. the waves are propagated from a moment-tensor point source at fixed depth (i.e.receivers exist throughout the medium), as well as backward or reciprocal wave fields, where the wave fields are propagated from a single-force point source at fixed depth and recorded throughout the medium (i.e.sources exist throughout the medium).
Potential applications of forward databases are the generation of 3-D wave-propagation movies (Holtzman et al., 2013), the computation of incoming teleseismic waves in 1-D/3-D hybrid methods (e.g.Monteiller et al., 2012;Masson et al., 2013) or the forward field in the computation of sensitivity kernels (Nissen-Meyer et al., 2007a) for seismic tomography.To generate a forward database, a total of four runs with AxiSEM are needed (Nissen-Meyer et al., 2007b).
In contrast, reciprocal databases utilize the reciprocity of the Green's functions, and are useful in all cases where the receivers are at fixed depth, thus for instance mimicking earthquake catalogues recorded at stations along the surface.The source can be located anywhere in the region where the Green's functions are recorded in the simulation, thus allowing for unlimited choices in the source-receiver geometry.To generate a reciprocal database, a total of two runs with AxiSEM are needed, one for the vertical component and one for both horizontal components of the seismogram (Nissen-Meyer et al., 2007b).It is also possible to compute a database for the vertical component seismograms only, which is then a factor of 3 faster and uses only about 40 % of the disk space.

The spatial scheme
For the spatial discretization we choose to keep the same basis as used in AxiSEM.The displacement u within each element is expanded in terms of Lagrangian polynomials l i (see Fig. 3) of order N defined on the integration points of the spectral element scheme (see Fig. 4): ξ and η are the reference coordinates of the element and N typically has a value of 4. This approach has several advantages.
-The wave field is represented by polynomials, typically of degree 4; interpolation is hence of 4th order accuracy.
-The basis is local and only few coefficients are needed to represent the wave field inside an element (typically 25), in contrast to e.g.global basis functions such as spherical harmonics.
-Discontinuities in the model that cause discontinuities in the strain Green's functions are respected by the mesh.
-The strain tensor (representing the moment tensor in the reciprocal case) can be computed on the fly from the stored displacements at high accuracy.This reduces the storage by a factor of 2 as the displacement has 3 degrees of freedom, compared to 6 for the strain.
-Since the displacement is continuous also at model discontinuities and element boundaries, it needs to be stored only once at all Gauss-Lobatto-Legendre (GLL) points that belong to multiple elements, reducing the storage by another factor of 16/25 = 0.64 (see Fig. 4).-Storing the displacement allows to use force sources as well without any extra computation or storage requirements.
Figure 5 visualizes the spatial representation for a long period mesh (50 s) for the Rayleigh wave train and the G rr,r component of the strain Green's tensor: the strain is smooth also across the doubling layer of the mesh where the background model (ak135f, Montagner and Kennett, 1996) is smooth as well.Still, the discontinuities of the model and hence the strain are explicitly represented by this discretization and the resolution of the mesh is adapted to the local wavelength, as for instance in the crust.Figure 6 shows an example for 2 s shortest period and compares the SEM discretization to regular depth sampling.In the regular sampling case with nearest neighbor interpolation, the phase and envelope errors can be quite large, especially close to the model discontinuities (up to 80 % envelope misfit and 4 % phase misfit as defined by Kristekova et al., 2009) and for very shallow sources (up to 40 % envelope misfit and 14 % phase misfit).

Finite element mapping
One performance-critical step in the spatial scheme is to find the reference coordinates (ξ, η) inside the spectral element that includes a point given in global coordinates (s, z).While  and Kennett, 1996) are indicated by solid lines and lead to discontinuities in G rr,r , which are exactly represented in the SEM basis.
the opposite mapping is trivial because this is how the elements of the SEM are defined (Nissen-Meyer et al., 2007a, Appendix A1), it cannot be generally inverted easily.Hua (1990) presents an analytical inverse solution for quadrilateral elements, which is quite involved and not easy to generalize for the semicircular elements used in AxiSEM.
We follow a two-step approach to finding the reference coordinates.First, we find the six closest element midpoints to limit the search to a small number of candidate elements in which the point could be.The number six is specific to the AxiSEM mesh, where each corner point can belong to a maximum of six elements in the doubling layers, see Fig. 7.This step can be seen as approximating the AxiSEM mesh with Voronoi cells.For most points, the closest midpoint will already indicate the correct element, in the worst case the second step has to be performed for all six candidates.Voronoi approximation (colored) of the AxiSEM mesh (black lines) using the midpoints of the elements (red circles) only, zoomed onto a doubling layer for a 50 s mesh.For most elements, the Voronoi cell coincides almost exactly with the AxiSEM element, note that most of the AxiSEM elements have edges of concentric circles while the edges of the Voronoi cells are all straight lines.In the worst case, six AxiSEM elements have to be tested whether a point is inside or not.
In a second step, the reference coordinates (ξ, η) of the given point (s p , z p ) are computed for the six candidate elements sorted by the distance of the midpoints.If both ξ and η are in the interval [−1, 1], the element is found.The coordinates (ξ, η) are computed using an iterative gradient scheme adopted from SPECFEM3D (Komatitsch and Tromp, 2002b).Starting from the midpoint of the candidate element, updated values are found by linear approximation of the inverse mapping: with the Jacobian matrix defined as and the mapping s(ξ, η) and z(ξ, η) depending on the element type as defined in Nissen-Meyer et al. (2007b).In the AxiSEM mesh, this iteration converges to numerical accuracy within less than 10 iterations and is not performance critical for Instaseis as it is only used on the few candidate elements.Also, this two-step approach requires only the midpoints of all elements in the mesh to be read from file on initialization and can be implemented efficiently using the kdtree provided by the SciPy package (http://www.scipy.org/).

The temporal scheme
The design of the temporal scheme is guided by a number of constraints on the spectrum of the source time function: the  spectrum should decay steep enough above the highest frequency resolved by the mesh, such that the least number of samples according to the Nyquist criterion can be used without introducing aliasing.On the other hand, it should not decay too steeply, such that it is still possible to deconvolve and convolve with another source time function.Additionally, the spectrum should be as flat as possible within the usable frequency range as well as "earthquake-like" without the ne- The vertical lines denote the resolution of the mesh and the Nyquist frequency of the downsampling using four samples per mesh period.
cessity of deconvolution when extracting a seismogram from the database.An actual delta function as would be required for true Green's functions cannot be represented in a discrete approximation as it is not bandlimited.We found a Gaussian source time function with σ = τ/3.5 to fulfill these requirements, where τ is the shortest period resolved by the mesh.Figure 11 shows the amplitude spectra of this source time function as well as a corresponding velocity seismogram at a distance of 40 • .The two spectra have a very similar general shape and decay to 10 −3 of the maximum at half the shortest period.This motivates that sampling with four samples per period will not introduce aliasing artifacts.
It is desirable to retrieve seismograms from the database with arbitrary time steps, which requires interpolation or resampling.Popular time domain schemes such as interpolation by low-order polynomials or splines do not work well close to the Nyquist frequency.On the other hand, frequency domain resampling by zero-padding the discrete Fourier transform of the signal can only resample to rational multiples of the original sampling interval.Finally, the kernel from the theoretically exact reconstruction according to the Nyquist-Shannon sampling theorem (i.e. the sinc function) has infinite support which renders it impractical as well (see  A Source and a Receiver object are created and then passed to the get_seismograms() method of an InstaseisDB object.This will extract the Green's functions from the databases and perform all necessary subsequent steps resulting in directly usable three-component seismograms in form of an ObsPy Stream object.Please refer to the Instaseis documentation for details.Burger and Burge, 2009, Sect. 10.3 for an extended introduction to interpolation).
Therefore, we adopt the Lanczos resampling scheme, which is popular in image processing, and an approximation to the sinc-resampling with finite support.The Lanczos kernel is defined as the sinc function multiplied by the Lanczos window function (Burger and Burge, 2009): where a is a parameter to control the number of samples to be used in the interpolation and the sinc function is defined as Interpolation is then performed by convolving the discrete signal s i with this kernel and evaluating it at the new time samples t j (Burger and Burge, 2009): where • denotes the floor function and t the sampling interval of the original signal.Figure 8 shows the Lanczos kernel for different values of a, Fig. 9 shows a practical example of resampling a seismogram.In Fig. 10 we test a number of values for a for the first 1800s of the same seismogram and we find a = 12 to be a reasonable compromise between cost (using 25 samples in the interpolation) and accuracy (RMS error of 0.03 %).

Python API
Instaseis is implemented as a library for the Python programming language with some performance critical parts written in Fortran.Furthermore it directly integrates with the ObsPy package (Megies et al., 2011;Beyreuther et al., 2010) and utilizes the Python bindings to NetCDF 4 (Rew and Davis, 1990).This enables it to take advantage of the strong scientific Python ecosystem built on top of the SciPy Stack (http: //www.scipy.org/).Reasons for choosing Python include its growing popularity in the sciences and it being easy to learn and use while still sufficiently powerful for complex scenarios.Python is open-source and particularly well suited for big data applications and the integration with web services and databases which suits the potential uses for Instaseis.
Figure 12 shows how to use the Python API in the most simple case.Instaseis provides an objectoriented interface: in addition to the shown Source and Receiver classes it furthermore provides ForceSource and FiniteSource objects.These can also be created by providing data in most commonly used file formats like Sta-tionXML, QuakeML, and Standard Rupture Format.Please refer to the Instaseis documentation for further details (www.instaseis.net).Combining and integrating these features enables the construction of modern and clean workflows to solve new problems.A big advantage of this approach is that no temporary files need to be created and the synthetic seismograms can be extracted from the databases on demand when and where they are needed.
The Python API furthermore implements a client/server approach for remote Instaseis database access over HTTP.This enables organizations to host high-frequency databases and serve them to users over the internet.This eliminates the need and upfront cost to calculate, store, and distribute Instaseis databases for most users while still offering enough performance for many use cases.The Python interface is datasource independent: from a usage perspective it does not matter if the databases are available locally or via the internet.
Instaseis is developed with a test-driven approach utilizing continuous integration, i.e. every change in the code is automatically tested for a number of different python version once committed to the repository.It is well documented, has a high test coverage, and we intend to maintain it for the next couple of years providing a solid foundation for future applications built on top of it.It is licensed under the Lesser GNU General Public License v3.0, the source code and issue tracker are hosted on GitHub.

Accuracy
As we already provided some rigorous validation comparing AxiSEM synthetics to a reference solution (Yspec Al-Attar and Woodhouse, 2008)  , where each point was weighted with the frequency f to ensure better fitting at the higher frequencies.The exponent is slightly smaller than the expected 3 because the zip compression is more efficient for longer time traces.At long periods, element sizes are governed by the layer thickness rather than the wavelength, resulting in the discrepancy from the power law at long periods.
curacy. Figure 13 shows a record section and some details for Instaseis, AxiSEM and Yspec seismograms computed in the anisotropic, visco-elastic PREM model for an event at 126 km depth beneath Tonga bandpass filtered to 50-2 s period.
While this figure is similar and the AxiSEM and Yspec reference data actually the same as presented in van Driel and Nissen-Meyer (2014b, Fig. 11), it is important to note that they were generated in very different ways: here we computed a whole Green's function database for all epicentral distances and down to 700 km source depth and changing source or receivers would cost a few milliseconds only.In our previous approach, this would have required a full new AxiSEM simulation on the order of 10 K CPU hours computational cost.Also, in contrast to van Driel and Nissen-Meyer (2014b), we used default mesh parameters for 2 s period and time step close to the stability limit of the 4th order symplectic time scheme (Nissen-Meyer et al., 2008).Still, the phase misfit (Kristekova et al., 2009) is well below 1 % in all zoom windows and the maximum of the envelope misfit is 2 % for the PPP phase on station ALE.
The fact that these traces are virtually indistinguishable for such a demanding setup of wave propagation over 800 wavelengths (waves at 2 s period traveling for 1600 s) verifies that the entire workflow of computing and querying the database are correctly implemented.In particular, numerical reciprocity (i.e. the different force and moment sources), onthe-fly calculation of the strain tensor as well as temporal  and spatial sampling have no significant adverse effect on accuracy, i.e. any remaining errors vanish within numerical accuracy of the forward solver AxiSEM.

Database size
One major constraint for computing a database beside the CPU cost is the permanent storage requirement.Here, we summarize the most important parameters and the related scaling of the required disk space.The amount of data scales with the third power of the highest frequency resolved by the mesh, but zip compression is slightly more efficient for longer traces, resulting in an empirical exponent of 2.7, see Fig. 14.Scaling with the length of the seismograms is slightly stronger than linear, again because the compression is more efficient on the zeros before the first P arrival.Scaling with depth and epicentral distance range is linear, where the prefactor for depth scaling is halved at each doubling layer of the mesh.The reciprocal databases for vertical (40 %) and hori-zontal (60 %) components are computed and therefore usable independently.
Several examples are shown in Fig. 14: for Earth, a complete reciprocal database including all three components, all epicentral distances and sources down to 700 km and 1 h of seismogram length accurate down to 2 s period, is about 1 TB in size.Calculating such a database once and storing it on a central server will give any user arbitrary and immediate access to short-period synthetic seismograms without any further cost.More specialized databases are possible: for example to study inner core phases for shallow events in an epicentral distance from 140 to 160 • , 200 GB storage suffices to store a database with a frequency of 2 Hz.

Performance
To evaluate the overall performance of Instaseis, two distinct parts have to be analyzed: first, the databases have to be generated with AxiSEM.Though very efficient, the database generation at short periods is a high-performance computing task.However, AxiSEM scales well on up to 10 000 cores such that global wave fields can be computed at the highest frequencies within hours on a supercomputer.Detailed performance and scaling tests of AxiSEM can be found in Nissen-Meyer et al. ( 2014), here we just show the total CPU time required to compute full databases (i.e.horizontal and vertical component) for 1 h long seismograms for two different time schemes (2nd order Newmark and 4th order symplectic Nissen-Meyer et al., 2008) and two planets (Earth and Mars) at a variety of resolutions, see Fig. 15.The general scaling of AxiSEM is proportional to T −3 , where T is the shortest period resolved by the mesh.The slight discrepancy from this power law at longer periods is due to the thin crustal layers causing a smaller global time step in the simulation.Simulations for Mars are approximately a factor of 5 faster than for Earth, due to the smaller radius.
The performance of the second part, the seismogram extraction, on the other hand is rarely limited by raw computing power.It scales linearly with increasing frequency of the databases' Green's functions and can easily be accomplished on a standard laptop.The limiting factor in most cases is the latency of the storage system, e.g. the time until it starts reading from the database.To alleviate this issue we implement a buffering strategy on the functions reading data from the files: the Green's functions from a whole element of the numerical grid are read once and cached in memory.If data from the same element is needed again at a later stage it will already be in memory, thus avoiding repeated disc access.Once the cache memory limit is reached, the data with the earliest last access time is deallocated, effectively resulting in a priority queue sorted by last access time.This optimization is very effective for most common use cases as they oftentimes require seismograms in a small range of epicentral distances and depths.
Instaseis comes with a number of integrated benchmarks to judge its performance for a certain database on a given system.The benchmarks emulate the computational requirements and data access patterns of some typical use cases like finite source simulations and source parameter inversions.Finite sources within the benchmarks are simulated by calculating waveforms for moment tensor sources on an imaginary fault plane along the equator ranging from the surface to a depth of 25 km.One source is calculated for each kilometer in depth until the bottom of the fault is reached.This is repeated each kilometer along the fault's surface trajectory until the benchmark terminates.A source parameter inversion is simulated by calculating seismograms from moment tensor sources randomly scattered within 50 km distance to a fixed point.Results for four runs are shown in Fig. 16.As is the case with all benchmarks they have to be interpreted carefully, nonetheless they demonstrate the behavior and performance characteristics of Instaseis on real machines.

Applications
In this section we depict several possible use cases of Instaseis.This list is not exhaustive and deliberately unconnected to provide a broad overview.

Graphical user interface
To prominently highlight the features and nearly instantaneous seismogram extraction for arbitrary source and receiver combinations of Instaseis, we developed a crossplatform graphical user interface (GUI), shown in Fig. 17.It ships with the standard Instaseis package and is written in PyQt, a Python wrapper for the Qt toolkit.
Most evidently, this may be used for visual inspection and verification of any given AxiSEM Green's function database.Instaseis' performance permits an immediate visual feedback to changing parameters.This also delivers quantitative insight for an intuitive understanding of the features and parameter sensitivities of seismograms.Examples of this are the polarity flips of first arrivals when crossing a moment tensor's nodal planes, the triplication of phases for shallow sources, the Hilbert transformed shape of reflected phases and the relative amplitude of surface waves (especially overtones) depending on the earthquake depth.Furthermore, the GUI allows the calculation of seismograms from finite sources and the exploration of waveform differences in comparison to best-fitting point sources.

IRIS web-interface
To enable usage of Instaseis seismograms to a broader community, we aim to remove all hurdles of computing and storing large databases locally.To this end, and in collaboration with IRIS, we plan to establish a web interface to the Instaseis databases.In contrast to the ShakeMovie ap-    .Computational cost to compute many synthetic seismograms for finite-frequency tomography with a shortest period of 2 s using different methods.For Yspec we assume that for every source there are 1000 receivers with 3 components each.The shaded regions for Instaseis indicate the dependence of the performance on the actual source receiver distribution, compare Fig. 16.Including the cost to generate the database with AxiSEM, Instaseis breaks even with Yspec for 14 000 waveforms, which is equivalent to about 5 sources in this configuration.
proach (Tromp et al., 2010), this interface will be able to handle arbitrary sources and receivers independent from catalogue data or other parameter limitations.The interface and databases will be described and benchmarked in detail in a separate publication, the status of this project can be viewed on http://ds.iris.edu/ds/products/ondemandsynthetics.

Finite-frequency tomography
In finite-frequency tomography (e.g.Nolet, 2008) information is extracted from recorded seismograms by matched filters in multiple frequency bands (Sigloch and Nolet, 2006;Colombi et al., 2014).A matched filter correlates a predicted signal with the measured signal to detect the predicted signal in the presence of noise.In the case of seismic tomography, a synthetic seismogram is necessary, which is usually created by convolving a Green's function with an estimated source-time function.For body waves, short periods down to 1 s are commonly used (e.g.Stähler et al., 2012;Hosseini and Sigloch, 2015).Typical data sets contain thousands of earthquakes (e.g.Auer et al., 2014), each recorded at hundreds of stations, resulting in up to a million waveforms.For each of these waveforms, a separate Green's function has to be calculated, which requires solving the seismic forward problem at the desired frequencies.For wave propagation methods that solve the forward problem separately for each event, computing these reference synthetics presents a formidable computational challenge, which is why previous studies resorted to approximate solutions like WKBJ (Chapman, 1978) or the reflectivity method (Fuchs and Müller, 1971) (Sigloch and Nolet, 2006).The waveforms are aligned by computing relative time-shifts between data and synthetic seismograms using cross-correlation (similar to actual finitefrequency tomography).
2008) is about an order of magnitude faster than AxiSEM in computing seismograms for a single source.However, at least in the current implementation, the cost scales linearly with the number of events.As Instaseis takes advantage of reciprocity of the Green's function, we can now build the whole database for all possible sources with only two runs of AxiSEM: one for the vertical and one for the horizontal components.Figure 18 compares the computational cost of computing the reference synthetics down to 2 s period assuming that each event was recorded at 1000 three-component stations.Ignoring the cost of computing the database, Instaseis is comparable in performance to WKBJ, but actually returns full seismograms including all phases, see Fig. 19.In contrast to WKBJ, where each crustal reverberation has to be defined separately, it automatically calculates the full crustal response.Also, it appropriately models diffracted phases such as Pdiff and triplicated phases from upper mantle discontinuities.If we include the database generation, Instaseis breaks even in computational cost with Yspec already at about 14 000 waveforms, i.e. five events with 1000 three-component stations each.At about 5 • 10 8 waveforms, the cost of extracting the seismograms from the database becomes dominant over the database generation.Assuming 2000 seismograms per event, this is equivalent to 10 000 earthquakes, i.e. in the order of available earthquake catalogues.However, generating seismograms with different source locations or moment-tensor radiation patterns, which is often necessary in tomography, does not require a new database generation.

Probabilistic source inversion
Uncertainties in source parameters have been shown to have a strong influence on waveform tomography (Valentine and Woodhouse, 2010).Probabilistic point source inversion estimates the uncertainties of source parameters and their correlation.From these, the effect on seismic tomography can be estimated (Stähler and Sigloch, 2014).It requires the repeated calculation of synthetic waveforms for varying moment tensors, depths and source time functions to calculate the likelihood and posterior probability density of models in a Bayesian sense.Changing source time function and moment tensor is extremely efficient from an Instaseis perspective, and the limitation to a fixed epicenter means that the I/O buffering can be done very efficiently, which is reflected in the Source Inversion test case in the benchmark (Fig. 13).
From a previous study (Stähler and Sigloch, 2014), we assume that for an inversion for depth, the moment tensor and the source time function, a 20-dimensional model space has to be sampled, which requires to perform roughly 60 000 forward simulations.Using 100 seismic stations and threecomponent seismograms, this means that roughly 1.8 • 10 7 waveforms have to be calculated for one source inversion, costing on the order of 50-100 CPU hours (Fig. 18).

Finite sources
Finite sources can be represented in Instaseis by a cloud of point sources without limitations on the fault geometry or source time functions.Each point source needs to be attached with a moment tensor, a slip rate function and a time shift relative to the origin time.These can for instance be retrieved from standard rupture format (*.srf) or subfault format (*.param) files as provided by the USGS for most events with M > 6.5.As a show case, we computed the seismograms for the source inversion validation (SIV) exercise #3 (http://equake-rc.info).The source is a M 7.8 strike-slip earthquake on the San Andreas Fault represented by ≈ 10 5 point sources, where each source has a different mechanism and slip rate function.The 52 stations are in 30 to 90 • epicentral distance (see Fig. 20), where the P wave arrival is supposed to be well separated (compare Fig. 1).Excluding the cost of generating the database, it cost a total of 12 CPU hours to compute the 52 1-hour-long three-component seismograms accurate down to 5 s.
Figure 21 compares the Instaseis seismograms to Pphases computed with the frequency-wave number integration method (fk) by Kikuchi and Kanamori (1982), where only direct and surface reflected phases where taken into account.While the first arriving waves agree to certain extent with Instaseis providing systematically larger amplitudes, there are significant differences for later time windows.These are due to additional phase arrivals within the time window (especially triplicated PP, compare Fig. 1) and crustal reverberations not modeled by the fk method.For events with long rupture durations as in this example (200 s) this suggests that more accurate waveforms should be beneficial for finite source inversions.

Insight/Mars
The upcoming NASA-lead Mars Insight mission (Banerdt, 2013), to be launched in March 2016 and scheduled to land September 2016, will deploy a single station with both a broad-band and a short-period seismometer on Mars.This will be the first extra-terrestrial seismic mission since the Apollo lunar landings (1969)(1970)(1971)(1972) and Mars Viking missions (1975) with the goal of elucidating the interior structure of a planet other than Earth.The instrument will record local, regional, and more distant marsquakes (including meteorite impacts) and send data back to Earth for analysis.
Our knowledge of the seismic structure of Mars is limited because of lack of resolution of currently available areophysical data (e.g., Khan and Connolly, 2008) and the limited sensitivity of the Viking seismometers due to their installation on board of the lander.For this reason, we will generate databases of "reference" seismic waveforms for a comprehensive collection (order of magnitude 1000) of 1-D Martian models to be used by modelers and analysts in preparation for the Insight mission.The models are constructed from cur-  21.Seismograms for the SIV benchmark, Z-component aligned on the P arrival band-pass filtered between 5 and 100 s period.The labels denote the station code and epicentral distance.In the frequency-wave number integration (fk, Kikuchi and Kanamori, 1982), only direct P and the depth phases were included, while Instaseis provides full seismograms, including PP, PcP and other phases.Especially for the stations in less than 40 • distance, the effect is profound, since PP arrives as a triplicated complex wave train only 70-100 s after P. Due to the long source duration, the PP arrival overlaps with the direct P wave train for several stations.rent areophysical data (mean mass, mean moment of inertia, tidal Love number, and tidal dissipation) and thermodynamic modeling methods and summarize our current understanding of the internal constitution of the planet.AxiSEM and hence Instaseis can readily be used to propagate waves on Mars, see Fig. 22, allowing us to build these databases very efficiently.

Synthetic ambient seismic noise
As mentioned in Sect.2.3, seismograms generated by force sources can be extracted from the same reciprocal databases.This is particularly interesting for studying ambient seismic noise.By cross-correlating noise recorded at two stations, using long enough time series and under certain assumptions (uncorrelated, isotropically distributed white noise sources), it is possible to retrieve the Green's function of the medium between the two stations (e.g.Sanchez-Sesma, 2006;Gouédard et al., 2008).However, not all of these assumptions are met in nature, e.g. the noise sources are not evenly distributed (Tsai, 2009;Froment et al., 2010;Basini et al., 2013).Also, the noise sources themselves are not yet well understood, especially with respect to the generation of Love waves in the microseismic band (Nishida et al., 2008).
Instaseis provides a basis to quickly generate noise synthetics to study such effects, which we illustrate in Fig. 23.We computed noise cross correlations, accurate down to a period of 5 s, for a total of 20 days of noise data generated with 100 000 noise sources.The calculation only took 1 CPU hour.In the first case, the noise sources consist of vertical forces with a random source-time function, all have the same amplitude and are distributed evenly on the globe.The resulting cross correlation is in good agreement with the Green's function, which is obtained by introducing an impulse source at each of the stations in Zurich and Munich.
In the second case, sources are located in the oceans only, their amplitude proportional to the significant wave height (Gualtieri et al., 2013).For the two stations located in Zurich and Munich, the close sources are thus solely located in the west, which leads to strong asymmetry in the retrieved correlations (Stehly et al., 2006).Instaseis thus enables users to study noise on the global scale across the microseismic band, by generating realistic waveforms at negligible cost.

Conclusions & Outlook
In this paper we presented a readily available methodology and code to extract seismograms for spherical earth models from a Green's function database.High efficiency in the generation of databases and very fast extraction (on the order of milliseconds per seismogram) of highly accurate seismograms (indistinguishable from conventional forward solvers) can then replace previously employed approximations such as WKBJ, reflectivity or frequency-wave number integration methods that were used for computational reasons in many applications of global seismology.Instaseis is open source and available with extensive documentation at www.instaseis.net.Future developments include Cartesian local domains with layered models, which are not yet supported by AxiSEM.As a large fraction of earthquakes are located below oceans and receivers on continents, it may be beneficial for body waves studies to take advantage of the axisymmetric capability of AxiSEM and place the receiver on a circular "island" of continental crust within a global oceanic crustal model.

Figure 1 .
Figure1.Global stack of 1 h of seismograms accurate to a shortest period of 2 s for an earthquake in 27 km depth computed with Instaseis.The displacement is color-coded analogous to the IRIS global stack(Astiz et al., 1996), i.e. red: transversal component, green: radial component, blue: vertical component.An automatic gain control (AGC) with a window of 100 s length is used to balance large amplitude variations between the various phases.Note that creating this plot does not require to define the source depth at the time of database calculation.A high-resolution version of this plot and one for 5 h long seismograms is added as an electronic supplement.

Figure 2 .
Figure 2. The 3-D wave field is decomposed analytically into monopole, dipole and quadrupole radiation patterns (left) and the remaining 2-D problem is solved on a D-shaped domain (right) using the spectral element method.While the forward databases require a total of four 2-D computations, it is only two for the backward databases using reciprocity of the Green's function: one for the vertical and one for the horizontal components (modified from Nissen-Meyer et al., 2014).

Figure 3 .
Figure3.Lagrangian basis polynomials l n (ξ ) of fourth order in one dimension.At the collocation points, all but one are zero, such that the value of the interpolated function at this point coincides with the coefficient in this basis expansion.

Figure 4 .
Figure 4. Lagrange interpolation points inside an element (gray) and its neighbors.Coordinates ξ and η are the reference coordinates of the gray element.Points on the edges (black squares) are shared between neighbors and function values at these points need to be stored only once if the function is continuous (e.g.displacement).The number of global degrees of freedom per element of such functions is thus approximately 16 compared to 25 for discontinuous functions (e.g.strain).

Figure 5 .
Figure 5. Snapshot of one component of the Green's tensor (G rr,r ) as represented in the SEM basis for a shortest period of 50s.Discontinuities such as caused by the crustal layers are exactly represented and the wave field is smooth across doubling layers of the mesh.

Figure 6 .
Figure 6.One component of the strain Green's tensor (G rr,r ) for a distance of 30 • as a function of time and depth with a shortest period of 2 s.(a) SEM basis vs.(b) regular sampling with 1 km distance and (c) phase and envelope misfits (EM and PM in the legend, see Kristekova et al., 2009) caused by the regular sampling computed in the period range 1-20 s.Dashed lines in the left panel sketch the spectral elements.The crustal discontinuities of ak135f(Montagner and Kennett, 1996) are indicated by solid lines and lead to discontinuities in G rr,r , which are exactly represented in the SEM basis.

Figure 7 .
Figure7.Voronoi approximation (colored) of the AxiSEM mesh (black lines) using the midpoints of the elements (red circles) only, zoomed onto a doubling layer for a 50 s mesh.For most elements, the Voronoi cell coincides almost exactly with the AxiSEM element, note that most of the AxiSEM elements have edges of concentric circles while the edges of the Voronoi cells are all straight lines.In the worst case, six AxiSEM elements have to be tested whether a point is inside or not.

Figure 8 .Figure 9 .Figure 10 .
Figure8.Lanczos kernels used for resampling.For large values of the parameter a, it converges towards the sinc function, which is the kernel that allows exact reconstruction for bandlimited signals as stated in the Nyquist sampling theorem(Nyquist, 1928).

Figure 11 .
Figure 11.Normalized amplitude spectra of the Gaussian source time function (slip rate) used at 2 s mesh period and a vertical component synthetic seismogram recorded at 40 • epicentral distance.The vertical lines denote the resolution of the mesh and the Nyquist frequency of the downsampling using four samples per mesh period.

Figure 12 .
Figure12.The Instaseis Python API demonstrated in a short interactive Python session.A Source and a Receiver object are created and then passed to the get_seismograms() method of an InstaseisDB object.This will extract the Green's functions from the databases and perform all necessary subsequent steps resulting in directly usable three-component seismograms in form of an ObsPy Stream object.Please refer to the Instaseis documentation for details.

Figure 13 .Figure 14 .
Figure13.Comparison of vertical displacement seismograms (bandpass filtered from 50 to 2 s period) for a moment magnitude M w = 5.0 event in 126 km depth under the Tonga Islands, computed with Instaseis, AxiSEM and Yspec in the anisotropic PREM model without ocean but including attenuation.The traces are recorded at the GSN stations indicated in the map.The zoom windows are depicted with gray rectangles in the record section and the time scale is relative to the ray-theoretical arrival.EM and PM(Kristekova et al., 2009) denote the envelope and phase misfit between Instaseis and Yspec traces in the corresponding time window.

Figure 15 .
Figure 15.Computational cost in CPU hours (measured on Monte Rosa: a Cray XE6 for Earth and Piz Daint, a Cray XC30 for Mars) to generate full Instaseis databases with 1 h long seismograms for two time schemes: 2nd order Newmark and 4th order symplectic.

Figure 16 .
Figure 16.Results of benchmarks for four typical use cases run on different hardware with a variety of shortest periods.The graphs show the inverse time for the calculation of the ith three-component seismogram.Each run calculated 1000 three-component seismograms, is repeated 10 times with the same random seed, the top and bottom values are discarded, and the mean of the remaining eight values is plotted.The CPU and I/O bound scenarios illustrate the speed with a fully efficient and a deactivated cache, respectively.The two bottom scenarios emulate real use cases, see the main text for details.Amongst other things they show the consequence of a too small cache in the source inversion scenario for the 2 s run and the efficiency of the buffer in the finite source scenario for the same database.

Figure 17 .
Figure 17.Screenshot of the Instaseis graphical user interface (GUI).Aside from quickly exploring the characteristics of a given Green's function database it is a great tool for understanding and teaching many features of seismograms.The speed of Instaseis enables an immediate visual response to changing source and receiver parameters.The left-hand side shows three-component seismograms where theoretical arrival times of various seismic phases are overlaid as vertical lines.The bar at the top is used to change filter and resampling settings and the section on the right side is used to modify source and receiver parameters.
Figure18.Computational cost to compute many synthetic seismograms for finite-frequency tomography with a shortest period of 2 s using different methods.For Yspec we assume that for every source there are 1000 receivers with 3 components each.The shaded regions for Instaseis indicate the dependence of the performance on the actual source receiver distribution, compare Fig.16.Including the cost to generate the database with AxiSEM, Instaseis breaks even with Yspec for 14 000 waveforms, which is equivalent to about 5 sources in this configuration.

Figure 20 .
Figure 20.Stations used in the source inversion validation (SIV) exercise.Circles mark 30, 60 and 90 • epicentral distance.The finite source is a M 7.8 strike-slip earthquake in southern California represented by ≈ 10 5 point sources, the beach ball represents the centroid moment tensor, i.e. the orientation and predominant direction of slip of the overall fault.

Figure 22 .
Figure 22.Seismic waves traveling in Mars after a meteorite impact at its north pole computed with AxiSEM.P-waves are shown in blue and S-waves and surface waves in red.

Figure 23 .
Figure 23.Synthetic ambient seismic noise cross correlations computed with Instaseis.Left: 100 000 vertical force sources located in the oceans and amplitude proportional to the significant wave height from the NOAA WAVEWATCH III model on 3 January 2015 (Tolman, 2009).Red crosses indicate the receivers located in Munich and Zurich.Right: cross correlations of 20 days of noise for (top) evenly distributed noise sources and (bottom) the sources in the map, the traces are normalized to their maximum amplitude.

Table 1 .
I/O performance for a typical setup of AxiSEM on Super-MUC.The simulation parameters were as follows: 2 s shortest period, 3600 s simulation length, model: ak135f, vertical component, maximum source depth 700 km.The resulting uncompressed wave field file has a size of 675 GB.The I/O throughput is not affected much by the number of CPUs involved.The throughput between different runs varies, which is probably caused by the changing I/O load on the system.