Beam-hardening correction by a surface fitting and phase classification by a least square support vector machine approach for tomography images of geological samples

Introduction Conclusions References Tables Figures


Introduction
Advances in the technological (image resolution) and computational (image size) aspects of X-ray computed microtomography (µXCT) technology now enable the acquisition of three-dimensional (3-D) images down to a sub-micron spatial resolution, which is sufficient to capture the microstructure of geological rock cores (Cnudde and Boone, 2013).Recent research on digital rock physics has successfully combined microscopic imaging with advanced numerical simulations of physical properties for which laboratory measurements are not possible.However, benchmarking tests of commonly used image processing methods revealed unacceptably large variations in the results and further development and optimization is therefore clearly warranted (e.g., Andrä et al., Introduction Conclusions References Tables Figures

Back Close
Full 2013).Furthermore, Leu et al., 2014 determined the importance of image analysis of µXCT-generated data to provide accurate parameterization of porosity-permeability relationships.It turned out that this relationship is highly sensitive towards accurate processing of the pore microstructure images used.This accuracy is of great importance in pedohydrology studies of water flux in the critical zone (e.g., Khan et al., 2012;Kumahor et al., 2015), reactive transport modelling (e.g., Schwarz and Enzmann, 2013;Molins et al., 2014), distributions of multi-component fluids (e.g., Berg et al., 2013Berg et al., , 2015)), contaminant retardation (e.g., Huber et al., 2012), energy-related activities such as nuclear waste disposal (e.g., Hemes et al., 2015), and the geological sequestration of CO 2 (e.g., Sell et al., 2012;Herring et al., 2015), to name just a few examples.
The research aim of image classification is to obtain representations of structures that can be automatically used for categorization of samples into a finite set of labels (i.e., phases in geological materials).As part of the recent development of computer performance and advanced automated computer algorithms, the classical machine learning technique provides a methodology for (non-)linear function estimation and classification problems (Vapnik, 1995).In general, the supervised machine learning approach involves the construction of a convincing model of the distribution of class labels in terms of predictor features for the whole image on basis of a reduced example data set (i.e., training stage) or past experience (Alpaydin, 2004;Kotsiantis, 2007).The resulting classifier is then used to optimize a performance criterion and assign class labels to the testing data.Representative generalization is an important property of a classifier or classification algorithm, because it offers information about as yet unknown data.The Support Vector Machine (SVM) algorithm as one capable example of such classifiers was first developed by Vapnik (1995) as an extension of the Generalized Portrait algorithm.The SVM algorithm is firmly grounded in the framework of statistical learning theory, which improves the generalization ability of learning machines for unknown data.
The success with any classification technique depends on the quality of the µXCT images used as input and, therefore, a less noisy and artefact-free image is always Introduction

Conclusions References
Tables Figures

Back Close
Full desirable.However, in reality, this is not the case when using a bench-top laboratory µXCT technology.In polychromatic X-ray tomography, the average energy of the X-ray beam increases as the beam propagates through a geomaterial, whereby low-energy X-ray photons are preferentially attenuated.This results in a beam spectrum successively depleted at lower energies as it passes through thicker parts of the geomaterial.
Consequently, the spectrum of the X-ray beam hardens and becomes less attenuated further into the material.This effect is called beam hardening (BH) and implies that the grey levels of the projection data are non-linear with respect to the shape of a crosssectional linear profile.Consequently, the reconstructed µXCT image features some visual distortions such as pronounced edges ("cupping effect") and other artefacts (Schlüter et al., 2014).The cupping artefact shows an apparently higher attenuation coefficient at the outer regions than in the inner part of a round rock core, even if both regions are composed of the same mineral.The presence of BH artefacts therefore causes problems in 3-D image processing and hampers correct image analysis and phase quantification due to a biased multiphase image segmentation process.However, such artefacts applies for polychromatic X-ray sources only and, more recently, monochromatic synchrotron-based µXCT has been introduced as a powerful tool for effective BH-free visualization of the microstructural features of geomaterials at a voxel resolution down to the sub-micron level (Fusseis et al., 2014;Leu et al., 2014).However, access to this advanced technique is limited by experimental sites and available beamtime.Therefore, most µXCT studies must rely on bench-top devices resulting in the artefacts mentioned.
A variety of both hard-and software measures have been developed to eliminate BH artefacts.These include physical pre-filtering to pre-harden the X-ray photon spectrum, the dual-energy approach, and a variety of computational pre-and post-processing image corrections (Schlüter et al., 2014).A major prerequisite for success with the latter software approaches is that the correction method should not rely on any prior knowledge of the material properties, i.e., it should not depend on known attenuation coefficients.Commonly, there is no prior quantitative information available on the number Introduction

Conclusions References
Tables Figures

Back Close
Full and distribution of phases present in a geological sample.The most common technique for BH correction in pre-processing is linearization, but this is preferable for monophasic material cases.Although the most commonly used algorithm for the reconstruction of µXCT data is based on filtered back-projection (Feldkamp et al., 1984), an iterative forward projection can be used with modern imaging software like Octopus, which allows incorporation of BH modelling algorithms (Brabant et al., 2012).In this alternative reconstruction approach, the attenuation coefficients are not simply added but multiplied with factors to simulate BH depending on the accumulated attenuation over the distance the beam has penetrated through the sample.This pre-processing correction thereby minimizes the underestimation of the attenuation coefficient as the beam progresses through it.However, a major drawback of this promising method is that there is currently no way to determine the two necessary iterative parameters α and β automatically (Brabant et al., 2012), resulting in the manually adjusted output being a highly subjective result.
In principle, the idea of a surface fitting approach in µXCT image post-processing for BH-correction has already been introduced earlier (Krumm et al., 2008;Iassonov et al., 2010;Jovanović et al., 2013), however, without a more detailed outline how to be realized in practice.Therefore, a quite simple algorithm is suggested that fits a 2-D quadratic polynomial function for accurate removal of BH artefacts upon classically filtered back-projection reconstructed slices.Our novel BH-correction algorithm is followed by a pixel-based phase classification introducing the machine learning algorithm approach.
2 Material and methods

Micro-tomography
The custom-built µXCT scanner used at our laboratory (ProCon CT-Alpha, Germany) is equipped with a microfocus X-ray tube (Feinfocus, Germany) and contains a diamond-Introduction

Conclusions References
Tables Figures

Back Close
Full coated anode target with a focal spot size of a few µm.X-ray data acquisition is performed with a 2048×2048 pixel ("2k") flat panel CCD detector of size 105 mm×105 mm (Hamamatsu, Japan).The geological test object was a cylindrical evaporite rock core 30 mm in diameter composed of an anhydrite and clay mineral matrix with halite-sealed veins.The X-ray source voltage was set at 130 kV, and the beam was slightly prehardened with 0.15 mm silver foil.A rotation step of 0.45 • with one-second exposure time corresponds to 800 projections for full 360 • data acquisition at a spatial resolution of 42 µm.Precise centro-symmetrical alignment of the cylinder along the vertical axis is an important prerequisite for success with the BH correction procedure.
The workflow of our post-reconstruction image processing approach including BH correction and the novel segmentation approach is illustrated in Fig. 2. The reconstruction of the 3-D data set was performed on-the-fly by the Feldkamp filtered backprojection algorithm (Feldkamp et al., 1984).This classical 3-D cone beam reconstruction algorithm follows three main steps: (i) pre-weighting the projection rays according to their position within the beam cone, (ii) filtering the projections along horizontal detector lines using a discrete filtering kernel; and (iii) performing a weighted backprojection of the filtered projections along the cone with a weighting factor.Raw projections were corrected for dark current and flat field variations, followed by non-local mean filtering, ring removal, and filtered back projection reconstruction using the imaging software package Octopus (https://insidematters.eu/octopus; Vlassenbroeck et al., 2007).The µXCT images are rarely perfect representations of the attenuation coefficients, because they are also biased by scatter and noise.Therefore, image denoising was additionally performed as an initial correction operation.A suitable smoothing filter should reduce the noise level with minimal alteration of edged features in the image.We applied a 3-D median filter technique with window size mask (3 which replaces a singular pixel value with the median value considering the next neighbourhood pixels.The median filter acted to smooth noisy regions and to improve the preservation of their boundary structures (Gallagher et al., 1981), and is routinely implemented on µXCT images (Culligan et al., 2004;Kaestner et al., 2008;Khan et al., Introduction Conclusions References Tables Figures

Back Close
Full 2012; Sell et al., 2013;Landry et al., 2014;Herring et al., 2015).After reconstruction of the raw data, a 3-D digital image of dimensions x, y, z = 1417 × 1417 × 900 voxels was generated (Fig. 3), but a reduced image of only 450 voxels in the z-direction was ultimately used as a reference for the image correction and classification processing.

Mathematical basis of the surface fitting algorithm
Our post-reconstruction method corrects the BH artefact by fitting a 2-D polynomial, i.e., a quadratic surface to the reconstructed µXCT image data (2-D slice).The surface fitting (i.e., second-order polynomial) approach has a mathematical expression of the form: for some choice of unknown coefficients a 1 , a 2 , . .., a 6 .The solution for all a coefficients determines the best fit of the polynomial of Eq. ( 1) to a given set of data points (reconstructed grey-scale values).The final BH-corrected image is the residual of the data points, i.e. the difference between the surface elevation values and the original image values.Consider f k ∈ (x k , y k ), k = 1, 2, . ..N as arbitrary data points on the 2-D slice (µXCT image), then the normal equations for fitting a polynomial (Eq. 1) can be expressed in a matrix-vector form: (2) Introduction

Conclusions References
Tables Figures

Back Close
Full Equation ( 2) can be solved to yield the solution vector a by: The solution of Eq. (3) for the vector a determines the best fit of the polynomial of Eq. (1) to a given set of data points.The Matlab code of this surface-fitting approach for BH correction is listed in the Appendix.

Mathematical basis of the LS-SVM classification algorithm
Once BH-correction of a µXCT image by surface fitting has been accomplished, a pixelbased multi-classification can efficiently be performed by utilizing supervised machine learning of the least square support vector machine (LS-SVM) type.The LS-SVM method is derived from non-linear support vector machines (Suykens et al., 1999).
The NL-SVM method maps the input vector into the high-dimensional feature space by non-linear mapping associated with a kernel function (often called "kernel trick", refer to Eqs. 14 and 15).The aim is to construct the optimal separating hyperplane, also known as maximum-margin hyperplane in the higher-dimensional feature spaces.This idea of the maximum-margin hyperplane is obtained from statistical learning theory and provides a probabilistic test error bound that is minimized when the margin is maximized (see graphical representation of NL-SVM, Fig. 1).The parameters of the maximum-margin hyperplane are derived by solving a quadratic programming (QP) optimization problem.Suykens and co-workers (2002) proposed the idea of modifying Vapnik's SVM formulation by adding a least squares term to the cost function, which transformed the problem from solving a QP problem to the practically more convenient solving of a set of linear equations.This modification significantly reduces the effort in complexity and thus the computational cost, which may otherwise become excessive.
The basic mathematical formulation of the non-linear SVM classifier and its least square version is presented here only in brief.For further details, please refer to the classical literature (Vapnik, 1995(Vapnik, , 1999;;Chapelle et al., 1999;Suykens et al., 1999).For classification problems, let {y i , x i } N i =1 be given a training set of N data points (here: Introduction

Conclusions References
Tables Figures

Back Close
Full each points are accounted as image pixel values), where x i ∈ n is the i th inputs in n-dimensional vector space, and y i ∈ is the associated output class labels such that y i ∈ {−1, +1}.Consider ϕ(x i ) : n → nb represents a mapping (linear or nonlinear) to a high dimensional feature space which is formulated as: and which is equivalent to where w ∈ n is an adjustable weight vector parameter, and b ∈ is a bias term.The slack variable ξ i ≥ 0 is introduced in the case of the violation of Eq. ( 6).
In real data classification problems, a perfect linear separation is impossible due to overlapping classes.Therefore, a limited number of misclassifications should be tolerated around the margin.In LS-SVM for function estimation the following optimization problem is formulated: subject to the equality constraints:

Conclusions References
Tables Figures

Back Close
Full where e i = ([e 1 , e 2 , . ..e N ] T ) represents the estimation error for some misclassification tolerance in the case of overlapping distributions, and γ is a positive regularization constant in the cost function defining the trade-off between a large margin and misclassification error.In the case of the primal problem expressed in terms of the feature map, the parameter w may have a range over an "infinite-dimensional" parameter set.
Therefore, the dual problem for the LS-SVM represents a solution in terms of the kernel function by means of Lagrange multipliers α i = γe i , which can be positive or negative due to the equality constraints.This means that no sparseness property remains in the LS-SVM formulation, and every training data value is treated as a support vector.The Lagrangian (w , b, e; α) = J p (w , b, e) Is given by the following conditions for optimality: These can be written as a linear system: satisfy the Mercer's condition.This relation is also often termed as kernel trick since no explicit construction of the mapping ϕ(x i ) is needed.It enables the LS-SVM to work in a high-dimensional feature space, without actual performing calculation in this space.Hence, the non-linear LS-SVM classifier in dual space ultimately takes the form: In our model approach, only the Gaussian Radial Basis Function (RBF) kernel is implemented in the LS-SVM classifier due to its high accuracy in function estimation and data set classification (Van Gestel et al., 2002;Selvaraj et al., 2007;Caicedo and Van Huffel, 2010): where σ 2 is the bandwidth of the Gaussian RBF kernel.For the LS-SVM approach to be realized in practice, a public-domain toolbox is used (www.esat.kuleuven.be/sista/lssvmlab/) that contains Matlab/C implementations for a number of algorithms.Introduction

Conclusions References
Tables Figures

Back Close
Full

Correction for beam hardening effects
In presence of a BH artefact, the reconstructed grey-scale values vary across the rock core from higher values at the periphery to lower values in the central region for the same mineral phase (e.g., clay minerals, Fig. 4b).Therefore, the attenuation crosssection function across a sample consequently becomes a parabolic curve rather than a linear line (Jovanović et al., 2013).Visual inspection of the image of our evaporite rock sample showed that the grey-scale values of the anhydrite mineral in the central region may overlap with the grey-scale values of clay minerals at the periphery, and would significantly hamper the correct differentiation between both phases.In order to adjust unequivocally a unique grey-scale level for each phase, we applied the quadratic 2-D polynomial function (Eq. 1) to our image (Fig. 4a).This polynomial approximation constructs the surface that best fits the cloud of data points subject to the coefficients determined by Eq. ( 3).The residual data values were extracted as the difference between the values of the original data and those of the fitted surface.The plots of the residual data values indicate the difference in grey-scale levels of different phases (Fig. 4c), where the peaks represent a higher grey-scale level of anhydrite mineral clearly differentiated from the base level data values representing clay minerals.
The image was again reconstructed from the residual data values, which ultimately leads to the efficient removal of the BH artefact in comparison with the original image (compare Fig. 4b and d).

LS-SVM multi-classification for phase analysis
Upon successful removal of the BH artefacts, the LS-SVM algorithm was tested for the multi-classification task.For comparison, the performance of the LS-SVM algorithm was also evaluated on the same image but with uncorrected BH artefacts.In our LS-SVM approach, the direct voxelized input of an original µXCT image was mapped into Introduction

Conclusions References
Tables Figures

Back Close
Full a feature vector for training and for testing data points.The rock core µXCT image was classified into the three major phases: halite, anhydride, and clay minerals.To perform a pixel-based classification, certain regions at different locations were manually selected, as marked by letters "A" to "F" in Fig. 5a.The selection of all pixel values for each phase was performed carefully to avoid boundaries overlapping with each other phase and to limit the misclassification rate.The total number of data points thus trained for all phases was 1755, which is only 0.1 % of the remaining pixels of a 2-D slice.The remaining 1 570 149 pixels were treated as an unknown data set (test data).
It is important to include a possible range of grey-scale level in a training data set, in order to provide maximum information with true class labels, otherwise the classifier considers the output to be undecided.The generalization performance of the LS-SVM algorithm requires tuning of a set of hyperparameters (e.g., the regularization constant γ and the RBF kernel parameter σ).These tuning parameters were obtained by combining a coupled simulated annealing (CSA) and a standard simplex method.First, CSA was used to determine the appropriate starting points to be transferred to the simplex optimization routine to tune the result.Finally, optimal values of γ = 4.6 and σ = 1.7 were determined on the training data set by applying a leave-one-out routine with a 10-fold cross-validation score function and encoding scheme of one-versus-one.
The remaining data set of 1 570 149 pixels was tested based on the predictor feature vector of the training class labels thus obtained.The output of the data values classified in this way was again reconstructed to give an image in which each distinguished attenuation level was labelled by a single integer value (1, 2, and 3 for the three phases halite vein, anhydrite, and clay minerals, respectively) as illustrated in Fig. 5b and c.From visual inspection, the LS-SVM performs quite well on the BH-corrected image, in which the label class of each phase distribution is well matched with the mineral distribution in the original image, but fails to perform in this way on the image with BH artefacts.Introduction

Conclusions References
Tables Figures

Back Close
Full

Discussion
In principle, any classification results can be biased, and this bias can be evaluated by a performance measure based on the Receiver Operating Characteristic (ROC) method.The ROC is a statistical measure of the performance of a binary classification test.It provides tools to select optimal models in the analysis of decision-making (Fawcett, 2006).An ROC curve can be constructed by plotting the specificity ("false positive rate") against the sensitivity ("true positive rate") by varying the decision threshold over its entire range.In our LS-SVM model scheme, only binary classification ROC function is integrated.Therefore, the multiphase classification problem was first decomposed into binary classification tasks, i.e., a binary classification between, for example, anhydrite and halite, and between anhydrite and clay minerals, to measure the ROC relationship for LS-SVM both with and without BH-artefact corrected images.Note that ROC was implemented only on the training set data to minimize computational costs.
In addition to the ROC parameters of sensitivity and specificity, another important performance measure calculated was the area under the ROC curve (AUC, Hanley et al., 1982;Selvaraj et al., 2007;Luts et al., 2010).A typical plot of the ROC curve is shown in Fig. 6.The calculated parameters of AUC and accuracy were 0.998 and 99.82 % for the BH corrected image, but as low as 0.963 and 88.71 % in the presence of a BH artefact.Therefore, the performance measure results based on the pixel-based greyvalue training data set demonstrate that the probabilistic bias rate was higher in the BH-affected images, and this consequently caused misclassification of the test data (Fig. 5c).This finding provide evidence that BH correction is an important intermediate step in obtaining a good classifier performance using our LS-SVM approach.Moreover, for an optimal classification result, it is always desirable to include the full grey-scale range ("pixel value") of each individual phase to be trained in order to avoid misclassification, i.e., an undecided data classification as undesired output.Introduction

Conclusions References
Tables Figures

Back Close
Full

Conclusions
In this study, polychromatic X-ray source generated µXCT images of cylindrically shaped samples (rock cores) were evaluated for the efficient removal of the beamhardening artefact and optimized multiphase classification.Due to the nature of the BH artefact present in µXCT images, the reconstructed grey-scale data values for the same mineral phase show a non-linear (parabolic) curve from the periphery to the centre of the rock cores.The 2-D polynomial surface function was fitted to a slice image in order to extract residual data values in terms of the difference between the original data values and the fitted surface points.This novel approach is quite flexible for any geomaterial of any shape; the method could also be applied to non-cylindrical samples, and is computationally fast.A drawback is that in cases of multi-component geological material of extremely low density (e.g., organic material), or high density (e.g., ore), the fitting of the surface function to the cloud of data points may over-or underestimate the range of grey-scale values of each individual phase, which will subsequently affect the correct phase classification.A 3-D (volume) fitting is necessary to overcome this problem of data extremes.
The advanced machine learning technique of the least square support vector machine (kernel-based learning) method is proposed as an efficient routine to segment the µXCT images on the basis of a direct pixel-based classification task.Without any reduction in dimensionality or any requirement of prior knowledge, the radial basis function kernel yields good classification results for BH-corrected images with a high accuracy rate (less misclassification), but fails to classify phases in the presence of BH artefacts.Our method is sensitive to the selection of data points (pixels) at different locations, and to the number of data values of each individual mineral selected for training.Therefore, the presence of artefacts and inadequate data value selection for a specific mineral may affect correct image classification, and may become computationally costly as the result of the high dimensionality of the feature vector.In a companion paper, a comparison is presented of our LS-SVM method with other supervised and Introduction

Conclusions References
Tables Figures

Figure 1 .Figure 5 .
Figure 1.Graphical presentation of the NL-SVM approach, with (a) complex binary pattern classification problem in input space, and (b) non-linear mapping into high-dimensional feature space where a linearly separable data classification take place.