On soil textural classiﬁcations and soil texture-based estimations

. The soil texture representation with the standard textural fraction triplet ’sand-silt-clay’ is commonly used to estimate soil properties. The objective of this work was to test the hypothesis that other fraction sizes in the triplets may provide better representation of soil texture for estimating some soil parameters. We estimated the cumulative particle size distribution and bulk density from entropy-based representation of the textural triplet with experimental data for 6240 soil samples. Results supported the hypothesis. For example, simulated distributions were not signiﬁcantly different from the original ones in 25 and 5 85 % of cases when the ’sand-silt-clay’ and ’very coarse+coarse + medium sand - ﬁne +very ﬁne sand - silt+clay’, were used, respectively. When the same standard and modiﬁed triplets were used to estimate the average bulk density, the coefﬁcients of determination were 0.001 and 0.967, respectively. Overall, the textural triplet selection appears to be application-and data-speciﬁc.


Introduction
The particle size distribution is one of the essential controls of soil structure and functioning.Soil processes, properties and specific features are usually related to these distributions, commonly named soil texture.To express these relationships, the continuous particle size distributions are commonly replaced by their discrete representation with several textural fractions.The fractions are defined as particles within a range of sizes, e.g., medium sand, fine silt, etc.Then the percentages of textural fractions are used as attributes to classify soils and as predictors to the estimate soil properties of parameters.
Different countries have employed different numbers of textural fractions and different ranges of sizes for each of the fractions.Nemes et al. (1999) reviewed definitions of textural fractions in 14 European countries and reported the number of ranges varying from three in Italy and France, to eight in the Netherlands and Germany, and nine in Belgium.The authors also observed a large variability in size ranges.For example, while the minimum size of the second smallest fraction was 2 µm in most cases, the maximum size in such a fraction varied from 6 µm in Greece to 60 µm in England and Wales.In 1967, the Committee of the Soil Science Society of America noted that the current system of particle size boundaries arose due to geographic accident (Whiteside et al., 1967).The committee noted that there is "no narrowly definable natural particle size boundaries that would be equally significant in all soil materials".The boundary between clay and silt was originally set at 10 µm, then changed to 5 µm, and was finally established at 2 µm (Whiteside et al., 1967).
There were indications that setting the boundaries between textural fractions might depend on the purpose of further textural data use and on the specifics of the dataset under consideration.Twarakavi et al. (2010) demonstrated that soils are not classified well from a hydraulic standpoint if the USDA textural fractions of sand, silt, and clay are used.They also noted that this conclusion is conditioned on the database used for the hydraulic classification evaluation.Reasons for the selection of size boundaries varied.Whiteside et al. (1967) noted that for several reasons a scale based on 1 mm with subdivisions at 0.315, 0.1 mm, etc., would seem to be the ideal scale for agricultural purposes, but the advantages were not deemed to be sufficient to outweigh the radical departure from the existing textural classification.Also, physicsbased reasoning influenced the selection of size boundaries between textural fractions.For example, the 2 µm boundary for clay was chosen originally as 10 µm and then moved to 5 µm.Around 1936 a switch of the clay limit from 0.005 to 0.002 mm was proposed based on the realization that at 0.002 mm a significant break in the mineralogical properties of soil separates occurs (Truog et al., 1936a, b) and that soil surveyors in the field tend to consider the 0.002 to 0.005 mm fraction as silt rather than clay (Shaw and Alexander, 1937).
One application of the data on textural fraction content is the reconstruction of the particle size distribution from data on a small number of fractions.Martín and Taguas (1998) proposed to use the hypothesis of self-similarity and iterated function formalism to generate the particle size distribution from small number of textural fractions.In applications of this technique, they used sand, silt, and clay fraction contents with size boundaries defined by the USDA textural classification.Another application of data on textural fractions is to compute the information entropy as the metric of the particle size heterogeneity and derive the relationship between the bulk density (BD) and information entropy (IE) (Martín et al., 2017a).Seven textural fractions have been used, for which a strong linear correlation between the respective average values was shown.This fact, together with the computational results obtained in Martín et al. (2017b), seemed to reinforce the entropy self-similarity approach, which is used in the PSD reconstruction.Self-similarity, commonly expressed by scaling laws, actually means that the content of information obtained on the coarse scale keeps its average value on smaller scales (Pastor-Satorras and Wagensberg, 1998),which agrees with the driving idea of the PSD representation used.
The objective of this work was to test the hypotheses that (a) the reconstruction of the particle size distribution can be more accurate if the textural fraction size boundaries are changed from the USDA sand-silt-clay sizes to other size ranges, and (b) a satisfactory relationship between the information entropy and packing density can be achieved with three textural fractions with boundaries between fraction change sizes other than in the USDA sand-silt-clay triplet.
2 Materials and methods

The dataset
The USKSAT database is comprised of journal publications and technical reports containing coupled data on saturated hydraulic conductivity (Ksat), soil texture, bulk density, and organic matter content obtained across the United States.Detailed information can be found in Pachepsky and Park (2015).We selected the dataset from Florida (Carlisle et al., 1978(Carlisle et al., , 1981)).This dataset is the largest dataset in USKSAT obtained in the same laboratory with the same methods.The dataset was filtered to exclude samples for which data on seven textural fractions or on bulk density were not available.Samples with inconsistent textural data (the sum of mass texture fractions not agreeing with the total mass) were rejected.The selection criteria used in Martín et al. (2017a) were followed.Under these selection criteria, a total of 6240 soil samples were included in the study.According to USDA textural classes, sands, loamy sands, sandy loams, loams, silt loams, silts, sandy clay loams, clay loams, silt clay loams, sandy clays, and clays were represented by 3956, 570, 698, 27, 27, 4, 667, 26, 3, 118, and 144 samples.2.2 Reconstruction of the particle size distributions from data on textural fraction content The reconstruction of the particle size distribution (PSD) is based on the assumption that entropy as the measure of heterogeneity of these distributions is preserved across the support scales (Martín and Taguas, 1998).Assuming that the texture interval is divided into k textural size ranges and the respective textural fraction contents p 1 , p 2 , ..., p k , 1 ≤ i ≤ k, and k i=1 p i = 1, the Shannon information entropy (IE) (Shannon, 1948) is defined by where p i log p i = 0 if p i = 0.The IE is a widely accepted measure of the heterogeneity of distributions (Khinchin, 1957).The IE values for three textural size classes range from 1 when only one fraction is present to −log 2 1 3 = 1.585 when all three fractions are represented equally.Martín and Taguas (1998) proposed a self-similarity model that allows the PSD to be generated from commonly available textural data that consist of mass percentages of a small number of textural fractions.The driving idea of such a proposal was that the heterogeneity that textural data show at the coarse scale -quantified in terms of information entropy -is also reproduced in a similar way inside any rescaled textural fractions on smaller scales (i.e., the heterogeneity of sieved fractions would resemble that observed on the coarse scale).From this single hypothesis, a mathematically precise representation of the PSD is then obtained by the iterated function formalism.A brief illustration of this technique for the case of three textural fractions is as follows.Let us denote with I 1 , I 2 , and I 3 the subintervals of sizes corresponding to the three size classes and p 1 , p 2 , and p 3 the relative proportions of mass for the intervals I 1 , I 2 , and I 3 , respectively.These proportions are treated as probabilities, p 1 + p 2 + p 3 = 1.Let ϕ 1 , ϕ 2 , and ϕ 3 be the linear functions (similarities) that transform the whole size interval I into the subintervals I 1 , I 2 , and I 3 , respectively.The set {ϕ 1 , ϕ 2 , ϕ 3 , p 1 , p 2 , p 3 } is called an iterated function system (IFS) (Barnsley and Demko, 1985).The hypothesis of entropy self-similarity of the PSD states that the IE, now computed on the successive rescaled subintervals . .and so on (i, j, k = 1, 2, 3), is scale invariant.The set of textural data, together with the entropy self-similarity assumption, unequivocally determines the PSD (Martín and Taguas, 1998).Based on the theorem of Elton (1987), the mass of soil with size particles within an interval J may be computed using the IFS as follows: (a) take any starting value x 0 in I and (b) choose, at random, an integer number i of the index set 1, 2, 3 with probability p i and denote with x 1 the value ϕ i (x 0 ).Repeat the random experiment in (b), suppose the new outcome is j , and set x 2 = ϕ j (x 1 ).If x 0 , x 1 , ..., x n is the sequence obtained in this way and m n is the number of x i values that fall in J , the ratio m n /n approaches the mass of the interval J as the number of iterations n goes to infinity.In practice, the estimation of mass in the interval J is achieved quickly.
The reconstruction of distributions was performed using three size fractions: coarse, intermediate, and fine.The dataset contained experimental data on seven fractions: very coarse sand, coarse sand, medium sand, fine sand, very fine sand, silt, and clay.We used all possible triplets formed from seven textural fractions that were available.The symbols for the triplets show how the fractions were grouped.For example, the triplet 3-2-2 had "coarse" and included very coarse sand, coarse sand, and medium sand; "intermediate" included fine sand and very fine sand, and "fine" included silt and clay.Triplet 5-1-1 was the standard one in which "coarse" included all five sand fractions, "intermediate" included silt, and "fine" included clay.A total of 15 triplets were available, a list of which can be found in Table 3.
For all textural triplets we generated the PSD and compared experimental particle size distributions (built from seven known fractions) with simulated ones.The Kolmogorov-Smirnov test has been applied to find the probability that the samples are drawn from the same distribution.

Information entropy-bulk density relation
Following Eq. ( 1), the information entropy of soil texture is computed for all triplets in order to analyze how differences in the information entropy explain differences in the typical soil bulk density value of related soils.The range of information entropy values was divided into 10 subintervals of equal length.The mean bulk density value of soil samples binned into IE ranges was computed for each of the subintervals.The least squares linear regression of the average information entropy vs. average bulk density value was computed.

Results
Examples of ternary graphs showing the locations of samples in the "coarse-intermediate-fine" textural fraction content coordinates are shown in Fig. 1a and 1b.The standard triangle in Fig. 1a shows the majority of points in the left bot-tom corner.This reflects the fact that soils in the database are mostly coarse textured in terms of the USDA textural classification.The 3-2-2 textural triangle in the Fig. 1b shows that the soil samples represent both samples low in fine particles and samples low in coarse particles, whereas soils with low intermediate fraction contents are not represented well in the database.Table 1 shows total numbers of samples by the ranges of the IE values for standard and 3-2-2 triplets in Fig. 1.The standard triplet assigns small values of information entropy to the majority of samples and thus interprets the majority of samples as heterogeneous.In contrast, applying the 3-2-2 triplet leads to the conclusion that the majority of samples has a moderate level of textural heterogeneity.
The reconstruction of particle size distributions with the iterated function algorithm showed large difference among the applications of different triplets.Data on the statistical difference between generated and measured distributions are shown in Table 2 for all samples and for textural classes in which the number of available samples exceeded 100.Triplets in which the group of fines includes fine sand, silt, and clay, i.e., 1-3-3, 2-2-3, 3-1-3, provide the best results.Results for fine-textured soils do not depend on the triplet because the proportions of coarse particles are small and do not affect results.The differences among triplets become more pronounced as the textures become coarser.The worst results are obtained for triplets 1-5-1, 2-4-1, and 5-1-1 having clay as a separate file fraction.Using the standard triplet 5-1-1 leads to absolutely the worst results.The simulated and the experimental cumulative particle size distributions are not statistically different at the 95 % probability level for 25 % of soil samples when the standard triplet of fraction contents is used as input in the reconstruction of the PSD.Instead, when using the triplets 1-3-3, 2-2-3, or 3-1-3, the percentage of soils whose simulated particle size distribution is not statistically different from the original is bigger than 97 % of the total for the same probability level.
Results of linear regressions of mean information entropy values versus mean bin bulk density values are shown in Table 3. Different triplets cause different efficiency in estimating BD from IE by textural classes.Overall, the best relationships were found for sands.Efficiency of estimation was worse in textural classes in which there was no single dominant fraction.The sandy clay and sandy clay loam classes provide examples of the above.Noticeably, triplets with clay, silt, and fine sand combined in the fine fraction do not result in good R 2 for non-sandy soils (Table 3).This is opposite to the PSD reconstruction in which fines consisting of clay, silt, and very fine sand provide the best results (Table 2).
The best results by considering both sand and non-sandy samples are obtained with triplets 2-4-1 and 3-3-1, i.e., triplets in which fines are represented only by clays and there is a certain balance between the coarse and the intermediate fractions.Where this balance is not present (1-5-1 and 5-1-1), the separation of clay in the fine fraction does not help.The standard triangle seems to work only for non-sandy soils.Table 1.Total numbers of samples by ranges of the information entropy for two textural fraction triplets.The 3-2-2 triplet includes very coarse, coarse, and medium sand (fraction 1), fine and very fine sand (fraction 2), and clay and silt (fraction 3); the standard triplet 5-1-1 includes sand (fraction 1), silt (fraction 2), and clay (fraction 3).

Range of the
Standard 5-1-1 Also, this triplet's IE relates well to the BD of sandy clays, sandy loams, and sandy clay loams, but it gives unsatisfactory results for sands.

Discussion
The triplets having a fine fraction consisting of very fine sand, silt, and clay appeared to be superior in serving as the input for PSD reconstruction.One possible explanation is that mass size scaling is not scale invariant across all particle sizes.Rather it has ranges of particle sizes within which the power-law scaling dependencies are applied and the bound-aries between these ranges are reflected by the modified textural triplet rather than by the original 5-1-1 sand-silt-clay triplet.Breaks in particle size distribution scaling were first highlighted by Tyler and Wheatcraft (1992), who noted that the strict fractal or self-similar behavior in soil PSDs is restricted to a narrow spectrum of soils found in nature.For the soils tested, the power-law scaling was observed in only limited portions of their PSDs.Data on soils B to F from their work are shown in Fig. 2. The diameter of the break in scaling varied between 100 and 400 µm and on average in this group of soils occurred at diameters of 220 µm, which is close to the boundary of 250 mm between fine and medium sand.Later the break in scaling was demonstrated by other authors, e.g., Kravchenko and Zhang (1998), who noted that "The critical particle size [radii -M.A.] at which the fractal dimension values are changing, is about 100 to 200 µm for most of the soils.The result is consistent with that reported in the literature (Wu et al., 1993)".
Another reason for the better simulations of particle size distributions could be the better representation of soil texture, i.e., the distribution of samples by the ranges of IE in which the majority of soils are found (Table 1).When the IE is computed with the standard triplet a great amount of soils have a low IE value (have unbalanced contents in respect to those texture fractions).This may be an obstacle for reconstruction of the PSD under entropy self-similarity.In particular, because of the meaning of self-similarity itself, if the input contents are very unbalanced, it causes a multiplicative effect of more unbalanced distribution in the "sub-fractions" on lower scales and probably a more unreliable simulation.In contrast, in the case of the modified triplet a great amount of soils have medium to high IE, which means that they have more balanced contents in respect to the respective new fractions; a greater power of discriminating texture, texture-based prop- erties, and obtaining better PSD simulations is expected.This could be an interesting avenue to explore.The large difference between the IE-bulk density relationships developed for different textural classes indicates that the IE computed for different triplets has the potential to reflect the effect of soil texture on particle packing in soils.The theoretical analysis of Assouline and Rouault (1997) and Martín et al. (2017b) shows that the pore space arrangement can be related to the type of distribution of particles sizes.The IE parameter is related to packing but cannot reflect aggregation that is characteristic of soils in which fine particles are present in substantial amounts.We note that when IE was computed using the seven texture fraction contents with the same database, the determination of the coefficient of the regression "average IE vs. average bulk density" was equal to 0.99 (Martín et al., 2017b).Thus, results shows that the modified triplet provides almost the same information in respect to the bulk density values as that provided by the seven texture fractions altogether.
The best triplets were different for the reconstruction of the particle size distributions and for establishing relationships between information entropy and bulk density after binning samples.Different triplets may be the most informative to characterize the results of fragmentation and sedimentation that manifest themselves in particle size distributions and the results of packing that manifest themselves in IE-BD rela- tionships.Finally, some processes affecting the particle size distributions and IE-BD relationships may not be elucidated by textural data only; aggregation and weathering are examples.
The utility of textural fractions different from traditional sand-silt-clay triplet appears to have an application in the development of pedotransfer functions.The boundary of new fraction sizes can be parameters of pedotransfer functions along with the regression coefficients.Nemes and Rawls (2006) experimented with the boundary between silt and sand in the range from 20 and 63 µm and developed pedotransfer functions for water retention at −33 and −1500 kPa matric potential values.They could not point out the boundary size between silt and sand that would clearly provide better results in estimating the selected soil hydraulic properties.Our work indicates that the boundary may be moved to the range of much larger particle diameters.
The usability of triplets other than standard ones indicates the opportunity for a more efficient use of existing results of textural analysis.Although these results traditionally consist of seven fractions, including five fractions of sand, in the majority of applications all sand fractions have been lumped together.For example, the overwhelming majority of pedotransfer functions in soil hydrology use the elements of the standard triplet sand-silt-clay (Pachepsky and Rawls, 2004).The use of different coarse-intermediate-fine triplets in pedotransfer studies allows for the use of available detailed data on fractions of sand and revisiting existing databases.Overall, the application of nonstandard textural triplets in the development of pedotransfer functions presents an interesting avenue to explore.
When analyzing the utility of the traditional sand-siltclay triplet for classifying soils by their hydraulic properties, Twarakavi et al. (2010) concluded that "from a philosophical perspective, the research further stresses the need to revisit and reevaluate the results from the past in order to successfully move ahead into the future of soil physics".Using a set of fixed boundaries between texture fractions has been a productive approach in the past.Consideration of textural fraction boundaries as flexible parameters that can be task and dataset specific can provide additional insights into the role of texture in soil functioning and ecological services.

Conclusions
Having three textural size ranges, i.e., coarse, intermediate, and fine particle sizes, undoubtedly appears to be convenient for data presentation and textural class definition.Currently the coarse, intermediate, and fine fractions are identified as sand, silt, and clay, respectively.However, it is not conclusive that current sand, silt, and clay size ranges can provide the best representation of soil texture when these three size ranges are used for estimating soil properties.We hypothesized that the cumulative particle size distribution and soil bulk density can be more accurately estimated from the triplet coarse-intermediate-fine if the boundaries of the coarse, intermediate, and fine size ranges are different from those in the sand-silt-clay triplet.The entropy-based representation of particle size distributions was used to convert the triplet particle size representations into particle size distributions and to define ranges of soil textural heterogeneity.Experimental data on seven size fraction contents and bulk density for 6240 predominantly coarse-textured soil samples were extracted from the USKSAT database It appears that redefining the triplet coarse-intermediatefine may lead to a very substantial improvement in soil property estimates from soil textural data.Overall, the drastic improvement in predictions of both cumulative particle size distribution and mean bulk density for heterogeneity ranges occurred when the standard sand-silt-clay triplet was replaced with the modified textural triplet that was defined as very coarse, coarse, medium sand (coarse fraction), fine and very fine sand (intermediate fraction), and clay and silt (fine fraction).The modified triplet apparently provided more information about the particle size heterogeneity and particle packing.Different modified triplets provided the best inputs for different soil textural classes.
Results of this work indicate that detailed information about soil particle size distributions has the potential to enhance estimation of soil properties with soil texture as a predictor.Analyses of both existing and developing soil databases and pedotransfer methodologies may benefit from exploring modifications of textural triangles.The compression of information on textural heterogeneity in textural triangles into a single entropy-based parameter may provide additional advantages.
Data availability.The data can be obtained from Yakov Pachepsky by email request to yakov.pachepsky@ars.usda.gov.

Figure 1 .
Figure 1.Texture of soil samples in the database shown in the standard USDA (a) and modified 3-2-2 textural triangles (b).

Figure 2 .
Figure 2. Scaling in cumulative particle mass of four soils studied by Tyler and Wheatcraft (1992).

Table 2 .
Percentage of samples for which simulated and measured particle size distributions are not different at the 0.05 significance level.

Table 3 .
Determination of coefficients of regression for the average information entropy versus average bulk density for 10 average entropy bins.