|
|
ERDC TN-DOER-C15
July 2000
Parameters of interest in a CDF are likely to include percent sand, percent clay, and
contaminant concentrations. The spatial distribution of each of these parameters can be
examined individually, but by looking at the relationships between these parameters, it is
likely that much can be determined about the material distribution within the CDF using
physical parameters and more limited, targeted, chemical analysis. Bivariate data analysis
methods provide the means to do this.
Bivariate data Bivariate data analysis methods permit the comparison of two parameter
distributions to determine whether a functional relationship exists between them (Isaaks and
Srivastava 1989). Likely to be of interest in determining the distribution of recoverable
materials in a CDF is the relationship of percent sand and percent clay to contaminant levels.
Summary statistics and tests for normality should be calculated for each distribution individu-
ally. A relative location map can be employed, as for the univariate data, giving the values
of each parameter as a function of spatial distribution. A scatter plot of the two parameters,
one plotted on the ordinate and the other on the abscissa, may illustrate any functional
dependence that exists. The linearity of the relationship of the variables can be evaluated
using the correlation coefficient ρ, defined in Appendix I. The correlation coefficient varies
between -1 and +1; +1 indicates a straight line with a positive slope (positive correlation), -1
indicates a straight line with a negative slope (negative correlation), and values near zero
indicate little or no correlation between the variables (Isaaks and Srivastava 1989). For
example, one would expect particle size and contaminant concentration to be negatively
correlated and percent clay and contaminant concentration to be positively correlated,
contaminant level decreasing with increasing particle size. If the correlation coefficient is
unduly influenced by a few extreme values, the rank correlation coefficient may be a more
useful statistic. This is further described in Isaaks and Srivastava (1989).
Censored data In environmental sampling, a high percentage of samples may have no
measurable contaminants (nondetects). Concentrations of these analytes, known as censored
values, are normally reported as less than the method detection level (<MDL). The actual
concentration of the contaminant lies somewhere in the range from zero to the MDL. There
are several approaches to handling censored values. One approach is to ignore these values,
which results in an overestimate of the mean and underestimate of the standard deviation
(McBean and Rovers 1998). This alternative is acceptable only when the number of
nondetects is very small. Alternatively, the censored values can be assumed to be equal to
the detection limit, but this also introduces bias into the summary statistics. This alternative
is preferred when the values are not highly variable and are near the MDL. A third alternative
is to assume the censored values to be equal to MDL/2; this is the preferred alternative when
the contaminant is present in highly variable concentrations. There are a number of statistical
methods, parametric and nonparametric, for dealing with censored data; these are further
described in McBean and Rovers (1998).
Spatial analysis Several variations of data groupings are possible based on the relative
location map previously described. It may be visually instructive to identify the lowest and
highest values on the map, or to replace individual data points with symbols based on
assignment to certain ranges. An indicator map uses only two symbols, designating those
data points falling above and below a specified threshold (Isaaks and Srivastava 1989). The
8
|
Privacy Statement - Press Release - Copyright Information. - Contact Us - Support Integrated Publishing |