POSTS
kRiging
kRiging (Kriging in R)
As a programmer, it is often frustrating to solve equations by hand, knowing that a clever implementation in software could solve an infinite variety of similar problems. This cannot be said for the paper-and-pen approach. As a <<Mastère spécialisé>> student in the Cycle de Formation Spécialisée en Géostatistique at Mines Paristech, I have spent the better part of these past few weeks building and solving kriging systems. It certainly builds intution and confidence with the manipulations, but I’d still like to code up a quick implementation.
At the end of the day, I find that the linear algebra or matrix form of an equation is much easier to read than the equivalent with nested sums (in multivariate geostats, the nesting can build to quadruple sums and beyond!)
Cokriging system for 2 variables (from Rivoirard, Course on Multivariate Geostatistics). Only downhill from here.
\begin{array} { l } { \sum _ { \beta \in S _ { 1 } } \lambda _ { 1 \beta } C \left( Z _ { 1 \alpha } , Z _ { 1 \beta } \right) + \sum _ { \beta \in S _ { 2 }} \lambda _ { 2 \beta } C \left( Z _ { 1 \alpha } , Z _ { 2 \beta } \right) = C \left( Z _ { 0 } , Z _ { 1 \alpha } \right) } \newline { \sum _ { \beta \in S _ { 1 } } \lambda _ { 1 \beta } C \left( Z _ { 1 \beta } , Z _ { 2 \alpha } \right) + \sum _ { \beta \in S _ { 2 } } \lambda _ { 2 \beta } C \left( Z _ { 2 \alpha } , Z _ { 2 \beta } \right) = C \left( Z _ { 0 } , Z _ { 2 \alpha } \right) } \end{array}
The terms in the sum expand for every point $\beta$ in $S_1$ (every point where we have a value for $Z_1$) and $\beta$ in $S_2$ (every point where we have a value for $Z_2$). Then the expression expands for every $\alpha$ (so we have this pair of large expanded sums for every permutation of points in $S_1$ with points in $S_2$). It is clear why kriging scales very poorly with input size (this paper and this paper both peg the training time at $O(n^3)$ and the space complexity at $O(n^2)$ for ordinary univariate kriging). We also see that a matrix form would be a little easier to work with.
I found a pretty amazing paper, Kriging in Perspective. It does a very good job introducing the intuition of kriging but also providing a mathematical justification of the connection to least squares and some of the properties that make kriging so special and useful.
The best feature by far of this paper is an entire worked example in R that solves the kriging matrix for a toy problem. Most of the kriging examples and libraries are extremely optimized and the system is no longer presented in a readily interpretable form. I learn best when I can control the problem and introduce complexity little by little, and I still have access to tangible, primitive data structures that I understand well.
That is a major advantage of using code examples. Here, for instance, I can create a system that shows the screen effect for an exponential variogram model with. Here, the weights of the distant points shrink to zero.
### Creating a vertical line of points
coords <- expand.grid(x=c(0), y=c(-2,-1,0,1,2))
### Keep track of which row is the origin (where we want to predict at)
predict.pt <- which(coords$x==0 & coords$y==0)
### Use the built-in function to create distance matrices
distances <- dist(coords) # Euclidean matrix by default
### Returns a structure w/ just lower-triangular half
distances <- as.matrix(distances)
### Create the matrix of all covariances
covars <- 10*exp(-distances/8)
### Break it into Cov(Y,Z) and Var(Z)
Cov.YZ <- covars[predict.pt, - predict.pt]
Var.Z <- covars[-predict.pt, -predict.pt]
### Find the coefficients
beta <- solve(Var.Z) %*% Cov.YZ
plot(coords, xlab="longitude", ylab="latitude", type="n")
points(coords[predict.pt,], col="red")
points(coords[-predict.pt,], cex=10*sqrt(beta))
This code produces the following diagram:
Another good resource, with a worked example that shows a kriging system solved by hand: Kriging Example
Yet another good resource: Kriging Example
I will follow up with further examples and resources for geostatistics as I discover them!