Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : The Jura (Goovaerts 1997) dataset consists of measurements of concentrations of seven heavy metals (cadmium, cobalt, chromium, copper, nickel, lead, and zinc), recorded at 359 locations in the topsoil of a region of the Swiss Jura. The type of land use (Forest, Pasture, Meadow, Tillage) and rock type (Argovian, Kimmeridgian, Sequanian, Portlandian, Quaternary) were also recorded for each location. In a typical scenario (Goovaerts 1997; Alvarez and Lawrence 2011), we are interested in the prediction of the concentration of metals that are more expensive to measure (primary variables) using measurements of metals that are cheaper to sample (secondary variables). In this study, cadmium, copper and lead are treated as target variables while the remaining metals along with land use type, rock type and the coordinates of each location are used as predictive features.

Cd (target) | numeric | 276 unique values 0 missing | |

Co (target) | numeric | 219 unique values 0 missing | |

Cu (target) | numeric | 302 unique values 0 missing | |

Xloc | numeric | 341 unique values 0 missing | |

Yloc | numeric | 347 unique values 0 missing | |

Landuse_1 | numeric | 2 unique values 0 missing | |

Landuse_2 | numeric | 2 unique values 0 missing | |

Landuse_3 | numeric | 2 unique values 0 missing | |

Landuse_4 | numeric | 2 unique values 0 missing | |

Rock_1 | numeric | 2 unique values 0 missing | |

Rock_2 | numeric | 2 unique values 0 missing | |

Rock_3 | numeric | 2 unique values 0 missing | |

Rock_4 | numeric | 2 unique values 0 missing | |

Rock_5 | numeric | 2 unique values 0 missing | |

Cr | numeric | 265 unique values 0 missing | |

Ni | numeric | 277 unique values 0 missing | |

Pb | numeric | 254 unique values 0 missing | |

Zn | numeric | 242 unique values 0 missing |

0.27

0.95

1.36

0.67

6.56

-0.65

8.74

0.25

0.4

