Data
Filter results by:
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : The Energy Building dataset (Tsanas and Xifara 2012) concerns the prediction of the heating load…
0 runs0 likes0 downloads0 reach1 impact
768 instances - 10 features - classes - 0 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : The Concrete Slump dataset (Yeh 2007) concerns the prediction of three properties of concrete…
0 runs1 likes0 downloads1 reach1 impact
103 instances - 10 features - classes - 0 missing values
* Source: JP Marques de Sá, INEB-Instituto de Engenharia Biomédica, Porto, Portugal; e-mail: jpmdesa '@' gmail.com J Jossinet, inserm, Lyon, France * Data Set Information: Impedance measurements…
280 runs0 likes5 downloads5 reach5 impact
106 instances - 10 features - 6 classes - 0 missing values
A 4-class version of breast-tissue dataset.
299 runs0 likes3 downloads3 reach5 impact
106 instances - 10 features - 4 classes - 0 missing values
* Title: South Africa Heart Disease Dataset * Description A retrospective sample of males in a heart-disease high-risk region of the Western Cape, South Africa. There are roughly two controls per case…
155 runs0 likes11 downloads11 reach6 impact
462 instances - 10 features - 2 classes - 0 missing values
Source: The dataset was created by Angeliki Xifara (angxifara @ gmail.com, Civil/Structural Engineer) and was processed by Athanasios Tsanas (tsanasthanasis @ gmail.com, Oxford Centre for Industrial…
103 runs1 likes4 downloads5 reach5 impact
768 instances - 10 features - 37 classes - 0 missing values
Source: David Gil, dgil '@' dtic.ua.es, Lucentia Research Group, Department of Computer Technology, University of Alicante Jose Luis Girela, girela '@' ua.es, Department of Biotechnology, University…
451 runs0 likes7 downloads7 reach5 impact
100 instances - 10 features - 2 classes - 0 missing values
No data.
90 runs0 likes4 downloads4 reach1 impact
137781 instances - 10 features - 7 classes - 0 missing values
No data.
75 runs0 likes2 downloads2 reach1 impact
137781 instances - 10 features - 7 classes - 0 missing values
Citation Request: This breast cancer domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data.…
66 runs0 likes2 downloads2 reach6 impact
277 instances - 10 features - 2 classes - 0 missing values
web services evaluations in this table
0 runs0 likes0 downloads0 reach0 impact
1974675 instances - 10 features - classes - 1974675 missing values
UserID
0 runs0 likes0 downloads0 reach0 impact
1974675 instances - 10 features - classes - 1974675 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Survival treated as the class attribute As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using…
12 runs0 likes2 downloads2 reach1 impact
130 instances - 10 features - 0 classes - 97 missing values
This is a dataset obtained from the StatLib repository. Here is the included description: The data provided are daily stock prices from January 1988 through October 1991, for ten aerospace companies.…
5 runs1 likes5 downloads6 reach1 impact
950 instances - 10 features - 0 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Tumor-size treated as the class attribute. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using…
0 runs0 likes3 downloads3 reach1 impact
286 instances - 10 features - 0 classes - 9 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Identification code deleted. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based…
4 runs0 likes0 downloads0 reach1 impact
189 instances - 10 features - 0 classes - 0 missing values
No data.
311 runs0 likes5 downloads5 reach2 impact
1000000 instances - 10 features - 2 classes - 0 missing values
Wine data gathered by https://www.kaggle.com/zynicideThe data was scraped from WineEnthusiast during the week of June 15th, 2017. The code for the scraper can be found at…
0 runs0 likes0 downloads0 reach0 impact
150930 instances - 10 features - classes - 174477 missing values
1. Title: Contraceptive Method Choice 2. Sources: (a) Origin: This dataset is a subset of the 1987 National Indonesia Contraceptive Prevalence Survey (b) Creator: Tjen-Sien Lim (limt@stat.wisc.edu)…
23044 runs0 likes19 downloads19 reach1 impact
1473 instances - 10 features - 3 classes - 0 missing values
Citation Request: This breast cancer domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data.…
2007 runs1 likes30 downloads31 reach1 impact
286 instances - 10 features - 2 classes - 9 missing values
Current dataset was adapted to ARFF format from the UCI version. Sample code ID's were removed. ! Note that there is also a related Breast Cancer Wisconsin (Diagnosis) Data Set with a different set of…
25210 runs1 likes18 downloads19 reach1 impact
699 instances - 10 features - 2 classes - 16 missing values
This database encodes the complete set of possible board configurations at the end of tic-tac-toe games, where "x" is assumed to have played first. The target concept is "win for x" (i.e., true when…
385248 runs1 likes54 downloads55 reach1 impact
958 instances - 10 features - 2 classes - 0 missing values
No data.
1038 runs0 likes8 downloads8 reach1 impact
55296 instances - 10 features - 3 classes - 0 missing values
No data.
1457 runs0 likes12 downloads12 reach1 impact
39366 instances - 10 features - 2 classes - 0 missing values
1. Title: Glass Identification Database 2. Sources: (a) Creator: B. German -- Central Research Establishment Home Office Forensic Science Service Aldermaston, Reading, Berkshire RG7 4PN (b) Donor:…
1776 runs0 likes50 downloads50 reach1 impact
214 instances - 10 features - 6 classes - 0 missing values
No data.
863 runs0 likes11 downloads11 reach1 impact
39366 instances - 10 features - 2 classes - 0 missing values
No data.
960 runs0 likes8 downloads8 reach1 impact
55296 instances - 10 features - 3 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
50 runs0 likes5 downloads5 reach6 impact
95 instances - 10 features - 5 classes - 9 missing values
PRO FOOTBALL SCORES (raw data appears after the description below) How well do the oddsmakers of Las Vegas predict the outcome of professional football games? Is there really a home field advantage -…
15927 runs0 likes19 downloads19 reach17 impact
672 instances - 10 features - 2 classes - 1200 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
102 runs0 likes4 downloads4 reach6 impact
52 instances - 10 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
773 runs0 likes14 downloads14 reach7 impact
950 instances - 10 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
726 runs0 likes5 downloads5 reach6 impact
52 instances - 10 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
106 runs0 likes3 downloads3 reach6 impact
74 instances - 10 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
720 runs0 likes6 downloads6 reach7 impact
159 instances - 10 features - 2 classes - 6 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
708 runs0 likes4 downloads4 reach8 impact
286 instances - 10 features - 2 classes - 9 missing values
Datasets for `Pattern Recognition and Neural Networks' by B.D. Ripley ===================================================================== Cambridge University Press (1996) ISBN 0-521-46086-7 The…
640 runs0 likes6 downloads6 reach6 impact
214 instances - 10 features - 6 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
744 runs0 likes5 downloads5 reach6 impact
130 instances - 10 features - 2 classes - 97 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
118 runs0 likes3 downloads3 reach7 impact
228 instances - 10 features - 2 classes - 20 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
767 runs0 likes8 downloads8 reach7 impact
189 instances - 10 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
792 runs0 likes10 downloads10 reach7 impact
214 instances - 10 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
736 runs0 likes6 downloads6 reach7 impact
1473 instances - 10 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
119 runs0 likes4 downloads4 reach6 impact
95 instances - 10 features - 2 classes - 9 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
772 runs0 likes7 downloads7 reach6 impact
214 instances - 10 features - 2 classes - 0 missing values
### Description The data consists of real historical data collected from 2010 & 2011. Employees are manually allowed or denied access to resources over time. The data is used to create an algorithm…
35317 runs0 likes14 downloads14 reach18 impact
32769 instances - 10 features - 2 classes - 0 missing values
shuttle-pmlb
6 runs0 likes2 downloads2 reach13 impact
58000 instances - 10 features - 7 classes - 0 missing values
No data.
307 runs0 likes2 downloads2 reach2 impact
1000000 instances - 11 features - 5 classes - 0 missing values
No data.
315 runs0 likes2 downloads2 reach2 impact
295245 instances - 11 features - 5 classes - 0 missing values
No data.
298 runs0 likes3 downloads3 reach2 impact
1000000 instances - 11 features - 5 classes - 0 missing values
Normalized version of the pokerhand data set. Automated file upload of pokerhand-normalized.arff
314 runs0 likes11 downloads11 reach2 impact
829201 instances - 11 features - 10 classes - 0 missing values
This is an artificial data set described in Breiman et al. (1984,p.238) (with variance 1 instead of 2). Generate the values of the 10 attributes independently using the following probabilities: P(X_1…
2 runs1 likes4 downloads5 reach2 impact
40768 instances - 11 features - 0 classes - 0 missing values
This is the poker dataset, retrieved 2013-11-14 from the libSVM site. Additional to the preprocessing done there (see LibSVM site for details), this dataset was created as follows: -join test and…
23 runs0 likes17 downloads17 reach7 impact
1025010 instances - 11 features - 2 classes - 0 missing values
This is an artificial data set with dependencies between the attribute values. The cases are generated using the following method: X1 : uniformly distributed over [-5,5] X2 : uniformly distributed…
3 runs1 likes5 downloads6 reach5 impact
40768 instances - 11 features - 0 classes - 0 missing values
This is an artificial data set used in Friedman (1991) and also described in Breiman (1996,p.139). The cases are generated using the following method: Generate the values of 10 attributes, X1, ...,…
0 runs2 likes7 downloads9 reach5 impact
40768 instances - 11 features - 0 classes - 0 missing values
Contains 110 data sets from the book 'The Statistical Sleuth' by Fred Ramsey and Dan Schafer; Duxbury Press, 1997. (schafer@stat.orst.edu) [14/Oct/97] (172k) Note: description taken from this web…
0 runs0 likes0 downloads0 reach5 impact
42 instances - 11 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach1 impact
177147 instances - 11 features - 0 classes - 0 missing values
1. Title: Social Workers Decisions (Ordinal SWD) 2. Source Informaion: Donor: Arie Ben David MIS, Dept. of Technology Management Holon Academic Inst. of Technology 52 Golomb St. Holon 58102 Israel…
0 runs0 likes0 downloads0 reach5 impact
1000 instances - 11 features - 0 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach1 impact
177147 instances - 11 features - 0 classes - 0 missing values
* Abstract: Purpose is to predict poker hands * Source - Creators: Robert Cattral (cattral '@' gmail.com) Franz Oppacher (oppacher '@' scs.carleton.ca) Carleton University, Department of Computer…
1 runs0 likes4 downloads4 reach6 impact
1025009 instances - 11 features - 10 classes - 0 missing values
* Abstract: 9-class version of poker-hand dataset, it was removed the minority class.
1 runs0 likes2 downloads2 reach6 impact
1025000 instances - 11 features - 9 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository.
1 runs0 likes2 downloads2 reach5 impact
1025010 instances - 11 features - 0 classes - 0 missing values
This dataset contains 3 more features compared to version 1 of the same dataset. Data from which conclusions were drawn in the article "Sleep in Mammals: Ecological and Constitutional Correlates" by…
0 runs0 likes0 downloads0 reach5 impact
62 instances - 11 features - 0 classes - 38 missing values
Determinants of Wages from the 1985 Current Population Survey Summary: The Current Population Survey (CPS) is used to supplement census information between census years. These data consist of a random…
2 runs0 likes3 downloads3 reach5 impact
534 instances - 11 features - 0 classes - 0 missing values
This file contains the data in "The MU284 Population" from Appendix B of the book "Model Assisted Survey Sampling" by Sarndal, Swensson and Wretman, published by Springer-Verlag, New York, 1992. The…
0 runs0 likes0 downloads0 reach5 impact
284 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
100 instances - 11 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
0 runs0 likes0 downloads0 reach5 impact
60 instances - 11 features - 0 classes - 14 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
100 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
1 runs0 likes2 downloads2 reach5 impact
1000 instances - 11 features - 0 classes - 0 missing values
File README ----------- chscase A collection of the data sets used in the book "A Casebook for a First Course in Statistics and Data Analysis," by Samprit Chatterjee, Mark S. Handcock and Jeffrey S.…
0 runs0 likes0 downloads0 reach5 impact
27 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
500 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
500 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes2 downloads2 reach5 impact
1000 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
100 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
100 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
500 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
500 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
100 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
500 instances - 11 features - 0 classes - 0 missing values
Data Sets for 'Regression Models for Time Series Analysis' by B. Kedem and K. Fokianos, Wiley 2002. Submitted by Kostas Fokianos (fokianos@ucy.ac.cy) [8/Nov/02] (176k) Note: - attribute names were…
0 runs0 likes1 downloads1 reach5 impact
508 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 11 features - 0 classes - 0 missing values
The ILPD liver dataset from the OpenCC18 with the gender binary encoded so all features are numeric
0 runs0 likes0 downloads0 reach0 impact
583 instances - 11 features - 2 classes - 0 missing values
The ILPD dataset from the OpenCC18 with all categorical variables label encoded
0 runs0 likes0 downloads0 reach0 impact
583 instances - 11 features - 0 classes - 0 missing values
flare-pmlb
32 runs0 likes0 downloads0 reach11 impact
1066 instances - 11 features - 2 classes - 0 missing values
parity5_plus_5-pmlb
31 runs0 likes0 downloads0 reach11 impact
1124 instances - 11 features - 2 classes - 0 missing values
The origin is not clear, but presumably this is an artificial problem representing M-of-N rules. The target is 1 if a certain M 'bits' are '1'? (Joaquin Vanschoren)
31 runs0 likes0 downloads0 reach11 impact
1324 instances - 11 features - 2 classes - 0 missing values
.. _diabetes_dataset: Diabetes dataset ---------------- Ten baseline variables, age, sex, body mass index, average blood pressure, and six blood serum measurements were obtained for each of n = 442…
0 runs0 likes0 downloads0 reach2 impact
442 instances - 11 features - 0 classes - 0 missing values
.. _diabetes_dataset: Diabetes dataset ---------------- Ten baseline variables, age, sex, body mass index, average blood pressure, and six blood serum measurements were obtained for each of n = 442…
0 runs0 likes0 downloads0 reach2 impact
442 instances - 11 features - 0 classes - 0 missing values
.. _diabetes_dataset: Diabetes dataset ---------------- Ten baseline variables, age, sex, body mass index, average blood pressure, and six blood serum measurements were obtained for each of n = 442…
0 runs0 likes0 downloads0 reach2 impact
442 instances - 11 features - 0 classes - 0 missing values
.. _diabetes_dataset: Diabetes dataset ---------------- Ten baseline variables, age, sex, body mass index, average blood pressure, and six blood serum measurements were obtained for each of n = 442…
0 runs0 likes0 downloads0 reach2 impact
442 instances - 11 features - 0 classes - 0 missing values
.. _diabetes_dataset: Diabetes dataset ---------------- Ten baseline variables, age, sex, body mass index, average blood pressure, and six blood serum measurements were obtained for each of n = 442…
0 runs0 likes0 downloads0 reach2 impact
442 instances - 11 features - 0 classes - 0 missing values
.. _diabetes_dataset: Diabetes dataset ---------------- Ten baseline variables, age, sex, body mass index, average blood pressure, and six blood serum measurements were obtained for each of n = 442…
0 runs0 likes0 downloads0 reach2 impact
442 instances - 11 features - 0 classes - 0 missing values
No data.
305 runs0 likes2 downloads2 reach2 impact
1000000 instances - 11 features - 5 classes - 0 missing values
Synthetic dataset. Almost identical to [dataset 152](https://www.openml.org/d/153/edit)
319 runs0 likes4 downloads4 reach2 impact
1000000 instances - 11 features - 2 classes - 0 missing values
No data.
309 runs0 likes3 downloads3 reach2 impact
1000000 instances - 11 features - 5 classes - 0 missing values