OpenML
Filter results by:
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: C1 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
54 runs0 likes3 downloads3 reach4 impact
28626 instances - 4 features - 5 classes - 0 missing values
This dataset contains a set of face images taken between April 1992 and April 1994 at AT&T Laboratories Cambridge. As described on the original website: There are ten different images of each of 40…
53 runs0 likes0 downloads0 reach2 impact
400 instances - 4097 features - 40 classes - 0 missing values
The Sheffield (previously UMIST) Face Database consists of 564 images of 20 individuals (mixed race/gender/appearance). Each individual is shown in a range of poses from profile to frontal views -…
53 runs0 likes0 downloads0 reach2 impact
575 instances - 10305 features - 20 classes - 0 missing values
No data.
52 runs0 likes2 downloads2 reach0 impact
simple engine data
52 runs0 likes4 downloads4 reach3 impact
383 instances - 6 features - 3 classes - 0 missing values
The Street View House Numbers (SVHN) Dataset SVHN is a real-world image dataset for developing machine learning and object recognition algorithms with minimal requirement on data preprocessing and…
52 runs0 likes0 downloads0 reach2 impact
99289 instances - 3073 features - 10 classes - 0 missing values
No data.
52 runs0 likes3 downloads3 reach0 impact
1000000 instances - 48 features - 10 classes - 0 missing values
No data.
51 runs1 likes4 downloads5 reach0 impact
1000000 instances - 48 features - 10 classes - 0 missing values
The Committee on Statistical Graphics of the American Statistical Association (ASA) invites you to participate in its Second (1983) Exposition of Statistical Graphics Technology. The purposes of the…
51 runs0 likes3 downloads3 reach5 impact
406 instances - 9 features - 3 classes - 14 missing values
No data.
51 runs0 likes2 downloads2 reach0 impact
1000000 instances - 15 features - 2 classes - 0 missing values
No data.
50 runs0 likes1 downloads1 reach0 impact
1000000 instances - 65 features - 10 classes - 0 missing values
No data.
50 runs0 likes3 downloads3 reach0 impact
1000000 instances - 61 features - 2 classes - 0 missing values
No data.
50 runs0 likes1 downloads1 reach0 impact
1000000 instances - 18 features - 22 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
50 runs0 likes5 downloads5 reach5 impact
95 instances - 10 features - 5 classes - 9 missing values
No data.
48 runs1 likes4 downloads5 reach0 impact
1000000 instances - 77 features - 10 classes - 0 missing values
No data.
47 runs0 likes1 downloads1 reach0 impact
1000000 instances - 45 features - 2 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. All datasets contain between 100 and 400 samples, characterized by values of 20,000 - 65,000 attributes. Samples are assigned to several (2-10) classes.…
46 runs0 likes5 downloads5 reach6 impact
159 instances - 61360 features - 2 classes - 0 missing values
No data.
45 runs0 likes2 downloads2 reach0 impact
1000000 instances - 23 features - 2 classes - 0 missing values
No data.
44 runs0 likes2 downloads2 reach0 impact
1000000 instances - 15 features - 2 classes - 0 missing values
No data.
44 runs0 likes1 downloads1 reach0 impact
1000000 instances - 13 features - 11 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
44 runs0 likes2 downloads2 reach5 impact
92 instances - 6 features - 0 classes - 26 missing values
No data.
43 runs0 likes2 downloads2 reach0 impact
1000000 instances - 45 features - 2 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
41 runs0 likes2 downloads2 reach6 impact
1340 instances - 18 features - 3 classes - 20 missing values
Datasets for `Pattern Recognition and Neural Networks' by B.D. Ripley ===================================================================== Cambridge University Press (1996) ISBN 0-521-46086-7 The…
41 runs0 likes2 downloads2 reach5 impact
27 instances - 4 features - 4 classes - 0 missing values
* Dataset: DBworld e-mails data set Task: dbworld-subjects * Source: Michele Filannino, PhD University of Manchester Centre for Doctoral Training Email: filannim_AT_cs.man.ac.uk * Data Set…
40 runs0 likes2 downloads2 reach4 impact
64 instances - 243 features - 2 classes - 0 missing values
CIFAR-10 dataset but with some modifications. In particular, each class has fewer labeled training examples than in CIFAR-10, but a very large set of unlabeled examples is provided to learn image…
40 runs0 likes0 downloads0 reach2 impact
13000 instances - 27649 features - 10 classes - 0 missing values
No data.
37 runs0 likes2 downloads2 reach0 impact
1000000 instances - 70 features - 24 classes - 0 missing values
Donor: David W. Aha (aha@ics.uci.edu) This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In particular, the Cleveland database is the only one…
36 runs0 likes5 downloads5 reach0 impact
303 instances - 14 features - 0 classes - 6 missing values
Balanced version of click prediction data
36 runs0 likes11 downloads11 reach4 impact
1997410 instances - 12 features - 2 classes - 0 missing values
)), [PMLB](https://github.com/EpistasisLab/penn-ml-benchmarks/tree/master/datasets/classification/tokyo1) This is Performance co-pilot (PCP) data for the Tokyo server at Silicon Graphics International…
35 runs0 likes1 downloads1 reach9 impact
959 instances - 45 features - 2 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
35 runs0 likes2 downloads2 reach5 impact
23 instances - 6 features - 3 classes - 0 missing values
No data.
34 runs0 likes2 downloads2 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
33 runs0 likes4 downloads4 reach0 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
33 runs0 likes4 downloads4 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
32 runs0 likes1 downloads1 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
32 runs0 likes1 downloads1 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
MyExampleIris
32 runs0 likes0 downloads0 reach8 impact
150 instances - 5 features - 3 classes - 0 missing values
analcatdata_fraud-pmlb
32 runs0 likes0 downloads0 reach8 impact
42 instances - 12 features - 2 classes - 0 missing values
flare-pmlb
32 runs0 likes0 downloads0 reach9 impact
1066 instances - 11 features - 2 classes - 0 missing values
cleve-pmlb
32 runs0 likes0 downloads0 reach8 impact
303 instances - 14 features - 2 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
32 runs0 likes2 downloads2 reach5 impact
57 instances - 12 features - 5 classes - 1 missing values
County data from the 2000 Presidential Election in Florida. Compiled by Brett Presnell Department of Statistics, University of Florida These data are derived from three sources, described below. As…
32 runs0 likes4 downloads4 reach5 impact
67 instances - 17 features - 5 classes - 0 missing values
1. Title: meta-data 2. Sources: (a) Creator: LIACC - University of Porto R.Campo Alegre 823 4150 PORTO (b) Donor: P.B.Brazdil or J.Gama Tel.: +351 600 1672 LIACC, University of Porto Fax.: +351 600…
32 runs0 likes2 downloads2 reach6 impact
528 instances - 22 features - 0 classes - 504 missing values
The AAUP dataset for the ASA Statistical Graphics Section's 1995 Data Analysis Exposition contains information on faculty salaries for 1161 American colleges and universities. The data may be obtained…
32 runs0 likes3 downloads3 reach5 impact
1161 instances - 17 features - 4 classes - 256 missing values
General Description of Thyroid Disease Databases and Related Files This directory contains 6 databases, corresponding test set, and corresponding documentation. They were left at the University of…
32 runs0 likes6 downloads6 reach3 impact
2800 instances - 27 features - 5 classes - 0 missing values
Pittsburgh bridges This version is derived from version 1 by removing all instances with missing values in the last (target) attribute. The bridges dataset is originally not a classification dataset,…
31 runs0 likes1 downloads1 reach6 impact
105 instances - 13 features - 6 classes - 61 missing values
Pittsburgh bridges This version is derived from version 2 (the discretized version) by removing all instances with missing values in the last (target) attribute. The bridges dataset is originally not…
31 runs0 likes2 downloads2 reach6 impact
105 instances - 13 features - 6 classes - 61 missing values
No data.
31 runs0 likes1 downloads1 reach0 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
31 runs0 likes1 downloads1 reach0 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
31 runs0 likes1 downloads1 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
The data was collected retrospectively at Wroclaw Thoracic Surgery Centre for patients who underwent major lung resections for primary lung cancer in the years 2007 - 2011. The Centre is associated…
31 runs0 likes2 downloads2 reach3 impact
470 instances - 17 features - 2 classes - 0 missing values
Abstract: The data set is composed of 60 chorales (5665 events) by J.S. Bach (1675-1750). Each event of each chorale is labelled using 1 among 101 chord labels and described through 14 features.…
31 runs0 likes2 downloads2 reach3 impact
5665 instances - 17 features - 102 classes - 0 missing values
GAMETES_Epistasis_2-Way_20atts_0.1H_EDM-1_1-pmlb
31 runs0 likes0 downloads0 reach9 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Epistasis_2-Way_20atts_0.4H_EDM-1_1-pmlb
31 runs0 likes0 downloads0 reach9 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Epistasis_3-Way_20atts_0.2H_EDM-1_1-pmlb
31 runs0 likes0 downloads0 reach9 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Heterogeneity_20atts_1600_Het_0.4_0.2_75_EDM-2_001-pmlb
31 runs0 likes0 downloads0 reach9 impact
1600 instances - 21 features - 2 classes - 0 missing values
wine-quality-red-pmlb
31 runs1 likes0 downloads1 reach8 impact
1599 instances - 12 features - 6 classes - 0 missing values
Dataset used by Buntine and Niblett (1992). Composed of 10 features, one of which is irrelevant. The target is a disjunctive normal form formula over the nine other attributes, with additional…
31 runs0 likes0 downloads0 reach9 impact
973 instances - 10 features - 2 classes - 0 missing values
cars1-pmlb
31 runs0 likes0 downloads0 reach8 impact
392 instances - 8 features - 3 classes - 0 missing values
calendarDOW-pmlb
31 runs0 likes0 downloads0 reach8 impact
399 instances - 33 features - 5 classes - 0 missing values
car-evaluation-pmlb
31 runs0 likes1 downloads1 reach8 impact
1728 instances - 22 features - 4 classes - 0 missing values
Derived from the Musk dataset: https://www.openml.org/d/1116
31 runs0 likes0 downloads0 reach9 impact
476 instances - 169 features - 2 classes - 0 missing values
Derived from the Musk dataset: https://www.openml.org/d/1116
31 runs0 likes0 downloads0 reach9 impact
6598 instances - 169 features - 2 classes - 0 missing values
corral-pmlb
31 runs0 likes0 downloads0 reach9 impact
160 instances - 7 features - 2 classes - 0 missing values
PMLB version of the Titanic dataset, which only uses 3 features. See version 1 for the complete version: https://www.openml.org/d/40945
31 runs0 likes0 downloads0 reach9 impact
2201 instances - 4 features - 2 classes - 0 missing values
parity5_plus_5-pmlb
31 runs0 likes0 downloads0 reach9 impact
1124 instances - 11 features - 2 classes - 0 missing values
allbp-pmlb
31 runs0 likes0 downloads0 reach8 impact
3772 instances - 30 features - 3 classes - 0 missing values
allrep-pmlb
31 runs0 likes0 downloads0 reach8 impact
3772 instances - 30 features - 4 classes - 0 missing values
analcatdata_happiness-pmlb
31 runs0 likes0 downloads0 reach8 impact
60 instances - 4 features - 3 classes - 0 missing values
ecoli-pmlb
31 runs0 likes0 downloads0 reach8 impact
327 instances - 8 features - 5 classes - 0 missing values
led24-pmlb
31 runs0 likes1 downloads1 reach8 impact
3200 instances - 25 features - 10 classes - 0 missing values
led7-pmlb
31 runs0 likes0 downloads0 reach8 impact
3200 instances - 8 features - 10 classes - 0 missing values
The origin is not clear, but presumably this is an artificial problem representing M-of-N rules. The target is 1 if a certain M 'bits' are '1'? (Joaquin Vanschoren)
31 runs0 likes0 downloads0 reach9 impact
1324 instances - 11 features - 2 classes - 0 missing values
mux6-pmlb
31 runs0 likes0 downloads0 reach8 impact
128 instances - 7 features - 2 classes - 0 missing values
new-thyroid-pmlb
31 runs0 likes1 downloads1 reach8 impact
215 instances - 6 features - 3 classes - 0 missing values
Relevant Information: -- The database contains 3 potential classes, one for the number of times a certain type of solar flare occured in a 24 hour period. -- Each instance represents captured features…
31 runs0 likes0 downloads0 reach8 impact
315 instances - 13 features - 5 classes - 0 missing values
Relevant Information: -- The database contains 3 potential classes, one for the number of times a certain type of solar flare occured in a 24 hour period. -- Each instance represents captured features…
31 runs0 likes0 downloads0 reach8 impact
1066 instances - 13 features - 6 classes - 0 missing values
threeOf9-pmlb
31 runs0 likes0 downloads0 reach9 impact
512 instances - 10 features - 2 classes - 0 missing values
cleveland-nominal-pmlb
31 runs0 likes0 downloads0 reach8 impact
303 instances - 8 features - 5 classes - 0 missing values
dis-pmlb
31 runs0 likes0 downloads0 reach9 impact
3772 instances - 30 features - 2 classes - 0 missing values
Andrew V Uzilov, Joshua M Keegan, and David H Mathews. Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change. BMC Bioinformatics, 7(173), 2006. This…
31 runs0 likes10 downloads10 reach6 impact
488565 instances - 9 features - 2 classes - 0 missing values
This directory contains Thyroid datasets. "ann-train.data" contains 3772 learning examples and "ann-test.data" contains 3428 testing examples. I have obtained this data from…
31 runs0 likes2 downloads2 reach4 impact
3772 instances - 22 features - 3 classes - 0 missing values
General Description of Thyroid Disease Databases and Related Files This directory contains 6 databases, corresponding test set, and corresponding documentation. They were left at the University of…
31 runs1 likes7 downloads8 reach3 impact
2800 instances - 27 features - 5 classes - 0 missing values
General Description of Thyroid Disease Databases and Related Files This directory contains 6 databases, corresponding test set, and corresponding documentation. They were left at the University of…
31 runs1 likes7 downloads8 reach3 impact
2800 instances - 27 features - 5 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
30 runs0 likes2 downloads2 reach0 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach0 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach0 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
parity5-pmlb
30 runs0 likes0 downloads0 reach8 impact
32 instances - 6 features - 2 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
29 runs0 likes6 downloads6 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
29 runs0 likes4 downloads4 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach0 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
29 runs0 likes2 downloads2 reach0 impact
1000000 instances - 37 features - 2 classes - 0 missing values