OpenML
Filter results by:
This database has been artificially generated. It describes the structure of the capital letters A, C, D, E, F, G, H, L, P, R, indicated by a number 1-10, in that order (A=1,C=2,...). Each letter's…
24309 runs0 likes10 downloads10 reach57 impact
10218 instances - 8 features - 10 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. The maps were scanned in 8 bit grey value at density of 400dpi,…
11340 runs1 likes2 downloads3 reach21 impact
2000 instances - 241 features - 10 classes - 0 missing values
SVHN is a real-world image dataset for developing machine learning and object recognition algorithms with minimal requirement on data preprocessing and formatting. It can be seen as similar in flavor…
52 runs0 likes1 downloads1 reach15 impact
99289 instances - 3073 features - 10 classes - 0 missing values
The dataset and this description is made available on http://www-stat.stanford.edu/~tibs/ElemStatLearn/data.html. Normalized handwritten digits, automatically scanned from envelopes by the U.S. Postal…
57 runs0 likes1 downloads1 reach11 impact
9298 instances - 257 features - 10 classes - 0 missing values
led24-pmlb
31 runs0 likes2 downloads2 reach21 impact
3200 instances - 25 features - 10 classes - 0 missing values
led7-pmlb
31 runs0 likes0 downloads0 reach21 impact
3200 instances - 8 features - 10 classes - 0 missing values
The midwest survey dataset contain individual responses from surveys about regional identification conducted for FiveThirtyEight by SurveyMonkey.
0 runs0 likes0 downloads0 reach6 impact
2778 instances - 28 features - 10 classes - 1744 missing values
The midwest survey dataset contain individual responses from surveys about regional identification conducted for FiveThirtyEight by SurveyMonkey.
0 runs0 likes0 downloads0 reach6 impact
2778 instances - 28 features - 10 classes - 1744 missing values
No data.
51 runs1 likes4 downloads5 reach11 impact
1000000 instances - 48 features - 10 classes - 0 missing values
No data.
67 runs0 likes2 downloads2 reach11 impact
1000000 instances - 17 features - 10 classes - 0 missing values
No data.
194 runs0 likes3 downloads3 reach11 impact
1000000 instances - 65 features - 10 classes - 0 missing values
No data.
290 runs0 likes5 downloads5 reach11 impact
1000000 instances - 77 features - 10 classes - 0 missing values
No data.
52 runs0 likes2 downloads2 reach10 impact
1000000 instances - 65 features - 10 classes - 0 missing values
No data.
50 runs0 likes1 downloads1 reach11 impact
1000000 instances - 65 features - 10 classes - 0 missing values
Much of machine learning research focuses on producing models which perform well on benchmark tasks, in turn improving our understanding of the challenges associated with those tasks. From the…
0 runs0 likes1 downloads1 reach10 impact
70000 instances - 785 features - 10 classes - 0 missing values
No data.
293 runs0 likes2 downloads2 reach11 impact
1000000 instances - 17 features - 10 classes - 0 missing values
Dataset created to study concept drift in stream mining. It is constructed by combining the Covertype, Poker-Hand, and Electricity datasets. More details can be found in: Albert Bifet, Geoff Holmes,…
332 runs0 likes27 downloads27 reach12 impact
1455525 instances - 73 features - 10 classes - 0 missing values
No data.
52 runs0 likes3 downloads3 reach11 impact
1000000 instances - 48 features - 10 classes - 0 missing values
Automated file upload of BNG(optdigits)
100 runs1 likes1 downloads2 reach11 impact
1000000 instances - 65 features - 10 classes - 0 missing values
This simple domain contains 7 Boolean attributes and 10 classes, the set of decimal digits. Recall that LED displays contain 7 light-emitting diodes -- hence the reason for 7 attributes. The class…
13006 runs0 likes9 downloads9 reach18 impact
500 instances - 8 features - 10 classes - 0 missing values
2126 fetal cardiotocograms (CTGs) were automatically processed and the respective diagnostic features measured. The CTGs were also classified by three expert obstetricians and a consensus…
24283 runs5 likes29 downloads34 reach56 impact
2126 instances - 36 features - 10 classes - 0 missing values
CIFAR-10 dataset but with some modifications. In particular, each class has fewer labeled training examples than in CIFAR-10, but a very large set of unlabeled examples is provided to learn image…
40 runs0 likes0 downloads0 reach14 impact
13000 instances - 27649 features - 10 classes - 0 missing values
Survey to know if people self-identify as Midwesterners.
0 runs0 likes0 downloads0 reach0 impact
2778 instances - 28 features - 10 classes - 1737 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
11 runs0 likes0 downloads0 reach19 impact
10000 instances - 7201 features - 10 classes - 0 missing values
Fashion-MNIST is a dataset of Zalando's article images, consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a…
452 runs0 likes12 downloads12 reach25 impact
70000 instances - 785 features - 10 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
13 runs0 likes1 downloads1 reach18 impact
58310 instances - 181 features - 10 classes - 0 missing values
Tattile Via Gaetano Donizetti, 1-3-5,25030 Mairano (Brescia), Italy. ### Dataset Description Semeion Handwritten Digit Data Set, where 1593 handwritten digits from around 80 persons were scanned and…
32940 runs0 likes23 downloads23 reach58 impact
1593 instances - 257 features - 10 classes - 0 missing values
Survey to know if people self-identify as Midwesterners.
0 runs0 likes0 downloads0 reach0 impact
2778 instances - 28 features - 10 classes - 1737 missing values
No data.
48 runs1 likes4 downloads5 reach12 impact
1000000 instances - 77 features - 10 classes - 0 missing values
Normalized version of the pokerhand data set. Automated file upload of pokerhand-normalized.arff ### Data Set Information: Each record is an example of a hand consisting of five playing cards drawn…
314 runs0 likes12 downloads12 reach12 impact
829201 instances - 11 features - 10 classes - 0 missing values
No data.
304 runs0 likes7 downloads7 reach11 impact
1000000 instances - 25 features - 10 classes - 0 missing values
No data.
216 runs0 likes12 downloads12 reach63 impact
11162 instances - 11466 features - 10 classes - 0 missing values
50% stratified subsample of the original SVHN data
0 runs0 likes1 downloads1 reach11 impact
49644 instances - 3073 features - 10 classes - 0 missing values
No data.
377 runs0 likes11 downloads11 reach62 impact
913 instances - 3101 features - 10 classes - 0 missing values
This is a 20,000 instance sample of the original CIFAR-10 dataset. Sampled randomly and stratified, with 2000 examples per class. Training and test set are merged. Find the corresponding task for the…
380 runs0 likes5 downloads5 reach22 impact
20000 instances - 3073 features - 10 classes - 0 missing values
* Dataset Title: MicroMass - Mixed (mixed spectra version) * Abstract: A dataset to explore machine learning approaches for the identification of microorganisms from mass-spectrometry data. * Source:…
64 runs1 likes6 downloads7 reach13 impact
360 instances - 1301 features - 10 classes - 0 missing values
No data.
373 runs0 likes10 downloads10 reach62 impact
918 instances - 3013 features - 10 classes - 0 missing values
10% stratified subsample of the original SVHN data
0 runs0 likes1 downloads1 reach11 impact
9927 instances - 3073 features - 10 classes - 0 missing values
0. airplane 1. automobile 2. bird 3. cat 4. deer 5. dog 6. frog 7. horse 8. ship 9. truck CIFAR-10 contains 6000 images per class. The original train-test split randomly divided these into 5000 train…
159 runs0 likes6 downloads6 reach21 impact
60000 instances - 3073 features - 10 classes - 0 missing values
1. Title of Database: Optical Recognition of Handwritten Digits 2. Source: E. Alpaydin, C. Kaynak Department of Computer Engineering Bogazici University, 80815 Istanbul Turkey alpaydin@boun.edu.tr…
35799 runs3 likes22 downloads25 reach12 impact
5620 instances - 65 features - 10 classes - 0 missing values
We create a digit database by collecting 250 samples from 44 writers. The samples written by 30 writers are used for training, cross-validation and writer dependent testing, and the digits written by…
37205 runs0 likes21 downloads21 reach12 impact
10992 instances - 17 features - 10 classes - 0 missing values
No data.
2198 runs1 likes17 downloads18 reach9 impact
1484 instances - 9 features - 10 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. The maps were scanned in 8 bit grey value at density of 400dpi,…
26229 runs0 likes18 downloads18 reach13 impact
2000 instances - 241 features - 10 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. Corresponding patterns in different datasets correspond to the same…
34558 runs0 likes23 downloads23 reach12 impact
2000 instances - 48 features - 10 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. Corresponding patterns in different datasets correspond to the same…
38639 runs0 likes20 downloads20 reach12 impact
2000 instances - 65 features - 10 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. Corresponding patterns in different datasets correspond to the same…
35343 runs0 likes18 downloads18 reach12 impact
2000 instances - 7 features - 10 classes - 0 missing values
The MNIST database of handwritten digits with 784 features. It can be split in a training set of the first 60,000 examples, and a test set of 10,000 examples It is a subset of a larger set available…
13233 runs7 likes73 downloads80 reach36 impact
70000 instances - 785 features - 10 classes - 0 missing values
No data.
414 runs0 likes9 downloads9 reach62 impact
690 instances - 8262 features - 10 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. Corresponding patterns in different datasets correspond to the same…
37368 runs0 likes18 downloads18 reach15 impact
2000 instances - 217 features - 10 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. Corresponding patterns in different datasets correspond to the same…
38026 runs0 likes12 downloads12 reach12 impact
2000 instances - 77 features - 10 classes - 0 missing values
No data.
203 runs0 likes5 downloads5 reach21 impact
878 instances - 7455 features - 10 classes - 0 missing values
No data.
416 runs1 likes14 downloads15 reach63 impact
1050 instances - 3239 features - 10 classes - 0 missing values
No data.
428 runs0 likes13 downloads13 reach63 impact
1003 instances - 3183 features - 10 classes - 0 missing values
Modified by TunedIT (converted to ARFF format) GINA is digit recognition database The task of GINA is handwritten digit recognition. Data type: non-sparse Number of features: 784 Number of examples…
396 runs0 likes17 downloads17 reach15 impact
3468 instances - 785 features - 10 classes - 0 missing values
No data.
44 runs0 likes1 downloads1 reach9 impact
1000000 instances - 13 features - 11 classes - 0 missing values
Author: Marius Lindauer Date: 27.02.2014 These data set was generated for a publication about claspfolio 2.0, i.e., an algorithm selector for ASP. The algorithm portfolio of clasp (2.1.4)…
0 runs0 likes0 downloads0 reach8 impact
1294 instances - 143 features - 11 classes - 18258 missing values
####1. Summary This database was generated by the Laboratory of Image Processing and Pattern Recognition (INPG-LTIRF) in the development of the Esprit project ELENA No. 6891 and the Esprit working…
20229 runs0 likes13 downloads13 reach18 impact
5500 instances - 41 features - 11 classes - 0 missing values
Dataset Title: Localization Data for Person Activity Data Set Abstract: Data contains recordings of five people performing different activities. Each person wore four sensors (tags) while performing…
6 runs0 likes6 downloads6 reach15 impact
164860 instances - 8 features - 11 classes - 0 missing values
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage repeatable,…
519 runs0 likes7 downloads7 reach14 impact
203 instances - 17 features - 11 classes - 0 missing values
Speaker independent recognition of the eleven steady state vowels of British English using a specified training set of lpc derived log area ratios. Collected by David Deterding (data and…
26450 runs1 likes18 downloads19 reach43 impact
990 instances - 13 features - 11 classes - 0 missing values
No data.
405 runs0 likes7 downloads7 reach12 impact
45164 instances - 75 features - 11 classes - 0 missing values
No data.
283 runs0 likes6 downloads6 reach23 impact
96 instances - 4027 features - 11 classes - 19667 missing values
No data.
222 runs0 likes11 downloads11 reach15 impact
1504 instances - 2887 features - 13 classes - 0 missing values
The aim is to determine the type of arrhythmia from the ECG recordings. This database contains 279 attributes, 206 of which are linear valued and the rest are nominal. Concerning the study of H. Altay…
4430 runs0 likes50 downloads50 reach15 impact
452 instances - 280 features - 13 classes - 408 missing values
source: http://www.cs.ubc.ca/labs/beta/Projects/SATzilla/ authors: L. Xu, F. Hutter, H. Hoos, K. Leyton-Brown translator in coseal format: M. Lindauer with the help of Alexandre Frechette the data do…
0 runs0 likes0 downloads0 reach8 impact
296 instances - 116 features - 14 classes - 1810 missing values
Multiclass cancer diagnosis using 16063 tumor gene expression signatures. PNAS, VOL 98, no 26, pp. 15149-15154, December 18, 2001. S. Ramaswamy, P. Tamayo, R. Rifkin, S. Mukherjee, C.-H. Yeang, M.…
116 runs0 likes9 downloads9 reach23 impact
190 instances - 16064 features - 14 classes - 0 missing values
No data.
426 runs0 likes16 downloads16 reach84 impact
2463 instances - 2001 features - 17 classes - 0 missing values
Abstract: A chess endgame data set representing the positions on the board of the white king, the white rook, and the black king. The task is to determine the optimum number of turn required for white…
25 runs0 likes6 downloads6 reach14 impact
28056 instances - 7 features - 18 classes - 0 missing values
Classify a chess game based on the position of the white king, the white rook and the black king.
1777 runs0 likes16 downloads16 reach9 impact
28056 instances - 7 features - 18 classes - 0 missing values
No data.
314 runs1 likes8 downloads9 reach11 impact
1000000 instances - 36 features - 19 classes - 0 missing values
This is the large soybean database from the UCI repository, with its training and test database combined into a single file. There are 19 classes, only the first 15 of which have been used in prior…
40719 runs1 likes53 downloads54 reach13 impact
683 instances - 36 features - 19 classes - 2337 missing values
__Major changes w.r.t. version 2: ignored variable 3 in this upload as this seems to be ea perfect predictor.__ Tamilnadu Electricity Board Hourly Readings dataset. Real-time readings were collected…
0 runs0 likes2 downloads2 reach19 impact
45781 instances - 4 features - 20 classes - 0 missing values
The Sheffield (previously UMIST) Face Database consists of 564 images of 20 individuals (mixed race/gender/appearance). Each individual is shown in a range of poses from profile to frontal views -…
53 runs0 likes1 downloads1 reach15 impact
575 instances - 10305 features - 20 classes - 0 missing values
No data.
163 runs0 likes13 downloads13 reach22 impact
1560 instances - 8461 features - 20 classes - 0 missing values
### Description MicroMass (pure spectra version) is a dataset to explore machine learning approaches for the identification of microorganisms from mass-spectrometry data. ### Source ``` Pierre Mahé,…
39629 runs1 likes17 downloads18 reach98 impact
571 instances - 1301 features - 20 classes - 0 missing values
Primary Tumor Domain - Donors: - I. Kononenko, University E.Kardelj, Faculty for electrical engineering - B. Cestnik, Jozef Stefan Institute - Past Usage: (sveral) 1. Cestnik,G., Konenenko,I, &…
1261 runs0 likes16 downloads16 reach12 impact
339 instances - 18 features - 21 classes - 225 missing values
No data.
50 runs0 likes2 downloads2 reach12 impact
1000000 instances - 18 features - 22 classes - 0 missing values
The dataset collects data from an Android smartphone positioned in the chest pocket. Accelerometer Data are collected from 22 participants walking in the wild over a predefined path. The dataset is…
80 runs0 likes8 downloads8 reach15 impact
149332 instances - 5 features - 22 classes - 0 missing values
This is a 10% stratified subsample of the data from the 1999 ACM KDD Cup (http://www.sigkdd.org/kddcup/index.php). Modified by TunedIT (converted to ARFF format)…
25 runs1 likes35 downloads36 reach15 impact
494020 instances - 42 features - 23 classes - 0 missing values
Datasets from ACM KDD Cup (http://www.sigkdd.org/kddcup/index.php) Data set for KDD Cup 1999 Modified by TunedIT (converted to ARFF format)…
4 runs1 likes21 downloads22 reach15 impact
4898431 instances - 42 features - 23 classes - 0 missing values
INTRUSION DETECTOR LEARNING Software to detect network intrusions protects a computer network from unauthorized users, including perhaps insiders. The intrusion detector learning task is to build a…
0 runs1 likes0 downloads1 reach1 impact
4898431 instances - 42 features - 23 classes - 0 missing values
No data.
33 runs0 likes4 downloads4 reach12 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach12 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
30 runs0 likes2 downloads2 reach12 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
31 runs0 likes1 downloads1 reach12 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach12 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
37 runs0 likes2 downloads2 reach12 impact
1000000 instances - 70 features - 24 classes - 0 missing values
This database is a standardized version of the original audiology database (see audiology.* in this directory). The non-standard set of attributes have been converted to a standard set of attributes…
7303 runs0 likes12 downloads12 reach12 impact
226 instances - 70 features - 24 classes - 317 missing values
No data.
159 runs0 likes11 downloads11 reach22 impact
1657 instances - 3759 features - 25 classes - 0 missing values
No data.
60 runs0 likes2 downloads2 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
311 runs0 likes3 downloads3 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
28 runs0 likes2 downloads2 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
32 runs0 likes1 downloads1 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
34 runs0 likes2 downloads2 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
31 runs0 likes1 downloads1 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values