Data
Filter results by:
Dataset from the MLRR repository: http://axon.cs.byu.edu:5000/
180 runs0 likes5 downloads5 reach14 impact
294 instances - 12 features - 2 classes - 0 missing values
A dataset relating characteristics of telephony account features and usage and whether or not the customer churned. Originally used in [Discovering Knowledge in Data: An Introduction to Data…
4861 runs1 likes3 downloads4 reach14 impact
5000 instances - 21 features - 2 classes - 0 missing values
This database contains all legal 8-ply positions in the game of connect-4 in which neither player has won yet, and in which the next move is not forced. Attributes represent board positions on a 6x6…
8499 runs0 likes4 downloads4 reach14 impact
67557 instances - 43 features - 3 classes - 0 missing values
No data.
496 runs0 likes6 downloads6 reach13 impact
45 instances - 4027 features - 2 classes - 5948 missing values
No data.
296 runs0 likes5 downloads5 reach13 impact
96 instances - 4027 features - 9 classes - 19667 missing values
Multiclass cancer diagnosis using 16063 tumor gene expression signatures. PNAS, VOL 98, no 26, pp. 15149-15154, December 18, 2001. S. Ramaswamy, P. Tamayo, R. Rifkin, S. Mukherjee, C.-H. Yeang, M.…
116 runs0 likes7 downloads7 reach13 impact
190 instances - 16064 features - 14 classes - 0 missing values
No data.
283 runs0 likes5 downloads5 reach13 impact
96 instances - 4027 features - 11 classes - 19667 missing values
Dataset from the MLRR repository: http://axon.cs.byu.edu:5000/
731 runs0 likes5 downloads5 reach13 impact
151 instances - 7 features - 3 classes - 0 missing values
Fashion-MNIST is a dataset of Zalando's article images, consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a…
111 runs0 likes7 downloads7 reach13 impact
70000 instances - 785 features - 10 classes - 0 missing values
Originally from the StatLog project. The raw data is still available on [UCI](https://archive.ics.uci.edu/ml/datasets/Molecular+Biology+(Splice-junction+Gene+Sequences)). The data consists of 3,186…
4286 runs0 likes2 downloads2 reach13 impact
3186 instances - 181 features - 3 classes - 0 missing values
This is the original version of the famous covertype dataset in ARFF format. Predicting forest cover type from cartographic variables only (no remotely sensed data). The actual forest cover type for a…
2 runs1 likes14 downloads15 reach12 impact
581012 instances - 55 features - 7 classes - 0 missing values
Re-upload of the dataset as it is present in the Penn ML Benchmark (https://github.com/EpistasisLab/penn-ml-benchmarks/tree/master/datasets/classification/fars). It's a dataset on traffic accidents,…
1 runs0 likes0 downloads0 reach12 impact
100968 instances - 30 features - 8 classes - 0 missing values
The instances were drawn randomly from a database of 7 outdoor images. The images were hand-segmented to create a classification for every pixel. Each instance is a 3x3 region. __Major changes w.r.t.…
4251 runs0 likes2 downloads2 reach12 impact
2310 instances - 20 features - 7 classes - 0 missing values
shuttle-pmlb
0 runs0 likes2 downloads2 reach12 impact
58000 instances - 10 features - 7 classes - 0 missing values
No data.
67 runs0 likes11 downloads11 reach11 impact
9558 instances - 26833 features - 44 classes - 0 missing values
No data.
159 runs0 likes11 downloads11 reach11 impact
1657 instances - 3759 features - 25 classes - 0 missing values
No data.
163 runs0 likes13 downloads13 reach11 impact
1560 instances - 8461 features - 20 classes - 0 missing values
The datasets contains transactions made by credit cards in September 2013 by european cardholders. This dataset present transactions that occurred in two days, where we have 492 frauds out of 284,807…
355 runs0 likes54 downloads54 reach11 impact
284807 instances - 31 features - 2 classes - 0 missing values
wine-quality-red-pmlb
31 runs1 likes0 downloads1 reach11 impact
1599 instances - 12 features - 6 classes - 0 missing values
Dataset used by Buntine and Niblett (1992). Composed of 10 features, one of which is irrelevant. The target is a disjunctive normal form formula over the nine other attributes, with additional…
31 runs0 likes0 downloads0 reach11 impact
973 instances - 10 features - 2 classes - 0 missing values
flare-pmlb
32 runs0 likes0 downloads0 reach11 impact
1066 instances - 11 features - 2 classes - 0 missing values
PMLB version of the Titanic dataset, which only uses 3 features. See version 1 for the complete version: https://www.openml.org/d/40945
31 runs0 likes0 downloads0 reach11 impact
2201 instances - 4 features - 2 classes - 0 missing values
)), [PMLB](https://github.com/EpistasisLab/penn-ml-benchmarks/tree/master/datasets/classification/tokyo1) This is Performance co-pilot (PCP) data for the Tokyo server at Silicon Graphics International…
35 runs0 likes1 downloads1 reach11 impact
959 instances - 45 features - 2 classes - 0 missing values
parity5_plus_5-pmlb
31 runs0 likes0 downloads0 reach11 impact
1124 instances - 11 features - 2 classes - 0 missing values
led24-pmlb
31 runs0 likes1 downloads1 reach11 impact
3200 instances - 25 features - 10 classes - 0 missing values
led7-pmlb
31 runs0 likes0 downloads0 reach11 impact
3200 instances - 8 features - 10 classes - 0 missing values
The origin is not clear, but presumably this is an artificial problem representing M-of-N rules. The target is 1 if a certain M 'bits' are '1'? (Joaquin Vanschoren)
31 runs0 likes0 downloads0 reach11 impact
1324 instances - 11 features - 2 classes - 0 missing values
threeOf9-pmlb
31 runs0 likes0 downloads0 reach11 impact
512 instances - 10 features - 2 classes - 0 missing values
dis-pmlb
31 runs0 likes0 downloads0 reach11 impact
3772 instances - 30 features - 2 classes - 0 missing values
This database was derived from a simple hierarchical decision model originally developed for the demonstration of DEX (M. Bohanec, V. Rajkovic: Expert system for decision making. Sistemica 1(1), pp.…
4250 runs0 likes4 downloads4 reach11 impact
1728 instances - 7 features - 4 classes - 0 missing values
### Description __Changes to version 1:__ all categorical features transformed as such. This dataset represents a set of possible advertisements on Internet pages. ### Sources (a) Creator and donor:…
4 runs0 likes2 downloads2 reach11 impact
3279 instances - 1559 features - 2 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. The maps were scanned in 8 bit grey value at density of 400dpi,…
4181 runs0 likes1 downloads1 reach11 impact
2000 instances - 241 features - 10 classes - 0 missing values
GAMETES_Epistasis_2-Way_1000atts_0.4H_EDM-1_EDM-1_1-pmlb
0 runs0 likes1 downloads1 reach11 impact
1600 instances - 1001 features - 2 classes - 0 missing values
GAMETES_Epistasis_2-Way_20atts_0.1H_EDM-1_1-pmlb
31 runs0 likes0 downloads0 reach11 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Epistasis_2-Way_20atts_0.4H_EDM-1_1-pmlb
31 runs0 likes0 downloads0 reach11 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Epistasis_3-Way_20atts_0.2H_EDM-1_1-pmlb
31 runs0 likes0 downloads0 reach11 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Heterogeneity_20atts_1600_Het_0.4_0.2_50_EDM-2_001-pmlb
0 runs0 likes0 downloads0 reach11 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Heterogeneity_20atts_1600_Het_0.4_0.2_75_EDM-2_001-pmlb
31 runs0 likes0 downloads0 reach11 impact
1600 instances - 21 features - 2 classes - 0 missing values
Derived from the Musk dataset: https://www.openml.org/d/1116
31 runs0 likes1 downloads1 reach11 impact
476 instances - 169 features - 2 classes - 0 missing values
Derived from the Musk dataset: https://www.openml.org/d/1116
31 runs0 likes0 downloads0 reach11 impact
6598 instances - 169 features - 2 classes - 0 missing values
corral-pmlb
31 runs0 likes1 downloads1 reach11 impact
160 instances - 7 features - 2 classes - 0 missing values
This is a 20,000 instance sample of the original CIFAR-10 dataset. Sampled randomly and stratified, with 2000 examples per class. Training and test set are merged. Find the corresponding task for the…
380 runs0 likes3 downloads3 reach11 impact
20000 instances - 3073 features - 10 classes - 0 missing values
No data.
220 runs0 likes7 downloads7 reach10 impact
336 instances - 7903 features - 6 classes - 0 missing values
No data.
108 runs0 likes4 downloads4 reach10 impact
927 instances - 10129 features - 7 classes - 0 missing values
No data.
219 runs0 likes5 downloads5 reach10 impact
414 instances - 6430 features - 9 classes - 0 missing values
No data.
215 runs0 likes7 downloads7 reach10 impact
204 instances - 5833 features - 6 classes - 0 missing values
No data.
211 runs0 likes4 downloads4 reach10 impact
313 instances - 5805 features - 8 classes - 0 missing values
No data.
203 runs0 likes5 downloads5 reach10 impact
878 instances - 7455 features - 10 classes - 0 missing values
This database contains the HTML source of web pages plus the ratings of a single user on these web pages. The web pages are on four separate subjects (Bands- recording artists; Goats; Sheep; and…
0 runs0 likes1 downloads1 reach10 impact
131 instances - 3 features - 3 classes - 0 missing values
This database contains the HTML source of web pages plus the ratings of a single user on these web pages. The web pages are on four separate subjects (Bands- recording artists; Goats; Sheep; and…
0 runs0 likes3 downloads3 reach10 impact
65 instances - 3 features - 2 classes - 0 missing values
This database contains the HTML source of web pages plus the ratings of a single user on these web pages. The web pages are on four separate subjects (Bands- recording artists; Goats; Sheep; and…
0 runs0 likes0 downloads0 reach10 impact
70 instances - 3 features - 3 classes - 0 missing values
This database contains the HTML source of web pages plus the ratings of a single user on these web pages. The web pages are on four separate subjects (Bands- recording artists; Goats; Sheep; and…
0 runs0 likes0 downloads0 reach10 impact
61 instances - 3 features - 3 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
1133 runs0 likes15 downloads15 reach10 impact
150 instances - 5 features - 2 classes - 0 missing values
MyExampleIris
32 runs0 likes0 downloads0 reach10 impact
150 instances - 5 features - 3 classes - 0 missing values
####1. Summary This dataset contain attributes of dresses and their recommendations according to their sales. Sales are monitor on the basis of alternate days. The attributes present analyzed are:…
16895 runs0 likes5 downloads5 reach10 impact
500 instances - 13 features - 2 classes - 835 missing values
DEXTER is a text classification problem in a bag-of-word representation. This is a two-class classification problem with sparse continuous input variables. This dataset is one of five datasets of the…
0 runs0 likes5 downloads5 reach10 impact
600 instances - 20001 features - 2 classes - 0 missing values
DOROTHEA is a drug discovery dataset. Chemical compounds represented by structural molecular features must be classified as active (binding to thrombin) or inactive. This is one of 5 datasets of the…
0 runs0 likes6 downloads6 reach10 impact
1150 instances - 100001 features - 2 classes - 0 missing values
####1. Summary This database was generated by the Laboratory of Image Processing and Pattern Recognition (INPG-LTIRF) in the development of the Esprit project ELENA No. 6891 and the Esprit working…
15298 runs0 likes10 downloads10 reach10 impact
5500 instances - 41 features - 11 classes - 0 missing values
cars1-pmlb
31 runs0 likes1 downloads1 reach10 impact
392 instances - 8 features - 3 classes - 0 missing values
allbp-pmlb
31 runs0 likes1 downloads1 reach10 impact
3772 instances - 30 features - 3 classes - 0 missing values
allrep-pmlb
31 runs0 likes0 downloads0 reach10 impact
3772 instances - 30 features - 4 classes - 0 missing values
analcatdata_happiness-pmlb
31 runs0 likes0 downloads0 reach10 impact
60 instances - 4 features - 3 classes - 0 missing values
cleve-pmlb
32 runs0 likes1 downloads1 reach10 impact
303 instances - 14 features - 2 classes - 0 missing values
ecoli-pmlb
31 runs0 likes1 downloads1 reach10 impact
327 instances - 8 features - 5 classes - 0 missing values
mux6-pmlb
31 runs0 likes1 downloads1 reach10 impact
128 instances - 7 features - 2 classes - 0 missing values
new-thyroid-pmlb
31 runs0 likes2 downloads2 reach10 impact
215 instances - 6 features - 3 classes - 0 missing values
postoperative-patient-data-pmlb
26 runs0 likes1 downloads1 reach10 impact
88 instances - 9 features - 2 classes - 0 missing values
Relevant Information: -- The database contains 3 potential classes, one for the number of times a certain type of solar flare occured in a 24 hour period. -- Each instance represents captured features…
31 runs0 likes1 downloads1 reach10 impact
315 instances - 13 features - 5 classes - 0 missing values
Relevant Information: -- The database contains 3 potential classes, one for the number of times a certain type of solar flare occured in a 24 hour period. -- Each instance represents captured features…
31 runs0 likes0 downloads0 reach10 impact
1066 instances - 13 features - 6 classes - 0 missing values
cleveland-nominal-pmlb
31 runs0 likes1 downloads1 reach10 impact
303 instances - 8 features - 5 classes - 0 missing values
parity5-pmlb
30 runs0 likes0 downloads0 reach10 impact
32 instances - 6 features - 2 classes - 0 missing values
__Changes w.r.t. version 1: renamed variables such that they match description.__ ### Dataset: Wilt Data Set ### Abstract: High-resolution Remote Sensing data set (Quickbird). Small number of training…
4185 runs0 likes1 downloads1 reach10 impact
4839 instances - 6 features - 2 classes - 0 missing values
Expression levels of 77 proteins measured in the cerebral cortex of 8 classes of control and Down syndrome mice exposed to context fear conditioning, a task used to assess associative learning. The…
4177 runs0 likes0 downloads0 reach10 impact
1080 instances - 82 features - 8 classes - 1396 missing values
analcatdata_fraud-pmlb
32 runs0 likes0 downloads0 reach10 impact
42 instances - 12 features - 2 classes - 0 missing values
calendarDOW-pmlb
31 runs0 likes1 downloads1 reach10 impact
399 instances - 33 features - 5 classes - 0 missing values
car-evaluation-pmlb
31 runs0 likes1 downloads1 reach10 impact
1728 instances - 22 features - 4 classes - 0 missing values
This dataset contains 358 lyrics of songs for the rock bands 'The Rolling Stones' and 'Deep Purple'. The bands are equally represented in the dataset (179 songs for each band). This dataset was…
8 runs0 likes1 downloads1 reach10 impact
358 instances - 2 features - 2 classes - 0 missing values
0. airplane 1. automobile 2. bird 3. cat 4. deer 5. dog 6. frog 7. horse 8. ship 9. truck CIFAR-10 contains 6000 images per class. The original train-test split randomly divided these into 5000 train…
77 runs0 likes3 downloads3 reach10 impact
60000 instances - 3073 features - 10 classes - 0 missing values
ARFF version of UCI dataset 'flags'. Creators: Collected primarily from the "Collins Gem Guide to Flags": Collins Publishers (1986). Donor: Richard S. Forsyth. Date 5/15/1990 This data file contains…
103 runs0 likes8 downloads8 reach9 impact
194 instances - 30 features - 8 classes - 0 missing values
The Boston house-price data of Harrison, D. and Rubinfeld, D.L. 'Hedonic prices and the demand for clean air', J. Environ. Economics & Management, vol.5, 81-102, 1978. Used in Belsley, Kuh & Welsch,…
6 runs0 likes5 downloads5 reach9 impact
506 instances - 14 features - 0 classes - 0 missing values
Datasets from ACM KDD Cup (http://www.sigkdd.org/kddcup/index.php) KDD Cup 2009 http://www.kddcup-orange.com Converted to ARFF format by TunedIT Customer Relationship Management (CRM) is a key element…
218 runs0 likes16 downloads16 reach9 impact
50000 instances - 231 features - 2 classes - 8024152 missing values
One of the NASA Metrics Data Program defect data sets. The specific type of software is unknown. Data comes from McCabe and Halstead features extractors of source code. These features were defined in…
815 runs0 likes14 downloads14 reach9 impact
9466 instances - 39 features - 2 classes - 0 missing values
This simple domain contains 7 Boolean attributes and 10 classes, the set of decimal digits. Recall that LED displays contain 7 light-emitting diodes -- hence the reason for 7 attributes. The class…
12691 runs0 likes5 downloads5 reach9 impact
500 instances - 8 features - 10 classes - 0 missing values
This dataset classifies people described by a set of attributes as good or bad credit risks. This dataset comes with a cost matrix: ``` Good Bad (predicted) Good 0 1 (actual) Bad 5 0 ``` It is worse…
502699 runs12 likes151 downloads163 reach9 impact
1000 instances - 21 features - 2 classes - 0 missing values
"The speech dataset was also provided by (see citation request) and contains real world data from recorded English language. The normal class contains data from persons having an American accent…
1599 runs0 likes4 downloads4 reach9 impact
3686 instances - 401 features - 2 classes - 0 missing values
Vehicle classification in distributed sensor networks. Journal of Parallel and Distributed Computing, 64(7):826-838, July 2004. This is the SensIT Vehicle (combined) dataset, retrieved 2013-11-14 from…
403 runs0 likes22 downloads22 reach8 impact
98528 instances - 101 features - 2 classes - 0 missing values
SPECT heart data This is a merged version of the separate train and test set which are usually distributed. On OpenML this train-test split can be found as one of the possible tasks. Sources: --…
1296 runs1 likes12 downloads13 reach8 impact
267 instances - 23 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
72 runs1 likes6 downloads7 reach8 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
59 runs0 likes6 downloads6 reach8 impact
1545 instances - 10937 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
866 runs1 likes11 downloads12 reach8 impact
7129 instances - 6 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
708 runs0 likes4 downloads4 reach8 impact
286 instances - 10 features - 2 classes - 9 missing values
Datasets from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch) Dataset from: http://www.agnostic.inf.ethz.ch/datasets.php Modified by TunedIT (converted to ARFF…
778 runs0 likes8 downloads8 reach8 impact
4562 instances - 15 features - 2 classes - 88 missing values
Datasets from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch) Dataset from: http://www.agnostic.inf.ethz.ch/datasets.php Modified by TunedIT (converted to ARFF…
406 runs1 likes11 downloads12 reach8 impact
4229 instances - 1618 features - 2 classes - 0 missing values
Datasets from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch) Dataset from: http://www.agnostic.inf.ethz.ch/datasets.php Modified by TunedIT (converted to ARFF…
486 runs0 likes14 downloads14 reach8 impact
14395 instances - 109 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
733 runs0 likes9 downloads9 reach8 impact
7485 instances - 56 features - 2 classes - 32427 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
169 runs0 likes8 downloads8 reach8 impact
600 instances - 62 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
744 runs0 likes8 downloads8 reach8 impact
7019 instances - 61 features - 2 classes - 43814 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
754 runs0 likes10 downloads10 reach8 impact
8844 instances - 57 features - 2 classes - 34843 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
857 runs0 likes12 downloads12 reach8 impact
9961 instances - 15 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
622 runs0 likes6 downloads6 reach8 impact
10108 instances - 69 features - 2 classes - 2699 missing values