OpenML
Filter results by:
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 12840, and it has 1608 rows and 1026 features…
1 runs0 likes1 downloads1 reach3 impact
1608 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11711, and it has 11 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
11 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11902, and it has 1211 rows and 1026 features…
1 runs0 likes1 downloads1 reach3 impact
1211 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 101338, and it has 79 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
79 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 30035, and it has 678 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
678 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 100969, and it has 79 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
79 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 102913, and it has 10 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
10 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11055, and it has 46 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
46 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 186, and it has 14 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
14 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 101411, and it has 74 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
74 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 101021, and it has 81 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
81 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 12474, and it has 360 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
360 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 104138, and it has 65 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
65 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 100851, and it has 526 rows and 1026 features…
1 runs0 likes1 downloads1 reach3 impact
526 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 72, and it has 5354 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
5354 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 20109, and it has 44 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
44 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10901, and it has 541 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
541 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 102389, and it has 12 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
12 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 12476, and it has 1023 rows and 1026 features…
1 runs0 likes1 downloads1 reach3 impact
1023 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 100976, and it has 79 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
79 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 12591, and it has 10 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
10 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 90, and it has 2055 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
2055 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 101152, and it has 101 rows and 1026 features…
1 runs0 likes1 downloads1 reach3 impact
101 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 100859, and it has 25 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
25 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 12336, and it has 17 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
17 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 17134, and it has 32 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
32 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10648, and it has 21 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
21 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 100436, and it has 163 rows and 1026 features…
1 runs0 likes1 downloads1 reach3 impact
163 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11870, and it has 129 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
129 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11977, and it has 44 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
44 instances - 1026 features - 0 classes - 0 missing values
No data.
69 runs0 likes4 downloads4 reach1 impact
1000000 instances - 20 features - 2 classes - 0 missing values
No data.
31 runs0 likes1 downloads1 reach2 impact
1000000 instances - 37 features - 2 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: The original Adult data set has 14 features, among which six are continuous and eight are categorical. In this data set,…
0 runs0 likes3 downloads3 reach5 impact
48842 instances - 124 features - 0 classes - 0 missing values
Mean While 1
0 runs0 likes3 downloads3 reach3 impact
253 instances - 38 features - 2 classes - 0 missing values
No data.
33 runs0 likes4 downloads4 reach3 impact
1000000 instances - 70 features - 24 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes1 downloads1 reach5 impact
8 instances - 1143 features - 0 classes - 0 missing values
Klaverjas is an example of the Jack-Nine card games, which are characterized as trick-taking games where the the Jack and nine of the trump suit are the highest-ranking trumps, and the tens and aces…
0 runs0 likes1 downloads1 reach1 impact
981541 instances - 33 features - 2 classes - 0 missing values
No data.
71 runs0 likes5 downloads5 reach2 impact
1000000 instances - 17 features - 2 classes - 0 missing values
No data.
356 runs0 likes7 downloads7 reach1 impact
131072 instances - 17 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
66 runs0 likes1 downloads1 reach7 impact
386 instances - 10937 features - 2 classes - 0 missing values
No data.
50 runs0 likes3 downloads3 reach1 impact
1000000 instances - 61 features - 2 classes - 0 missing values
No data.
65 runs0 likes5 downloads5 reach1 impact
1000000 instances - 30 features - 4 classes - 0 missing values
No data.
143 runs0 likes4 downloads4 reach2 impact
1000000 instances - 39 features - 6 classes - 0 missing values
This is the famous covertype dataset in its binary version, retrieved 2013-11-13 from the libSVM site (called covtype.binary there). Additional to the preprocessing done there (see LibSVM site for…
22 runs0 likes9 downloads9 reach7 impact
581012 instances - 55 features - 2 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes2 downloads2 reach7 impact
10000 instances - 2001 features - 5 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes0 downloads0 reach7 impact
8237 instances - 801 features - 7 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes0 downloads0 reach6 impact
416188 instances - 61 features - 355 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
5 runs0 likes0 downloads0 reach7 impact
10000 instances - 7201 features - 10 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
6 runs0 likes1 downloads1 reach7 impact
83733 instances - 55 features - 4 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
7 runs0 likes1 downloads1 reach7 impact
58310 instances - 181 features - 10 classes - 0 missing values
Datasets from ACM KDD Cup (http://www.sigkdd.org/kddcup/index.php) KDD Cup 2009 http://www.kddcup-orange.com Converted to ARFF format by TunedIT Customer Relationship Management (CRM) is a key element…
223 runs0 likes17 downloads17 reach9 impact
50000 instances - 231 features - 2 classes - 8024152 missing values
shuttle-pmlb
6 runs0 likes2 downloads2 reach13 impact
58000 instances - 10 features - 7 classes - 0 missing values
Automated file upload of BNG(credit-g)
99 runs0 likes3 downloads3 reach2 impact
1000000 instances - 21 features - 2 classes - 0 missing values
Automated file upload of BNG(spambase)
98 runs0 likes3 downloads3 reach2 impact
1000000 instances - 58 features - 2 classes - 0 missing values
Automated file upload of 20_newsgroups.drift
124 runs0 likes1 downloads1 reach6 impact
399940 instances - 1001 features - 2 classes - 0 missing values
Automated file upload of BNG(segment)
99 runs0 likes1 downloads1 reach2 impact
1000000 instances - 20 features - 7 classes - 0 missing values
Automated file upload of BNG(anneal)
100 runs0 likes3 downloads3 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
Multi-label dataset. The genbase dataset contains protein sequences that can be assigned to several classes of protein families.
0 runs0 likes1 downloads1 reach3 impact
662 instances - 1213 features - 2 classes - 0 missing values
The langLog dataset includes 1004 textual predictors and was originally compiled in the doctorial thesis of Read (2010). It consists of 956 text samples that can be assigned to one or more topics such…
0 runs0 likes4 downloads4 reach3 impact
1460 instances - 1079 features - 2 classes - 0 missing values
Multi-label dataset. A subset of the reuters dataset includes 2000 observations for text classification.
0 runs0 likes8 downloads8 reach4 impact
2000 instances - 250 features - 2 classes - 0 missing values
Multi-label dataset. The scene dataset is an image classification task where labels like Beach, Mountain, Field, Urban are assigned to each image.
0 runs0 likes12 downloads12 reach3 impact
2407 instances - 300 features - 2 classes - 0 missing values
Multi-label dataset. The yeast dataset (Elisseeff and Weston, 2002) consists of micro-array expression data, as well as phylogenetic profiles of yeast, and includes 2417 genes and 103 predictors. In…
0 runs0 likes2 downloads2 reach3 impact
2417 instances - 117 features - 2 classes - 0 missing values
This directory contains Thyroid datasets. "ann-train.data" contains 3772 learning examples and "ann-test.data" contains 3428 testing examples. I have obtained this data from…
31 runs0 likes2 downloads2 reach6 impact
3772 instances - 22 features - 3 classes - 0 missing values
General Description of Thyroid Disease Databases and Related Files This directory contains 6 databases, corresponding test set, and corresponding documentation. They were left at the University of…
92 runs0 likes5 downloads5 reach6 impact
2800 instances - 27 features - 5 classes - 0 missing values
General Description of Thyroid Disease Databases and Related Files This directory contains 6 databases, corresponding test set, and corresponding documentation. They were left at the University of…
32 runs0 likes7 downloads7 reach5 impact
2800 instances - 27 features - 5 classes - 0 missing values
Data on the population density of tree pipits, Anthus trivialis, in Franconian oak forests including variables describing the forest ecosystem. This data is taken from R package coin. This study is…
0 runs0 likes2 downloads2 reach4 impact
86 instances - 10 features - 0 classes - 0 missing values
Data from https://doi.org/10.5281/zenodo.269636
0 runs0 likes4 downloads4 reach5 impact
4758 instances - 39 features - classes - 0 missing values
#study_1
0 runs0 likes0 downloads0 reach2 impact
944 instances - 17 features - classes - 0 missing values
GAMETES_Epistasis_2-Way_1000atts_0.4H_EDM-1_EDM-1_1-pmlb
0 runs0 likes1 downloads1 reach12 impact
1600 instances - 1001 features - 2 classes - 0 missing values
GAMETES_Epistasis_2-Way_20atts_0.1H_EDM-1_1-pmlb
31 runs0 likes0 downloads0 reach12 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Epistasis_2-Way_20atts_0.4H_EDM-1_1-pmlb
31 runs0 likes0 downloads0 reach12 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Epistasis_3-Way_20atts_0.2H_EDM-1_1-pmlb
31 runs0 likes0 downloads0 reach12 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Heterogeneity_20atts_1600_Het_0.4_0.2_50_EDM-2_001-pmlb
0 runs0 likes0 downloads0 reach12 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Heterogeneity_20atts_1600_Het_0.4_0.2_75_EDM-2_001-pmlb
31 runs0 likes0 downloads0 reach12 impact
1600 instances - 21 features - 2 classes - 0 missing values
analcatdata_fraud-pmlb
32 runs0 likes0 downloads0 reach11 impact
42 instances - 12 features - 2 classes - 0 missing values
Dataset used by Buntine and Niblett (1992). Composed of 10 features, one of which is irrelevant. The target is a disjunctive normal form formula over the nine other attributes, with additional…
31 runs0 likes0 downloads0 reach12 impact
973 instances - 10 features - 2 classes - 0 missing values
calendarDOW-pmlb
31 runs0 likes1 downloads1 reach11 impact
399 instances - 33 features - 5 classes - 0 missing values
car-evaluation-pmlb
31 runs0 likes1 downloads1 reach11 impact
1728 instances - 22 features - 4 classes - 0 missing values
Derived from the Musk dataset: https://www.openml.org/d/1116
31 runs0 likes1 downloads1 reach12 impact
476 instances - 169 features - 2 classes - 0 missing values
Derived from the Musk dataset: https://www.openml.org/d/1116
31 runs0 likes0 downloads0 reach12 impact
6598 instances - 169 features - 2 classes - 0 missing values
corral-pmlb
31 runs0 likes1 downloads1 reach12 impact
160 instances - 7 features - 2 classes - 0 missing values
flare-pmlb
32 runs0 likes0 downloads0 reach12 impact
1066 instances - 11 features - 2 classes - 0 missing values
PMLB version of the Titanic dataset, which only uses 3 features. See version 1 for the complete version: https://www.openml.org/d/40945
35 runs0 likes0 downloads0 reach12 impact
2201 instances - 4 features - 2 classes - 0 missing values
)), [PMLB](https://github.com/EpistasisLab/penn-ml-benchmarks/tree/master/datasets/classification/tokyo1) This is Performance co-pilot (PCP) data for the Tokyo server at Silicon Graphics International…
35 runs0 likes1 downloads1 reach12 impact
959 instances - 45 features - 2 classes - 0 missing values
parity5_plus_5-pmlb
31 runs0 likes0 downloads0 reach12 impact
1124 instances - 11 features - 2 classes - 0 missing values
allbp-pmlb
31 runs0 likes1 downloads1 reach11 impact
3772 instances - 30 features - 3 classes - 0 missing values
allrep-pmlb
31 runs0 likes0 downloads0 reach11 impact
3772 instances - 30 features - 4 classes - 0 missing values
analcatdata_happiness-pmlb
31 runs0 likes0 downloads0 reach11 impact
60 instances - 4 features - 3 classes - 0 missing values
cleve-pmlb
32 runs0 likes1 downloads1 reach11 impact
303 instances - 14 features - 2 classes - 0 missing values
ecoli-pmlb
31 runs0 likes1 downloads1 reach11 impact
327 instances - 8 features - 5 classes - 0 missing values
Re-upload of the dataset as it is present in the Penn ML Benchmark (https://github.com/EpistasisLab/penn-ml-benchmarks/tree/master/datasets/classification/fars). It's a dataset on traffic accidents,…
1 runs0 likes0 downloads0 reach13 impact
100968 instances - 30 features - 8 classes - 0 missing values
led24-pmlb
31 runs0 likes1 downloads1 reach12 impact
3200 instances - 25 features - 10 classes - 0 missing values
led7-pmlb
31 runs0 likes0 downloads0 reach12 impact
3200 instances - 8 features - 10 classes - 0 missing values
The origin is not clear, but presumably this is an artificial problem representing M-of-N rules. The target is 1 if a certain M 'bits' are '1'? (Joaquin Vanschoren)
31 runs0 likes0 downloads0 reach12 impact
1324 instances - 11 features - 2 classes - 0 missing values
mux6-pmlb
31 runs0 likes1 downloads1 reach11 impact
128 instances - 7 features - 2 classes - 0 missing values
new-thyroid-pmlb
31 runs0 likes2 downloads2 reach11 impact
215 instances - 6 features - 3 classes - 0 missing values
postoperative-patient-data-pmlb
26 runs0 likes1 downloads1 reach11 impact
88 instances - 9 features - 2 classes - 0 missing values
Relevant Information: -- The database contains 3 potential classes, one for the number of times a certain type of solar flare occured in a 24 hour period. -- Each instance represents captured features…
31 runs0 likes1 downloads1 reach11 impact
315 instances - 13 features - 5 classes - 0 missing values
Relevant Information: -- The database contains 3 potential classes, one for the number of times a certain type of solar flare occured in a 24 hour period. -- Each instance represents captured features…
31 runs0 likes0 downloads0 reach11 impact
1066 instances - 13 features - 6 classes - 0 missing values
threeOf9-pmlb
31 runs0 likes0 downloads0 reach12 impact
512 instances - 10 features - 2 classes - 0 missing values