OpenML
Filter results by:
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes2 downloads2 reach6 impact
10000 instances - 2001 features - 5 classes - 0 missing values
No data.
45 runs0 likes2 downloads2 reach1 impact
1000000 instances - 23 features - 2 classes - 0 missing values
No data.
50 runs0 likes1 downloads1 reach2 impact
1000000 instances - 65 features - 10 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag.
4 runs0 likes1 downloads1 reach1 impact
61 instances - 3 features - 0 classes - 0 missing values
The Computer Activity databases are a collection of computer systems activity measures. The data was collected from a Sun Sparcstation 20/712 with 128 Mbytes of memory running in a multi-user…
2 runs1 likes1 downloads2 reach1 impact
8192 instances - 22 features - 0 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Case number deleted. X treated as the class attribute. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric…
10 runs0 likes1 downloads1 reach1 impact
418 instances - 19 features - 0 classes - 1239 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Horsepower treated as the class attribute. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using…
0 runs0 likes1 downloads1 reach1 impact
The problem is to learn a regression equation/rule/tree to predict the activity from the descriptive structural attributes. The data and methodology is described in detail in: - King, Ross .D., Hurst,…
5 runs0 likes1 downloads1 reach1 impact
186 instances - 61 features - 0 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag.
2 runs0 likes1 downloads1 reach1 impact
2178 instances - 4 features - 0 classes - 0 missing values
As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based learning with encoding length selection. In Progress in Connectionist-Based Information Systems.…
2 runs0 likes1 downloads1 reach1 impact
200 instances - 11 features - 0 classes - 0 missing values
1. Title: Ozone Level Detection 2. Source: Kun Zhang zhang.kun05 '@' gmail.com Department of Computer Science, Xavier University of Lousiana Wei Fan wei.fan '@' gmail.com IBM T.J.Watson Research…
0 runs0 likes1 downloads1 reach5 impact
2536 instances - 73 features - 0 classes - 0 missing values
Data from StatLib (ftp stat.cmu.edu/datasets) The infamous Longley data, "An appraisal of least-squares programs from the point of view of the user", JASA, 62(1967) p819-841. Variables are: Number of…
3 runs0 likes1 downloads1 reach1 impact
16 instances - 7 features - 0 classes - 0 missing values
The task consists of Learning Quantitative Structure Activity Relationships (QSARs). The Inhibition of Dihydrofolate Reductase by Pyrimidines.The data are described in: King, Ross .D., Muggleton,…
6 runs0 likes1 downloads1 reach1 impact
74 instances - 28 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs1 likes1 downloads2 reach5 impact
4450 instances - 203 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes1 downloads1 reach5 impact
25 instances - 10 features - 0 classes - 0 missing values
This database contains the HTML source of web pages plus the ratings of a single user on these web pages. The web pages are on four separate subjects (Bands- recording artists; Goats; Sheep; and…
0 runs0 likes1 downloads1 reach10 impact
131 instances - 3 features - 3 classes - 0 missing values
Pittsburgh bridges This version is derived from version 1 by removing all instances with missing values in the last (target) attribute. The bridges dataset is originally not a classification dataset,…
31 runs0 likes1 downloads1 reach7 impact
105 instances - 13 features - 6 classes - 61 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes1 downloads1 reach5 impact
20 instances - 10 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes1 downloads1 reach5 impact
16 instances - 34 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
0 runs0 likes1 downloads1 reach5 impact
365 instances - 4 features - 0 classes - 30 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
66 runs0 likes1 downloads1 reach7 impact
386 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes1 downloads1 reach7 impact
185 instances - 10937 features - 2 classes - 0 missing values
A family of datasets synthetically generated from a simulation of how bank-customers choose their banks. Tasks are based on predicting the fraction of bank customers who leave the bank because of full…
0 runs0 likes1 downloads1 reach5 impact
8192 instances - 33 features - 0 classes - 0 missing values
Systematic determination of genetic network architecture. Nature Genetics, 1999 Jul;22(3):281-5. Data also used in Biclustering of Expression Data, by Yizong Cheng and George M. Church (web…
0 runs0 likes1 downloads1 reach5 impact
17 instances - 2884 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
0 runs0 likes1 downloads1 reach3 impact
366 instances - 5 features - classes - 2 missing values
1. Title: Faults in a urban waste water treatment plant 2. Source Information: -- Creators: Manel Poch (igte2@cc.uab.es) Unitat d'Enginyeria Quimica Universitat Autonoma de Barcelona. Bellaterra.…
0 runs0 likes1 downloads1 reach5 impact
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
67 runs0 likes1 downloads1 reach7 impact
458 instances - 10937 features - 2 classes - 0 missing values
File README ----------- chscase A collection of the data sets used in the book "A Casebook for a First Course in Statistics and Data Analysis," by Samprit Chatterjee, Mark S. Handcock and Jeffrey S.…
0 runs0 likes1 downloads1 reach5 impact
Data Used in "A BAYESIAN APPROACH TO DATA DISCLOSURE: OPTIMAL INTRUDER BEHAVIOR FOR CONTINUOUS DATA" by Stephen E. Fienberg, Udi E. Makov, and Ashish P. Sanil Background: ========== In this paper we…
0 runs0 likes1 downloads1 reach5 impact
662 instances - 4 features - 0 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach1 impact
1000000 instances - 33 features - 0 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach1 impact
1000000 instances - 14 features - 0 classes - 0 missing values
No data.
50 runs0 likes1 downloads1 reach3 impact
1000000 instances - 18 features - 22 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach1 impact
17496 instances - 10 features - 0 classes - 0 missing values
1. Title: Employee Rejection\Acceptance (Orinal ERA) 2. Source Informaion: Donor: Arie Ben David MIS, Dept. of Technology Management Holon Academic Inst. of Technology 52 Golomb St. Holon 58102 Israel…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 5 features - 0 classes - 0 missing values
No data.
10 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
7 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
7 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
6 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
7 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
31 runs0 likes1 downloads1 reach2 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach2 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach2 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach2 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach2 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
31 runs0 likes1 downloads1 reach2 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach2 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach1 impact
177147 instances - 11 features - 0 classes - 0 missing values
No data.
44 runs0 likes1 downloads1 reach1 impact
1000000 instances - 13 features - 11 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach1 impact
1000000 instances - 33 features - 0 classes - 0 missing values
No data.
47 runs0 likes1 downloads1 reach1 impact
1000000 instances - 45 features - 2 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach1 impact
144 instances - 77 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10478, and it has 86 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
86 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10009, and it has 714 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
714 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10659, and it has 81 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
81 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 101097, and it has 59 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
59 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 101105, and it has 10 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
10 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 52, and it has 877 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
877 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 20174, and it has 1201 rows and 1026 features…
1 runs0 likes1 downloads1 reach3 impact
1201 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 100908, and it has 84 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
84 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 100479, and it has 11 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
11 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10530, and it has 90 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
90 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 30049, and it has 733 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
733 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 101505, and it has 15 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
15 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 250, and it has 2446 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
2446 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10075, and it has 161 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
161 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11300, and it has 1616 rows and 1026 features…
1 runs0 likes1 downloads1 reach3 impact
1616 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 19904, and it has 584 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
584 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 12078, and it has 70 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
70 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10506, and it has 10 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
10 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10227, and it has 15 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
15 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10766, and it has 122 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
122 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 234, and it has 2145 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
2145 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11694, and it has 157 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
157 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 103169, and it has 10 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
10 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 103561, and it has 47 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
47 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10250, and it has 124 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
124 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 30007, and it has 534 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
534 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 101124, and it has 10 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
10 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11451, and it has 2442 rows and 1026 features…
1 runs0 likes1 downloads1 reach3 impact
2442 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10051, and it has 1007 rows and 1026 features…
1 runs0 likes1 downloads1 reach3 impact
1007 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10981, and it has 262 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
262 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 102406, and it has 23 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
23 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 12407, and it has 66 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
66 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 100080, and it has 1157 rows and 1026 features…
1 runs0 likes1 downloads1 reach3 impact
1157 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11866, and it has 47 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
47 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11242, and it has 1107 rows and 1026 features…
1 runs0 likes1 downloads1 reach3 impact
1107 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 30000, and it has 83 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
83 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11017, and it has 1211 rows and 1026 features…
1 runs0 likes1 downloads1 reach3 impact
1211 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 17084, and it has 1863 rows and 1026 features…
1 runs0 likes1 downloads1 reach3 impact
1863 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 227, and it has 1238 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
1238 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11036, and it has 396 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
396 instances - 1026 features - 0 classes - 0 missing values