OpenML
Filter results by:
diabetes
0 runs0 likes4 downloads4 reach6 impact
768 instances - 9 features - classes - 0 missing values
Author: Gregory Gay, Tim Menzies, Misty Davies, Karen Gundy-Burlet Source: [Zenodo](https://zenodo.org/record/322475) Please cite: Misty Davies. (2009). bike [Data set]. Zenodo. DOI:…
0 runs0 likes4 downloads4 reach10 impact
4435 instances - 11 features - classes - 0 missing values
Automated file upload of BNG(credit-g)
99 runs0 likes4 downloads4 reach12 impact
1000000 instances - 21 features - 2 classes - 0 missing values
No data.
117 runs0 likes4 downloads4 reach9 impact
1000000 instances - 20 features - 5 classes - 0 missing values
Citation Request: This breast cancer domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data.…
66 runs0 likes4 downloads4 reach14 impact
277 instances - 10 features - 2 classes - 0 missing values
This is a sesnor data for test it is not complete.
0 runs0 likes4 downloads4 reach11 impact
127591 instances - 27 features - classes - 0 missing values
__Major changes w.r.t. version 1: deactivated first two variables as they describe the batch of the experiments and should not be used for prediction. Also transformed the target from numeric to…
8809 runs0 likes4 downloads4 reach13 impact
540 instances - 21 features - 2 classes - 0 missing values
No data.
211 runs0 likes4 downloads4 reach21 impact
313 instances - 5805 features - 8 classes - 0 missing values
Squash Harvest Unstored Data source: Winna Harvey Crop and Food Research, Christchurch, New Zealand The purpose of the research was to determine the changes taking place in squash fruit during the…
876 runs0 likes4 downloads4 reach15 impact
52 instances - 24 features - 3 classes - 39 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
537 runs0 likes4 downloads4 reach14 impact
285 instances - 8 features - 7 classes - 27 missing values
The Committee on Statistical Graphics of the American Statistical Association (ASA) invites you to participate in its Second (1983) Exposition of Statistical Graphics Technology. The purposes of the…
164 runs0 likes4 downloads4 reach14 impact
406 instances - 8 features - 3 classes - 14 missing values
No data.
949 runs0 likes4 downloads4 reach12 impact
74 instances - 63 features - 4 classes - 0 missing values
No data.
996 runs0 likes4 downloads4 reach12 impact
74 instances - 63 features - 4 classes - 0 missing values
Squash Harvest Stored Data source: Winna Harvey Crop and Food Research, Christchurch, New Zealand The purpose of the research was to determine the changes taking place in squash fruit during the…
867 runs0 likes4 downloads4 reach15 impact
52 instances - 25 features - 3 classes - 7 missing values
Hayes-Roth Database This is a merged version of the separate train and test set which are usually distributed. On OpenML this train-test split can be found as one of the possible tasks. Source…
384 runs0 likes4 downloads4 reach26 impact
160 instances - 5 features - 3 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
103 runs0 likes4 downloads4 reach14 impact
52 instances - 9 features - 2 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
490 runs0 likes4 downloads4 reach13 impact
364 instances - 33 features - 6 classes - 101 missing values
CODING: ITEM 1 = BUSINESS CONDIDIONS 6 MONTHS FROM NOW (CONFERENCE BOARD) ITEM 2 = JOBS 6 MONTHS FROM NOW (CONFERENCE BOARD) ITEM 3 = FAMILY INCOME 6 MONTHS FROM NOW (CONFERENCE BOARD) ITEM 4 =…
560 runs0 likes4 downloads4 reach14 impact
72 instances - 4 features - 6 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
708 runs0 likes4 downloads4 reach16 impact
286 instances - 10 features - 2 classes - 9 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
700 runs0 likes4 downloads4 reach15 impact
294 instances - 14 features - 2 classes - 782 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
672 runs0 likes4 downloads4 reach15 impact
158 instances - 8 features - 2 classes - 87 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
688 runs0 likes4 downloads4 reach14 impact
294 instances - 14 features - 2 classes - 782 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
755 runs0 likes4 downloads4 reach14 impact
54 instances - 8 features - 2 classes - 120 missing values
County data from the 2000 Presidential Election in Florida. Compiled by Brett Presnell Department of Statistics, University of Florida These data are derived from three sources, described below. As…
32 runs0 likes4 downloads4 reach14 impact
67 instances - 16 features - 5 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
102 runs0 likes4 downloads4 reach14 impact
67 instances - 15 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
107 runs0 likes4 downloads4 reach15 impact
66 instances - 12 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
119 runs0 likes4 downloads4 reach14 impact
95 instances - 9 features - 2 classes - 9 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
670 runs0 likes4 downloads4 reach14 impact
62 instances - 8 features - 2 classes - 8 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
103 runs0 likes4 downloads4 reach14 impact
92 instances - 10 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
714 runs0 likes4 downloads4 reach15 impact
303 instances - 14 features - 2 classes - 6 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
117 runs0 likes4 downloads4 reach14 impact
50 instances - 5 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
733 runs0 likes4 downloads4 reach14 impact
87 instances - 11 features - 2 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. All datasets contain between 100 and 400 samples, characterized by values of 20,000 - 65,000 attributes. Samples are assigned to several (2-10) classes.…
11 runs0 likes3 downloads3 reach14 impact
214 instances - 45102 features - 7 classes - 0 missing values
No data.
44 runs0 likes3 downloads3 reach12 impact
1000000 instances - 15 features - 2 classes - 0 missing values
No data.
337 runs1 likes2 downloads3 reach12 impact
1000000 instances - 13 features - 3 classes - 0 missing values
No data.
28 runs0 likes3 downloads3 reach10 impact
1000000 instances - 26 features - 7 classes - 0 missing values
A copy of the data set proposed in: S. M. Weiss, and C. A. Kulikowski, Computer Systems That Learn (1991).
30 runs0 likes3 downloads3 reach13 impact
106 instances - 8 features - classes - 0 missing values
libSVM","AAD group A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics, 19(17):2246-2253, 2003. #Dataset from the LIBSVM data repository.…
0 runs0 likes3 downloads3 reach16 impact
86 instances - 7130 features - 0 classes - 0 missing values
Building projectable classifiers of arbitrary complexity. In Proceedings of the 13th International Conference on Pattern Recognition, pages 880-885, Vienna, Austria, August 1996. #Dataset from the…
0 runs0 likes3 downloads3 reach16 impact
862 instances - 3 features - 0 classes - 0 missing values
Mind Cave 2
0 runs0 likes3 downloads3 reach11 impact
125 instances - 40 features - 2 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: The original Adult data set has 14 features, among which six are continuous and eight are categorical. In this data set,…
0 runs0 likes3 downloads3 reach16 impact
48842 instances - 124 features - 0 classes - 0 missing values
No data.
6 runs0 likes3 downloads3 reach13 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
305 runs0 likes3 downloads3 reach12 impact
1000000 instances - 4 features - 2 classes - 0 missing values
No data.
313 runs0 likes3 downloads3 reach9 impact
1000000 instances - 23 features - 2 classes - 0 missing values
No data.
0 runs0 likes3 downloads3 reach12 impact
116640 instances - 10 features - 0 classes - 0 missing values
Michel Lang fRMA-normalized. Only "Kratz-genes"*. \* (see: A practical molecular assay to predict survival in resected non-squamous, non-small-cell lung cancer: development and international…
3 runs0 likes3 downloads3 reach12 impact
442 instances - 24 features - 0 classes - 0 missing values
This data is derived from the 2012 KDD Cup. The data is subsampled to 1% of the original number of instances, downsampling the majority class (click=0) so that the target feature is reasonably…
0 runs1 likes2 downloads3 reach10 impact
798964 instances - 10 features - 3 classes - 399482 missing values
No data.
0 runs0 likes3 downloads3 reach12 impact
31104 instances - 10 features - 0 classes - 0 missing values
No data.
0 runs0 likes3 downloads3 reach9 impact
59049 instances - 10 features - 0 classes - 0 missing values
No data.
0 runs0 likes3 downloads3 reach18 impact
697641 instances - 47237 features - 0 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: Vikas Sindhwani for the SVMlin project.
0 runs0 likes3 downloads3 reach16 impact
72309 instances - 20959 features - 0 classes - 0 missing values
* Dataset: DBworld e-mails data set Task: dbworld-bodies-stemmed * Source: Michele Filannino, PhD University of Manchester Centre for Doctoral Training Email: filannim_AT_cs.man.ac.uk * Data Set…
0 runs0 likes3 downloads3 reach12 impact
64 instances - 3722 features - 2 classes - 0 missing values
* Dataset: DBworld e-mails data set Task: dbworld-subjects-stemmed * Source: Michele Filannino, PhD University of Manchester Centre for Doctoral Training Email: filannim_AT_cs.man.ac.uk * Data Set…
71 runs0 likes3 downloads3 reach13 impact
64 instances - 230 features - 2 classes - 0 missing values
* Dataset: DBworld e-mails data set Task: dbworld-subjects * Source: Michele Filannino, PhD University of Manchester Centre for Doctoral Training Email: filannim_AT_cs.man.ac.uk * Data Set…
40 runs0 likes3 downloads3 reach13 impact
64 instances - 243 features - 2 classes - 0 missing values
* Abstract: 9-class version of poker-hand dataset, it was removed the minority class.
1 runs0 likes3 downloads3 reach14 impact
1025000 instances - 11 features - 9 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: We used binary encoding for each feature (o, b, x), so the number of features is 42*3 = 126
0 runs0 likes3 downloads3 reach16 impact
67557 instances - 127 features - 0 classes - 0 missing values
#Dataset from the LIBSVM multiclass data repository.
0 runs0 likes3 downloads3 reach18 impact
108000 instances - 129 features - 0 classes - 0 missing values
This is a corrected version of the previous data file in version 1, which contained a dataset (349 instances) incorrectly merged from the original training and test sets available on UCI (there are…
0 runs0 likes3 downloads3 reach12 impact
267 instances - 45 features - 2 classes - 0 missing values
No data.
328 runs0 likes3 downloads3 reach12 impact
1000000 instances - 4 features - 2 classes - 0 missing values
No data.
298 runs0 likes3 downloads3 reach12 impact
1000000 instances - 11 features - 5 classes - 0 missing values
No data.
309 runs0 likes3 downloads3 reach12 impact
1000000 instances - 11 features - 5 classes - 0 missing values
No data.
307 runs0 likes3 downloads3 reach12 impact
1000000 instances - 41 features - 3 classes - 0 missing values
No data.
70 runs0 likes3 downloads3 reach9 impact
1000000 instances - 28 features - 2 classes - 0 missing values
No data.
72 runs0 likes3 downloads3 reach9 impact
1000000 instances - 23 features - 2 classes - 0 missing values
No data.
194 runs0 likes3 downloads3 reach12 impact
1000000 instances - 65 features - 10 classes - 0 missing values
No data.
66 runs0 likes3 downloads3 reach13 impact
1000000 instances - 35 features - 6 classes - 0 missing values
No data.
211 runs0 likes3 downloads3 reach12 impact
1000000 instances - 20 features - 7 classes - 0 missing values
No data.
65 runs1 likes2 downloads3 reach9 impact
1000000 instances - 18 features - 7 classes - 0 missing values
No data.
65 runs0 likes3 downloads3 reach9 impact
1000000 instances - 40 features - 2 classes - 0 missing values
No data.
75 runs0 likes3 downloads3 reach9 impact
137781 instances - 10 features - 7 classes - 0 missing values
No data.
304 runs0 likes3 downloads3 reach9 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
306 runs0 likes3 downloads3 reach9 impact
1000000 instances - 13 features - 6 classes - 0 missing values
No data.
52 runs0 likes3 downloads3 reach12 impact
1000000 instances - 48 features - 10 classes - 0 missing values
No data.
50 runs0 likes3 downloads3 reach9 impact
1000000 instances - 61 features - 2 classes - 0 missing values
No data.
63 runs0 likes3 downloads3 reach9 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
67 runs0 likes3 downloads3 reach9 impact
1000000 instances - 13 features - 6 classes - 0 missing values
No data.
66 runs0 likes3 downloads3 reach9 impact
1000000 instances - 13 features - 6 classes - 0 missing values
This database contains the HTML source of web pages plus the ratings of a single user on these web pages. The web pages are on four separate subjects (Bands- recording artists; Goats; Sheep; and…
0 runs0 likes3 downloads3 reach21 impact
65 instances - 3 features - 2 classes - 0 missing values
Information about customers consists of 86 variables and includes product usage data and socio-demographic data derived from zip area codes. The data was supplied by the Dutch data mining company…
0 runs0 likes3 downloads3 reach13 impact
9822 instances - 86 features - 0 classes - 0 missing values
Geographical Analysis Spatial Data This georeferenced data set was used in: Pace, R. Kelley, and Ronald Barry, Quick Computation of Regressions with a Spatially Autoregressive Dependent Variable,…
4 runs1 likes2 downloads3 reach15 impact
3107 instances - 7 features - 0 classes - 0 missing values
Determinants of Wages from the 1985 Current Population Survey Summary: The Current Population Survey (CPS) is used to supplement census information between census years. These data consist of a random…
2 runs0 likes3 downloads3 reach14 impact
534 instances - 11 features - 0 classes - 0 missing values
The data consist of 2001 observations taken from a balloon about 30 kilometres above the surface of the earth. In the section of the flight shown here the balloon increases in height. As radiation…
0 runs1 likes2 downloads3 reach14 impact
2001 instances - 2 features - 0 classes - 0 missing values
No data.
206 runs0 likes3 downloads3 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
311 runs0 likes3 downloads3 reach12 impact
1000000 instances - 17 features - 26 classes - 0 missing values
Pittsburgh bridges This version is derived from version 2 (the discretized version) by removing all instances with missing values in the last (target) attribute. The bridges dataset is originally not…
31 runs0 likes3 downloads3 reach15 impact
105 instances - 12 features - 6 classes - 61 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
8 runs0 likes3 downloads3 reach14 impact
113 instances - 54676 features - 5 classes - 0 missing values
Annual salary information including gross pay and overtime pay for all active, permanent employees of Montgomery County, MD paid in calendar year 2016. This information will be published annually each…
0 runs0 likes3 downloads3 reach8 impact
9228 instances - 13 features - 0 classes - 11169 missing values
1. Title: Lecturers Evaluation (Ordinal LEV) 2. Source Informaion: Donor: Arie Ben David MIS, Dept. of Technology Management Holon Academic Inst. of Technology 52 Golomb St. Holon 58102 Israel…
0 runs1 likes2 downloads3 reach14 impact
1000 instances - 5 features - 0 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
3 runs0 likes3 downloads3 reach13 impact
50 instances - 5 features - 0 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
78 runs0 likes3 downloads3 reach14 impact
130 instances - 10936 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
78 runs0 likes3 downloads3 reach15 impact
363 instances - 10936 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes3 downloads3 reach15 impact
201 instances - 10936 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes3 downloads3 reach14 impact
146 instances - 10936 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes3 downloads3 reach15 impact
412 instances - 10936 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes3 downloads3 reach15 impact
347 instances - 10936 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes3 downloads3 reach15 impact
203 instances - 10936 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes3 downloads3 reach15 impact
329 instances - 10936 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes3 downloads3 reach15 impact
337 instances - 10936 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes3 downloads3 reach15 impact
413 instances - 10936 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
79 runs0 likes3 downloads3 reach15 impact
322 instances - 10936 features - 2 classes - 0 missing values