OpenML
Filter results by:
Information about the dataset CLASSTYPE: numeric CLASSINDEX: last
2 runs0 likes1 downloads1 reach14 impact
559 instances - 5 features - 0 classes - 0 missing values
Primary Biliary Cirrhosis This data set is a follow-up to the original PBC data set, as discussed in appendix D of Fleming and Harrington, Counting Processes and Survival Analysis, Wiley, 1991. An…
0 runs0 likes5 downloads5 reach13 impact
1945 instances - 19 features - 0 classes - 1133 missing values
Data on fluctuating proportions of marked cells in marrow from heterozygous Safari cats from a study of early hematopoiesis. The data included below are 11 time series of proportions of marked…
2 runs0 likes2 downloads2 reach13 impact
140 instances - 4 features - 0 classes - 0 missing values
These data are estimated correlations between daily 3 p.m. wind measurements during September and October 1997 for a network of 45 stations in the Sydney region. The first column below gives a list of…
0 runs0 likes0 downloads0 reach11 impact
45 instances - 47 features - classes - 0 missing values
Veteran's Administration Lung Cancer Trial Taken from Kalbfleisch and Prentice, pages 223-224 Variables Treatment 1=standard, 2=test Celltype 1=squamous, 2=smallcell, 3=adeno, 4=large Survival in days…
2 runs0 likes1 downloads1 reach13 impact
137 instances - 8 features - 0 classes - 0 missing values
Title: Communities and Crime Abstract: Communities within the United States. The data combines socio-economic data from the 1990 US Census, law enforcement data from the 1990 US LEMAS survey, and…
0 runs1 likes3 downloads4 reach13 impact
1994 instances - 128 features - 0 classes - 39202 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag.
7 runs0 likes2 downloads2 reach9 impact
61 instances - 3 features - 0 classes - 0 missing values
No data.
206 runs0 likes3 downloads3 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
67 runs0 likes2 downloads2 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
332 runs0 likes4 downloads4 reach12 impact
1000000 instances - 17 features - 2 classes - 0 missing values
No data.
311 runs0 likes3 downloads3 reach12 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
65 runs0 likes8 downloads8 reach9 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
310 runs0 likes4 downloads4 reach12 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
290 runs0 likes5 downloads5 reach12 impact
1000000 instances - 77 features - 10 classes - 0 missing values
This is an artificial data set with dependencies between the attribute values. The cases are generated using the following method: X1 : uniformly distributed over [-5,5] X2 : uniformly distributed…
3 runs1 likes5 downloads6 reach14 impact
40768 instances - 11 features - 0 classes - 0 missing values
Data originating from the book "Analyzing Categorical Data" by Jeffrey S. Simonoff.
1087 runs0 likes9 downloads9 reach15 impact
50 instances - 5 features - 2 classes - 0 missing values
1. Title: Ozone Level Detection 2. Source: Kun Zhang zhang.kun05 '@' gmail.com Department of Computer Science, Xavier University of Lousiana Wei Fan wei.fan '@' gmail.com IBM T.J.Watson Research…
0 runs0 likes1 downloads1 reach13 impact
2536 instances - 73 features - 0 classes - 0 missing values
This is one of a family of datasets synthetically generated from a realistic simulation of the dynamics of a Unimation Puma 560 robot arm. There are eight datastets in this family . In this repository…
0 runs0 likes6 downloads6 reach14 impact
8192 instances - 33 features - 0 classes - 0 missing values
1. Title: Wine Quality 2. Sources Created by: Paulo Cortez (Univ. Minho), Antonio Cerdeira, Fernando Almeida, Telmo Matos and Jose Reis (CVRVV) @ 2009 3. Past Usage: P. Cortez, A. Cerdeira, F.…
0 runs1 likes13 downloads14 reach15 impact
6497 instances - 12 features - 0 classes - 0 missing values
Pittsburgh bridges This version is derived from version 1 by removing all instances with missing values in the last (target) attribute. The bridges dataset is originally not a classification dataset,…
34 runs0 likes1 downloads1 reach15 impact
105 instances - 12 features - 6 classes - 61 missing values
Pittsburgh bridges This version is derived from version 2 (the discretized version) by removing all instances with missing values in the last (target) attribute. The bridges dataset is originally not…
31 runs0 likes3 downloads3 reach15 impact
105 instances - 12 features - 6 classes - 61 missing values
No data.
268 runs0 likes9 downloads9 reach47 impact
3075 instances - 12433 features - 6 classes - 0 missing values
No data.
373 runs0 likes10 downloads10 reach62 impact
918 instances - 3013 features - 10 classes - 0 missing values
No data.
159 runs0 likes11 downloads11 reach22 impact
1657 instances - 3759 features - 25 classes - 0 missing values
No data.
264 runs0 likes11 downloads11 reach47 impact
3204 instances - 13196 features - 6 classes - 0 missing values
No data.
163 runs0 likes13 downloads13 reach22 impact
1560 instances - 8461 features - 20 classes - 0 missing values
No data.
216 runs0 likes12 downloads12 reach63 impact
11162 instances - 11466 features - 10 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
31 instances - 54 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
13 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
13 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes1 downloads1 reach13 impact
20 instances - 10 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes1 downloads1 reach13 impact
16 instances - 34 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
22 instances - 40 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
19 instances - 10 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
274 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
12 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
11 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
13 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs1 likes1 downloads2 reach14 impact
4450 instances - 203 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes1 downloads1 reach13 impact
25 instances - 10 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
30 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
26 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
79 instances - 321 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
37 instances - 1143 features - 0 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
9 runs1 likes3 downloads4 reach14 impact
95 instances - 22278 features - 5 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
8 runs0 likes3 downloads3 reach14 impact
113 instances - 54676 features - 5 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
9 runs0 likes0 downloads0 reach14 impact
89 instances - 54614 features - 4 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
9 runs0 likes4 downloads4 reach14 impact
92 instances - 59005 features - 5 classes - 0 missing values
Phishing website 1
0 runs0 likes0 downloads0 reach2 impact
11055 instances - 31 features - 0 classes - 0 missing values
Email dataset 1a
0 runs0 likes0 downloads0 reach4 impact
4585 instances - 4 features - 0 classes - 0 missing values
Email dataset 1b
0 runs0 likes0 downloads0 reach4 impact
4585 instances - 24 features - 0 classes - 161 missing values
Email dataset 1c
0 runs0 likes0 downloads0 reach4 impact
4585 instances - 792 features - 0 classes - 0 missing values
Email dataset 1d
0 runs0 likes0 downloads0 reach4 impact
4585 instances - 11 features - 0 classes - 0 missing values
Email dataset 1e
0 runs0 likes0 downloads0 reach4 impact
4585 instances - 580 features - 0 classes - 0 missing values
Email dataset 2
0 runs0 likes0 downloads0 reach2 impact
11507 instances - 4 features - 0 classes - 0 missing values
Testing dataset
0 runs0 likes1 downloads1 reach3 impact
134731 instances - 31 features - 2 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
150 instances - 7 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
178 instances - 16 features - classes - 0 missing values
This dataset describes 100,000 realistic, synthetically generated worker compensation insurance claims. Along the ultimate financial losses, each claim is described by the initial case estimate, date…
0 runs0 likes0 downloads0 reach0 impact
100000 instances - 14 features - 0 classes - 0 missing values
Autistic Spectrum Disorder (ASD) is a neurodevelopment condition associated with significant healthcare costs, and early diagnosis can significantly reduce these. Unfortunately, waiting times for an…
0 runs0 likes0 downloads0 reach0 impact
704 instances - 21 features - classes - 192 missing values
The dataset is about bankruptcy prediction of Polish companies. The data was collected from Emerging Markets Information Service (EMIS, [Web Link]), which is a database containing information on…
0 runs0 likes0 downloads0 reach0 impact
7027 instances - 65 features - classes - 5835 missing values
This hourly data set contains the PM2.5 data of US Embassy in Beijing. Meanwhile, meteorological data from Beijing Capital International Airport are also included.
0 runs0 likes0 downloads0 reach0 impact
43824 instances - 13 features - classes - 2067 missing values
The database was created with records of behavior of the urban traffic of the city of Sao Paulo in Brazil from December 14, 2009 to December 18, 2009 (From Monday to Friday). Registered from 7:00 to…
0 runs0 likes0 downloads0 reach0 impact
135 instances - 18 features - classes - 0 missing values
We choose age, delivery number, delivery time, blood pressure and heart status. We classify delivery time to Premature, Timely and Latecomer. As like the delivery time we consider blood pressure in…
0 runs0 likes0 downloads0 reach0 impact
80 instances - 6 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
150 instances - 7 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
20000 instances - 42 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
360 instances - 105 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
10992 instances - 26 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
10992 instances - 26 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
6435 instances - 42 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
2310 instances - 25 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
2310 instances - 25 features - classes - 0 missing values
1987 National Indonesia Contraceptive Prevalence Survey
0 runs0 likes0 downloads0 reach0 impact
1473 instances - 10 features - classes - 0 missing values
dgf_test
0 runs0 likes0 downloads0 reach0 impact
3415 instances - 5 features - 2 classes - 1 missing values
dgf_test
0 runs0 likes0 downloads0 reach0 impact
3415 instances - 5 features - 2 classes - 1 missing values
We scraped a large number of eBay auctions of a popular product. After preprocessing the auction data, we build the SB dataset. The goal is to share the labelled SB dataset with the researchers.
0 runs0 likes0 downloads0 reach0 impact
6321 instances - 13 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
2465 instances - 35 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
950 instances - 10 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
846 instances - 22 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
846 instances - 22 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
528 instances - 21 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
528 instances - 21 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
178 instances - 16 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
194 instances - 32 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
1484 instances - 18 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
8192 instances - 11 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
2465 instances - 31 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
2465 instances - 28 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
336 instances - 15 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
16599 instances - 18 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
40768 instances - 14 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
214 instances - 15 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
214 instances - 15 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
2465 instances - 30 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
506 instances - 12 features - classes - 0 missing values
Subset of KITS dataset with 100 images
0 runs0 likes0 downloads0 reach0 impact
100 instances - 27649 features - 0 classes - 0 missing values
Subset of KITS dataset with 100 images
0 runs0 likes0 downloads0 reach0 impact
100 instances - 27649 features - 2 classes - 0 missing values
Subset of KITS dataset with 100 images
0 runs0 likes0 downloads0 reach0 impact
100 instances - 27649 features - 0 classes - 0 missing values
Subset of KITS dataset with 100 images and nominal target
5 runs0 likes0 downloads0 reach0 impact
100 instances - 27649 features - 2 classes - 0 missing values
Survey to know if people self-identify as Midwesterners.
0 runs0 likes0 downloads0 reach0 impact
2778 instances - 28 features - 10 classes - 1737 missing values