Data
Filter results by:
No data.
306 runs0 likes4 downloads4 reach2 impact
1000000 instances - 4 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
76 runs0 likes4 downloads4 reach7 impact
187 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2860 runs0 likes4 downloads4 reach16 impact
604 instances - 10937 features - 2 classes - 0 missing values
* Abstract: Purpose is to predict poker hands * Source - Creators: Robert Cattral (cattral '@' gmail.com) Franz Oppacher (oppacher '@' scs.carleton.ca) Carleton University, Department of Computer…
1 runs0 likes4 downloads4 reach6 impact
1025009 instances - 11 features - 10 classes - 0 missing values
Newsweeder: Learning to filter netnews. In Proceedings of the Twelfth International Conference on Machine Learning, pages 331-339, 1995. #Dataset from the LIBSVM data repository. Preprocessing: First…
0 runs0 likes4 downloads4 reach5 impact
19928 instances - 62062 features - 0 classes - 0 missing values
This is the Tecator data set: The task is to predict the fat content of a meat sample on the basis of its near infrared absorbance spectrum. 1. Statement of permission from Tecator (the original data…
0 runs0 likes4 downloads4 reach5 impact
240 instances - 125 features - 0 classes - 0 missing values
No data.
33 runs0 likes4 downloads4 reach3 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
27 runs1 likes4 downloads5 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
Kung chi
1 runs0 likes4 downloads4 reach4 impact
123 instances - 40 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2855 runs0 likes4 downloads4 reach16 impact
1545 instances - 10937 features - 2 classes - 0 missing values
Dataset Title: Localization Data for Person Activity Data Set Abstract: Data contains recordings of five people performing different activities. Each person wore four sensors (tags) while performing…
6 runs0 likes4 downloads4 reach6 impact
164860 instances - 8 features - 11 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes4 downloads4 reach7 impact
250 instances - 10937 features - 2 classes - 0 missing values
No data.
33 runs0 likes4 downloads4 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
Multi-label dataset. The birds dataset consists of 327 audio recordings of 12 different vocalizing bird species. Each sound can be assigned to various bird species.
0 runs0 likes4 downloads4 reach3 impact
645 instances - 279 features - 2 classes - 0 missing values
Automated file upload of BNG(ionosphere)
99 runs1 likes4 downloads5 reach3 impact
1000000 instances - 35 features - 2 classes - 0 missing values
"The speech dataset was also provided by (see citation request) and contains real world data from recorded English language. The normal class contains data from persons having an American accent…
1599 runs0 likes4 downloads4 reach9 impact
3686 instances - 401 features - 2 classes - 0 missing values
The langLog dataset includes 1004 textual predictors and was originally compiled in the doctorial thesis of Read (2010). It consists of 956 text samples that can be assigned to one or more topics such…
0 runs0 likes4 downloads4 reach3 impact
1460 instances - 1079 features - 2 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: E5 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
105 runs0 likes4 downloads4 reach6 impact
1112 instances - 4 features - 5 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: D4 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
121 runs0 likes4 downloads4 reach6 impact
8654 instances - 4 features - 5 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: D1 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
125 runs0 likes4 downloads4 reach6 impact
8753 instances - 4 features - 5 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10907, and it has 1443 rows and 1026 features…
1 runs0 likes4 downloads4 reach3 impact
1443 instances - 1026 features - 0 classes - 0 missing values
* Dataset Title: MicroMass - Mixed (mixed spectra version) * Abstract: A dataset to explore machine learning approaches for the identification of microorganisms from mass-spectrometry data. * Source:…
64 runs1 likes4 downloads5 reach5 impact
360 instances - 1301 features - 10 classes - 0 missing values
* Donor: David W. Aha (aha '@' ics.uci.edu) (714) 856-8779 * Data Set Information: This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In…
159 runs1 likes4 downloads5 reach5 impact
200 instances - 14 features - 5 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: A1 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
274 runs0 likes4 downloads4 reach6 impact
3252 instances - 4 features - 5 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: B2 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
120 runs0 likes4 downloads4 reach6 impact
10668 instances - 4 features - 5 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: B3 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
119 runs0 likes4 downloads4 reach6 impact
10386 instances - 4 features - 5 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: B1 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
119 runs0 likes4 downloads4 reach6 impact
10176 instances - 4 features - 5 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: A2 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
119 runs0 likes4 downloads4 reach6 impact
1623 instances - 4 features - 5 classes - 0 missing values
Source: The dataset was created by Angeliki Xifara (angxifara @ gmail.com, Civil/Structural Engineer) and was processed by Athanasios Tsanas (tsanasthanasis @ gmail.com, Oxford Centre for Industrial…
103 runs1 likes4 downloads5 reach5 impact
768 instances - 10 features - 37 classes - 0 missing values
No data.
90 runs0 likes4 downloads4 reach1 impact
137781 instances - 10 features - 7 classes - 0 missing values
No data.
68 runs0 likes4 downloads4 reach2 impact
1000000 instances - 21 features - 2 classes - 0 missing values
Synthetic dataset. Almost identical to [dataset 152](https://www.openml.org/d/153/edit)
319 runs0 likes4 downloads4 reach2 impact
1000000 instances - 11 features - 2 classes - 0 missing values
This is a 20,000 instance sample of the original CIFAR-10 dataset. Sampled randomly and stratified, with 2000 examples per class. Training and test set are merged. Find the corresponding task for the…
380 runs0 likes4 downloads4 reach11 impact
20000 instances - 3073 features - 10 classes - 0 missing values
0. airplane 1. automobile 2. bird 3. cat 4. deer 5. dog 6. frog 7. horse 8. ship 9. truck CIFAR-10 contains 6000 images per class. The original train-test split randomly divided these into 5000 train…
82 runs0 likes4 downloads4 reach10 impact
60000 instances - 3073 features - 10 classes - 0 missing values
No data.
65 runs0 likes4 downloads4 reach1 impact
1000000 instances - 40 features - 2 classes - 0 missing values
No data.
310 runs0 likes4 downloads4 reach2 impact
1000000 instances - 11 features - 2 classes - 0 missing values
This data set consists of three types of entities: (a) the specification of an auto in terms of various characteristics; (b) its assigned insurance risk rating,; (c) its normalized losses in use as…
7 runs1 likes4 downloads5 reach1 impact
159 instances - 16 features - 0 classes - 0 missing values
This dataset was retrieved 2014-11-14 from the UCI site and converted to the ARFF format. __Major changes w.r.t. version 3: dataset from UCI that matches description and data types__ ### Feature…
4196 runs0 likes4 downloads4 reach5 impact
690 instances - 15 features - 2 classes - 0 missing values
No data.
63 runs0 likes4 downloads4 reach2 impact
1000000 instances - 19 features - 4 classes - 0 missing values
Data from https://doi.org/10.5281/zenodo.269636
0 runs0 likes4 downloads4 reach5 impact
4758 instances - 39 features - classes - 0 missing values
Primary Biliary Cirrhosis This data set is a follow-up to the original PBC data set, as discussed in appendix D of Fleming and Harrington, Counting Processes and Survival Analysis, Wiley, 1991. An…
0 runs0 likes4 downloads4 reach5 impact
1945 instances - 19 features - 0 classes - 1133 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes4 downloads4 reach7 impact
275 instances - 10937 features - 2 classes - 0 missing values
No data.
29 runs0 likes4 downloads4 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
941 runs0 likes4 downloads4 reach2 impact
74 instances - 63 features - 4 classes - 0 missing values
White Clover Persistence Trials Data source: Ian Tarbotton AgResearch, Whatawhata Research Centre, Hamilton, New Zealand The objective was to determine the mechanisms which influence the persistence…
858 runs0 likes4 downloads4 reach7 impact
63 instances - 32 features - 4 classes - 0 missing values
Squash Harvest Stored Data source: Winna Harvey Crop and Food Research, Christchurch, New Zealand The purpose of the research was to determine the changes taking place in squash fruit during the…
867 runs0 likes4 downloads4 reach7 impact
52 instances - 25 features - 3 classes - 7 missing values
Squash Harvest Unstored Data source: Winna Harvey Crop and Food Research, Christchurch, New Zealand The purpose of the research was to determine the changes taking place in squash fruit during the…
876 runs0 likes4 downloads4 reach7 impact
52 instances - 24 features - 3 classes - 39 missing values
No data.
211 runs0 likes4 downloads4 reach10 impact
313 instances - 5805 features - 8 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
537 runs0 likes4 downloads4 reach6 impact
285 instances - 8 features - 7 classes - 27 missing values
County data from the 2000 Presidential Election in Florida. Compiled by Brett Presnell Department of Statistics, University of Florida These data are derived from three sources, described below. As…
32 runs0 likes4 downloads4 reach6 impact
67 instances - 17 features - 5 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
103 runs0 likes4 downloads4 reach6 impact
92 instances - 11 features - 2 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
490 runs0 likes4 downloads4 reach5 impact
364 instances - 33 features - 6 classes - 101 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
102 runs0 likes4 downloads4 reach6 impact
52 instances - 10 features - 2 classes - 0 missing values
CODING: ITEM 1 = BUSINESS CONDIDIONS 6 MONTHS FROM NOW (CONFERENCE BOARD) ITEM 2 = JOBS 6 MONTHS FROM NOW (CONFERENCE BOARD) ITEM 3 = FAMILY INCOME 6 MONTHS FROM NOW (CONFERENCE BOARD) ITEM 4 =…
560 runs0 likes4 downloads4 reach6 impact
72 instances - 4 features - 6 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
670 runs0 likes4 downloads4 reach6 impact
62 instances - 8 features - 2 classes - 8 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
102 runs0 likes4 downloads4 reach6 impact
67 instances - 16 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
107 runs0 likes4 downloads4 reach6 impact
66 instances - 13 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
714 runs0 likes4 downloads4 reach7 impact
303 instances - 14 features - 2 classes - 6 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
115 runs0 likes4 downloads4 reach6 impact
50 instances - 6 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
700 runs0 likes4 downloads4 reach7 impact
294 instances - 14 features - 2 classes - 782 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
672 runs0 likes4 downloads4 reach7 impact
158 instances - 8 features - 2 classes - 87 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
708 runs0 likes4 downloads4 reach8 impact
286 instances - 10 features - 2 classes - 9 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
755 runs0 likes4 downloads4 reach6 impact
54 instances - 8 features - 2 classes - 120 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
113 runs0 likes4 downloads4 reach6 impact
40 instances - 3 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
688 runs0 likes4 downloads4 reach6 impact
294 instances - 14 features - 2 classes - 782 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
119 runs0 likes4 downloads4 reach6 impact
95 instances - 10 features - 2 classes - 9 missing values
Abstract: CART book's waveform domains Source: Original Owners: Breiman,L., Friedman,J.H., Olshen,R.A., & Stone,C.J. (1984). Classification and Regression Trees. Wadsworth International Group:…
0 runs1 likes3 downloads4 reach3 impact
5000 instances - 22 features - classes - 0 missing values
This dataset summarizes a heterogeneous set of features about articles published by Mashable in a period of two years. The goal is to predict the number of shares in social networks (popularity). *…
0 runs0 likes3 downloads3 reach3 impact
39644 instances - 61 features - 0 classes - 0 missing values
The data was collected retrospectively at Wroclaw Thoracic Surgery Centre for patients who underwent major lung resections for primary lung cancer in the years 2007 - 2011. The Centre is associated…
31 runs0 likes3 downloads3 reach4 impact
470 instances - 17 features - 2 classes - 0 missing values
DBpedia with top-474 most frequent YAGO types HMC dataset for type prediction. Ingoing and outgoing properties as features
0 runs0 likes3 downloads3 reach3 impact
2886305 instances - 2401 features - classes - 0 missing values
No data.
298 runs0 likes3 downloads3 reach2 impact
1000000 instances - 11 features - 5 classes - 0 missing values
No data.
194 runs0 likes3 downloads3 reach2 impact
1000000 instances - 65 features - 10 classes - 0 missing values
No data.
211 runs0 likes3 downloads3 reach2 impact
1000000 instances - 20 features - 7 classes - 0 missing values
No data.
66 runs0 likes3 downloads3 reach1 impact
1000000 instances - 13 features - 6 classes - 0 missing values
No data.
143 runs0 likes3 downloads3 reach2 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
328 runs0 likes3 downloads3 reach2 impact
1000000 instances - 4 features - 2 classes - 0 missing values
This is a commercial application described in Weiss & Indurkhya (1995). The data describes a telecommunication problem. No further information is available. Characteristics: (10000+5000) cases, 49…
2 runs0 likes3 downloads3 reach1 impact
15000 instances - 49 features - 0 classes - 0 missing values
No data.
307 runs0 likes3 downloads3 reach2 impact
1000000 instances - 41 features - 3 classes - 0 missing values
No data.
65 runs0 likes3 downloads3 reach1 impact
1000000 instances - 40 features - 2 classes - 0 missing values
No data.
304 runs0 likes3 downloads3 reach1 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
306 runs0 likes3 downloads3 reach1 impact
1000000 instances - 13 features - 6 classes - 0 missing values
No data.
52 runs0 likes3 downloads3 reach2 impact
1000000 instances - 48 features - 10 classes - 0 missing values
No data.
206 runs0 likes3 downloads3 reach2 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
311 runs0 likes3 downloads3 reach2 impact
1000000 instances - 17 features - 26 classes - 0 missing values
This database was designed on the basis of data provided by US Census Bureau [http://www.census.gov] (under Lookup Access [http://www.census.gov/cdrom/lookup]: Summary Tape File 1). The data were…
2 runs1 likes3 downloads4 reach1 impact
22784 instances - 9 features - 0 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2834 runs0 likes3 downloads3 reach16 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2841 runs0 likes3 downloads3 reach16 impact
630 instances - 10937 features - 2 classes - 0 missing values
The Computer Activity databases are a collection of computer systems activity measures. The data was collected from a Sun Sparcstation 20/712 with 128 Mbytes of memory running in a multi-user…
12 runs0 likes3 downloads3 reach6 impact
8192 instances - 13 features - 0 classes - 0 missing values
El Nino Data Data Type spatio-temporal Abstract The data set contains oceanographic and surface meteorological readings taken from a series of buoys positioned throughout the equatorial Pacific. The…
0 runs1 likes3 downloads4 reach5 impact
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes3 downloads3 reach7 impact
337 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes3 downloads3 reach7 impact
470 instances - 10937 features - 2 classes - 0 missing values
No data.
90 runs2 likes3 downloads5 reach2 impact
663552 instances - 13 features - 2 classes - 0 missing values
No data.
6 runs0 likes3 downloads3 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
0 runs0 likes3 downloads3 reach1 impact
116640 instances - 10 features - 0 classes - 0 missing values
Even smaller sample of version 1
0 runs0 likes3 downloads3 reach4 impact
149639 instances - 12 features - 2 classes - 0 missing values
No data.
305 runs0 likes3 downloads3 reach2 impact
1000000 instances - 4 features - 2 classes - 0 missing values
Michel Lang fRMA-normalized. Only "Kratz-genes"*. \* (see: A practical molecular assay to predict survival in resected non-squamous, non-small-cell lung cancer: development and international…
3 runs0 likes3 downloads3 reach4 impact
442 instances - 24 features - 0 classes - 0 missing values
No data.
68 runs0 likes3 downloads3 reach1 impact
20000 instances - 17 features - 3 classes - 10000 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes3 downloads3 reach6 impact
138 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes3 downloads3 reach7 impact
267 instances - 10937 features - 2 classes - 0 missing values