OpenML
Filter results by:
Donated by P. Savicky, Institute of Computer Science, AS of CR, Czech Republic Methods for multidimensional event classification: a case study using images from a Cherenkov gamma-ray telescope.…
64660 runs1 likes30 downloads31 reach26 impact
19020 instances - 12 features - 2 classes - 0 missing values
URL dataset 3
0 runs0 likes0 downloads0 reach2 impact
18982 instances - 80 features - 5 classes - 0 missing values
Rows with NaN and inf values removed. Data converted from CSV to ARFF.
0 runs0 likes0 downloads0 reach4 impact
18982 instances - 80 features - classes - 0 missing values
Rows with NaN and inf values removed. Converted file format from CSV to ARFF.
0 runs0 likes1 downloads1 reach6 impact
18982 instances - 80 features - 5 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach12 impact
17496 instances - 10 features - 0 classes - 0 missing values
Bike sharing systems are new generation of traditional bike rentals where whole process from membership, rental and return back has become automatic. Through these systems, user is able to easily rent…
0 runs0 likes2 downloads2 reach3 impact
17379 instances - 13 features - 0 classes - 0 missing values
Bike sharing systems are new generation of traditional bike rentals where whole process from membership, rental and return back has become automatic. Through these systems, user is able to easily rent…
0 runs0 likes1 downloads1 reach3 impact
17379 instances - 13 features - 0 classes - 0 missing values
Historical Rainfall data of Bangladesh
0 runs0 likes0 downloads0 reach12 impact
16755 instances - 4 features - 0 classes - 0 missing values
This data set is also obtained from the task of controlling a F16 aircraft, although the target variable and attributes are different from the ailerons domain. In this case the goal variable is…
2 runs0 likes7 downloads7 reach11 impact
16599 instances - 19 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
16599 instances - 18 features - classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
1176 runs0 likes12 downloads12 reach15 impact
16599 instances - 19 features - 2 classes - 0 missing values
test
0 runs0 likes0 downloads0 reach3 impact
16598 instances - 11 features - classes - 329 missing values
CD4 count prediction date
0 runs0 likes0 downloads0 reach10 impact
16484 instances - 62 features - classes - 0 missing values
nfl_games
0 runs0 likes0 downloads0 reach8 impact
16274 instances - 12 features - classes - 0 missing values
No data.
2 runs0 likes0 downloads0 reach0 impact
16000 instances - 27649 features - 2 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
16000 instances - 27649 features - 2 classes - 0 missing values
A subset of the 3D dataset from Princeton\'s COS 429 Computer Vision course. The dataset consists of 40 models organised into 4 classes of 10 objects each.
0 runs0 likes0 downloads0 reach2 impact
16000 instances - 4 features - classes - 0 missing values
Test dataset
0 runs0 likes1 downloads1 reach14 impact
15547 instances - 61 features - 0 classes - 280 missing values
Test dataset
0 runs0 likes0 downloads0 reach13 impact
15547 instances - 61 features - 0 classes - 280 missing values
Test dataset
3 runs0 likes0 downloads0 reach16 impact
15547 instances - 61 features - 2 classes - 280 missing values
Test dataset
0 runs0 likes2 downloads2 reach13 impact
15547 instances - 61 features - 0 classes - 280 missing values
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage repeatable,…
109963 runs1 likes20 downloads21 reach28 impact
15545 instances - 6 features - 2 classes - 0 missing values
This is a commercial application described in Weiss & Indurkhya (1995). The data describes a telecommunication problem. No further information is available. Characteristics: (10000+5000) cases, 49…
2 runs0 likes4 downloads4 reach11 impact
15000 instances - 49 features - 0 classes - 0 missing values
exercises
0 runs0 likes0 downloads0 reach8 impact
15000 instances - 8 features - classes - 0 missing values
exercises
0 runs0 likes0 downloads0 reach8 impact
15000 instances - 8 features - classes - 0 missing values
% Title: Flora % Source: https://automl.chalearn.org/data % % Dataset from the first ChaLearn AutoML challenge (2014). % Only the training data is included, as there were no labels for validation and…
0 runs0 likes0 downloads0 reach4 impact
15000 instances - 200001 features - 0 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
624 runs0 likes10 downloads10 reach15 impact
15000 instances - 49 features - 2 classes - 0 missing values
All data is from one continuous EEG measurement with the Emotiv EEG Neuroheadset. The duration of the measurement was 117 seconds. The eye state was detected via a camera during the EEG measurement…
165845 runs3 likes94 downloads97 reach29 impact
14980 instances - 15 features - 2 classes - 0 missing values
__Major changes w.r.t. version 1: changed binary features to data type factor.__ Dataset from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch), which consisted of…
0 runs0 likes0 downloads0 reach10 impact
14395 instances - 217 features - classes - 0 missing values
Datasets from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch) Dataset from: http://www.agnostic.inf.ethz.ch/datasets.php Modified by TunedIT (converted to ARFF…
486 runs0 likes14 downloads14 reach16 impact
14395 instances - 109 features - 2 classes - 0 missing values
Om algos te testen
74 runs0 likes6 downloads6 reach15 impact
14240 instances - 31 features - 2 classes - 0 missing values
Author: Marius Lindauer Date: 27.02.2014 These data set was generated for a publication about claspfolio 2.0, i.e., an algorithm selector for ASP. The algorithm portfolio of clasp (2.1.4)…
0 runs0 likes0 downloads0 reach9 impact
14234 instances - 143 features - 0 classes - 200838 missing values
The dataset contains information on 13,932 single-family homes sold in Miami in 2016. Besides publicly available information, the dataset creator Steven C. Bourassa has added distance variables,…
0 runs0 likes0 downloads0 reach0 impact
13932 instances - 17 features - 0 classes - 0 missing values
### Description Gas Sensor Array Drift Dataset Data Set ### Sources ``` (a) Creators: Alexander Vergara (vergara '@' ucsd.edu) BioCircutis Institute University of California San Diego San Diego,…
18354 runs1 likes20 downloads21 reach44 impact
13910 instances - 129 features - 6 classes - 0 missing values
A Vergara, S Vembu, T Ayhan, M Ryan, M Homer, R Huerta. "Chemical gas sensor drift compensation using classifier ensembles." Sensors and Actuators B: Chemical 166 (2012): 320-329. I Rodriguez-Lujan, J…
68 runs1 likes10 downloads11 reach13 impact
13910 instances - 130 features - 6 classes - 0 missing values
This data set addresses a control problem, namely flying a F16 aircraft. The attributes describe the status of the aeroplane, while the goal is to predict the control action on the ailerons of the…
0 runs0 likes6 downloads6 reach14 impact
13750 instances - 41 features - 0 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
602 runs1 likes12 downloads13 reach15 impact
13750 instances - 41 features - 2 classes - 0 missing values
CIFAR-10 dataset but with some modifications. In particular, each class has fewer labeled training examples than in CIFAR-10, but a very large set of unlabeled examples is provided to learn image…
40 runs0 likes0 downloads0 reach15 impact
13000 instances - 27649 features - 10 classes - 0 missing values
1. Title: Nursery Database 2. Sources: (a) Creator: Vladislav Rajkovic et al. (13 experts) (b) Donors: Marko Bohanec (marko.bohanec@ijs.si) Blaz Zupan (blaz.zupan@ijs.si) (c) Date: June, 1997 3. Past…
2210 runs0 likes18 downloads18 reach12 impact
12960 instances - 9 features - 5 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
652 runs0 likes17 downloads17 reach15 impact
12960 instances - 9 features - 2 classes - 0 missing values
* Title: Nursery Database * Abstract: 4-class version of the original Nursery dataset
121 runs0 likes7 downloads7 reach14 impact
12958 instances - 9 features - 4 classes - 0 missing values
This is an experimental data set for trying to classify numbers in a lottery as "Highly likely to be picked" or "Not very likely to be picked". It is based on a little more than a…
0 runs0 likes0 downloads0 reach0 impact
12528 instances - 36 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
12330 instances - 18 features - classes - 0 missing values
Email dataset 2
0 runs0 likes0 downloads0 reach2 impact
11507 instances - 4 features - 0 classes - 0 missing values
Mammography dataset Past Usage: 1. Woods, K., Doss, C., Bowyer, K., Solka, J., Priebe, C.,
218 runs5 likes48 downloads53 reach25 impact
11183 instances - 7 features - 2 classes - 0 missing values
No data.
216 runs0 likes12 downloads12 reach63 impact
11162 instances - 11466 features - 10 classes - 0 missing values
Phishing website 1
0 runs0 likes0 downloads0 reach2 impact
11055 instances - 31 features - 0 classes - 0 missing values
Source: Rami Mustafa A Mohammad ( University of Huddersfield, rami.mohammad '@' hud.ac.uk, rami.mustafa.a '@' gmail.com) Lee McCluskey (University of Huddersfield,t.l.mccluskey '@' hud.ac.uk ) Fadi…
51520 runs1 likes27 downloads28 reach27 impact
11055 instances - 31 features - 2 classes - 0 missing values
This dataset contains all the player names and player ids, taken from Sofifa
0 runs0 likes0 downloads0 reach8 impact
11009 instances - 3 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
10992 instances - 26 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
10992 instances - 26 features - classes - 0 missing values
We create a digit database by collecting 250 samples from 44 writers. The samples written by 30 writers are used for training, cross-validation and writer dependent testing, and the digits written by…
37228 runs0 likes21 downloads21 reach12 impact
10992 instances - 17 features - 10 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
676 runs0 likes14 downloads14 reach15 impact
10992 instances - 17 features - 2 classes - 0 missing values
Jarkko Salojarvi, Kai Puolamaki, Jaana Simola, Lauri Kovanen, Ilpo Kojo, Samuel Kaski. Inferring Relevance from Eye Movements: Feature Extraction. Helsinki University of Technology, Publications in…
440 runs0 likes12 downloads12 reach15 impact
10936 instances - 28 features - 3 classes - 0 missing values
Modified version of the training dataset of the Bike Sharing Demand challenge running on Kaggle (http://www.kaggle.com/c/bike-sharing-demand/) If you use the problem in publication, please cite:…
0 runs0 likes3 downloads3 reach12 impact
10886 instances - 11 features - 0 classes - 0 missing values
This is a PROMISE data set made publicly available in order to encourage repeatable, verifiable, refutable, and/or improvable predictive models of software engineering. If you publish material based…
21918 runs0 likes21 downloads21 reach27 impact
10885 instances - 22 features - 2 classes - 25 missing values
Dataset sales
0 runs0 likes0 downloads0 reach11 impact
10738 instances - 15 features - 0 classes - 0 missing values
This dataset contains 10962 houses to rent with 13 diferent features. Some values in the dataset can be considered as outliers for further analyses. Bear in mind that the Web Crawler was used only to…
0 runs0 likes0 downloads0 reach5 impact
10692 instances - 13 features - 0 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: B2 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
120 runs0 likes4 downloads4 reach14 impact
10668 instances - 4 features - 5 classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
10503 instances - 65 features - classes - 9888 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: B3 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
119 runs0 likes4 downloads4 reach14 impact
10386 instances - 4 features - 5 classes - 0 missing values
Human Activity Recognition (HAR) database built from the recordings of 30 subjects performing activities of daily living (ADL) while carrying a waist-mounted smartphone with embedded inertial sensors.…
24379 runs1 likes26 downloads27 reach42 impact
10299 instances - 562 features - 6 classes - 0 missing values
This database has been artificially generated. It describes the structure of the capital letters A, C, D, E, F, G, H, L, P, R, indicated by a number 1-10, in that order (A=1,C=2,...). Each letter's…
24309 runs0 likes10 downloads10 reach57 impact
10218 instances - 8 features - 10 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: B4 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
123 runs0 likes3 downloads3 reach14 impact
10190 instances - 4 features - 5 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: B1 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
119 runs0 likes4 downloads4 reach14 impact
10176 instances - 4 features - 5 classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
10173 instances - 65 features - classes - 12157 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: B6 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
111 runs0 likes2 downloads2 reach14 impact
10130 instances - 4 features - 5 classes - 0 missing values
Internet Usage Data Data Type multivariate Abstract This data contains general demographic information on internet users in 1997. Sources Original Owner [1]Graphics, Visualization, & Usability Center…
0 runs1 likes6 downloads7 reach12 impact
10108 instances - 72 features - 46 classes - 2699 missing values
This data contains general demographic information on internet users in 1997. Original Owner [1]Graphics, Visualization, & Usability Center College of Computing Geogia Institute of Technology…
0 runs0 likes0 downloads0 reach12 impact
10108 instances - 72 features - classes - 2699 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
622 runs0 likes6 downloads6 reach17 impact
10108 instances - 69 features - 2 classes - 2699 missing values
"The sulfur recovery unit (SRU) removes environmental pollutants from acid gas streams before they are released into the atmosphere. Furthermore, elemental sulfur is recovered as a valuable…
0 runs0 likes2 downloads2 reach12 impact
10081 instances - 7 features - 0 classes - 0 missing values
The AI4I 2020 Predictive Maintenance Dataset is a synthetic dataset that reflects real predictive maintenance data encountered in industry. Since real predictive maintenance datasets are generally…
0 runs0 likes0 downloads0 reach0 impact
10000 instances - 14 features - classes - 0 missing values
The analysis is performed for different sets of input values using the methodology similar to that described in [Schafer, Benjamin, et al. 'Taming instabilities in power grid networks by decentralized…
0 runs0 likes0 downloads0 reach0 impact
10000 instances - 14 features - classes - 0 missing values
#sbox
0 runs0 likes0 downloads0 reach7 impact
10000 instances - 32 features - classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
3 runs0 likes2 downloads2 reach18 impact
10000 instances - 2001 features - 5 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
11 runs0 likes0 downloads0 reach20 impact
10000 instances - 7201 features - 10 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: B5 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
107 runs0 likes2 downloads2 reach14 impact
9989 instances - 4 features - 5 classes - 0 missing values
This dataset is an artificial simulation of the Duffing system with one phase transition to the chaotic regime.
0 runs0 likes0 downloads0 reach8 impact
9983 instances - 4 features - classes - 0 missing values
This dataset records 640 time series of 12 LPC cepstrum coefficients taken from nine male speakers. The data was collected for examining our newly developed classifier for multidimensional curves…
23161 runs0 likes12 downloads12 reach54 impact
9961 instances - 15 features - 9 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
857 runs0 likes13 downloads13 reach17 impact
9961 instances - 15 features - 2 classes - 0 missing values
10% stratified subsample of the original SVHN data
0 runs0 likes1 downloads1 reach11 impact
9927 instances - 3073 features - 10 classes - 0 missing values
Creators: Renata Cristina Barros Madeo (Madeo, R. C. B.) Priscilla Koch Wagner (Wagner, P. K.) Sarajane Marques Peres (Peres, S. M.) {renata.si, priscilla.wagner, sarajane} at usp.br…
26332 runs1 likes16 downloads17 reach38 impact
9873 instances - 33 features - 5 classes - 0 missing values
Information about customers consists of 86 variables and includes product usage data and socio-demographic data derived from zip area codes. The data was supplied by the Dutch data mining company…
0 runs0 likes3 downloads3 reach13 impact
9822 instances - 86 features - 0 classes - 0 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : The Supply Chain Management datasets are derived from the Trading Agent Competition in Supply…
0 runs0 likes0 downloads0 reach9 impact
9803 instances - 296 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
9792 instances - 65 features - classes - 8776 missing values
No data.
67 runs0 likes12 downloads12 reach22 impact
9558 instances - 26833 features - 44 classes - 0 missing values
This data set is also obtained from the task of controlling the ailerons of a F16 aircraft, although the target variable and attributes are different from the ailerons domain. The target variable here…
7 runs0 likes3 downloads3 reach10 impact
9517 instances - 7 features - 0 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
774 runs0 likes14 downloads14 reach15 impact
9517 instances - 7 features - 2 classes - 0 missing values
One of the NASA Metrics Data Program defect data sets. The specific type of software is unknown. Data comes from McCabe and Halstead features extractors of source code. These features were defined in…
815 runs0 likes15 downloads15 reach18 impact
9466 instances - 39 features - 2 classes - 0 missing values
The dataset and this description is made available on http://www-stat.stanford.edu/~tibs/ElemStatLearn/data.html. Normalized handwritten digits, automatically scanned from envelopes by the U.S. Postal…
57 runs0 likes1 downloads1 reach11 impact
9298 instances - 257 features - 10 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: D3 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
126 runs0 likes3 downloads3 reach14 impact
9285 instances - 4 features - 5 classes - 0 missing values
Annual salary information including gross pay and overtime pay for all active, permanent employees of Montgomery County, MD paid in calendar year 2016. This information will be published annually each…
0 runs0 likes3 downloads3 reach8 impact
9228 instances - 13 features - 0 classes - 11169 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: D2 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
118 runs0 likes3 downloads3 reach14 impact
9172 instances - 4 features - 5 classes - 0 missing values
Data contains the information of 9144 samples form 220 spectral bands. The classes represent land-use types: alfalfa, corn, grass, hay, oats, soybeans, trees, and wheat.
0 runs0 likes2 downloads2 reach11 impact
9144 instances - 221 features - 8 classes - 0 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : The river flow datasets concern the prediction of river network flows for 48 h in the future at…
0 runs0 likes0 downloads0 reach9 impact
9125 instances - 72 features - classes - 3264 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : The river flow datasets concern the prediction of river network flows for 48 h in the future at…
0 runs0 likes0 downloads0 reach9 impact
9125 instances - 584 features - classes - 356160 missing values
This file contains 9 sets of sanitized user data drawn from the command histories of 8 UNIX computer users at Purdue over the course of up to 2 years (USER0 and USER1 were generated by the same…
11 runs0 likes9 downloads9 reach14 impact
9100 instances - 3 features - 9 classes - 0 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : The Supply Chain Management datasets are derived from the Trading Agent Competition in Supply…
0 runs0 likes0 downloads0 reach9 impact
8966 instances - 77 features - classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs1 likes0 downloads1 reach15 impact
8885 instances - 252 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
5 runs1 likes0 downloads1 reach15 impact
8885 instances - 267 features - 0 classes - 0 missing values