Data
Filter results by:
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes3 downloads3 reach7 impact
337 instances - 10937 features - 2 classes - 0 missing values
No data.
0 runs0 likes3 downloads3 reach7 impact
697641 instances - 47237 features - 0 classes - 0 missing values
No data.
44 runs0 likes3 downloads3 reach2 impact
1000000 instances - 15 features - 2 classes - 0 missing values
No data.
337 runs1 likes2 downloads3 reach2 impact
1000000 instances - 13 features - 3 classes - 0 missing values
1. Title: Lecturers Evaluation (Ordinal LEV) 2. Source Informaion: Donor: Arie Ben David MIS, Dept. of Technology Management Holon Academic Inst. of Technology 52 Golomb St. Holon 58102 Israel…
0 runs1 likes2 downloads3 reach5 impact
1000 instances - 5 features - 0 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
9 runs0 likes3 downloads3 reach6 impact
92 instances - 59005 features - 5 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. All datasets contain between 100 and 400 samples, characterized by values of 20,000 - 65,000 attributes. Samples are assigned to several (2-10) classes.…
11 runs0 likes3 downloads3 reach6 impact
214 instances - 45102 features - 7 classes - 0 missing values
No data.
6 runs0 likes3 downloads3 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
Michel Lang fRMA-normalized. Only "Kratz-genes"*. \* (see: A practical molecular assay to predict survival in resected non-squamous, non-small-cell lung cancer: development and international…
3 runs0 likes3 downloads3 reach4 impact
442 instances - 24 features - 0 classes - 0 missing values
No data.
28 runs0 likes3 downloads3 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
Building projectable classifiers of arbitrary complexity. In Proceedings of the 13th International Conference on Pattern Recognition, pages 880-885, Vienna, Austria, August 1996. #Dataset from the…
0 runs0 likes3 downloads3 reach5 impact
862 instances - 3 features - 0 classes - 0 missing values
Modified version of the training dataset of the Bike Sharing Demand challenge running on Kaggle (http://www.kaggle.com/c/bike-sharing-demand/) If you use the problem in publication, please cite:…
0 runs0 likes3 downloads3 reach4 impact
10886 instances - 12 features - 0 classes - 0 missing values
No data.
305 runs0 likes3 downloads3 reach2 impact
1000000 instances - 4 features - 2 classes - 0 missing values
No data.
313 runs0 likes3 downloads3 reach1 impact
1000000 instances - 23 features - 2 classes - 0 missing values
No data.
0 runs0 likes3 downloads3 reach1 impact
116640 instances - 10 features - 0 classes - 0 missing values
No data.
68 runs0 likes3 downloads3 reach1 impact
20000 instances - 17 features - 3 classes - 10000 missing values
No data.
0 runs0 likes3 downloads3 reach1 impact
59049 instances - 10 features - 0 classes - 0 missing values
Determinants of Wages from the 1985 Current Population Survey Summary: The Current Population Survey (CPS) is used to supplement census information between census years. These data consist of a random…
2 runs0 likes3 downloads3 reach5 impact
534 instances - 11 features - 0 classes - 0 missing values
The data consist of 2001 observations taken from a balloon about 30 kilometres above the surface of the earth. In the section of the flight shown here the balloon increases in height. As radiation…
0 runs1 likes2 downloads3 reach5 impact
2001 instances - 3 features - 0 classes - 0 missing values
This dataset summarizes a heterogeneous set of features about articles published by Mashable in a period of two years. The goal is to predict the number of shares in social networks (popularity). *…
0 runs0 likes3 downloads3 reach3 impact
39644 instances - 61 features - 0 classes - 0 missing values
Abstract: This dataset contains timeseries of mel-frequency cepstrum coefficients (MFCCs) corresponding to spoken Arabic digits. Includes data from 44 male and 44 female native Arabic speakers.…
0 runs0 likes3 downloads3 reach3 impact
178526 instances - 13 features - classes - 57200 missing values
DBpedia with top-474 most frequent YAGO types HMC dataset for type prediction. Ingoing and outgoing properties as features
0 runs0 likes3 downloads3 reach3 impact
2886305 instances - 2401 features - classes - 0 missing values
Source: The dataset was created by Athanasios Tsanas (tsanasthanasis '@' gmail.com) and Max Little (littlem '@' physics.ox.ac.uk) of the University of Oxford, in collaboration with 10 medical centers…
0 runs1 likes2 downloads3 reach3 impact
5875 instances - 22 features - classes - 0 missing values
Source: 1. Olcay KURSUN, PhD., Istanbul University, Department of Computer Engineering, 34320, Istanbul, Turkey Phone: +90 (212) 473 7070 - 17827 Email: okursun '@' istanbul.edu.tr 2. Betul ERDOGDU…
0 runs0 likes3 downloads3 reach3 impact
1039 instances - 29 features - classes - 0 missing values
No data.
70 runs0 likes3 downloads3 reach1 impact
1000000 instances - 28 features - 2 classes - 0 missing values
No data.
72 runs0 likes3 downloads3 reach1 impact
1000000 instances - 23 features - 2 classes - 0 missing values
No data.
194 runs0 likes3 downloads3 reach2 impact
1000000 instances - 65 features - 10 classes - 0 missing values
No data.
66 runs0 likes3 downloads3 reach2 impact
1000000 instances - 35 features - 6 classes - 0 missing values
No data.
211 runs0 likes3 downloads3 reach2 impact
1000000 instances - 20 features - 7 classes - 0 missing values
No data.
63 runs0 likes3 downloads3 reach1 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
67 runs0 likes3 downloads3 reach1 impact
1000000 instances - 13 features - 6 classes - 0 missing values
This data set is also obtained from the task of controlling the ailerons of a F16 aircraft, although the target variable and attributes are different from the ailerons domain. The target variable here…
2 runs0 likes3 downloads3 reach1 impact
9517 instances - 7 features - 0 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Cholesterol treated as the class attribute. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using…
60 runs0 likes3 downloads3 reach1 impact
303 instances - 14 features - 0 classes - 6 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Tumor-size treated as the class attribute. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using…
0 runs0 likes3 downloads3 reach1 impact
286 instances - 10 features - 0 classes - 9 missing values
The Computer Activity databases are a collection of computer systems activity measures. The data was collected from a Sun Sparcstation 20/712 with 128 Mbytes of memory running in a multi-user…
5 runs1 likes2 downloads3 reach1 impact
8192 instances - 13 features - 0 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Case number deleted. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based learning…
10 runs1 likes2 downloads3 reach1 impact
195 instances - 12 features - 0 classes - 2 missing values
No data.
65 runs1 likes2 downloads3 reach1 impact
1000000 instances - 18 features - 7 classes - 0 missing values
No data.
66 runs0 likes3 downloads3 reach1 impact
1000000 instances - 13 features - 6 classes - 0 missing values
No data.
298 runs0 likes3 downloads3 reach2 impact
1000000 instances - 11 features - 5 classes - 0 missing values
No data.
309 runs0 likes3 downloads3 reach2 impact
1000000 instances - 11 features - 5 classes - 0 missing values
No data.
328 runs0 likes3 downloads3 reach2 impact
1000000 instances - 4 features - 2 classes - 0 missing values
This database contains the HTML source of web pages plus the ratings of a single user on these web pages. The web pages are on four separate subjects (Bands- recording artists; Goats; Sheep; and…
0 runs0 likes3 downloads3 reach11 impact
65 instances - 3 features - 2 classes - 0 missing values
No data.
206 runs0 likes3 downloads3 reach2 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
311 runs0 likes3 downloads3 reach2 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
306 runs0 likes3 downloads3 reach1 impact
1000000 instances - 13 features - 6 classes - 0 missing values
No data.
52 runs0 likes3 downloads3 reach2 impact
1000000 instances - 48 features - 10 classes - 0 missing values
No data.
65 runs0 likes3 downloads3 reach1 impact
1000000 instances - 40 features - 2 classes - 0 missing values
No data.
304 runs0 likes3 downloads3 reach1 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
307 runs0 likes3 downloads3 reach2 impact
1000000 instances - 41 features - 3 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: The original Adult data set has 14 features, among which six are continuous and eight are categorical. In this data set,…
0 runs0 likes3 downloads3 reach5 impact
48842 instances - 124 features - 0 classes - 0 missing values
Mean While 1
0 runs0 likes3 downloads3 reach3 impact
253 instances - 38 features - 2 classes - 0 missing values
No data.
50 runs0 likes3 downloads3 reach1 impact
1000000 instances - 61 features - 2 classes - 0 missing values
Automated file upload of BNG(credit-g)
99 runs0 likes3 downloads3 reach2 impact
1000000 instances - 21 features - 2 classes - 0 missing values
Automated file upload of BNG(spambase)
98 runs0 likes3 downloads3 reach2 impact
1000000 instances - 58 features - 2 classes - 0 missing values
Automated file upload of BNG(anneal)
100 runs0 likes3 downloads3 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
This dataset is gather to detect whether a person is running or walking based on deep neural networks and sensor data collected from iOS devices. The dataset represents 88588 sensor data samples…
1 runs0 likes3 downloads3 reach6 impact
88588 instances - 7 features - 2 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 12789, and it has 309 rows and 1026 features (including…
1 runs0 likes3 downloads3 reach3 impact
309 instances - 1026 features - 0 classes - 0 missing values
Mind Cave 2
0 runs0 likes3 downloads3 reach3 impact
125 instances - 40 features - 2 classes - 0 missing values
__Changes w.r.t. version 1: included one target factor with 7 levels as target variable for the classification. Also deleted the previous 7 binary target variables.__ A dataset of steel plates'…
7374 runs1 likes2 downloads3 reach8 impact
1941 instances - 28 features - 7 classes - 0 missing values
The instances were drawn randomly from a database of 7 outdoor images. The images were hand-segmented to create a classification for every pixel. Each instance is a 3x3 region. __Major changes w.r.t.…
7997 runs0 likes3 downloads3 reach14 impact
2310 instances - 20 features - 7 classes - 0 missing values
__Major changes w.r.t. version 1: deactivated first two variables as they describe the batch of the experiments and should not be used for prediction. Also transformed the target from numeric to…
6844 runs0 likes3 downloads3 reach6 impact
540 instances - 21 features - 2 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. The maps were scanned in 8 bit grey value at density of 400dpi,…
9398 runs1 likes2 downloads3 reach13 impact
2000 instances - 241 features - 10 classes - 0 missing values
Originally from the StatLog project. The raw data is still available on [UCI](https://archive.ics.uci.edu/ml/datasets/Molecular+Biology+(Splice-junction+Gene+Sequences)). The data consists of 3,186…
6313 runs0 likes3 downloads3 reach16 impact
3186 instances - 181 features - 3 classes - 0 missing values
The data is cleaned, regularized and encrypted global equity data. The first 21 columns (feature1 - feature21) are features, and target is the binary class you’re trying to predict.
882 runs1 likes2 downloads3 reach7 impact
96320 instances - 22 features - 2 classes - 0 missing values
Multi-label dataset. The UC Berkeley enron4 dataset represents a subset of the original enron5 dataset and consists of 1684 cases of emails with 21 labels and 1001 predictor variables.
1 runs0 likes3 downloads3 reach7 impact
1702 instances - 1054 features - 2 classes - 0 missing values
This data is derived from the 2012 KDD Cup. The data is subsampled to 1% of the original number of instances, downsampling the majority class (click=0) so that the target feature is reasonably…
0 runs1 likes2 downloads3 reach3 impact
798964 instances - 12 features - 3 classes - 399482 missing values
* Dataset: DBworld e-mails data set Task: dbworld-bodies-stemmed * Source: Michele Filannino, PhD University of Manchester Centre for Doctoral Training Email: filannim_AT_cs.man.ac.uk * Data Set…
0 runs0 likes3 downloads3 reach5 impact
64 instances - 3722 features - 2 classes - 0 missing values
Context It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase. Content The…
0 runs0 likes3 downloads3 reach1 impact
284807 instances - 31 features - 0 classes - 0 missing values
shuttle-pmlb
10 runs0 likes3 downloads3 reach14 impact
58000 instances - 10 features - 7 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes3 downloads3 reach8 impact
203 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes3 downloads3 reach8 impact
412 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes3 downloads3 reach8 impact
201 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
79 runs0 likes3 downloads3 reach8 impact
322 instances - 10937 features - 2 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: Vikas Sindhwani for the SVMlin project.
0 runs0 likes3 downloads3 reach6 impact
72309 instances - 20959 features - 0 classes - 0 missing values
Even smaller sample of version 1
0 runs0 likes3 downloads3 reach5 impact
149639 instances - 12 features - 2 classes - 0 missing values
### Description __Changes to version 1:__ all categorical features transformed as such. This dataset represents a set of possible advertisements on Internet pages. ### Sources (a) Creator and donor:…
1430 runs0 likes3 downloads3 reach14 impact
3279 instances - 1559 features - 2 classes - 0 missing values
A 4-class version of breast-tissue dataset.
299 runs0 likes3 downloads3 reach6 impact
106 instances - 10 features - 4 classes - 0 missing values
One of the biggest challenges of an auto dealership purchasing a used car at an auto auction is the risk of that the vehicle might have serious issues that prevent it from being sold to customers. The…
3 runs0 likes3 downloads3 reach5 impact
72983 instances - 33 features - 2 classes - 149271 missing values
#Dataset from the LIBSVM multiclass data repository.
0 runs0 likes3 downloads3 reach7 impact
108000 instances - 129 features - 0 classes - 0 missing values
This is a corrected version of the previous data file in version 1, which contained a dataset (349 instances) incorrectly merged from the original training and test sets available on UCI (there are…
0 runs0 likes3 downloads3 reach5 impact
267 instances - 45 features - 2 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: C1 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
54 runs0 likes3 downloads3 reach7 impact
28626 instances - 4 features - 5 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes3 downloads3 reach8 impact
413 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes3 downloads3 reach8 impact
347 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
78 runs0 likes3 downloads3 reach8 impact
363 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes3 downloads3 reach7 impact
146 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
78 runs0 likes3 downloads3 reach7 impact
130 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes3 downloads3 reach8 impact
329 instances - 10937 features - 2 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
9 runs1 likes2 downloads3 reach7 impact
95 instances - 22278 features - 5 classes - 0 missing values
No data.
0 runs0 likes3 downloads3 reach2 impact
31104 instances - 10 features - 0 classes - 0 missing values
The AAUP dataset for the ASA Statistical Graphics Section's 1995 Data Analysis Exposition contains information on faculty salaries for 1161 American colleges and universities. The data may be obtained…
32 runs0 likes3 downloads3 reach7 impact
1161 instances - 17 features - 4 classes - 256 missing values
The Committee on Statistical Graphics of the American Statistical Association (ASA) invites you to participate in its Second (1983) Exposition of Statistical Graphics Technology. The purposes of the…
164 runs0 likes3 downloads3 reach7 impact
406 instances - 9 features - 3 classes - 14 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
118 runs0 likes3 downloads3 reach8 impact
195 instances - 12 features - 2 classes - 2 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
107 runs0 likes3 downloads3 reach7 impact
74 instances - 10 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
100 runs0 likes3 downloads3 reach7 impact
31 instances - 17 features - 2 classes - 150 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
118 runs0 likes3 downloads3 reach8 impact
228 instances - 10 features - 2 classes - 20 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
102 runs0 likes3 downloads3 reach8 impact
527 instances - 39 features - 2 classes - 560 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
113 runs0 likes3 downloads3 reach8 impact
366 instances - 6 features - 2 classes - 1 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
701 runs0 likes3 downloads3 reach8 impact
736 instances - 20 features - 2 classes - 448 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
104 runs0 likes3 downloads3 reach7 impact
57 instances - 12 features - 2 classes - 1 missing values
* Dataset Title: AutoUniv Dataset data problem: autoUniv-au6-cd1-400 * Abstract: AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and heterogeneity…
144 runs0 likes3 downloads3 reach6 impact
400 instances - 41 features - 8 classes - 0 missing values