Data
Filter results by:
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 12252, and it has 2998 rows and 1026 features…
1 runs0 likes1 downloads1 reach11 impact
2998 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 19905, and it has 3048 rows and 1026 features…
1 runs0 likes1 downloads1 reach11 impact
3048 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10197, and it has 3058 rows and 1026 features…
1 runs0 likes1 downloads1 reach11 impact
3058 instances - 1026 features - 0 classes - 0 missing values
No data.
268 runs0 likes9 downloads9 reach47 impact
3075 instances - 12433 features - 6 classes - 0 missing values
Geographical Analysis Spatial Data This georeferenced data set was used in: Pace, R. Kelley, and Ronald Barry, Quick Computation of Regressions with a Spatially Autoregressive Dependent Variable,…
4 runs1 likes2 downloads3 reach15 impact
3107 instances - 7 features - 0 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
745 runs0 likes11 downloads11 reach15 impact
3107 instances - 7 features - 2 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 130, and it has 3133 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach11 impact
3133 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10280, and it has 3134 rows and 1026 features…
1 runs0 likes1 downloads1 reach11 impact
3134 instances - 1026 features - 0 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
1 runs0 likes0 downloads0 reach17 impact
3140 instances - 260 features - 2 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 133, and it has 3151 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach11 impact
3151 instances - 1026 features - 0 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes0 downloads0 reach18 impact
3153 instances - 971 features - 2 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: scaled to [-1,1]
0 runs0 likes0 downloads0 reach16 impact
3175 instances - 61 features - 0 classes - 0 missing values
Originally from the StatLog project. The raw data is still available on [UCI](https://archive.ics.uci.edu/ml/datasets/Molecular+Biology+(Splice-junction+Gene+Sequences)). The data consists of 3,186…
7063 runs0 likes7 downloads7 reach25 impact
3186 instances - 181 features - 3 classes - 0 missing values
Primate splice-junction gene sequences (DNA) with associated imperfect domain theory. Splice junctions are points on a DNA sequence at which 'superfluous' DNA is removed during the process of protein…
24189 runs1 likes17 downloads18 reach9 impact
3190 instances - 61 features - 3 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
135 runs0 likes9 downloads9 reach15 impact
3190 instances - 61 features - 2 classes - 0 missing values
Author: Alen Shapiro Source: [UCI](https://archive.ics.uci.edu/ml/datasets/Chess+(King-Rook+vs.+King-Pawn)) Please cite: [UCI citation policy](https://archive.ics.uci.edu/ml/citation_policy.html) 1.…
273625 runs1 likes43 downloads44 reach16 impact
3196 instances - 37 features - 2 classes - 0 missing values
led24-pmlb
31 runs0 likes2 downloads2 reach22 impact
3200 instances - 25 features - 10 classes - 0 missing values
led7-pmlb
31 runs0 likes0 downloads0 reach22 impact
3200 instances - 8 features - 10 classes - 0 missing values
No data.
264 runs0 likes11 downloads11 reach47 impact
3204 instances - 13196 features - 6 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: A1 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
273 runs0 likes4 downloads4 reach14 impact
3252 instances - 4 features - 5 classes - 0 missing values
testing
0 runs0 likes1 downloads1 reach2 impact
3279 instances - 1559 features - classes - 0 missing values
### Description __Changes to version 1:__ all categorical features transformed as such. This dataset represents a set of possible advertisements on Internet pages. ### Sources (a) Creator and donor:…
1432 runs0 likes5 downloads5 reach23 impact
3279 instances - 1559 features - 2 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 51, and it has 3356 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach11 impact
3356 instances - 1026 features - 0 classes - 0 missing values
We aggregated screen movements into screen-fixations using a Salvucci & Goldberg (2000) dispersion-threshold algorithm, and defined Perception Action Cycles (PACs) as fixations with at least one…
0 runs0 likes0 downloads0 reach0 impact
3395 instances - 20 features - classes - 168 missing values
dgf_test
0 runs0 likes0 downloads0 reach0 impact
3415 instances - 5 features - 2 classes - 1 missing values
dgf_test
0 runs0 likes0 downloads0 reach0 impact
3415 instances - 5 features - 2 classes - 1 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11140, and it has 3429 rows and 1026 features…
1 runs0 likes1 downloads1 reach12 impact
3429 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 280, and it has 3438 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach11 impact
3438 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 13000, and it has 3459 rows and 1026 features…
1 runs0 likes1 downloads1 reach11 impact
3459 instances - 1026 features - 0 classes - 0 missing values
Datasets from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch) Dataset from: http://www.agnostic.inf.ethz.ch/datasets.php Modified by TunedIT (converted to ARFF…
548 runs0 likes9 downloads9 reach16 impact
3468 instances - 785 features - 2 classes - 0 missing values
Dataset from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch), which consisted of 5 different datasets (SYLVA, GINA, NOVA, HIVA, ADA). The purpose of the challenge…
69081 runs0 likes21 downloads21 reach26 impact
3468 instances - 971 features - 2 classes - 0 missing values
Datasets from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch) Dataset from: http://www.agnostic.inf.ethz.ch/datasets.php Modified by TunedIT (converted to ARFF…
396 runs0 likes17 downloads17 reach15 impact
3468 instances - 785 features - 10 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 114, and it has 3490 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach11 impact
3490 instances - 1026 features - 0 classes - 0 missing values
Estimated article influence scores in 2015
0 runs0 likes1 downloads1 reach8 impact
3615 instances - 7 features - 3169 classes - 48 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10434, and it has 3650 rows and 1026 features…
1 runs0 likes1 downloads1 reach11 impact
3650 instances - 1026 features - 0 classes - 0 missing values
### Description ### This dataset is part of a collection datasets based on the game "Jungle Chess" (a.k.a. Dou Shou Qi). For a description of the rules, please refer to the paper (link attached). The…
10 runs0 likes0 downloads0 reach14 impact
3660 instances - 47 features - 2 classes - 0 missing values
"The speech dataset was also provided by (see citation request) and contains real world data from recorded English language. The normal class contains data from persons having an American accent…
1599 runs0 likes6 downloads6 reach17 impact
3686 instances - 401 features - 2 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 137, and it has 3689 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach11 impact
3689 instances - 1026 features - 0 classes - 0 missing values
Predict a biological response of molecules from their chemical properties. Each row in this data set represents a molecule. The first column contains experimental data describing an actual biological…
48340 runs2 likes38 downloads40 reach34 impact
3751 instances - 1777 features - 2 classes - 0 missing values
allbp-pmlb
31 runs0 likes2 downloads2 reach21 impact
3772 instances - 30 features - 3 classes - 0 missing values
allrep-pmlb
31 runs0 likes1 downloads1 reach21 impact
3772 instances - 30 features - 4 classes - 0 missing values
dis-pmlb
31 runs0 likes1 downloads1 reach22 impact
3772 instances - 30 features - 2 classes - 0 missing values
The sick dataset from the OpenCC18 with all categorical data label encoded so all data is numeric
0 runs0 likes0 downloads0 reach8 impact
3772 instances - 30 features - classes - 0 missing values
Sick dataset from the opencc18 with all textual binary variables label encoded.
1 runs0 likes2 downloads2 reach9 impact
3772 instances - 30 features - 2 classes - 0 missing values
This directory contains Thyroid datasets. "ann-train.data" contains 3772 learning examples and "ann-test.data" contains 3428 testing examples. I have obtained this data from…
31 runs1 likes5 downloads6 reach14 impact
3772 instances - 22 features - 3 classes - 0 missing values
; ; Thyroid disease records supplied by the Garavan Institute and J. Ross ; Quinlan, New South Wales Institute, Syndney, Australia. ; ; 1987. ; hypothyroid, primary hypothyroid, compensated…
883 runs0 likes13 downloads13 reach9 impact
3772 instances - 30 features - 4 classes - 6064 missing values
Attribute information: ``` sick, negative. | classes age: continuous. sex: M, F. on thyroxine: f, t. query on thyroxine: f, t. on antithyroid medication: f, t. sick: f, t. pregnant: f, t. thyroid…
19941 runs0 likes31 downloads31 reach9 impact
3772 instances - 30 features - 2 classes - 6064 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
737 runs0 likes10 downloads10 reach15 impact
3772 instances - 30 features - 2 classes - 6064 missing values
Multi-label dataset for text-classification. It consists of article titles and partial blurbs. Blurbs can be assigned to several categories (e.g. Science, News, Games) based on word predictors.
0 runs1 likes16 downloads17 reach16 impact
3782 instances - 1101 features - 2 classes - 0 missing values
Multi-label dataset for text-classification. It consists of article titles and partial blurbs. Blurbs can be assigned to several categories (e.g. Science, News, Games) based on word predictors.
0 runs0 likes3 downloads3 reach13 impact
3782 instances - 1101 features - classes - 0 missing values
This dataset is synthetic. It was generated by David Coleman at RCA Laboratories in Princeton, N.J. For convenience, we will refer to it as the POLLEN DATA. The first three variables are the lengths…
0 runs0 likes1 downloads1 reach14 impact
3848 instances - 5 features - 0 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
1767 runs0 likes15 downloads15 reach15 impact
3848 instances - 6 features - 2 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10188, and it has 3889 rows and 1026 features…
1 runs0 likes1 downloads1 reach11 impact
3889 instances - 1026 features - 0 classes - 0 missing values
student performance analysis 1
0 runs0 likes1 downloads1 reach6 impact
3892 instances - 36 features - classes - 0 missing values
mydata
0 runs0 likes0 downloads0 reach6 impact
3892 instances - 36 features - classes - 0 missing values
The researchers of OCLAR Marwan et al. (2019), they gathered Arabic costumer reviews from and Zomato website on wide scope of domain, including restaurants, hotels, hospitals, local shops, etc. The…
0 runs0 likes0 downloads0 reach0 impact
3916 instances - 3 features - classes - 0 missing values
Sanitized and anonymized Cargo 2000 (C2K) airfreight tracking and tracing events, covering five months of business execution (3,942 process instances, 7,932 transport legs, 56,082 activities). ###…
0 runs0 likes0 downloads0 reach0 impact
3943 instances - 98 features - classes - 210284 missing values
No data.
7 runs0 likes1 downloads1 reach0 impact
4000 instances - 27649 features - 2 classes - 0 missing values
artificial no anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 2 features - classes - 0 missing values
artificial with anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 3 features - 0 classes - 0 missing values
artificial with anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 3 features - classes - 0 missing values
artificial with anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 2 features - classes - 0 missing values
artificial no anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 2 features - 0 classes - 0 missing values
artificial with anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 3 features - 2 classes - 0 missing values
artificial with anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 3 features - classes - 0 missing values
artificial with anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 3 features - classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
0 runs0 likes0 downloads0 reach14 impact
4052 instances - 8 features - 0 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
739 runs0 likes11 downloads11 reach15 impact
4052 instances - 8 features - 2 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 252, and it has 4081 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach11 impact
4081 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 136, and it has 4085 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach11 impact
4085 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 129, and it has 4089 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach11 impact
4089 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10576, and it has 4103 rows and 1026 features…
1 runs0 likes1 downloads1 reach11 impact
4103 instances - 1026 features - 0 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
1 runs0 likes1 downloads1 reach17 impact
4147 instances - 49 features - 2 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 87, and it has 4160 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach11 impact
4160 instances - 1026 features - 0 classes - 0 missing values
Make target (age) numeric**Author**: 1. Title of Database: Abalone data 2. Sources: (a) Original owners of database: Marine Resources Division Marine Research Laboratories - Taroona Department of…
0 runs0 likes0 downloads0 reach1 impact
4177 instances - 9 features - 0 classes - 0 missing values
* Abstract: A 3-class version of abalone dataset. * Sources: (a) Original owners of database: Marine Resources Division Marine Research Laboratories - Taroona Department of Primary Industry and…
176 runs0 likes4 downloads4 reach14 impact
4177 instances - 9 features - 3 classes - 0 missing values
1. Title of Database: Abalone data 2. Sources: (a) Original owners of database: Marine Resources Division Marine Research Laboratories - Taroona Department of Primary Industry and Fisheries, Tasmania…
34899 runs0 likes18 downloads18 reach9 impact
4177 instances - 9 features - 28 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
747 runs0 likes14 downloads14 reach15 impact
4177 instances - 9 features - 2 classes - 0 missing values
Since the first automobile, the Benz Patent Motor Car in 1886, Mercedes-Benz has stood for important automotive innovations. These include, for example, the passenger safety cell with crumple zone,…
0 runs0 likes0 downloads0 reach8 impact
4209 instances - 377 features - 0 classes - 0 missing values
Datasets from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch) Dataset from: http://www.agnostic.inf.ethz.ch/datasets.php Modified by TunedIT (converted to ARFF…
406 runs1 likes12 downloads13 reach16 impact
4229 instances - 1618 features - 2 classes - 0 missing values
test
0 runs0 likes0 downloads0 reach3 impact
4324 instances - 9 features - classes - 3360 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 259, and it has 4332 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach11 impact
4332 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 14037, and it has 4378 rows and 1026 features…
1 runs0 likes1 downloads1 reach11 impact
4378 instances - 1026 features - 0 classes - 0 missing values
Author: Gregory Gay, Tim Menzies, Misty Davies, Karen Gundy-Burlet Source: [Zenodo](https://zenodo.org/record/322475) Please cite: Misty Davies. (2009). bike [Data set]. Zenodo. DOI:…
0 runs0 likes4 downloads4 reach10 impact
4435 instances - 11 features - classes - 0 missing values
source: http://www.cs.ubc.ca/labs/beta/Projects/SATzilla/ authors: L. Xu, F. Hutter, H. Hoos, K. Leyton-Brown translator in coseal format: M. Lindauer with the help of Alexandre Frechette the data do…
0 runs0 likes1 downloads1 reach9 impact
4440 instances - 117 features - 0 classes - 27150 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 191, and it has 4442 rows and 1026 features (including…
1 runs0 likes2 downloads2 reach11 impact
4442 instances - 1026 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs1 likes1 downloads2 reach14 impact
4450 instances - 203 features - 0 classes - 0 missing values
According to Epsilon research, 80% of customers are more likely to do business with you if you provide personalized service. Banking is no exception. The digitalization of everyday lives means that…
0 runs0 likes2 downloads2 reach7 impact
4459 instances - 4992 features - 0 classes - 0 missing values
Data for an stock long position
0 runs0 likes0 downloads0 reach6 impact
4477 instances - 20 features - 0 classes - 0 missing values
* Dataset: Reduced version (10 % of the examples) of bank-marketing dataset.
1254 runs1 likes17 downloads18 reach15 impact
4521 instances - 17 features - 2 classes - 0 missing values
Datasets from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch) Dataset from: http://www.agnostic.inf.ethz.ch/datasets.php Modified by TunedIT (converted to ARFF…
778 runs0 likes8 downloads8 reach16 impact
4562 instances - 15 features - 2 classes - 88 missing values
__Major change w.r.t. version 1: updated data type of binary variables to factor type.__ Dataset from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch), which…
0 runs0 likes1 downloads1 reach10 impact
4562 instances - 49 features - classes - 0 missing values
Email dataset 1a
0 runs0 likes0 downloads0 reach4 impact
4585 instances - 4 features - 0 classes - 0 missing values
Email dataset 1b
0 runs0 likes0 downloads0 reach4 impact
4585 instances - 24 features - 0 classes - 161 missing values
Email dataset 1c
0 runs0 likes0 downloads0 reach4 impact
4585 instances - 792 features - 0 classes - 0 missing values
Email dataset 1d
0 runs0 likes0 downloads0 reach4 impact
4585 instances - 11 features - 0 classes - 0 missing values
Email dataset 1e
0 runs0 likes0 downloads0 reach4 impact
4585 instances - 580 features - 0 classes - 0 missing values
SPAM E-mail Database The "spam" concept is diverse: advertisements for products/websites, make money fast schemes, chain letters, pornography... Our collection of spam e-mails came from our postmaster…
161536 runs5 likes92 downloads97 reach12 impact
4601 instances - 58 features - 2 classes - 0 missing values
### Description ### This dataset is part of a collection datasets based on the game "Jungle Chess" (a.k.a. Dou Shou Qi). For a description of the rules, please refer to the paper (link attached). The…
12 runs0 likes0 downloads0 reach14 impact
4704 instances - 47 features - 3 classes - 0 missing values
### Description ### This dataset is part of a collection datasets based on the game "Jungle Chess" (a.k.a. Dou Shou Qi). For a description of the rules, please refer to the paper (link attached). The…
11 runs0 likes0 downloads0 reach14 impact
4704 instances - 47 features - 3 classes - 0 missing values