Data
Filter results by:
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11109, and it has 1976 rows and 1026 features…
1 runs0 likes1 downloads1 reach3 impact
1976 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 30001, and it has 717 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
717 instances - 1026 features - 0 classes - 0 missing values
Dataset sales
0 runs0 likes0 downloads0 reach0 impact
10738 instances - 15 features - 0 classes - 0 missing values
Context It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase. Content The…
0 runs0 likes3 downloads3 reach0 impact
284807 instances - 31 features - 0 classes - 0 missing values
General Description 2015-current: greater than $200.00. The Commission categorizes contributions from individuals using the calendar year-to-date amount for political action committee (PAC) and party…
0 runs0 likes1 downloads1 reach0 impact
3348209 instances - 21 features - 0 classes - 10786577 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes1 downloads1 reach5 impact
5 instances - 1143 features - 0 classes - 0 missing values
This S dump contains 22 data sets from the book Visualizing Data published by Hobart Press (books@hobart.com). The dump was created by data.dump() and can be read back into S by data.restore(). The…
0 runs0 likes1 downloads1 reach5 impact
111 instances - 4 features - 0 classes - 0 missing values
Data on the recurrence times to infection, at the point of insertion of the catheter, for kidney patients using portable dialysis equipment. Catheters may be removed for reasons other than infection,…
2 runs0 likes1 downloads1 reach5 impact
76 instances - 7 features - 0 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
78 runs0 likes2 downloads2 reach6 impact
130 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
72 runs1 likes6 downloads7 reach8 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
78 runs0 likes2 downloads2 reach7 impact
363 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes2 downloads2 reach7 impact
329 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2841 runs0 likes3 downloads3 reach16 impact
630 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
59 runs0 likes6 downloads6 reach8 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes3 downloads3 reach7 impact
337 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes3 downloads3 reach7 impact
468 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
67 runs0 likes1 downloads1 reach7 impact
458 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes3 downloads3 reach7 impact
470 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes3 downloads3 reach7 impact
193 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes2 downloads2 reach7 impact
203 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes2 downloads2 reach7 impact
347 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2855 runs0 likes4 downloads4 reach16 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes3 downloads3 reach7 impact
355 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes4 downloads4 reach7 impact
250 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2866 runs0 likes6 downloads6 reach16 impact
546 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2862 runs0 likes5 downloads5 reach16 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2849 runs0 likes5 downloads5 reach16 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes2 downloads2 reach7 impact
324 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes2 downloads2 reach7 impact
413 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
80 runs0 likes5 downloads5 reach7 impact
405 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes2 downloads2 reach7 impact
201 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes2 downloads2 reach6 impact
146 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes2 downloads2 reach7 impact
412 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
78 runs0 likes3 downloads3 reach7 impact
421 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2865 runs1 likes18 downloads19 reach16 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes2 downloads2 reach7 impact
384 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2853 runs1 likes7 downloads8 reach16 impact
1545 instances - 10937 features - 2 classes - 0 missing values
The KDD Cup 2009 offers the opportunity to work on large marketing databases from the French Telecom company Orange to predict the propensity of customers to switch provider (churn). Churn (wikipedia…
10982 runs0 likes15 downloads15 reach17 impact
50000 instances - 231 features - 2 classes - 8024152 missing values
Datasets from ACM KDD Cup (http://www.sigkdd.org/kddcup/index.php) KDD Cup 2009 http://www.kddcup-orange.com Converted to ARFF format by TunedIT Customer Relationship Management (CRM) is a key element…
11301 runs0 likes12 downloads12 reach17 impact
50000 instances - 231 features - 2 classes - 8024152 missing values
ARCENE's task is to distinguish cancer versus normal patterns from mass-spectrometric data. This is a two-class classification problem with continuous input variables. This dataset is one of 5…
17 runs0 likes8 downloads8 reach6 impact
200 instances - 10001 features - 2 classes - 0 missing values
Kung chi
1 runs0 likes4 downloads4 reach4 impact
123 instances - 40 features - 2 classes - 0 missing values
knugget chase 3
0 runs0 likes2 downloads2 reach4 impact
194 instances - 40 features - 2 classes - 0 missing values
Mind Cave 2
0 runs0 likes3 downloads3 reach3 impact
125 instances - 40 features - 2 classes - 0 missing values
#### Abstract: MADELON is an artificial dataset, which was part of the NIPS 2003 feature selection challenge. This is a two-class classification problem with continuous input variables. The difficulty…
98021 runs0 likes17 downloads17 reach18 impact
2600 instances - 501 features - 2 classes - 0 missing values
The datasets contains transactions made by credit cards in September 2013 by european cardholders. This dataset present transactions that occurred in two days, where we have 492 frauds out of 284,807…
355 runs0 likes54 downloads54 reach11 impact
284807 instances - 31 features - 2 classes - 0 missing values
This is a corrected version of the previous data file in version 1, which contained a dataset (349 instances) incorrectly merged from the original training and test sets available on UCI (there are…
0 runs0 likes2 downloads2 reach4 impact
267 instances - 45 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2834 runs0 likes4 downloads4 reach16 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
79 runs0 likes2 downloads2 reach7 impact
322 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes1 downloads1 reach7 impact
185 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2855 runs0 likes8 downloads8 reach16 impact
542 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2862 runs0 likes7 downloads7 reach16 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes3 downloads3 reach6 impact
138 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes3 downloads3 reach7 impact
267 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes3 downloads3 reach7 impact
484 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
76 runs0 likes4 downloads4 reach7 impact
187 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
66 runs0 likes2 downloads2 reach7 impact
195 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes4 downloads4 reach7 impact
275 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes1 downloads1 reach7 impact
321 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2860 runs0 likes5 downloads5 reach16 impact
604 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
66 runs0 likes3 downloads3 reach7 impact
259 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes1 downloads1 reach7 impact
410 instances - 10937 features - 2 classes - 0 missing values
No data.
43 runs0 likes2 downloads2 reach1 impact
1000000 instances - 45 features - 2 classes - 0 missing values
No data.
44 runs0 likes3 downloads3 reach2 impact
1000000 instances - 15 features - 2 classes - 0 missing values
No data.
90 runs2 likes3 downloads5 reach2 impact
663552 instances - 13 features - 2 classes - 0 missing values
No data.
45 runs0 likes2 downloads2 reach1 impact
1000000 instances - 23 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
1078 runs0 likes11 downloads11 reach6 impact
108 instances - 8 features - 2 classes - 0 missing values
Datasets from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch) Dataset from: http://www.agnostic.inf.ethz.ch/datasets.php Modified by TunedIT (converted to ARFF…
778 runs0 likes8 downloads8 reach8 impact
4562 instances - 15 features - 2 classes - 88 missing values
Datasets from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch) Dataset from: http://www.agnostic.inf.ethz.ch/datasets.php Modified by TunedIT (converted to ARFF…
406 runs1 likes11 downloads12 reach8 impact
4229 instances - 1618 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
131 runs1 likes9 downloads10 reach7 impact
990 instances - 14 features - 2 classes - 0 missing values
* Title: Skin Segmentation Data Set * Abstract: The Skin Segmentation dataset is constructed over B, G, R color space. Skin and Nonskin dataset is generated using skin textures from face images of…
15 runs1 likes10 downloads11 reach6 impact
245057 instances - 4 features - 2 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. All datasets contain between 100 and 400 samples, characterized by values of 20,000 - 65,000 attributes. Samples are assigned to several (2-10) classes.…
48 runs0 likes5 downloads5 reach7 impact
159 instances - 61360 features - 2 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach2 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach2 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
29 runs0 likes2 downloads2 reach2 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach2 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach2 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach2 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach2 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
29 runs0 likes2 downloads2 reach2 impact
1000000 instances - 37 features - 2 classes - 0 missing values
Normalized form of codrna (351) Andrew V Uzilov, Joshua M Keegan, and David H Mathews. Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change. BMC…
309 runs0 likes5 downloads5 reach1 impact
488565 instances - 9 features - 2 classes - 0 missing values
Normalized version of vehicle dataset (http://www.openml.org/d/54) NAME vehicle silhouettes PURPOSE to classify a given silhouette as one of four types of vehicle, using a set of features extracted…
372 runs0 likes10 downloads10 reach1 impact
98528 instances - 101 features - 2 classes - 0 missing values
Michel Lang fRMA-normalized. Only "Kratz-genes"*. \* (see: A practical molecular assay to predict survival in resected non-squamous, non-small-cell lung cancer: development and international…
0 runs0 likes7 downloads7 reach5 impact
226 instances - 24 features - 2 classes - 0 missing values
No data.
311 runs0 likes5 downloads5 reach2 impact
1000000 instances - 10 features - 2 classes - 0 missing values
No data.
307 runs0 likes5 downloads5 reach2 impact
1000000 instances - 4 features - 2 classes - 0 missing values
No data.
306 runs0 likes4 downloads4 reach2 impact
1000000 instances - 4 features - 2 classes - 0 missing values
No data.
305 runs0 likes3 downloads3 reach2 impact
1000000 instances - 4 features - 2 classes - 0 missing values
No data.
313 runs0 likes3 downloads3 reach1 impact
1000000 instances - 23 features - 2 classes - 0 missing values
No data.
47 runs0 likes1 downloads1 reach1 impact
1000000 instances - 45 features - 2 classes - 0 missing values
Data on predicting clicks on ads in a search engine.
0 runs0 likes7 downloads7 reach5 impact
1496391 instances - 12 features - 2 classes - 0 missing values
Even smaller sample of version 1
0 runs0 likes3 downloads3 reach4 impact
149639 instances - 12 features - 2 classes - 0 missing values
Balanced version of click prediction data
36 runs0 likes13 downloads13 reach5 impact
1997410 instances - 12 features - 2 classes - 0 missing values
This data is derived from the 2012 KDD Cup. The data is subsampled to 1% of the original number of instances, downsampling the majority class (click=0) so that the target feature is reasonably…
313 runs0 likes35 downloads35 reach5 impact
399482 instances - 12 features - 2 classes - 0 missing values
No data.
51 runs0 likes2 downloads2 reach1 impact
1000000 instances - 15 features - 2 classes - 0 missing values
Datasets from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch) Dataset from: http://www.agnostic.inf.ethz.ch/datasets.php Modified by TunedIT (converted to ARFF…
548 runs0 likes9 downloads9 reach8 impact
3468 instances - 785 features - 2 classes - 0 missing values
* Dataset: DBworld e-mails data set Task: dbworld-bodies-stemmed * Source: Michele Filannino, PhD University of Manchester Centre for Doctoral Training Email: filannim_AT_cs.man.ac.uk * Data Set…
0 runs0 likes3 downloads3 reach4 impact
64 instances - 3722 features - 2 classes - 0 missing values
* Dataset: DBworld e-mails data set Task: dbworld-bodies * Source: Michele Filannino, PhD University of Manchester Centre for Doctoral Training Email: filannim_AT_cs.man.ac.uk * Data Set Information:…
3 runs0 likes6 downloads6 reach5 impact
64 instances - 4703 features - 2 classes - 0 missing values
* Dataset: DBworld e-mails data set Task: dbworld-subjects-stemmed * Source: Michele Filannino, PhD University of Manchester Centre for Doctoral Training Email: filannim_AT_cs.man.ac.uk * Data Set…
71 runs0 likes2 downloads2 reach5 impact
64 instances - 230 features - 2 classes - 0 missing values
* Dataset: DBworld e-mails data set Task: dbworld-subjects * Source: Michele Filannino, PhD University of Manchester Centre for Doctoral Training Email: filannim_AT_cs.man.ac.uk * Data Set…
40 runs0 likes2 downloads2 reach5 impact
64 instances - 243 features - 2 classes - 0 missing values
DEXTER is a text classification problem in a bag-of-word representation. This is a two-class classification problem with sparse continuous input variables. This dataset is one of five datasets of the…
0 runs0 likes5 downloads5 reach11 impact
600 instances - 20001 features - 2 classes - 0 missing values
DOROTHEA is a drug discovery dataset. Chemical compounds represented by structural molecular features must be classified as active (binding to thrombin) or inactive. This is one of 5 datasets of the…
0 runs0 likes6 downloads6 reach11 impact
1150 instances - 100001 features - 2 classes - 0 missing values