Data
Filter results by:
DOROTHEA is a drug discovery dataset. Chemical compounds represented by structural molecular features must be classified as active (binding to thrombin) or inactive. This is one of 5 datasets of the…
0 runs0 likes5 downloads5 reach8 impact
1150 instances - 100001 features - 2 classes - 0 missing values
Newsweeder: Learning to filter netnews. In Proceedings of the Twelfth International Conference on Machine Learning, pages 331-339, 1995. #Dataset from the LIBSVM data repository. Preprocessing: First…
0 runs0 likes4 downloads4 reach4 impact
19928 instances - 62062 features - 0 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. All datasets contain between 100 and 400 samples, characterized by values of 20,000 - 65,000 attributes. Samples are assigned to several (2-10) classes.…
46 runs0 likes5 downloads5 reach6 impact
159 instances - 61360 features - 2 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
9 runs0 likes3 downloads3 reach5 impact
92 instances - 59005 features - 5 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. All datasets contain between 100 and 400 samples, characterized by values of 20,000 - 65,000 attributes. Samples are assigned to several (2-10) classes.…
0 runs0 likes2 downloads2 reach5 impact
383 instances - 54676 features - 9 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
8 runs0 likes2 downloads2 reach5 impact
113 instances - 54676 features - 5 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. All datasets contain between 100 and 400 samples, characterized by values of 20,000 - 65,000 attributes. Samples are assigned to several (2-10) classes.…
9 runs0 likes1 downloads1 reach5 impact
283 instances - 54622 features - 3 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. All datasets contain between 100 and 400 samples, characterized by values of 20,000 - 65,000 attributes. Samples are assigned to several (2-10) classes.…
8 runs0 likes2 downloads2 reach5 impact
283 instances - 54622 features - 3 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
9 runs0 likes0 downloads0 reach5 impact
89 instances - 54614 features - 4 classes - 0 missing values
No data.
0 runs0 likes3 downloads3 reach6 impact
697641 instances - 47237 features - 0 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. All datasets contain between 100 and 400 samples, characterized by values of 20,000 - 65,000 attributes. Samples are assigned to several (2-10) classes.…
9 runs0 likes3 downloads3 reach5 impact
214 instances - 45102 features - 7 classes - 0 missing values
CIFAR-10 dataset but with some modifications. In particular, each class has fewer labeled training examples than in CIFAR-10, but a very large set of unlabeled examples is provided to learn image…
40 runs0 likes0 downloads0 reach2 impact
13000 instances - 27649 features - 10 classes - 0 missing values
No data.
67 runs0 likes10 downloads10 reach10 impact
9558 instances - 26833 features - 44 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
9 runs0 likes0 downloads0 reach5 impact
105 instances - 22284 features - 3 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
9 runs0 likes0 downloads0 reach5 impact
105 instances - 22284 features - 3 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. All datasets contain between 100 and 400 samples, characterized by values of 20,000 - 65,000 attributes. Samples are assigned to several (2-10) classes.…
9 runs0 likes3 downloads3 reach5 impact
220 instances - 22284 features - 3 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
9 runs0 likes1 downloads1 reach5 impact
95 instances - 22278 features - 5 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: Vikas Sindhwani for the SVMlin project.
0 runs0 likes3 downloads3 reach4 impact
72309 instances - 20959 features - 0 classes - 0 missing values
DEXTER is a text classification problem in a bag-of-word representation. This is a two-class classification problem with sparse continuous input variables. This dataset is one of five datasets of the…
0 runs0 likes4 downloads4 reach8 impact
600 instances - 20001 features - 2 classes - 0 missing values
Multiclass cancer diagnosis using 16063 tumor gene expression signatures. PNAS, VOL 98, no 26, pp. 15149-15154, December 18, 2001. S. Ramaswamy, P. Tamayo, R. Rifkin, S. Mukherjee, C.-H. Yeang, M.…
114 runs0 likes7 downloads7 reach11 impact
190 instances - 16064 features - 14 classes - 0 missing values
No data.
264 runs0 likes11 downloads11 reach35 impact
3204 instances - 13196 features - 6 classes - 0 missing values
No data.
268 runs0 likes9 downloads9 reach35 impact
3075 instances - 12433 features - 6 classes - 0 missing values
No data.
216 runs0 likes12 downloads12 reach51 impact
11162 instances - 11466 features - 10 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
75 runs0 likes3 downloads3 reach6 impact
193 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
75 runs0 likes2 downloads2 reach6 impact
203 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
63 runs0 likes2 downloads2 reach6 impact
347 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2853 runs0 likes4 downloads4 reach15 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
75 runs0 likes3 downloads3 reach6 impact
355 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
75 runs0 likes3 downloads3 reach6 impact
250 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2864 runs0 likes6 downloads6 reach15 impact
546 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2860 runs0 likes5 downloads5 reach15 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2847 runs0 likes5 downloads5 reach15 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
63 runs0 likes2 downloads2 reach6 impact
324 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
75 runs0 likes2 downloads2 reach6 impact
413 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
78 runs0 likes5 downloads5 reach6 impact
405 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
75 runs0 likes2 downloads2 reach6 impact
201 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
63 runs0 likes2 downloads2 reach5 impact
146 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
63 runs0 likes1 downloads1 reach6 impact
412 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
76 runs0 likes3 downloads3 reach6 impact
421 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2863 runs1 likes16 downloads17 reach15 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
75 runs0 likes2 downloads2 reach6 impact
384 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2851 runs1 likes7 downloads8 reach15 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
76 runs0 likes2 downloads2 reach5 impact
130 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
70 runs1 likes6 downloads7 reach7 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
76 runs0 likes2 downloads2 reach6 impact
363 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
75 runs0 likes2 downloads2 reach6 impact
329 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2839 runs0 likes3 downloads3 reach15 impact
630 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
57 runs0 likes6 downloads6 reach7 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
75 runs0 likes3 downloads3 reach6 impact
337 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
63 runs0 likes2 downloads2 reach6 impact
468 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes1 downloads1 reach6 impact
458 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
63 runs0 likes3 downloads3 reach6 impact
470 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
63 runs0 likes3 downloads3 reach5 impact
138 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
63 runs0 likes2 downloads2 reach6 impact
267 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
75 runs0 likes3 downloads3 reach6 impact
484 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
74 runs0 likes4 downloads4 reach6 impact
187 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
64 runs0 likes2 downloads2 reach6 impact
195 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
75 runs0 likes4 downloads4 reach6 impact
275 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
63 runs0 likes1 downloads1 reach6 impact
321 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2858 runs0 likes4 downloads4 reach15 impact
604 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
64 runs0 likes3 downloads3 reach6 impact
259 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
63 runs0 likes1 downloads1 reach6 impact
410 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2832 runs0 likes3 downloads3 reach15 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes2 downloads2 reach6 impact
322 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
64 runs0 likes1 downloads1 reach6 impact
386 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
63 runs0 likes1 downloads1 reach6 impact
185 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2853 runs0 likes8 downloads8 reach15 impact
542 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2860 runs0 likes7 downloads7 reach15 impact
1545 instances - 10937 features - 2 classes - 0 missing values
The Sheffield (previously UMIST) Face Database consists of 564 images of 20 individuals (mixed race/gender/appearance). Each individual is shown in a range of poses from profile to frontal views -…
53 runs0 likes0 downloads0 reach2 impact
575 instances - 10305 features - 20 classes - 0 missing values
No data.
108 runs0 likes4 downloads4 reach9 impact
927 instances - 10129 features - 7 classes - 0 missing values
Dataset creator and donator: Zhi Liu, e-mail: liuzhi8673 '@' gmail.com, institution: National Engineering Research Center for E-Learning, Hubei Wuhan, China Data Set Information: dataset are derived…
65168 runs2 likes37 downloads39 reach205 impact
1500 instances - 10001 features - 50 classes - 0 missing values
ARCENE's task is to distinguish cancer versus normal patterns from mass-spectrometric data. This is a two-class classification problem with continuous input variables. This dataset is one of 5…
16 runs0 likes8 downloads8 reach5 impact
200 instances - 10001 features - 2 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes0 downloads0 reach1 impact
100 instances - 10001 features - 2 classes - 0 missing values
No data.
163 runs0 likes13 downloads13 reach10 impact
1560 instances - 8461 features - 20 classes - 0 missing values
No data.
414 runs0 likes8 downloads8 reach50 impact
690 instances - 8262 features - 10 classes - 0 missing values
No data.
220 runs0 likes6 downloads6 reach9 impact
336 instances - 7903 features - 6 classes - 0 missing values
No data.
203 runs0 likes5 downloads5 reach9 impact
878 instances - 7455 features - 10 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes0 downloads0 reach2 impact
10000 instances - 7201 features - 10 classes - 0 missing values
Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science, VOL 286, pp. 531-537, 15 October 1999. Web supplement to the article T.R. Golub, D. K.…
449 runs0 likes12 downloads12 reach5 impact
72 instances - 7130 features - 2 classes - 0 missing values
Embryonal tumours of the central nervous system Prediction of Central Nervous System Embryonal Tumour Outcome based on Gene Expression. Nature, VOL 415, pp. 436-442, 24 January 2002. Scott L. Pomeroy,…
343 runs0 likes6 downloads6 reach5 impact
60 instances - 7130 features - 2 classes - 0 missing values
libSVM","AAD group A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics, 19(17):2246-2253, 2003. #Dataset from the LIBSVM data repository.…
0 runs0 likes2 downloads2 reach4 impact
86 instances - 7130 features - 0 classes - 0 missing values
No data.
219 runs0 likes5 downloads5 reach9 impact
414 instances - 6430 features - 9 classes - 0 missing values
eating
9413 runs0 likes14 downloads14 reach40 impact
945 instances - 6374 features - 7 classes - 0 missing values
No data.
215 runs0 likes7 downloads7 reach9 impact
204 instances - 5833 features - 6 classes - 0 missing values
No data.
211 runs0 likes4 downloads4 reach9 impact
313 instances - 5805 features - 8 classes - 0 missing values
GISETTE is a handwritten digit recognition problem. The problem is to separate the highly confusable digits '4' and '9'. This dataset is one of five datasets of the NIPS 2003 feature selection…
466 runs0 likes51 downloads51 reach13 impact
7000 instances - 5001 features - 2 classes - 0 missing values
* Dataset: DBworld e-mails data set Task: dbworld-bodies * Source: Michele Filannino, PhD University of Manchester Centre for Doctoral Training Email: filannim_AT_cs.man.ac.uk * Data Set Information:…
3 runs0 likes6 downloads6 reach4 impact
64 instances - 4703 features - 2 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes0 downloads0 reach3 impact
20000 instances - 4297 features - 2 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes0 downloads0 reach3 impact
20000 instances - 4297 features - 2 classes - 0 missing values
This dataset contains a set of face images taken between April 1992 and April 1994 at AT&T Laboratories Cambridge. As described on the original website: There are ten different images of each of 40…
53 runs0 likes0 downloads0 reach2 impact
400 instances - 4097 features - 40 classes - 0 missing values
No data.
496 runs0 likes6 downloads6 reach11 impact
45 instances - 4027 features - 2 classes - 5948 missing values
No data.
296 runs0 likes5 downloads5 reach11 impact
96 instances - 4027 features - 9 classes - 19667 missing values
No data.
283 runs0 likes5 downloads5 reach11 impact
96 instances - 4027 features - 11 classes - 19667 missing values
No data.
159 runs0 likes11 downloads11 reach10 impact
1657 instances - 3759 features - 25 classes - 0 missing values
* Dataset: DBworld e-mails data set Task: dbworld-bodies-stemmed * Source: Michele Filannino, PhD University of Manchester Centre for Doctoral Training Email: filannim_AT_cs.man.ac.uk * Data Set…
0 runs0 likes3 downloads3 reach3 impact
64 instances - 3722 features - 2 classes - 0 missing values
No data.
416 runs1 likes13 downloads14 reach51 impact
1050 instances - 3239 features - 10 classes - 0 missing values
No data.
428 runs0 likes12 downloads12 reach51 impact
1003 instances - 3183 features - 10 classes - 0 missing values
No data.
377 runs0 likes9 downloads9 reach50 impact
913 instances - 3101 features - 10 classes - 0 missing values
The Street View House Numbers (SVHN) Dataset SVHN is a real-world image dataset for developing machine learning and object recognition algorithms with minimal requirement on data preprocessing and…
52 runs0 likes0 downloads0 reach2 impact
99289 instances - 3073 features - 10 classes - 0 missing values
This is a 20,000 instance sample of the original CIFAR-10 dataset. Sampled randomly and stratified, with 2000 examples per class. Training and test set are merged. Find the corresponding task for the…
380 runs0 likes3 downloads3 reach9 impact
20000 instances - 3073 features - 10 classes - 0 missing values