Data
Filter results by:
No data.
219 runs0 likes4 downloads4 reach11 impact
1000000 instances - 58 features - 2 classes - 0 missing values
No data.
68 runs0 likes2 downloads2 reach9 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
334 runs0 likes4 downloads4 reach11 impact
1000000 instances - 33 features - 2 classes - 0 missing values
Automated file upload of BNG(spambase)
98 runs0 likes3 downloads3 reach11 impact
1000000 instances - 58 features - 2 classes - 0 missing values
Automated file upload of BNG(optdigits)
100 runs1 likes1 downloads2 reach11 impact
1000000 instances - 65 features - 10 classes - 0 missing values
Automated file upload of 20_newsgroups.drift
124 runs0 likes2 downloads2 reach15 impact
399940 instances - 1001 features - 2 classes - 0 missing values
Automated file upload of BNG(ionosphere)
99 runs1 likes4 downloads5 reach12 impact
1000000 instances - 35 features - 2 classes - 0 missing values
Automated file upload of BNG(segment)
99 runs0 likes1 downloads1 reach11 impact
1000000 instances - 20 features - 7 classes - 0 missing values
Multi-label dataset. The UC Berkeley enron4 dataset represents a subset of the original enron5 dataset and consists of 1684 cases of emails with 21 labels and 1001 predictor variables.
1 runs0 likes4 downloads4 reach14 impact
1702 instances - 1054 features - 2 classes - 0 missing values
Multi-label dataset. The UC Berkeley enron4 dataset represents a subset of the original enron5 dataset and consists of 1684 cases of emails with 21 labels and 1001 predictor variables.
0 runs0 likes0 downloads0 reach9 impact
1702 instances - 1054 features - classes - 0 missing values
Multi-label dataset. The genbase dataset contains protein sequences that can be assigned to several classes of protein families.
0 runs0 likes0 downloads0 reach9 impact
662 instances - 1212 features - classes - 0 missing values
No data.
307 runs0 likes5 downloads5 reach11 impact
1000000 instances - 4 features - 2 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach11 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach12 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
29 runs0 likes2 downloads2 reach11 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
45 runs0 likes2 downloads2 reach9 impact
1000000 instances - 23 features - 2 classes - 0 missing values
No data.
30 runs0 likes2 downloads2 reach12 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
31 runs0 likes1 downloads1 reach12 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
305 runs0 likes3 downloads3 reach11 impact
1000000 instances - 4 features - 2 classes - 0 missing values
No data.
337 runs1 likes2 downloads3 reach11 impact
1000000 instances - 13 features - 3 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach11 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
31 runs0 likes1 downloads1 reach11 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
29 runs0 likes2 downloads2 reach11 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
66 runs0 likes3 downloads3 reach9 impact
1000000 instances - 13 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach12 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
313 runs0 likes3 downloads3 reach9 impact
1000000 instances - 23 features - 2 classes - 0 missing values
* Dataset: DBworld e-mails data set Task: dbworld-bodies-stemmed * Source: Michele Filannino, PhD University of Manchester Centre for Doctoral Training Email: filannim_AT_cs.man.ac.uk * Data Set…
0 runs0 likes3 downloads3 reach12 impact
64 instances - 3722 features - 2 classes - 0 missing values
No data.
37 runs0 likes2 downloads2 reach12 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach11 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach11 impact
24 instances - 5 features - classes - 0 missing values
No data.
306 runs0 likes4 downloads4 reach11 impact
1000000 instances - 4 features - 2 classes - 0 missing values
Payments given by healthcare manufacturing companies to medical doctors or hospitals
0 runs0 likes0 downloads0 reach0 impact
73558 instances - 6 features - 2 classes - 83182 missing values
Survey to know if people self-identify as Midwesterners.
0 runs0 likes0 downloads0 reach0 impact
2778 instances - 28 features - 10 classes - 1737 missing values
Subset of KITS dataset with 100 images
0 runs0 likes0 downloads0 reach0 impact
100 instances - 27649 features - 2 classes - 0 missing values
Originally from the StatLog project. The raw data is still available on [UCI](https://archive.ics.uci.edu/ml/datasets/Molecular+Biology+(Splice-junction+Gene+Sequences)). The data consists of 3,186…
7055 runs0 likes5 downloads5 reach24 impact
3186 instances - 181 features - 3 classes - 0 missing values
Source: Rami Mustafa A Mohammad ( University of Huddersfield, rami.mohammad '@' hud.ac.uk, rami.mustafa.a '@' gmail.com) Lee McCluskey (University of Huddersfield,t.l.mccluskey '@' hud.ac.uk ) Fadi…
51512 runs1 likes25 downloads26 reach27 impact
11055 instances - 31 features - 2 classes - 0 missing values
### Description The data consists of real historical data collected from 2010 & 2011. Employees are manually allowed or denied access to resources over time. The data is used to create an algorithm…
35323 runs0 likes22 downloads22 reach28 impact
32769 instances - 10 features - 2 classes - 0 missing values
This database was derived from a simple hierarchical decision model originally developed for the demonstration of DEX (M. Bohanec, V. Rajkovic: Expert system for decision making. Sistemica 1(1), pp.…
7179 runs0 likes10 downloads10 reach23 impact
1728 instances - 7 features - 4 classes - 0 missing values
50 Danish words with their pronunciation from Dansk Ordbog
0 runs0 likes0 downloads0 reach0 impact
51 instances - 2 features - classes - 2 missing values
This database contains all legal 8-ply positions in the game of connect-4 in which neither player has won yet, and in which the next move is not forced. Attributes represent board positions on a 6x6…
9607 runs0 likes10 downloads10 reach25 impact
67557 instances - 43 features - 3 classes - 0 missing values
Relevant Information: -- The database contains 3 potential classes, one for the number of times a certain type of solar flare occured in a 24 hour period. -- Each instance represents captured features…
31 runs0 likes1 downloads1 reach20 impact
1066 instances - 13 features - 6 classes - 0 missing values
Survey to know if people self-identify as Midwesterners.
0 runs0 likes0 downloads0 reach0 impact
2778 instances - 28 features - 10 classes - 1737 missing values
Survey to know if people self-identify as Midwesterners.
0 runs0 likes0 downloads0 reach0 impact
2494 instances - 28 features - 9 classes - 99 missing values
No data.
48 runs1 likes4 downloads5 reach12 impact
1000000 instances - 77 features - 10 classes - 0 missing values
### Internet Usage Data #### Data Type multivariate #### Abstract This data contains general demographic information on internet users in 1997. ### Data Characteristics This data comes from a survey…
0 runs1 likes6 downloads7 reach12 impact
10108 instances - 72 features - 46 classes - 2699 missing values
This database contains the HTML source of web pages plus the ratings of a single user on these web pages. The web pages are on four separate subjects (Bands- recording artists; Goats; Sheep; and…
0 runs0 likes2 downloads2 reach21 impact
131 instances - 3 features - 3 classes - 0 missing values
This database contains the HTML source of web pages plus the ratings of a single user on these web pages. The web pages are on four separate subjects (Bands- recording artists; Goats; Sheep; and…
0 runs0 likes1 downloads1 reach21 impact
61 instances - 3 features - 3 classes - 0 missing values
This database contains the HTML source of web pages plus the ratings of a single user on these web pages. The web pages are on four separate subjects (Bands- recording artists; Goats; Sheep; and…
0 runs0 likes1 downloads1 reach21 impact
70 instances - 3 features - 3 classes - 0 missing values
This database contains the HTML source of web pages plus the ratings of a single user on these web pages. The web pages are on four separate subjects (Bands- recording artists; Goats; Sheep; and…
0 runs0 likes3 downloads3 reach21 impact
65 instances - 3 features - 2 classes - 0 missing values
No data.
304 runs0 likes7 downloads7 reach11 impact
1000000 instances - 25 features - 10 classes - 0 missing values
This dataset is a collection newsgroup documents. The 20 newsgroups collection has become a popular data set for experiments in text applications of machine learning techniques, such as text…
167 runs0 likes8 downloads8 reach12 impact
399940 instances - 1002 features - 2 classes - 0 missing values
### Pittsburgh bridges This version is derived from version 2 (the discretized version) by removing all instances with missing values in the last (target) attribute. The bridges dataset is originally…
31 runs0 likes3 downloads3 reach15 impact
105 instances - 12 features - 6 classes - 61 missing values
* Dataset: DBworld e-mails data set Task: dbworld-subjects * Source: Michele Filannino, PhD University of Manchester Centre for Doctoral Training Email: filannim_AT_cs.man.ac.uk * Data Set…
40 runs0 likes3 downloads3 reach13 impact
64 instances - 243 features - 2 classes - 0 missing values
* Dataset: DBworld e-mails data set Task: dbworld-subjects-stemmed * Source: Michele Filannino, PhD University of Manchester Centre for Doctoral Training Email: filannim_AT_cs.man.ac.uk * Data Set…
71 runs0 likes3 downloads3 reach13 impact
64 instances - 230 features - 2 classes - 0 missing values
This database is a standardized version of the original audiology database (see audiology.* in this directory). The non-standard set of attributes have been converted to a standard set of attributes…
7303 runs0 likes12 downloads12 reach12 impact
226 instances - 70 features - 24 classes - 317 missing values
1. Title: Chess End-Game -- King+Rook versus King+Pawn on a7 (usually abbreviated KRKPA7). The pawn on a7 means it is one square away from queening. It is the King+Rook's side (white) to move. 2.…
273623 runs1 likes42 downloads43 reach16 impact
3196 instances - 37 features - 2 classes - 0 missing values
1. Title: Postoperative Patient Data 2. Source Information: -- Creators: Sharon Summers, School of Nursing, University of Kansas Medical Center, Kansas City, KS 66160 Linda Woolery, School of Nursing,…
1758 runs0 likes10 downloads10 reach9 impact
90 instances - 9 features - 3 classes - 3 missing values
### Description This dataset describes mushrooms in terms of their physical characteristics. They are classified into: poisonous or edible. ### Source ``` (a) Origin: Mushroom records are drawn from…
16392 runs1 likes42 downloads43 reach13 impact
8124 instances - 23 features - 2 classes - 2480 missing values
1. Title: Nursery Database 2. Sources: (a) Creator: Vladislav Rajkovic et al. (13 experts) (b) Donors: Marko Bohanec (marko.bohanec@ijs.si) Blaz Zupan (blaz.zupan@ijs.si) (c) Date: June, 1997 3. Past…
2210 runs0 likes18 downloads18 reach12 impact
12960 instances - 9 features - 5 classes - 0 missing values
Compilation of promoters with known transcriptional start points for E. coli genes. The task is to recognize promoters in strings that represent nucleotides (one of A, G, T, or C). A promoter is a…
138 runs1 likes9 downloads10 reach12 impact
106 instances - 58 features - 2 classes - 0 missing values
Primary Tumor Domain - Donors: - I. Kononenko, University E.Kardelj, Faculty for electrical engineering - B. Cestnik, Jozef Stefan Institute - Past Usage: (sveral) 1. Cestnik,G., Konenenko,I, &…
1261 runs0 likes16 downloads16 reach12 impact
339 instances - 18 features - 21 classes - 225 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. The maps were scanned in 8 bit grey value at density of 400dpi,…
26229 runs0 likes18 downloads18 reach13 impact
2000 instances - 241 features - 10 classes - 0 missing values
Citation Request: This breast cancer domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data.…
2009 runs1 likes36 downloads37 reach9 impact
286 instances - 10 features - 2 classes - 9 missing values
Classify a chess game based on the position of the white king, the white rook and the black king.
1777 runs0 likes16 downloads16 reach9 impact
28056 instances - 7 features - 18 classes - 0 missing values
Number of Samples and Design Method of Classifier on the Plane", Pattern Recognition, Vol. 24, No. 4, pp. 317-324, 1991. Lung Cancer Data * Past Usage: - Hong, Z.Q. and Yang, J.Y. "Optimal…
1238 runs0 likes19 downloads19 reach13 impact
32 instances - 57 features - 3 classes - 5 missing values
No data.
1038 runs0 likes10 downloads10 reach9 impact
55296 instances - 10 features - 3 classes - 0 missing values
No data.
1457 runs0 likes13 downloads13 reach9 impact
39366 instances - 10 features - 2 classes - 0 missing values
Space Shuttle Autolanding Domain - Donor: B. Cestnik Jozef Stefan Institute - Past Usage: (several, it appears) Example: Michie,D. (1988). The Fifth Generation's Unbridged Gap. In Rolf Herken (Ed.)…
1466 runs0 likes9 downloads9 reach9 impact
15 instances - 7 features - 2 classes - 26 missing values
1. Title: INDUCE Trains Data set 2. Sources: - Donor: GMU, Center for AI, Software Librarian, Eric E. Bloedorn (bloedorn@aic.gmu.edu) - Original owners: Ryszard S. Michalski (michalski@aic.gmu.edu)…
1973 runs0 likes9 downloads9 reach15 impact
10 instances - 33 features - 2 classes - 51 missing values
1. Title: 1984 United States Congressional Voting Records Database 2. Source Information: (a) Source: Congressional Quarterly Almanac, 98th Congress, 2nd session 1984, Volume XL: Congressional…
2262 runs0 likes17 downloads17 reach9 impact
435 instances - 17 features - 2 classes - 392 missing values
This database encodes the complete set of possible board configurations at the end of tic-tac-toe games, where "x" is assumed to have played first. The target concept is "win for x" (i.e., true when…
386330 runs2 likes83 downloads85 reach10 impact
958 instances - 10 features - 2 classes - 0 missing values
A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. Further details concerning the book, including information on…
0 runs0 likes0 downloads0 reach13 impact
379 instances - 8 features - 4 classes - 1418 missing values
A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. Further details concerning the book, including information on…
490 runs0 likes4 downloads4 reach13 impact
364 instances - 33 features - 6 classes - 101 missing values
A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. Further details concerning the book, including information on…
27710 runs0 likes13 downloads13 reach42 impact
797 instances - 5 features - 6 classes - 0 missing values
A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. Further details concerning the book, including information on…
1030 runs0 likes8 downloads8 reach14 impact
132 instances - 4 features - 2 classes - 0 missing values
A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. Further details concerning the book, including information on…
1116 runs0 likes9 downloads9 reach14 impact
120 instances - 4 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
1058 runs0 likes9 downloads9 reach15 impact
167 instances - 5 features - 2 classes - 0 missing values
This is the large soybean database from the UCI repository, with its training and test database combined into a single file. There are 19 classes, only the first 15 of which have been used in prior…
40719 runs1 likes53 downloads54 reach13 impact
683 instances - 36 features - 19 classes - 2337 missing values
Primate splice-junction gene sequences (DNA) with associated imperfect domain theory. Splice junctions are points on a DNA sequence at which 'superfluous' DNA is removed during the process of protein…
24188 runs1 likes17 downloads18 reach9 impact
3190 instances - 61 features - 3 classes - 0 missing values
### SPECT heart data This is a merged version of the separate train and test set which are usually distributed. On OpenML this train-test split can be found as one of the possible tasks. ### Sources:…
1296 runs1 likes12 downloads13 reach16 impact
267 instances - 23 features - 2 classes - 0 missing values
Once upon a time, in July 1991, the monks of Corsendonk Priory were faced with a school held in their priory, namely the 2nd European Summer School on Machine Learning. After listening more than one…
394294 runs2 likes30 downloads32 reach38 impact
601 instances - 7 features - 2 classes - 0 missing values
Once upon a time, in July 1991, the monks of Corsendonk Priory were faced with a school held in their priory, namely the 2nd European Summer School on Machine Learning. After listening more than one…
108666 runs1 likes15 downloads16 reach35 impact
554 instances - 7 features - 2 classes - 0 missing values
Once upon a time, in July 1991, the monks of Corsendonk Priory were faced with a school held in their priory, namely the 2nd European Summer School on Machine Learning. After listening more than one…
358450 runs2 likes20 downloads22 reach40 impact
556 instances - 7 features - 2 classes - 0 missing values
This data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the years 1970, 1980, and 1990. The coding schemes have been standardized (by the IPUMS project) to be…
434 runs0 likes10 downloads10 reach14 impact
7019 instances - 61 features - 8 classes - 48089 missing values
This data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the years 1970, 1980, and 1990. The coding schemes have been standardized (by the IPUMS project) to be…
366 runs0 likes10 downloads10 reach14 impact
8844 instances - 61 features - 7 classes - 51515 missing values
This data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the years 1970, 1980, and 1990. The coding schemes have been standardized (by the IPUMS project) to be…
354 runs0 likes7 downloads7 reach14 impact
7485 instances - 61 features - 7 classes - 52048 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
622 runs0 likes6 downloads6 reach17 impact
10108 instances - 69 features - 2 classes - 2699 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
736 runs0 likes6 downloads6 reach15 impact
364 instances - 33 features - 2 classes - 80 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
728 runs0 likes7 downloads7 reach15 impact
2000 instances - 241 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
722 runs0 likes6 downloads6 reach15 impact
683 instances - 36 features - 2 classes - 2337 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
757 runs0 likes8 downloads8 reach15 impact
400 instances - 6 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
726 runs0 likes10 downloads10 reach15 impact
576 instances - 12 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
765 runs0 likes13 downloads13 reach15 impact
1728 instances - 7 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
1202 runs0 likes9 downloads9 reach14 impact
100 instances - 4 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
709 runs0 likes9 downloads9 reach14 impact
48 instances - 5 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
135 runs0 likes9 downloads9 reach15 impact
3190 instances - 61 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
173 runs0 likes6 downloads6 reach24 impact
106 instances - 58 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
717 runs0 likes5 downloads5 reach14 impact
90 instances - 9 features - 2 classes - 3 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
652 runs0 likes17 downloads17 reach15 impact
12960 instances - 9 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
106 runs0 likes5 downloads5 reach14 impact
76 instances - 45 features - 2 classes - 22 missing values