Data
Filter results by:
SPECT heart data This is a merged version of the separate train and test set which are usually distributed. On OpenML this train-test split can be found as one of the possible tasks. Sources: --…
1296 runs1 likes12 downloads13 reach8 impact
267 instances - 23 features - 2 classes - 0 missing values
### Description __Changes to version 1:__ all categorical features transformed as such. This dataset represents a set of possible advertisements on Internet pages. ### Sources (a) Creator and donor:…
1421 runs0 likes2 downloads2 reach12 impact
3279 instances - 1559 features - 2 classes - 0 missing values
No data.
1457 runs0 likes12 downloads12 reach1 impact
39366 instances - 10 features - 2 classes - 0 missing values
1. Title: Space Shuttle Autolanding Domain 2. Sources: (a) Original source: unknown -- NASA: Mr. Roger Burke's autolander design team (b) Donor: Bojan Cestnik Jozef Stefan Institute Jamova 39 61000…
1466 runs0 likes9 downloads9 reach1 impact
15 instances - 7 features - 2 classes - 26 missing values
"The speech dataset was also provided by (see citation request) and contains real world data from recorded English language. The normal class contains data from persons having an American accent…
1599 runs0 likes5 downloads5 reach9 impact
3686 instances - 401 features - 2 classes - 0 missing values
1. Title: Dermatology Database 2. Source Information: (a) Original owners: -- 1. Nilsel Ilter, M.D., Ph.D., Gazi University, School of Medicine 06510 Ankara, Turkey Phone: +90 (312) 214 1080 -- 2. H.…
1752 runs0 likes13 downloads13 reach2 impact
366 instances - 35 features - 6 classes - 8 missing values
1. Title: Postoperative Patient Data 2. Source Information: -- Creators: Sharon Summers, School of Nursing, University of Kansas Medical Center, Kansas City, KS 66160 Linda Woolery, School of Nursing,…
1758 runs0 likes9 downloads9 reach1 impact
90 instances - 9 features - 3 classes - 3 missing values
Publication Request: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This file describes the contents of the heart-disease directory. This directory contains 4 databases…
1761 runs0 likes10 downloads10 reach1 impact
303 instances - 14 features - 2 classes - 7 missing values
1. Title: Glass Identification Database 2. Sources: (a) Creator: B. German -- Central Research Establishment Home Office Forensic Science Service Aldermaston, Reading, Berkshire RG7 4PN (b) Donor:…
1776 runs0 likes50 downloads50 reach1 impact
214 instances - 10 features - 6 classes - 0 missing values
No data.
1777 runs0 likes15 downloads15 reach1 impact
28056 instances - 7 features - 18 classes - 0 missing values
Publication Request: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This file describes the contents of the heart-disease directory. This directory contains 4 databases…
1787 runs0 likes10 downloads10 reach1 impact
294 instances - 14 features - 2 classes - 782 missing values
1. Title: Protein Localization Sites 2. Creator and Maintainer: Kenta Nakai Institue of Molecular and Cellular Biology Osaka, University 1-3 Yamada-oka, Suita 565 Japan nakai@imcb.osaka-u.ac.jp…
1803 runs0 likes12 downloads12 reach2 impact
336 instances - 8 features - 8 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
1821 runs0 likes9 downloads9 reach6 impact
120 instances - 3 features - 2 classes - 0 missing values
Donor: Will Taylor (taylor@pluto.arc.nasa.gov) In this version (version 2), some features were removed. It is unclear why of how this was done.
1883 runs0 likes9 downloads9 reach1 impact
368 instances - 23 features - 2 classes - 1927 missing values
Citation Request: This lymphography domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data.…
1972 runs0 likes30 downloads30 reach2 impact
148 instances - 19 features - 4 classes - 0 missing values
1. Title: INDUCE Trains Data set 2. Sources: - Donor: GMU, Center for AI, Software Librarian, Eric E. Bloedorn (bloedorn@aic.gmu.edu) - Original owners: Ryszard S. Michalski (michalski@aic.gmu.edu)…
1973 runs0 likes9 downloads9 reach3 impact
10 instances - 33 features - 2 classes - 51 missing values
Citation Request: This breast cancer domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data.…
2007 runs1 likes32 downloads33 reach1 impact
286 instances - 10 features - 2 classes - 9 missing values
1. Title: Teaching Assistant Evaluation 2. Sources: (a) Collector: Wei-Yin Loh (Department of Statistics, UW-Madison) (b) Donor: Tjen-Sien Lim (limt@stat.wisc.edu) (b) Date: June 7, 1997 3. Past…
2028 runs0 likes12 downloads12 reach1 impact
151 instances - 6 features - 3 classes - 0 missing values
Data used in an analysis of the Brown and Frown corpora for my doctoral dissertation titled ``Variations in Written English: Characterizing Authors' Rhetorical Language Choices Across Corpora of…
2046 runs0 likes0 downloads0 reach4 impact
1000 instances - 24 features - 30 classes - 0 missing values
The satellite dataset comprises of features extracted from satellite observations. In particular, each image was taken under four different light wavelength, two in visible light (green and red) and…
2074 runs3 likes67 downloads70 reach21 impact
5100 instances - 37 features - 2 classes - 0 missing values
1. Title: Hepatitis Domain 2. Sources: (a) unknown (b) Donor: G.Gong (Carnegie-Mellon University) via Bojan Cestnik Jozef Stefan Institute Jamova 39 61000 Ljubljana Yugoslavia (tel.: (38)(+61) 214-399…
2134 runs1 likes12 downloads13 reach1 impact
155 instances - 20 features - 2 classes - 167 missing values
No data.
2193 runs0 likes15 downloads15 reach1 impact
1484 instances - 9 features - 10 classes - 0 missing values
1. Title: Nursery Database 2. Sources: (a) Creator: Vladislav Rajkovic et al. (13 experts) (b) Donors: Marko Bohanec (marko.bohanec@ijs.si) Blaz Zupan (blaz.zupan@ijs.si) (c) Date: June, 1997 3. Past…
2210 runs0 likes17 downloads17 reach2 impact
12960 instances - 9 features - 5 classes - 0 missing values
1. Title: 1984 United States Congressional Voting Records Database 2. Source Information: (a) Source: Congressional Quarterly Almanac, 98th Congress, 2nd session 1984, Volume XL: Congressional…
2262 runs0 likes17 downloads17 reach1 impact
435 instances - 17 features - 2 classes - 392 missing values
NAME: Sonar, Mines vs. Rocks SUMMARY: This is the data set used by Gorman and Sejnowski in their study of the classification of sonar signals using a neural network [1]. The task is to train a network…
2366 runs1 likes24 downloads25 reach1 impact
208 instances - 61 features - 2 classes - 0 missing values
This radar data was collected by a system in Goose Bay, Labrador. This system consists of a phased array of 16 high-frequency antennas with a total transmitted power on the order of 6.4 kilowatts. See…
2484 runs3 likes27 downloads30 reach2 impact
351 instances - 35 features - 2 classes - 0 missing values
Prediction task is to determine whether a person makes over 50K a year. Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the…
2671 runs1 likes30 downloads31 reach2 impact
48842 instances - 15 features - 2 classes - 6465 missing values
1. Title of Database: Blocks Classification 2. Sources: (a) Donato Malerba Dipartimento di Informatica University of Bari via Orabona 4 70126 Bari - Italy phone: +39 - 80 - 5443269 fax: +39 - 80 -…
2719 runs0 likes17 downloads17 reach2 impact
5473 instances - 11 features - 5 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2834 runs0 likes4 downloads4 reach16 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2841 runs0 likes3 downloads3 reach16 impact
630 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2849 runs0 likes5 downloads5 reach16 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2853 runs1 likes7 downloads8 reach16 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2855 runs0 likes4 downloads4 reach16 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2855 runs0 likes8 downloads8 reach16 impact
542 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2860 runs0 likes5 downloads5 reach16 impact
604 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2862 runs0 likes5 downloads5 reach16 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2862 runs0 likes7 downloads7 reach16 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2865 runs1 likes18 downloads19 reach16 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2866 runs0 likes6 downloads6 reach16 impact
546 instances - 10937 features - 2 classes - 0 missing values
This database contains 13 attributes (which have been extracted from a larger set of 75) Attribute Information: ------------------------ -- 1. age -- 2. sex -- 3. chest pain type (4 values) -- 4.…
3214 runs0 likes17 downloads17 reach2 impact
270 instances - 14 features - 2 classes - 0 missing values
1. Title: Haberman's Survival Data 2. Sources: (a) Donor: Tjen-Sien Lim (limt@stat.wisc.edu) (b) Date: March 4, 1999 3. Past Usage: 1. Haberman, S. J. (1976). Generalized Residuals for Log-Linear…
3241 runs1 likes19 downloads20 reach1 impact
306 instances - 4 features - 2 classes - 0 missing values
This data set consists of three types of entities: (a) the specification of an auto in terms of various characteristics, (b) its assigned insurance risk rating, (c) its normalized losses in use as…
3252 runs2 likes25 downloads27 reach2 impact
205 instances - 26 features - 6 classes - 59 missing values
* Dataset Title: AutoUniv Dataset data problem: autoUniv-au1-1000 * Abstract: AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and heterogeneity of…
3255 runs0 likes8 downloads8 reach15 impact
1000 instances - 21 features - 2 classes - 0 missing values
This dataset was retrieved 2014-11-14 from the UCI site and converted to the ARFF format. __Major changes w.r.t. version 3: dataset from UCI that matches description and data types__ ### Feature…
4196 runs0 likes4 downloads4 reach5 impact
690 instances - 15 features - 2 classes - 0 missing values
* Dataset Title: AutoUniv Dataset data problem: autoUniv-au4-2500 * Abstract: AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and heterogeneity of…
4222 runs0 likes7 downloads7 reach19 impact
2500 instances - 101 features - 3 classes - 0 missing values
The aim is to determine the type of arrhythmia from the ECG recordings. This database contains 279 attributes, 206 of which are linear valued and the rest are nominal. Concerning the study of H. Altay…
4428 runs0 likes48 downloads48 reach2 impact
452 instances - 280 features - 13 classes - 408 missing values
* Dataset Title: AutoUniv Dataset data problem: autoUniv-au7-700 * Abstract: AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and heterogeneity of…
4537 runs0 likes7 downloads7 reach19 impact
700 instances - 13 features - 3 classes - 0 missing values
### Description ### This dataset is part of a collection datasets based on the game "Jungle Chess" (a.k.a. Dou Shou Qi). For a description of the rules, please refer to the paper (link attached). The…
4653 runs0 likes2 downloads2 reach6 impact
44819 instances - 7 features - 3 classes - 0 missing values
This database was derived from a simple hierarchical decision model originally developed for the demonstration of DEX (M. Bohanec, V. Rajkovic: Expert system for decision making. Sistemica 1(1), pp.…
5828 runs0 likes5 downloads5 reach12 impact
1728 instances - 7 features - 4 classes - 0 missing values
Originally from the StatLog project. The raw data is still available on [UCI](https://archive.ics.uci.edu/ml/datasets/Molecular+Biology+(Splice-junction+Gene+Sequences)). The data consists of 3,186…
5988 runs0 likes3 downloads3 reach14 impact
3186 instances - 181 features - 3 classes - 0 missing values
__Major changes w.r.t. version 1: deactivated first two variables as they describe the batch of the experiments and should not be used for prediction. Also transformed the target from numeric to…
6501 runs0 likes3 downloads3 reach5 impact
540 instances - 21 features - 2 classes - 0 missing values
A dataset relating characteristics of telephony account features and usage and whether or not the customer churned. Originally used in [Discovering Knowledge in Data: An Introduction to Data…
6658 runs1 likes5 downloads6 reach15 impact
5000 instances - 21 features - 2 classes - 0 missing values
Expression levels of 77 proteins measured in the cerebral cortex of 8 classes of control and Down syndrome mice exposed to context fear conditioning, a task used to assess associative learning. The…
6779 runs0 likes0 downloads0 reach11 impact
1080 instances - 82 features - 8 classes - 1396 missing values
__Changes w.r.t. version 1: included one target factor with 7 levels as target variable for the classification. Also deleted the previous 7 binary target variables.__ A dataset of steel plates'…
7076 runs1 likes2 downloads3 reach7 impact
1941 instances - 28 features - 7 classes - 0 missing values
* Dataset Title: AutoUniv Dataset data problem: autoUniv-au7-300-drift-au7-cpd1-800 * Abstract: AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and…
7130 runs0 likes11 downloads11 reach27 impact
1100 instances - 13 features - 5 classes - 0 missing values
* Dataset Title: AutoUniv Dataset data problem: autoUniv-au7-cpd1-500 * Abstract: AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and heterogeneity…
7145 runs0 likes7 downloads7 reach27 impact
500 instances - 13 features - 5 classes - 0 missing values
No data.
7302 runs0 likes10 downloads10 reach2 impact
226 instances - 70 features - 24 classes - 317 missing values
The instances were drawn randomly from a database of 7 outdoor images. The images were hand-segmented to create a classification for every pixel. Each instance is a 3x3 region. __Major changes w.r.t.…
7680 runs0 likes2 downloads2 reach13 impact
2310 instances - 20 features - 7 classes - 0 missing values
Date: Tue, 15 Nov 88 15:44:08 EST From: stan To: aha@ICS.UCI.EDU 1. Title: Final settlements in labor negotitions in Canadian industry 2. Source Information -- Creators:…
7681 runs0 likes16 downloads16 reach2 impact
57 instances - 17 features - 2 classes - 326 missing values
__Changes w.r.t. version 1: renamed variables such that they match description.__ ### Dataset: Wilt Data Set ### Abstract: High-resolution Remote Sensing data set (Quickbird). Small number of training…
8029 runs0 likes1 downloads1 reach11 impact
4839 instances - 6 features - 2 classes - 0 missing values
This is perhaps the best known database to be found in the pattern recognition literature. Fisher's paper is a classic in the field and is referenced frequently to this day. (See Duda & Hart, for…
8061 runs8 likes112 downloads120 reach3 impact
150 instances - 5 features - 3 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. The maps were scanned in 8 bit grey value at density of 400dpi,…
9091 runs0 likes1 downloads1 reach12 impact
2000 instances - 241 features - 10 classes - 0 missing values
This database contains all legal 8-ply positions in the game of connect-4 in which neither player has won yet, and in which the next move is not forced. Attributes represent board positions on a 6x6…
9156 runs0 likes6 downloads6 reach15 impact
67557 instances - 43 features - 3 classes - 0 missing values
eating
9413 runs0 likes15 downloads15 reach43 impact
945 instances - 6374 features - 7 classes - 0 missing values
The KDD Cup 2009 offers the opportunity to work on large marketing databases from the French Telecom company Orange to predict the propensity of customers to switch provider (churn). Churn (wikipedia…
10982 runs0 likes15 downloads15 reach17 impact
50000 instances - 231 features - 2 classes - 8024152 missing values
* Dataset Title: AutoUniv Dataset data problem: autoUniv-au6-1000 * Abstract: AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and heterogeneity of…
11010 runs0 likes16 downloads16 reach39 impact
1000 instances - 41 features - 8 classes - 0 missing values
* Dataset Title: AutoUniv Dataset data problem: autoUniv-au6-250-drift-au6-cd1-500 * Abstract: AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and…
11011 runs0 likes9 downloads9 reach39 impact
750 instances - 41 features - 8 classes - 0 missing values
Datasets from ACM KDD Cup (http://www.sigkdd.org/kddcup/index.php) KDD Cup 2009 http://www.kddcup-orange.com Converted to ARFF format by TunedIT Customer Relationship Management (CRM) is a key element…
11301 runs0 likes12 downloads12 reach17 impact
50000 instances - 231 features - 2 classes - 8024152 missing values
This simple domain contains 7 Boolean attributes and 10 classes, the set of decimal digits. Recall that LED displays contain 7 light-emitting diodes -- hence the reason for 7 attributes. The class…
13005 runs0 likes7 downloads7 reach10 impact
500 instances - 8 features - 10 classes - 0 missing values
The MNIST database of handwritten digits with 784 features, raw data available at: http://yann.lecun.com/exdb/mnist/. It can be split in a training set of the first 60,000 examples, and a test set of…
13219 runs2 likes61 downloads63 reach21 impact
70000 instances - 785 features - 10 classes - 0 missing values
Prediction task is to determine whether a person makes over 50K a year. Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the…
13600 runs1 likes17 downloads18 reach26 impact
48842 instances - 15 features - 2 classes - 6465 missing values
The original Annealing dataset from UCI. The exact meaning of the features and classes is largely unknown. Annealing, in metallurgy and materials science, is a heat treatment that alters the physical…
13767 runs0 likes16 downloads16 reach2 impact
898 instances - 39 features - 5 classes - 22175 missing values
### Attribute Information * The first column is the class label (1 for signal, 0 for background) * 21 low-level features (kinematic properties): lepton pT, lepton eta, lepton phi, missing energy…
14235 runs1 likes7 downloads8 reach17 impact
98050 instances - 29 features - 2 classes - 9 missing values
PRO FOOTBALL SCORES (raw data appears after the description below) How well do the oddsmakers of Las Vegas predict the outcome of professional football games? Is there really a home field advantage -…
15927 runs0 likes19 downloads19 reach17 impact
672 instances - 10 features - 2 classes - 1200 missing values
Data on educational transitions for a sample of 500 Irish schoolchildren aged 11 in 1967. The data were collected by Greaney and Kelleghan (1984), and reanalyzed by Raftery and Hout (1985, 1993). ###…
16022 runs0 likes15 downloads15 reach17 impact
500 instances - 6 features - 2 classes - 32 missing values
### Description This dataset describes mushrooms in terms of their physical characteristics. They are classified into: poisonous or edible. ### Source ``` (a) Origin: Mushroom records are drawn from…
16392 runs1 likes40 downloads41 reach3 impact
8124 instances - 23 features - 2 classes - 2480 missing values
####1. Summary This dataset contain attributes of dresses and their recommendations according to their sales. Sales are monitor on the basis of alternate days. The attributes present analyzed are:…
18096 runs0 likes5 downloads5 reach10 impact
500 instances - 13 features - 2 classes - 835 missing values
####1. Summary This database was generated by the Laboratory of Image Processing and Pattern Recognition (INPG-LTIRF) in the development of the Esprit project ELENA No. 6891 and the Esprit working…
18284 runs0 likes11 downloads11 reach10 impact
5500 instances - 41 features - 11 classes - 0 missing values
### Description Gas Sensor Array Drift Dataset Data Set ### Sources ``` (a) Creators: Alexander Vergara (vergara '@' ucsd.edu) BioCircutis Institute University of California San Diego San Diego,…
18352 runs0 likes20 downloads20 reach34 impact
13910 instances - 129 features - 6 classes - 0 missing values
Data on tree growth used in the Case Study published in the September, 1995 issue of the Canadian Journal of Statistics. This data set was been provided by Dr. Fernando Camacho, Ontario Hydro…
18456 runs1 likes15 downloads16 reach31 impact
2796 instances - 35 features - 6 classes - 68100 missing values
This is a PROMISE data set made publicly available in order to encourage repeatable, verifiable, refutable, and/or improvable predictive models of software engineering. If you publish material based…
19125 runs0 likes18 downloads18 reach19 impact
10885 instances - 22 features - 2 classes - 25 missing values
Attribute information: ``` sick, negative. | classes age: continuous. sex: M, F. on thyroxine: f, t. query on thyroxine: f, t. on antithyroid medication: f, t. sick: f, t. pregnant: f, t. thyroid…
19125 runs0 likes31 downloads31 reach1 impact
3772 instances - 30 features - 2 classes - 6064 missing values
Generator generating 3 classes of waves. Each class is generated from a combination of 2 of 3 "base" waves. For details, see Breiman,L., Friedman,J.H., Olshen,R.A., and Stone,C.J. (1984).…
19670 runs1 likes53 downloads54 reach2 impact
5000 instances - 41 features - 3 classes - 0 missing values
### Description Synthetic Control Chart Time Series. This is actually time series classification. ### Sources ``` * Original Owner and Donor Dr Robert Alcock rob@skyblue.csd.auth.gr ``` ### Dataset…
20354 runs0 likes10 downloads10 reach40 impact
600 instances - 62 features - 6 classes - 0 missing values
### Description Cylinder bands UCI dataset - Process delays known as cylinder banding in rotogravure printing were substantially mitigated using control rules discovered by decision tree induction.…
20639 runs0 likes8 downloads8 reach18 impact
540 instances - 40 features - 2 classes - 999 missing values
Human Activity Recognition (HAR) database built from the recordings of 30 subjects performing activities of daily living (ADL) while carrying a waist-mounted smartphone with embedded inertial sensors.…
21322 runs0 likes23 downloads23 reach34 impact
10299 instances - 562 features - 6 classes - 0 missing values
The data were collected as the SCITOS G5 robot navigates through the room following the wall in a clockwise direction, for 4 rounds, using 24 ultrasound sensors arranged circularly around its 'waist'.…
22309 runs0 likes20 downloads20 reach26 impact
5456 instances - 25 features - 4 classes - 0 missing values
Primate splice-junction gene sequences (DNA) with associated imperfect domain theory. Splice junctions are points on a DNA sequence at which 'superfluous' DNA is removed during the process of protein…
22842 runs1 likes14 downloads15 reach1 impact
3190 instances - 61 features - 3 classes - 0 missing values
1. Title: Contraceptive Method Choice 2. Sources: (a) Origin: This dataset is a subset of the 1987 National Indonesia Contraceptive Prevalence Survey (b) Creator: Tjen-Sien Lim (limt@stat.wisc.edu)…
23111 runs0 likes19 downloads19 reach1 impact
1473 instances - 10 features - 3 classes - 0 missing values
The instances were drawn randomly from a database of 7 outdoor images. The images were hand-segmented to create a classification for every pixel. Each instance is a 3x3 region. ### Attribute…
23138 runs0 likes22 downloads22 reach2 impact
2310 instances - 20 features - 7 classes - 0 missing values
This dataset records 640 time series of 12 LPC cepstrum coefficients taken from nine male speakers. The data was collected for examining our newly developed classifier for multidimensional curves…
23156 runs0 likes11 downloads11 reach46 impact
9961 instances - 15 features - 9 classes - 0 missing values
Creators: Renata Cristina Barros Madeo (Madeo, R. C. B.) Priscilla Koch Wagner (Wagner, P. K.) Sarajane Marques Peres (Peres, S. M.) {renata.si, priscilla.wagner, sarajane} at usp.br…
23168 runs1 likes14 downloads15 reach29 impact
9873 instances - 33 features - 5 classes - 0 missing values
2126 fetal cardiotocograms (CTGs) were automatically processed and the respective diagnostic features measured. The CTGs were also classified by three expert obstetricians and a consensus…
24175 runs4 likes27 downloads31 reach48 impact
2126 instances - 36 features - 10 classes - 0 missing values
This file concerns credit card applications. All attribute names and values have been changed to meaningless symbols to protect the confidentiality of the data. This dataset is interesting because…
24246 runs1 likes31 downloads32 reach2 impact
690 instances - 16 features - 2 classes - 67 missing values
This database has been artificially generated. It describes the structure of the capital letters A, C, D, E, F, G, H, L, P, R, indicated by a number 1-10, in that order (A=1,C=2,...). Each letter's…
24305 runs0 likes10 downloads10 reach49 impact
10218 instances - 8 features - 10 classes - 0 missing values
Source: James P Bridge, Sean B Holden and Lawrence C Paulson University of Cambridge Computer Laboratory William Gates Building 15 JJ Thomson Avenue Cambridge CB3 0FD UK +44 (0)1223 763500…
24399 runs1 likes20 downloads21 reach34 impact
6118 instances - 52 features - 6 classes - 0 missing values
Current dataset was adapted to ARFF format from the UCI version. Sample code ID's were removed. ! Note that there is also a related Breast Cancer Wisconsin (Diagnosis) Data Set with a different set of…
25224 runs1 likes18 downloads19 reach1 impact
699 instances - 10 features - 2 classes - 16 missing values
Speaker independent recognition of the eleven steady state vowels of British English using a specified training set of lpc derived log area ratios. Collected by David Deterding (data and…
25626 runs0 likes14 downloads14 reach35 impact
990 instances - 13 features - 11 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
25956 runs0 likes7 downloads7 reach26 impact
841 instances - 71 features - 4 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. The maps were scanned in 8 bit grey value at density of 400dpi,…
26227 runs0 likes17 downloads17 reach3 impact
2000 instances - 241 features - 10 classes - 0 missing values