OpenML
Filter results by:
This data was gathered from participants in experimental speed dating events from 2002-2004. During the events, the attendees would have a four-minute "first date" with every other participant of the…
27285 runs16 likes144 downloads160 reach23 impact
8378 instances - 123 features - 2 classes - 18372 missing values
This dataset classifies people described by a set of attributes as good or bad credit risks. This dataset comes with a cost matrix: ``` Good Bad (predicted) Good 0 1 (actual) Bad 5 0 ``` It is worse…
501903 runs10 likes132 downloads142 reach2 impact
1000 instances - 21 features - 2 classes - 0 missing values
This is perhaps the best known database to be found in the pattern recognition literature. Fisher's paper is a classic in the field and is referenced frequently to this day. (See Duda & Hart, for…
8045 runs6 likes96 downloads102 reach2 impact
150 instances - 5 features - 3 classes - 0 missing values
Mammography dataset Past Usage: 1. Woods, K., Doss, C., Bowyer, K., Solka, J., Priebe, C.,
215 runs4 likes46 downloads50 reach13 impact
11183 instances - 7 features - 2 classes - 0 missing values
The aim of this dataset is to distinguish between nasal (class 0) and oral sounds (class 1). Five different attributes were chosen to characterize each vowel: they are the amplitudes of the five first…
212714 runs4 likes24 downloads28 reach19 impact
5404 instances - 6 features - 2 classes - 0 missing values
1. Title: Pima Indians Diabetes Database 2. Sources: (a) Original owners: National Institute of Diabetes and Digestive and Kidney Diseases (b) Donor of database: Vincent Sigillito…
196376 runs4 likes71 downloads75 reach0 impact
768 instances - 9 features - 2 classes - 0 missing values
This radar data was collected by a system in Goose Bay, Labrador. This system consists of a phased array of 16 high-frequency antennas with a total transmitted power on the order of 6.4 kilowatts. See…
2484 runs3 likes26 downloads29 reach0 impact
351 instances - 35 features - 2 classes - 0 missing values
The dataset (originally named ELEC2) contains 45,312 instances dated from 7 May 1996 to 5 December 1998. Each example of the dataset refers to a period of 30 minutes, i.e. there are 48 instances for…
104426 runs3 likes26 downloads29 reach0 impact
45312 instances - 9 features - 2 classes - 0 missing values
SPAM E-mail Database The "spam" concept is diverse: advertisements for products/websites, make money fast schemes, chain letters, pornography... Our collection of spam e-mails came from our postmaster…
155338 runs3 likes76 downloads79 reach0 impact
4601 instances - 58 features - 2 classes - 0 missing values
2126 fetal cardiotocograms (CTGs) were automatically processed and the respective diagnostic features measured. The CTGs were also classified by three expert obstetricians and a consensus…
23009 runs3 likes25 downloads28 reach47 impact
2126 instances - 36 features - 10 classes - 0 missing values
Data taken from the Blood Transfusion Service Center in Hsin-Chu City in Taiwan -- this is a classification problem. To demonstrate the RFMTC marketing model (a modified version of RFM), this study…
461234 runs3 likes48 downloads51 reach24 impact
748 instances - 5 features - 2 classes - 0 missing values
A simple database containing 17 Boolean-valued attributes describing animals. The "type" attribute appears to be the class attribute. Notes: * I find it unusual that there are 2 instances of "frog"…
168 runs2 likes14 downloads16 reach0 impact
101 instances - 18 features - 7 classes - 0 missing values
The database consists of the multi-spectral values of pixels in 3x3 neighbourhoods in a satellite image, and the classification associated with the central pixel in each neighbourhood. The aim is to…
21038 runs2 likes23 downloads25 reach0 impact
6430 instances - 37 features - 6 classes - 0 missing values
Oil dataset Past Usage: 1. Kubat, M., Holte, R.,
200 runs2 likes16 downloads18 reach13 impact
937 instances - 50 features - 2 classes - 0 missing values
### Description One-hundred plant species leaves dataset (Class = Texture). ### Sources ``` (a) Original owners of colour Leaves Samples: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. The…
139260 runs2 likes58 downloads60 reach408 impact
1599 instances - 65 features - 100 classes - 0 missing values
No data.
90 runs2 likes3 downloads5 reach0 impact
663552 instances - 13 features - 2 classes - 0 missing values
All data is from one continuous EEG measurement with the Emotiv EEG Neuroheadset. The duration of the measurement was 117 seconds. The eye state was detected via a camera during the EEG measurement…
162175 runs2 likes85 downloads87 reach17 impact
14980 instances - 15 features - 2 classes - 0 missing values
An artificial data set where instances belongs to several clusters with a banana shape. There are two attributes At1 and At2 corresponding to the x and y axis, respectively. The class label (-1 and 1)…
163 runs2 likes15 downloads17 reach5 impact
5300 instances - 3 features - 2 classes - 0 missing values
This is an artificial data set used in Friedman (1991) and also described in Breiman (1996,p.139). The cases are generated using the following method: Generate the values of 10 attributes, X1, ...,…
0 runs2 likes7 downloads9 reach4 impact
40768 instances - 11 features - 0 classes - 0 missing values
The MNIST database of handwritten digits with 784 features, raw data available at: http://yann.lecun.com/exdb/mnist/. It can be split in a training set of the first 60,000 examples, and a test set of…
12662 runs2 likes57 downloads59 reach16 impact
70000 instances - 785 features - 10 classes - 0 missing values
The first 5 variables are all blood tests which are thought to be sensitive to liver disorders that might arise from excessive alcohol consumption. Each line in the dataset constitutes the record of a…
150 runs2 likes30 downloads32 reach0 impact
345 instances - 7 features - 0 classes - 0 missing values
The satellite dataset comprises of features extracted from satellite observations. In particular, each image was taken under four different light wavelength, two in visible light (green and red) and…
2074 runs2 likes64 downloads66 reach18 impact
5100 instances - 37 features - 2 classes - 0 missing values
Dataset creator and donator: Zhi Liu, e-mail: liuzhi8673 '@' gmail.com, institution: National Engineering Research Center for E-Learning, Hubei Wuhan, China Data Set Information: dataset are derived…
65168 runs2 likes40 downloads42 reach206 impact
1500 instances - 10001 features - 50 classes - 0 missing values
Current dataset was adapted to ARFF format from the UCI version. Sample code ID's were removed. ! Note that there is also a related Breast Cancer Wisconsin (Diagnosis) Data Set with a different set of…
18908 runs1 likes16 downloads17 reach0 impact
699 instances - 10 features - 2 classes - 16 missing values
No data.
48 runs1 likes4 downloads5 reach0 impact
1000000 instances - 77 features - 10 classes - 0 missing values
NAME vehicle silhouettes PURPOSE to classify a given silhouette as one of four types of vehicle, using a set of features extracted from the silhouette. The vehicle may be viewed from one of many…
20914 runs1 likes23 downloads24 reach0 impact
846 instances - 19 features - 4 classes - 0 missing values
Generator generating 3 classes of waves. Each class is generated from a combination of 2 of 3 "base" waves. For details, see Breiman,L., Friedman,J.H., Olshen,R.A., and Stone,C.J. (1984).…
15586 runs1 likes53 downloads54 reach0 impact
5000 instances - 41 features - 3 classes - 0 missing values
This data set consists of three types of entities: (a) the specification of an auto in terms of various characteristics, (b) its assigned insurance risk rating, (c) its normalized losses in use as…
2493 runs1 likes24 downloads25 reach0 impact
205 instances - 26 features - 6 classes - 59 missing values
No data.
51 runs1 likes4 downloads5 reach0 impact
1000000 instances - 48 features - 10 classes - 0 missing values
No data.
326 runs1 likes4 downloads5 reach0 impact
1000000 instances - 23 features - 2 classes - 0 missing values
NAME: Sonar, Mines vs. Rocks SUMMARY: This is the data set used by Gorman and Sejnowski in their study of the classification of sonar signals using a neural network [1]. The task is to train a network…
2366 runs1 likes22 downloads23 reach0 impact
208 instances - 61 features - 2 classes - 0 missing values
1. Title: Haberman's Survival Data 2. Sources: (a) Donor: Tjen-Sien Lim (limt@stat.wisc.edu) (b) Date: March 4, 1999 3. Past Usage: 1. Haberman, S. J. (1976). Generalized Residuals for Log-Linear…
3241 runs1 likes18 downloads19 reach0 impact
306 instances - 4 features - 2 classes - 0 missing values
No data.
65 runs1 likes2 downloads3 reach0 impact
1000000 instances - 18 features - 7 classes - 0 missing values
Normalized version of the Forest Covertype dataset (see version 1), so that the numerical values are between 0 and 1. Contains the forest cover type for 30 x 30 meter cells obtained from US Forest…
319 runs1 likes39 downloads40 reach0 impact
581012 instances - 55 features - 7 classes - 0 missing values
No data.
314 runs1 likes8 downloads9 reach0 impact
1000000 instances - 36 features - 19 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
1187 runs1 likes10 downloads11 reach0 impact
412 instances - 9 features - 7 classes - 96 missing values
This data set consists of three types of entities: (a) the specification of an auto in terms of various characteristics; (b) its assigned insurance risk rating,; (c) its normalized losses in use as…
6 runs1 likes4 downloads5 reach0 impact
159 instances - 16 features - 0 classes - 0 missing values
Compilation of promoters with known transcriptional start points for E. coli genes. The task is to recognize promoters in strings that represent nucleotides (one of A, G, T, or C). A promoter is a…
138 runs1 likes9 downloads10 reach0 impact
106 instances - 59 features - 2 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Case number deleted. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based learning…
10 runs1 likes2 downloads3 reach0 impact
195 instances - 12 features - 0 classes - 2 missing values
This database was designed on the basis of data provided by US Census Bureau [http://www.census.gov] (under Lookup Access [http://www.census.gov/cdrom/lookup]: Summary Tape File 1). The data were…
2 runs1 likes3 downloads4 reach0 impact
22784 instances - 9 features - 0 classes - 0 missing values
Title: Communities and Crime Abstract: Communities within the United States. The data combines socio-economic data from the 1990 US Census, law enforcement data from the 1990 US LEMAS survey, and…
0 runs1 likes2 downloads3 reach4 impact
1994 instances - 128 features - 0 classes - 39202 missing values
Source: Ashwin Srinivasan Department of Statistics and Data Modeling University of Strathclyde Glasgow Scotland UK ross '@' uk.ac.turing The original Landsat data for this database was generated from…
1 runs1 likes6 downloads7 reach6 impact
6435 instances - 37 features - 0 classes - 0 missing values
The Computer Activity databases are a collection of computer systems activity measures. The data was collected from a Sun Sparcstation 20/712 with 128 Mbytes of memory running in a multi-user…
2 runs1 likes2 downloads3 reach0 impact
8192 instances - 13 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs1 likes1 downloads2 reach4 impact
4450 instances - 203 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs1 likes0 downloads1 reach4 impact
8885 instances - 252 features - 0 classes - 0 missing values
Internet Usage Data Data Type multivariate Abstract This data contains general demographic information on internet users in 1997. Sources Original Owner [1]Graphics, Visualization, & Usability Center…
0 runs1 likes5 downloads6 reach2 impact
10108 instances - 72 features - 46 classes - 2699 missing values
SPECT heart data This is a merged version of the separate train and test set which are usually distributed. On OpenML this train-test split can be found as one of the possible tasks. Sources: --…
1296 runs1 likes12 downloads13 reach7 impact
267 instances - 23 features - 2 classes - 0 missing values
El Nino Data Data Type spatio-temporal Abstract The data set contains oceanographic and surface meteorological readings taken from a series of buoys positioned throughout the equatorial Pacific. The…
0 runs1 likes3 downloads4 reach4 impact
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2851 runs1 likes7 downloads8 reach15 impact
1545 instances - 10937 features - 2 classes - 0 missing values
One of the NASA Metrics Data Program defect data sets. Data from software for storage management for receiving and processing ground data. Data comes from McCabe and Halstead features extractors of…
150513 runs1 likes20 downloads21 reach18 impact
2109 instances - 22 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
70 runs1 likes6 downloads7 reach7 impact
1545 instances - 10937 features - 2 classes - 0 missing values
No data.
337 runs1 likes2 downloads3 reach0 impact
1000000 instances - 13 features - 3 classes - 0 missing values
* Dataset Title: Wall-Following Robot Navigation Data Data Set (version with 4 Attributes) * Abstract: The data were collected as the SCITOS G5 robot navigates through the room following the wall in a…
138 runs1 likes6 downloads7 reach5 impact
5456 instances - 5 features - 4 classes - 0 missing values
No data.
27 runs1 likes4 downloads5 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
27 runs1 likes3 downloads4 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
* Donor: David W. Aha (aha '@' ics.uci.edu) (714) 856-8779 * Data Set Information: This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In…
159 runs1 likes4 downloads5 reach4 impact
200 instances - 14 features - 5 classes - 0 missing values
* Dataset Title: MicroMass - Mixed (mixed spectra version) * Abstract: A dataset to explore machine learning approaches for the identification of microorganisms from mass-spectrometry data. * Source:…
64 runs1 likes4 downloads5 reach4 impact
360 instances - 1301 features - 10 classes - 0 missing values
### Description MicroMass (pure spectra version) is a dataset to explore machine learning approaches for the identification of microorganisms from mass-spectrometry data. ### Source ``` Pierre Mahé,…
36484 runs1 likes14 downloads15 reach87 impact
571 instances - 1301 features - 20 classes - 0 missing values
Source: The dataset was created by Angeliki Xifara (angxifara @ gmail.com, Civil/Structural Engineer) and was processed by Athanasios Tsanas (tsanasthanasis @ gmail.com, Oxford Centre for Industrial…
103 runs1 likes4 downloads5 reach4 impact
768 instances - 10 features - 37 classes - 0 missing values
* Title: Skin Segmentation Data Set * Abstract: The Skin Segmentation dataset is constructed over B, G, R color space. Skin and Nonskin dataset is generated using skin textures from face images of…
15 runs1 likes9 downloads10 reach5 impact
245057 instances - 4 features - 2 classes - 0 missing values
### Description One-hundred plant species leaves dataset (Class = Margin). ### Sources ``` (a) Original owners of colour Leaves Samples: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. The…
139168 runs1 likes13 downloads14 reach409 impact
1600 instances - 65 features - 100 classes - 0 missing values
### Description One-hundred plant species leaves dataset (Class = Shape). ### Sources ``` (a) Original owners of colour Leaves Samples: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. The…
139377 runs1 likes32 downloads33 reach407 impact
1600 instances - 65 features - 100 classes - 0 missing values
Data Set Information: The data has been produced using Monte Carlo simulations. The first 21 features (columns 2-22) are kinematic properties measured by the particle detectors in the accelerator. The…
0 runs1 likes5 downloads6 reach4 impact
98050 instances - 29 features - 0 classes - 9 missing values
Source: Original Owner: U.S. Census Bureau http://www.census.gov/ United States Department of Commerce Donor: Terran Lane and Ronny Kohavi Data Mining and Visualization Silicon Graphics. terran '@'…
0 runs1 likes5 downloads6 reach4 impact
299285 instances - 42 features - classes - 0 missing values
Creators: Renata Cristina Barros Madeo (Madeo, R. C. B.) Priscilla Koch Wagner (Wagner, P. K.) Sarajane Marques Peres (Peres, S. M.) {renata.si, priscilla.wagner, sarajane} at usp.br…
17505 runs1 likes14 downloads15 reach27 impact
9873 instances - 33 features - 5 classes - 0 missing values
Concrete is the most important material in civil engineering. The concrete compressive strength is a highly nonlinear function of age and ingredients. These ingredients include cement, blast furnace…
0 runs1 likes2 downloads3 reach2 impact
1030 instances - 9 features - classes - 0 missing values
Data on tree growth used in the Case Study published in the September, 1995 issue of the Canadian Journal of Statistics. This data set was been provided by Dr. Fernando Camacho, Ontario Hydro…
14764 runs1 likes14 downloads15 reach30 impact
2796 instances - 35 features - 6 classes - 68100 missing values
### Attribute Information * The first column is the class label (1 for signal, 0 for background) * 21 low-level features (kinematic properties): lepton pT, lepton eta, lepton phi, missing energy…
10775 runs1 likes6 downloads7 reach13 impact
98050 instances - 29 features - 2 classes - 9 missing values
wine-quality-red-pmlb
31 runs1 likes0 downloads1 reach8 impact
1599 instances - 12 features - 6 classes - 0 missing values
Over 92 thousand images (32x32 pixels) of 46 characters from Devanagari script. Includes the alphabet as well as the numbers. Devanagari is an Indic script and forms a basis for over 100 languages…
9 runs1 likes6 downloads7 reach3 impact
92000 instances - 1025 features - 46 classes - 0 missing values
Author: Volker Lohweg (University of Applied Sciences, Ostwestfalen-Lippe) Source: [UCI](https://archive.ics.uci.edu/ml/datasets/banknote+authentication) - 2012 Please cite:…
127792 runs1 likes19 downloads20 reach19 impact
1372 instances - 5 features - 2 classes - 0 missing values
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage repeatable,…
106580 runs1 likes18 downloads19 reach18 impact
15545 instances - 6 features - 2 classes - 0 missing values
This data set was generated to model psychological experimental results. Each example is classified as having the balance scale tip to the right, tip to the left, or be balanced. The attributes are…
20364 runs1 likes14 downloads15 reach0 impact
625 instances - 5 features - 3 classes - 0 missing values
This is a 10% stratified subsample of the data from the 1999 ACM KDD Cup (http://www.sigkdd.org/kddcup/index.php). Modified by TunedIT (converted to ARFF format)…
25 runs1 likes33 downloads34 reach5 impact
494020 instances - 42 features - 23 classes - 0 missing values
* Abstract: Oxford Parkinson's Disease Detection Dataset * Source: The dataset was created by Max Little of the University of Oxford, in collaboration with the National Centre for Voice and Speech,…
179 runs1 likes14 downloads15 reach5 impact
195 instances - 23 features - 2 classes - 0 missing values
No data.
416 runs1 likes13 downloads14 reach51 impact
1050 instances - 3239 features - 10 classes - 0 missing values
This database was designed on the basis of data provided by US Census Bureau [http://www.census.gov] (under Lookup Access [http://www.census.gov/cdrom/lookup]: Summary Tape File 1). The data were…
0 runs1 likes6 downloads7 reach4 impact
22784 instances - 17 features - 0 classes - 0 missing values
The data consist of 2001 observations taken from a balloon about 30 kilometres above the surface of the earth. In the section of the flight shown here the balloon increases in height. As radiation…
0 runs1 likes2 downloads3 reach4 impact
2001 instances - 3 features - 0 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
608 runs1 likes9 downloads10 reach6 impact
1000 instances - 26 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
808 runs1 likes9 downloads10 reach5 impact
100 instances - 26 features - 2 classes - 0 missing values
Automated file upload of BNG(optdigits)
100 runs1 likes1 downloads2 reach0 impact
1000000 instances - 65 features - 10 classes - 0 missing values
Automated file upload of BNG(ionosphere)
99 runs1 likes4 downloads5 reach0 impact
1000000 instances - 35 features - 2 classes - 0 missing values
Multi-label dataset. Audio dataset (emotions) consists of 593 musical files with 6 clustered emotional labels and 72 predictors. Each song can be labeled with one or more of the labels…
0 runs1 likes5 downloads6 reach2 impact
593 instances - 78 features - 2 classes - 0 missing values
Citation Request: This dataset is public available for research. The details are described in [Cortez et al., 2009]. Please include this citation if you plan to use this database: P. Cortez, A.…
64 runs1 likes5 downloads6 reach4 impact
4898 instances - 12 features - 7 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
866 runs1 likes11 downloads12 reach7 impact
7129 instances - 6 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
143 runs1 likes10 downloads11 reach6 impact
531 instances - 103 features - 2 classes - 0 missing values
Predict a biological response of molecules from their chemical properties. Each row in this data set represents a molecule. The first column contains experimental data describing an actual biological…
40845 runs1 likes32 downloads33 reach19 impact
3751 instances - 1777 features - 2 classes - 0 missing values
Multi-label dataset for text-classification. It consists of article titles and partial blurbs. Blurbs can be assigned to several categories (e.g. Science, News, Games) based on word predictors.
0 runs1 likes9 downloads10 reach4 impact
3782 instances - 1101 features - 2 classes - 0 missing values
These weekly averages are ultimately based on measurements of 4 air samples per hour taken atop intake lines on several towers during steady periods of CO2 concentration of not less than 6 hours per…
0 runs1 likes1 downloads2 reach0 impact
2225 instances - 7 features - 0 classes - 0 missing values
Datasets from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch) Dataset from: http://www.agnostic.inf.ethz.ch/datasets.php Modified by TunedIT (converted to ARFF…
406 runs1 likes11 downloads12 reach7 impact
4229 instances - 1618 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
131 runs1 likes9 downloads10 reach6 impact
990 instances - 14 features - 2 classes - 0 missing values
The data is cleaned, regularized and encrypted global equity data. The first 21 columns (feature1 - feature21) are features, and target is the binary class you’re trying to predict.
284 runs1 likes1 downloads2 reach4 impact
96320 instances - 22 features - 2 classes - 0 missing values
Abstract: CART book's waveform domains Source: Original Owners: Breiman,L., Friedman,J.H., Olshen,R.A., & Stone,C.J. (1984). Classification and Regression Trees. Wadsworth International Group:…
0 runs1 likes3 downloads4 reach2 impact
5000 instances - 22 features - classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
736 runs1 likes5 downloads6 reach6 impact
452 instances - 280 features - 2 classes - 408 missing values
Citation Request: This breast cancer domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data.…
2006 runs1 likes30 downloads31 reach0 impact
286 instances - 10 features - 2 classes - 9 missing values
A dataset of steel plates' faults, classified into 7 different types. The goal was to train machine learning for automatic pattern recognition. The dataset consists of 27 features describing each…
276158 runs1 likes32 downloads33 reach15 impact
1941 instances - 34 features - 2 classes - 0 missing values
QSAR biodegradation Data Set * Abstract: Data set containing values for 41 attributes (molecular descriptors) used to classify 1055 chemicals into 2 classes (ready and not ready biodegradable). *…
260170 runs1 likes16 downloads17 reach16 impact
1055 instances - 42 features - 2 classes - 0 missing values
Current dataset was adapted to ARFF format from the UCI version. Sample code ID's were removed. ! Note that there is also a related Breast Cancer Wisconsin (Original) Data Set with a different set of…
221394 runs1 likes32 downloads33 reach16 impact
569 instances - 31 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2865 runs1 likes16 downloads17 reach15 impact
1545 instances - 10937 features - 2 classes - 0 missing values
Dataset from the MLRR repository: http://axon.cs.byu.edu:5000/ More infos: https://archive.ics.uci.edu/ml/datasets/Musk+(Version+2)
82514 runs1 likes17 downloads18 reach21 impact
6598 instances - 170 features - 2 classes - 0 missing values