OpenML
Filter results by:
The task consists of Learning Quantitative Structure Activity Relationships (QSARs). The Inhibition of Dihydrofolate Reductase by Pyrimidines.The data are described in: King, Ross .D., Muggleton,…
2 runs0 likes1 downloads1 reach0 impact
74 instances - 28 features - 0 classes - 0 missing values
This database was designed on the basis of data provided by US Census Bureau [http://www.census.gov] (under Lookup Access [http://www.census.gov/cdrom/lookup]: Summary Tape File 1). The data were…
2 runs1 likes3 downloads4 reach0 impact
22784 instances - 9 features - 0 classes - 0 missing values
No data.
307 runs0 likes3 downloads3 reach0 impact
1000000 instances - 41 features - 3 classes - 0 missing values
No data.
291 runs0 likes4 downloads4 reach0 impact
1000000 instances - 18 features - 7 classes - 0 missing values
No data.
167 runs0 likes8 downloads8 reach0 impact
399940 instances - 1002 features - 2 classes - 0 missing values
No data.
874 runs0 likes6 downloads6 reach0 impact
71 instances - 63 features - 6 classes - 0 missing values
No data.
940 runs0 likes5 downloads5 reach0 impact
74 instances - 63 features - 4 classes - 0 missing values
No data.
988 runs0 likes3 downloads3 reach0 impact
74 instances - 63 features - 4 classes - 0 missing values
No data.
400 runs0 likes6 downloads6 reach0 impact
45164 instances - 75 features - 11 classes - 0 missing values
No data.
293 runs0 likes2 downloads2 reach0 impact
1000000 instances - 17 features - 10 classes - 0 missing values
No data.
65 runs0 likes3 downloads3 reach0 impact
1000000 instances - 40 features - 2 classes - 0 missing values
No data.
309 runs0 likes6 downloads6 reach0 impact
1000000 instances - 35 features - 6 classes - 0 missing values
No data.
296 runs0 likes7 downloads7 reach0 impact
1000000 instances - 61 features - 2 classes - 0 missing values
No data.
75 runs0 likes2 downloads2 reach0 impact
137781 instances - 10 features - 7 classes - 0 missing values
No data.
310 runs0 likes2 downloads2 reach0 impact
1000000 instances - 14 features - 5 classes - 0 missing values
No data.
326 runs0 likes4 downloads4 reach0 impact
1000000 instances - 14 features - 2 classes - 0 missing values
No data.
304 runs0 likes3 downloads3 reach0 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
331 runs0 likes7 downloads7 reach0 impact
1000000 instances - 20 features - 2 classes - 0 missing values
### Description Scene recognition dataset - It contains characteristics about images and their classes. The original dataset is a multi-label classification problem with 6 different labels: {Beach,…
86252 runs0 likes21 downloads21 reach16 impact
2407 instances - 300 features - 2 classes - 0 missing values
1. Title: Part of the IRAS Low Resolution Spectrometer Database 2. Sources: (a) Originator: Infra-Red Astronomy Satellite Project Database (b) Donor: John Stutz (c) Date:…
1243 runs0 likes43 downloads43 reach6 impact
531 instances - 103 features - 48 classes - 0 missing values
Title: Communities and Crime Abstract: Communities within the United States. The data combines socio-economic data from the 1990 US Census, law enforcement data from the 1990 US LEMAS survey, and…
0 runs1 likes2 downloads3 reach4 impact
1994 instances - 128 features - 0 classes - 39202 missing values
Pittsburgh bridges This version is derived from version 1 by removing all instances with missing values in the last (target) attribute. The bridges dataset is originally not a classification dataset,…
31 runs0 likes1 downloads1 reach6 impact
105 instances - 13 features - 6 classes - 61 missing values
Pittsburgh bridges This version is derived from version 2 (the discretized version) by removing all instances with missing values in the last (target) attribute. The bridges dataset is originally not…
31 runs0 likes2 downloads2 reach6 impact
105 instances - 13 features - 6 classes - 61 missing values
Hayes-Roth Database This is a merged version of the separate train and test set which are usually distributed. On OpenML this train-test split can be found as one of the possible tasks. Source…
380 runs0 likes3 downloads3 reach14 impact
160 instances - 5 features - 3 classes - 0 missing values
No data.
863 runs0 likes11 downloads11 reach0 impact
39366 instances - 10 features - 2 classes - 0 missing values
No data.
52 runs0 likes2 downloads2 reach0 impact
No data.
960 runs0 likes8 downloads8 reach0 impact
55296 instances - 10 features - 3 classes - 0 missing values
No data.
68 runs0 likes4 downloads4 reach0 impact
1000000 instances - 23 features - 2 classes - 0 missing values
No data.
326 runs0 likes4 downloads4 reach0 impact
1000000 instances - 16 features - 2 classes - 0 missing values
No data.
315 runs0 likes2 downloads2 reach0 impact
295245 instances - 11 features - 5 classes - 0 missing values
Source: Ashwin Srinivasan Department of Statistics and Data Modeling University of Strathclyde Glasgow Scotland UK ross '@' uk.ac.turing The original Landsat data for this database was generated from…
1 runs1 likes6 downloads7 reach6 impact
6435 instances - 37 features - 0 classes - 0 missing values
Information about customers consists of 86 variables and includes product usage data and socio-demographic data derived from zip area codes. The data was supplied by the Dutch data mining company…
0 runs0 likes2 downloads2 reach4 impact
9822 instances - 86 features - 0 classes - 0 missing values
University of Sao Paulo, School of Art, Sciences and Humanities, Sao Paulo, SP, Brazil ### LIBRAS Movement Database LIBRAS, acronym of the Portuguese name "LIngua BRAsileira de Sinais", is the…
0 runs0 likes4 downloads4 reach6 impact
360 instances - 91 features - 0 classes - 0 missing values
### Description ISOLET (Isolated Letter Speech Recognition) dataset was generated as follows: 150 subjects spoke the name of each letter of the alphabet twice. Hence, there are 52 training examples…
43283 runs0 likes67 downloads67 reach120 impact
7797 instances - 618 features - 26 classes - 0 missing values
No data.
206 runs0 likes3 downloads3 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
67 runs0 likes2 downloads2 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
311 runs0 likes3 downloads3 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
65 runs0 likes8 downloads8 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
310 runs0 likes4 downloads4 reach0 impact
1000000 instances - 19 features - 4 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Survival treated as the class attribute As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using…
12 runs0 likes2 downloads2 reach0 impact
130 instances - 10 features - 0 classes - 97 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Tumor-size treated as the class attribute. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using…
0 runs0 likes3 downloads3 reach0 impact
286 instances - 10 features - 0 classes - 9 missing values
This is a family of datasets synthetically generated from a realistic simulation of the dynamics of a Unimation Puma 560 robot arm. There are eight datastets in this family . In this repository we…
2 runs0 likes5 downloads5 reach0 impact
8192 instances - 9 features - 0 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag. Gasoline comnsumption is being treated as…
2 runs0 likes0 downloads0 reach0 impact
27 instances - 5 features - 0 classes - 0 missing values
The Computer Activity databases are a collection of computer systems activity measures. The data was collected from a Sun Sparcstation 20/712 with 128 Mbytes of memory running in a multi-user…
2 runs1 likes2 downloads3 reach0 impact
8192 instances - 13 features - 0 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag. Electicity usage is being treated as the…
4 runs0 likes0 downloads0 reach0 impact
55 instances - 3 features - 0 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Weight treated as the class attribute. Identifier deleted. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric…
10 runs0 likes2 downloads2 reach0 impact
158 instances - 8 features - 0 classes - 87 missing values
1. Title: Ozone Level Detection 2. Source: Kun Zhang zhang.kun05 '@' gmail.com Department of Computer Science, Xavier University of Lousiana Wei Fan wei.fan '@' gmail.com IBM T.J.Watson Research…
0 runs0 likes1 downloads1 reach4 impact
2536 instances - 73 features - 0 classes - 0 missing values
This data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the years 1970, 1980, and 1990. The coding schemes have been standardized (by the IPUMS project) to be…
354 runs0 likes7 downloads7 reach5 impact
7485 instances - 61 features - 7 classes - 52048 missing values
This data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the years 1970, 1980, and 1990. The coding schemes have been standardized (by the IPUMS project) to be…
434 runs0 likes10 downloads10 reach5 impact
7019 instances - 61 features - 8 classes - 48089 missing values
No data.
414 runs0 likes8 downloads8 reach50 impact
690 instances - 8262 features - 10 classes - 0 missing values
No data.
219 runs0 likes5 downloads5 reach9 impact
414 instances - 6430 features - 9 classes - 0 missing values
No data.
215 runs0 likes7 downloads7 reach9 impact
204 instances - 5833 features - 6 classes - 0 missing values
No data.
222 runs0 likes10 downloads10 reach6 impact
1504 instances - 2887 features - 13 classes - 0 missing values
No data.
428 runs0 likes12 downloads12 reach51 impact
1003 instances - 3183 features - 10 classes - 0 missing values
No data.
268 runs0 likes9 downloads9 reach35 impact
3075 instances - 12433 features - 6 classes - 0 missing values
No data.
159 runs0 likes11 downloads11 reach10 impact
1657 instances - 3759 features - 25 classes - 0 missing values
No data.
264 runs0 likes11 downloads11 reach35 impact
3204 instances - 13196 features - 6 classes - 0 missing values
No data.
211 runs0 likes4 downloads4 reach9 impact
313 instances - 5805 features - 8 classes - 0 missing values
No data.
163 runs0 likes13 downloads13 reach10 impact
1560 instances - 8461 features - 20 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach4 impact
13 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs1 likes1 downloads2 reach4 impact
4450 instances - 203 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes1 downloads1 reach4 impact
25 instances - 10 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach4 impact
30 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach4 impact
26 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach4 impact
79 instances - 321 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach4 impact
37 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach4 impact
14 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach4 impact
10 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach4 impact
8 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs1 likes0 downloads1 reach4 impact
8885 instances - 252 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach4 impact
13 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach4 impact
34 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach4 impact
32 instances - 1143 features - 0 classes - 0 missing values
Internet Usage Data Data Type multivariate Abstract This data contains general demographic information on internet users in 1997. Sources Original Owner [1]Graphics, Visualization, & Usability Center…
0 runs1 likes5 downloads6 reach2 impact
10108 instances - 72 features - 46 classes - 2699 missing values
This database contains the HTML source of web pages plus the ratings of a single user on these web pages. The web pages are on four separate subjects (Bands- recording artists; Goats; Sheep; and…
0 runs0 likes1 downloads1 reach8 impact
131 instances - 3 features - 3 classes - 0 missing values
This dataset records 640 time series of 12 LPC cepstrum coefficients taken from nine male speakers. The data was collected for examining our newly developed classifier for multidimensional curves…
19127 runs0 likes10 downloads10 reach45 impact
9961 instances - 15 features - 9 classes - 0 missing values
This database contains the HTML source of web pages plus the ratings of a single user on these web pages. The web pages are on four separate subjects (Bands- recording artists; Goats; Sheep; and…
0 runs0 likes3 downloads3 reach8 impact
65 instances - 3 features - 2 classes - 0 missing values
### Description Synthetic Control Chart Time Series. This is actually time series classification. ### Sources ``` * Original Owner and Donor Dr Robert Alcock rob@skyblue.csd.auth.gr ``` ### Dataset…
16584 runs0 likes10 downloads10 reach38 impact
600 instances - 62 features - 6 classes - 0 missing values
This data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the years 1970, 1980, and 1990. The coding schemes have been standardized (by the IPUMS project) to be…
366 runs0 likes10 downloads10 reach5 impact
8844 instances - 61 features - 7 classes - 51515 missing values
This database contains the HTML source of web pages plus the ratings of a single user on these web pages. The web pages are on four separate subjects (Bands- recording artists; Goats; Sheep; and…
0 runs0 likes0 downloads0 reach8 impact
61 instances - 3 features - 3 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach4 impact
31 instances - 54 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach4 impact
80 instances - 113 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach4 impact
13 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes1 downloads1 reach4 impact
20 instances - 10 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach4 impact
22 instances - 40 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach4 impact
274 instances - 1143 features - 0 classes - 0 missing values
Squash Harvest Unstored Data source: Winna Harvey Crop and Food Research, Christchurch, New Zealand The purpose of the research was to determine the changes taking place in squash fruit during the…
876 runs0 likes4 downloads4 reach6 impact
52 instances - 24 features - 3 classes - 39 missing values
White Clover Persistence Trials Data source: Ian Tarbotton AgResearch, Whatawhata Research Centre, Hamilton, New Zealand The objective was to determine the mechanisms which influence the persistence…
858 runs0 likes4 downloads4 reach6 impact
63 instances - 32 features - 4 classes - 0 missing values
Data originating from the book "Analyzing Categorical Data" by Jeffrey S. Simonoff.
1085 runs0 likes9 downloads9 reach6 impact
50 instances - 5 features - 2 classes - 0 missing values
Fast training of support vector machines using sequential minimal optimization. In Bernhard Schölkopf, Christopher J. C. Burges, and Alexander J. Smola, editors, Advances in Kernel Methods - Support…
564 runs0 likes11 downloads11 reach14 impact
36974 instances - 124 features - 2 classes - 0 missing values
Once upon a time, in July 1991, the monks of Corsendonk Priory were faced with a school held in their priory, namely the 2nd European Summer School on Machine Learning. After listening more than one…
104971 runs0 likes13 downloads13 reach23 impact
554 instances - 7 features - 2 classes - 0 missing values
SPECT heart data This is a merged version of the separate train and test set which are usually distributed. On OpenML this train-test split can be found as one of the possible tasks. Sources: --…
1296 runs1 likes12 downloads13 reach7 impact
267 instances - 23 features - 2 classes - 0 missing values
Grass Grubs and Damage Ranking Data source: R. J. Townsend AgResearch, Lincoln, New Zealand Grass grubs are one of the major insect pests of pasture in Canterbury and can cause severe pasture damage…
988 runs0 likes8 downloads8 reach6 impact
155 instances - 9 features - 4 classes - 0 missing values
Pasture Production Data source: Dave Barker AgResearch Grasslands, Palmerston North, New Zealand The objective was to predict pasture production from a variety of biophysical factors. Vegetation and…
878 runs0 likes6 downloads6 reach6 impact
36 instances - 23 features - 3 classes - 0 missing values
Squash Harvest Stored Data source: Winna Harvey Crop and Food Research, Christchurch, New Zealand The purpose of the research was to determine the changes taking place in squash fruit during the…
867 runs0 likes4 downloads4 reach6 impact
52 instances - 25 features - 3 classes - 7 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach4 impact
16 instances - 24 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes2 downloads2 reach4 impact
195 instances - 33 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach4 impact
9 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach4 impact
10 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach4 impact
6 instances - 1143 features - 0 classes - 0 missing values