Data
Filter results by:
No data.
310 runs0 likes4 downloads4 reach11 impact
1000000 instances - 11 features - 2 classes - 0 missing values
Synthetic dataset. Almost identical to [dataset 152](https://www.openml.org/d/153/edit)
319 runs0 likes4 downloads4 reach11 impact
1000000 instances - 11 features - 2 classes - 0 missing values
No data.
304 runs0 likes7 downloads7 reach11 impact
1000000 instances - 25 features - 10 classes - 0 missing values
Normalized version of the pokerhand data set. Automated file upload of pokerhand-normalized.arff ### Data Set Information: Each record is an example of a hand consisting of five playing cards drawn…
314 runs0 likes12 downloads12 reach12 impact
829201 instances - 11 features - 10 classes - 0 missing values
No data.
298 runs0 likes3 downloads3 reach11 impact
1000000 instances - 11 features - 5 classes - 0 missing values
No data.
305 runs0 likes2 downloads2 reach11 impact
1000000 instances - 11 features - 5 classes - 0 missing values
No data.
308 runs0 likes2 downloads2 reach11 impact
1000000 instances - 11 features - 5 classes - 0 missing values
No data.
307 runs0 likes2 downloads2 reach11 impact
1000000 instances - 11 features - 5 classes - 0 missing values
No data.
309 runs0 likes3 downloads3 reach11 impact
1000000 instances - 11 features - 5 classes - 0 missing values
No data.
328 runs0 likes3 downloads3 reach11 impact
1000000 instances - 4 features - 2 classes - 0 missing values
No data.
330 runs0 likes5 downloads5 reach11 impact
1000000 instances - 4 features - 2 classes - 0 missing values
Number of Samples and Design Method of Classifier on the Plane", Pattern Recognition, Vol. 24, No. 4, pp. 317-324, 1991. Lung Cancer Data * Past Usage: - Hong, Z.Q. and Yang, J.Y. "Optimal…
1238 runs0 likes19 downloads19 reach13 impact
32 instances - 57 features - 3 classes - 5 missing values
Compilation of promoters with known transcriptional start points for E. coli genes. The task is to recognize promoters in strings that represent nucleotides (one of A, G, T, or C). A promoter is a…
138 runs1 likes9 downloads10 reach12 impact
106 instances - 58 features - 2 classes - 0 missing values
Primary Tumor Domain - Donors: - I. Kononenko, University E.Kardelj, Faculty for electrical engineering - B. Cestnik, Jozef Stefan Institute - Past Usage: (sveral) 1. Cestnik,G., Konenenko,I, &…
1261 runs0 likes16 downloads16 reach12 impact
339 instances - 18 features - 21 classes - 225 missing values
Space Shuttle Autolanding Domain - Donor: B. Cestnik Jozef Stefan Institute - Past Usage: (several, it appears) Example: Michie,D. (1988). The Fifth Generation's Unbridged Gap. In Rolf Herken (Ed.)…
1466 runs0 likes9 downloads9 reach9 impact
15 instances - 7 features - 2 classes - 26 missing values
Prediction task is to determine whether a person makes over 50K a year. Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the…
2671 runs1 likes33 downloads34 reach12 impact
48842 instances - 15 features - 2 classes - 6465 missing values
Predicting forest cover type from cartographic variables only (no remotely sensed data). The actual forest cover type for a given observation (30 x 30 meter cell) was determined from US Forest Service…
216 runs0 likes11 downloads11 reach12 impact
110393 instances - 55 features - 7 classes - 0 missing values
No data.
2198 runs1 likes17 downloads18 reach9 impact
1484 instances - 9 features - 10 classes - 0 missing values
The database consists of the multi-spectral values of pixels in 3x3 neighbourhoods in a satellite image, and the classification associated with the central pixel in each neighbourhood. The aim is to…
29713 runs2 likes24 downloads26 reach12 impact
6430 instances - 37 features - 6 classes - 0 missing values
**Source**: Marine Resources Division Marine Research Laboratories - Taroona Department of Primary Industry and Fisheries, Tasmania Abalone data - Past Usage: - Sam Waugh (1995) "Extending and…
34899 runs0 likes18 downloads18 reach9 impact
4177 instances - 9 features - 28 classes - 0 missing values
Classify a chess game based on the position of the white king, the white rook and the black king.
1777 runs0 likes16 downloads16 reach9 impact
28056 instances - 7 features - 18 classes - 0 missing values
Database of baseball players and play statistics, including 'Games_played', 'At_bats', 'Runs', 'Hits', 'Doubles', 'Triples', 'Home_runs', 'RBIs', 'Walks', 'Strikeouts', 'Batting_average',…
795 runs0 likes11 downloads11 reach10 impact
1340 instances - 17 features - 3 classes - 20 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
1187 runs1 likes10 downloads11 reach9 impact
412 instances - 9 features - 7 classes - 96 missing values
- S. Aeberhard, D. Coomans and O. de Vel, Comparison of Classifiers in High Dimensional Settings, Tech. Rep. no. 92-02, (1992), Dept. of Computer Science and Dept. of Mathematics and Statistics, James…
1187 runs1 likes20 downloads21 reach12 impact
178 instances - 14 features - 3 classes - 0 missing values
The objective was to determine which seedlots in a species are best for soil conservation in seasonally dry hill country. Determination is found by measurement of height, diameter by height, survival,…
27229 runs0 likes11 downloads11 reach10 impact
736 instances - 20 features - 5 classes - 448 missing values
This is data set is concerned with the forward kinematics of an 8 link robot arm. Among the existing variants of this data set we have used the variant 8nm, which is known to be highly non-linear and…
19 runs0 likes7 downloads7 reach9 impact
8192 instances - 9 features - 0 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag.
7 runs0 likes2 downloads2 reach9 impact
61 instances - 3 features - 0 classes - 0 missing values
Wisconsin Prognostic Breast Cancer (WPBC) Various versions of this data have been used in the following publications: - W. N. Street, O. L. Mangasarian, and W.H. Wolberg. An inductive learning…
5 runs0 likes4 downloads4 reach9 impact
194 instances - 33 features - 0 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag.
2 runs0 likes0 downloads0 reach9 impact
52 instances - 3 features - 0 classes - 0 missing values
Data from StatLib A manufacturer of automotive accessories provides hardware, e.g. nuts, bolts, washers and screws, to fasten the accessory to the car or truck. Hardware is counted and packaged…
4 runs0 likes0 downloads0 reach9 impact
40 instances - 7 features - 0 classes - 0 missing values
This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In particular, the Cleveland database is the only one that has been used by ML researchers to…
37 runs0 likes5 downloads5 reach9 impact
303 instances - 14 features - 0 classes - 6 missing values
This data set consists of three types of entities: (a) the specification of an auto in terms of various characteristics; (b) its assigned insurance risk rating,; (c) its normalized losses in use as…
11 runs1 likes4 downloads5 reach10 impact
159 instances - 16 features - 0 classes - 0 missing values
Auto-Mpg Data This dataset was taken from the StatLib library which is maintained at Carnegie Mellon University. The dataset was used in the 1983 American Statistical Association Exposition. -…
2 runs0 likes2 downloads2 reach12 impact
398 instances - 8 features - 0 classes - 6 missing values
The Computer Activity databases are a collection of computer systems activity measures. The data was collected from a Sun Sparcstation 20/712 with 128 Mbytes of memory running in a multi-user…
2 runs1 likes1 downloads2 reach9 impact
8192 instances - 22 features - 0 classes - 0 missing values
This data set is also obtained from the task of controlling the ailerons of a F16 aircraft, although the target variable and attributes are different from the ailerons domain. The target variable here…
2 runs0 likes3 downloads3 reach10 impact
9517 instances - 7 features - 0 classes - 0 missing values
Fruitflies" by Linda Partridge and Marion Farquhar. _Nature_, 294, 580-581, 1981 NAME: Sexual activity and the lifespan of male fruitflies\ TYPE: Designed (almost factorial) experiment\ SIZE: 125…
4 runs0 likes2 downloads2 reach12 impact
125 instances - 5 features - 0 classes - 0 missing values
D. Harrington\ D. Harrington, (1991), published by John Wiley & Sons NAME: PBC Data SIZE: 418 observations, 20 variables DESCRIPTIVE ABSTRACT: Below is a description of the variables recorded from the…
10 runs0 likes1 downloads1 reach12 impact
418 instances - 19 features - 0 classes - 1239 missing values
This is a commercial application described in Weiss & Indurkhya (1995). The data describes a telecommunication problem. No further information is available. Characteristics: (10000+5000) cases, 49…
2 runs0 likes4 downloads4 reach11 impact
15000 instances - 49 features - 0 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Horsepower treated as the class attribute. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using…
0 runs0 likes2 downloads2 reach9 impact
Massachusetts\ Massachusetts As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based learning with encoding length selection. In Progress in Connectionist-Based…
4 runs1 likes0 downloads1 reach12 impact
189 instances - 10 features - 0 classes - 0 missing values
University Hospital, Basel, Switzerland: Matthias Pfisterer, M.D. V.A. Medical Center, Long Beach and Cleveland Clinic Foundation: Robert Detrano, M.D., Ph.D. Heart Disease Databases Cholesterol…
160 runs0 likes4 downloads4 reach9 impact
303 instances - 14 features - 0 classes - 6 missing values
Cicchetti, D.\ Data from which conclusions were drawn in the article "Sleep in Mammals: Ecological and Constitutional Correlates" by Allison, T. and Cicchetti, D. (1976), _Science_, November 12, vol.…
0 runs0 likes1 downloads1 reach9 impact
62 instances - 8 features - 0 classes - 12 missing values
The problem is to learn a regression equation/rule/tree to predict the activity from the descriptive structural attributes. The data and methodology is described in detail in: - King, Ross .D., Hurst,…
5 runs0 likes1 downloads1 reach9 impact
186 instances - 61 features - 0 classes - 0 missing values
1) 1985 Model Import Car and Truck Specifications, 1985 Ward's Automotive Yearbook. 2) Personal Auto Manuals, Insurance Services Office, 160 Water Street, New York, NY 10038 3) Insurance Collision…
2 runs0 likes2 downloads2 reach12 impact
159 instances - 16 features - 0 classes - 0 missing values
Detroit: The Role of Firearms", Criminology, vol.14, 387-400 (1976) This is the data set called 'DETROIT' in the book 'Subset selection in regression' by Alan J. Miller published in the Chapman & Hall…
2 runs0 likes0 downloads0 reach10 impact
13 instances - 14 features - 0 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag.
2 runs0 likes2 downloads2 reach9 impact
2178 instances - 4 features - 0 classes - 0 missing values
These data are those collected in a cloud-seeding experiment in Tasmania between mid-1964 and January 1971. Their analysis, using regression techniques and permutation tests, is discussed in: Miller,…
66 runs0 likes2 downloads2 reach9 impact
108 instances - 6 features - 0 classes - 0 missing values
Data from StatLib (ftp stat.cmu.edu/datasets) The infamous Longley data, "An appraisal of least-squares programs from the point of view of the user", JASA, 62(1967) p819-841. Variables are: Number of…
3 runs0 likes1 downloads1 reach9 impact
16 instances - 7 features - 0 classes - 0 missing values
This data set concerns the study of the factors affecting patterns of insulin-dependent diabetes mellitus in children. The objective is to investigate the dependence of the level of serum C-peptide on…
2 runs0 likes1 downloads1 reach9 impact
43 instances - 3 features - 0 classes - 0 missing values
As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based learning with encoding length selection. In Progress in Connectionist-Based Information Systems.…
10 runs1 likes2 downloads3 reach12 impact
195 instances - 11 features - 0 classes - 2 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag. Points scored per minute is being treated as…
2 runs0 likes0 downloads0 reach9 impact
96 instances - 5 features - 0 classes - 0 missing values
(with variance 1 instead of 2). This is an artificial data set described in Breiman et al. (1984,p.238) (with variance 1 instead of 2). Generate the values of the 10 attributes independently using the…
2 runs1 likes4 downloads5 reach11 impact
40768 instances - 11 features - 0 classes - 0 missing values
This data set is also obtained from the task of controlling a F16 aircraft, although the target variable and attributes are different from the ailerons domain. In this case the goal variable is…
2 runs0 likes7 downloads7 reach11 impact
16599 instances - 19 features - 0 classes - 0 missing values
The task consists of Learning Quantitative Structure Activity Relationships (QSARs). The Inhibition of Dihydrofolate Reductase by Pyrimidines.The data are described in: King, Ross .D., Muggleton,…
6 runs0 likes2 downloads2 reach9 impact
74 instances - 28 features - 0 classes - 0 missing values
This database was designed on the basis of data provided by US Census Bureau [http://www.census.gov] (under Lookup Access [http://www.census.gov/cdrom/lookup]: Summary Tape File 1). The data were…
2 runs1 likes3 downloads4 reach9 impact
22784 instances - 9 features - 0 classes - 0 missing values
Survival treated as the class attribute As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based learning with encoding length selection. In Progress in…
12 runs0 likes2 downloads2 reach12 impact
130 instances - 10 features - 0 classes - 97 missing values
This is a dataset obtained from the StatLib repository. Here is the included description: The data provided are daily stock prices from January 1988 through October 1991, for ten aerospace companies.…
5 runs1 likes9 downloads10 reach9 impact
950 instances - 10 features - 0 classes - 0 missing values
Tumor-size treated as the class attribute. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based learning with encoding length selection. In Progress in…
0 runs0 likes3 downloads3 reach12 impact
286 instances - 10 features - 0 classes - 9 missing values
This is a family of datasets synthetically generated from a realistic simulation of the dynamics of a Unimation Puma 560 robot arm. There are eight datastets in this family. In this repository we only…
2 runs0 likes5 downloads5 reach10 impact
8192 instances - 9 features - 0 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag. Gasoline comnsumption is being treated as…
2 runs0 likes0 downloads0 reach9 impact
27 instances - 5 features - 0 classes - 0 missing values
The Computer Activity databases are a collection of computer systems activity measures. The data was collected from a Sun Sparcstation 20/712 with 128 Mbytes of memory running in a multi-user…
5 runs1 likes2 downloads3 reach9 impact
8192 instances - 13 features - 0 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag. Electicity usage is being treated as the…
4 runs0 likes0 downloads0 reach9 impact
55 instances - 3 features - 0 classes - 0 missing values
As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based learning with encoding length selection. In Progress in Connectionist-Based Information Systems.…
2 runs0 likes1 downloads1 reach12 impact
200 instances - 11 features - 0 classes - 0 missing values
The problem concerns Relative CPU Performance Data. The used attributes are : ``` MYCT: machine cycle time in nanoseconds (integer) MMIN: minimum main memory in kilobytes (integer) MMAX: maximum main…
2 runs0 likes2 downloads2 reach12 impact
209 instances - 7 features - 0 classes - 0 missing values
1. Hungarian Institute of Cardiology. Budapest: Andras Janosi, M.D. 2. University Hospital, Zurich, Switzerland: William Steinbrunn, M.D. 3. University Hospital, Basel, Switzerland: Matthias…
10 runs0 likes0 downloads0 reach9 impact
294 instances - 14 features - 0 classes - 782 missing values
sjoear. Laengelmaevesi. T.H.Jaervi: Finlands Fiskeriet Band 4, Meddelanden utgivna av fiskerifoereningen i Finland. Helsingfors 1917 Weight treated as the class attribute. Identifier deleted. As used…
10 runs0 likes2 downloads2 reach12 impact
158 instances - 8 features - 0 classes - 87 missing values
No data.
206 runs0 likes3 downloads3 reach11 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
67 runs0 likes2 downloads2 reach11 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
332 runs0 likes4 downloads4 reach11 impact
1000000 instances - 17 features - 2 classes - 0 missing values
No data.
311 runs0 likes3 downloads3 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
65 runs0 likes8 downloads8 reach9 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
310 runs0 likes4 downloads4 reach11 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
290 runs0 likes5 downloads5 reach11 impact
1000000 instances - 77 features - 10 classes - 0 missing values
No data.
867 runs0 likes12 downloads12 reach9 impact
39366 instances - 10 features - 2 classes - 0 missing values
No data.
52 runs0 likes2 downloads2 reach10 impact
1000000 instances - 65 features - 10 classes - 0 missing values
No data.
306 runs0 likes3 downloads3 reach9 impact
1000000 instances - 13 features - 6 classes - 0 missing values
No data.
52 runs0 likes3 downloads3 reach11 impact
1000000 instances - 48 features - 10 classes - 0 missing values
No data.
965 runs0 likes9 downloads9 reach9 impact
55296 instances - 10 features - 3 classes - 0 missing values
No data.
163 runs0 likes5 downloads5 reach9 impact
1000000 instances - 28 features - 2 classes - 0 missing values
No data.
68 runs0 likes4 downloads4 reach9 impact
1000000 instances - 23 features - 2 classes - 0 missing values
No data.
326 runs0 likes4 downloads4 reach11 impact
1000000 instances - 16 features - 2 classes - 0 missing values
No data.
315 runs0 likes2 downloads2 reach11 impact
295245 instances - 11 features - 5 classes - 0 missing values
No data.
225 runs0 likes7 downloads7 reach11 impact
1000000 instances - 21 features - 2 classes - 0 missing values
No data.
293 runs0 likes2 downloads2 reach11 impact
1000000 instances - 17 features - 10 classes - 0 missing values
No data.
65 runs0 likes3 downloads3 reach9 impact
1000000 instances - 40 features - 2 classes - 0 missing values
No data.
309 runs0 likes6 downloads6 reach11 impact
1000000 instances - 35 features - 6 classes - 0 missing values
No data.
296 runs0 likes7 downloads7 reach9 impact
1000000 instances - 61 features - 2 classes - 0 missing values
No data.
75 runs0 likes3 downloads3 reach9 impact
137781 instances - 10 features - 7 classes - 0 missing values
No data.
310 runs0 likes2 downloads2 reach9 impact
1000000 instances - 14 features - 5 classes - 0 missing values
No data.
326 runs0 likes4 downloads4 reach11 impact
1000000 instances - 14 features - 2 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
304 runs0 likes3 downloads3 reach9 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
331 runs0 likes7 downloads7 reach9 impact
1000000 instances - 20 features - 2 classes - 0 missing values
No data.
307 runs0 likes3 downloads3 reach11 impact
1000000 instances - 41 features - 3 classes - 0 missing values
No data.
291 runs0 likes4 downloads4 reach9 impact
1000000 instances - 17 features - 7 classes - 0 missing values
No data.
353 runs0 likes18 downloads18 reach13 impact
120919 instances - 1002 features - 2 classes - 0 missing values
This dataset is a collection newsgroup documents. The 20 newsgroups collection has become a popular data set for experiments in text applications of machine learning techniques, such as text…
167 runs0 likes8 downloads8 reach12 impact
399940 instances - 1002 features - 2 classes - 0 missing values
No data.
882 runs0 likes6 downloads6 reach12 impact
71 instances - 63 features - 6 classes - 0 missing values
No data.
948 runs0 likes5 downloads5 reach12 impact
74 instances - 63 features - 4 classes - 0 missing values
No data.
949 runs0 likes4 downloads4 reach12 impact
74 instances - 63 features - 4 classes - 0 missing values