OpenML
Filter results by:
Data Set Information: This research aimed at the case of customers’ default payments in Taiwan and compares the predictive accuracy of probability of default among six data mining methods. From…
0 runs0 likes0 downloads0 reach0 impact
30000 instances - 24 features - 2 classes - 0 missing values
This data approach student achievement in secondary education of two Portuguese schools. The data attributes include student grades, demographic, social and school related features) and it was…
0 runs0 likes1 downloads1 reach2 impact
395 instances - 33 features - 0 classes - 0 missing values
e fvr
0 runs0 likes0 downloads0 reach0 impact
2 instances - 11 features - classes - 0 missing values
This dataset summarizes a heterogeneous set of features about articles published by Mashable in a period of two years. The goal is to predict the number of shares in social networks (popularity). *…
0 runs0 likes4 downloads4 reach5 impact
39644 instances - 61 features - 0 classes - 0 missing values
sdsw frfr
0 runs0 likes0 downloads0 reach0 impact
1556 instances - 3 features - classes - 0 missing values
swd dced
0 runs0 likes0 downloads0 reach0 impact
589 instances - 3 features - classes - 0 missing values
frf r
0 runs0 likes0 downloads0 reach0 impact
2 instances - 3 features - classes - 0 missing values
e3r4vr t4r
0 runs0 likes0 downloads0 reach0 impact
2 instances - 5 features - classes - 0 missing values
e eded
0 runs0 likes0 downloads0 reach0 impact
2 instances - 4 features - classes - 0 missing values
Airlines Dataset Inspired in the regression dataset from Elena Ikonomovska. The task is to predict whether a given flight will be delayed, given the information of the scheduled departure. For this…
0 runs0 likes0 downloads0 reach0 impact
26969 instances - 8 features - 2 classes - 0 missing values
An artificial data set where instances belongs to several clusters with a banana shape. There are two attributes At1 and At2 corresponding to the x and y axis, respectively. The class label (-1 and 1)…
163 runs2 likes17 downloads19 reach8 impact
5300 instances - 3 features - 2 classes - 0 missing values
b gtrg
0 runs0 likes0 downloads0 reach0 impact
4 instances - 7 features - classes - 0 missing values
sqs efrf
0 runs0 likes0 downloads0 reach0 impact
4 instances - 5 features - classes - 0 missing values
f fr
0 runs0 likes0 downloads0 reach0 impact
2 instances - 5 features - classes - 0 missing values
dd efrg
0 runs0 likes0 downloads0 reach0 impact
1556 instances - 5629 features - classes - 0 missing values
The Inpatient Utilization and Payment Public Use File (Inpatient PUF) provides information on inpatient discharges for Medicare fee-for-service beneficiaries. The Inpatient PUF includes information on…
0 runs1 likes1 downloads2 reach2 impact
163065 instances - 12 features - 0 classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - classes - 0 missing values
Each record represents 100 points on a two-dimensional graph. When plotted in order (from 1 through 100) as the Y coordinate, the points will create either a Hill (a “bump” in the terrain) or a…
183264 runs0 likes23 downloads23 reach19 impact
1212 instances - 101 features - 2 classes - 0 missing values
wind daily average wind speeds for 1961-1978 at 12 synoptic meteorological stations in the Republic of Ireland (Haslett and raftery 1989). These data were analyzed in detail in the following article:…
0 runs0 likes6 downloads6 reach7 impact
6574 instances - 15 features - 0 classes - 0 missing values
This is the Tecator data set: The task is to predict the fat content of a meat sample on the basis of its near infrared absorbance spectrum. 1. Statement of permission from Tecator (the original data…
0 runs0 likes4 downloads4 reach7 impact
240 instances - 125 features - 0 classes - 0 missing values
Human Development Index [DATA] United Nations Development Program compiled an Index of Human Development. Column 1: Country(character) 2: Index 3: GNP GNP PER CAPITA RANK RANK - RANK HDI 1987 GNP RANK…
2 runs0 likes0 downloads0 reach7 impact
130 instances - 4 features - 0 classes - 0 missing values
The Boston house-price data of Harrison, D. and Rubinfeld, D.L. 'Hedonic prices and the demand for clean air', J. Environ. Economics & Management, vol.5, 81-102, 1978. Used in Belsley, Kuh & Welsch,…
6 runs0 likes5 downloads5 reach11 impact
506 instances - 14 features - 0 classes - 0 missing values
File README ----------- smoothmeth A collection of the data sets used in the book "Smoothing Methods in Statistics," by Jeffrey S. Simonoff, Springer-Verlag, New York, 1996. Submitted by Jeff Simonoff…
0 runs0 likes0 downloads0 reach7 impact
2178 instances - 4 features - 0 classes - 0 missing values
A family of datasets synthetically generated from a simulation of how bank-customers choose their banks. Tasks are based on predicting the fraction of bank customers who leave the bank because of full…
0 runs0 likes2 downloads2 reach7 impact
8192 instances - 33 features - 0 classes - 0 missing values
17x17x2x2 tables of counts in GLIM-ready format used for the analyses in Biblarz, Timothy J., and Adrian E. Raftery. 1993. "The Effects of Family Disruption on Social Mobility." American Sociological…
3 runs0 likes1 downloads1 reach7 impact
1156 instances - 6 features - 0 classes - 0 missing values
Data for the sensory evaluation experiment in Brien, C.J. and Payne, R.W. (1996) Tiers, structure formulae and the analysis of complicated experiments. submitted for publication. The experiment…
2 runs0 likes2 downloads2 reach7 impact
576 instances - 12 features - 0 classes - 0 missing values
Geographical Analysis Spatial Data This georeferenced data set was used in: Pace, R. Kelley, and Ronald Barry, Quick Computation of Regressions with a Spatially Autoregressive Dependent Variable,…
4 runs1 likes1 downloads2 reach7 impact
3107 instances - 7 features - 0 classes - 0 missing values
The data consist of 2001 observations taken from a balloon about 30 kilometres above the surface of the earth. In the section of the flight shown here the balloon increases in height. As radiation…
0 runs1 likes2 downloads3 reach7 impact
2001 instances - 3 features - 0 classes - 0 missing values
The data are a subsample of 500 observations from a data set that originate in a study where air pollution at a road is related to traffic volume and meteorological variables, collected by the…
2 runs0 likes1 downloads1 reach7 impact
500 instances - 8 features - 0 classes - 0 missing values
The data consist of annual observations on the level of strike volume (days lost due to industrial disputes per 1000 wage salary earners), and their covariates in 18 OECD countries from 1951-1985. The…
0 runs0 likes2 downloads2 reach7 impact
625 instances - 7 features - 0 classes - 0 missing values
This data set is also obtained from the task of controlling a F16 aircraft, although the target variable and attributes are different from the ailerons domain. In this case the goal variable is…
2 runs0 likes7 downloads7 reach3 impact
16599 instances - 19 features - 0 classes - 0 missing values
S&P Letters Data We collected information on the variables using all the block groups in California from the 1990 Census. In this sample a block group on average includes 1425.5 individuals living in…
0 runs0 likes6 downloads6 reach7 impact
20640 instances - 9 features - 0 classes - 0 missing values
Veteran's Administration Lung Cancer Trial Taken from Kalbfleisch and Prentice, pages 223-224 Variables Treatment 1=standard, 2=test Celltype 1=squamous, 2=smallcell, 3=adeno, 4=large Survival in days…
2 runs0 likes1 downloads1 reach7 impact
137 instances - 8 features - 0 classes - 0 missing values
This is an artificial data set used in Friedman (1991) and also described in Breiman (1996,p.139). The cases are generated using the following method: Generate the values of 10 attributes, X1, ...,…
0 runs2 likes7 downloads9 reach7 impact
40768 instances - 11 features - 0 classes - 0 missing values
eevrr der
0 runs0 likes0 downloads0 reach0 impact
1557 instances - 5629 features - classes - 0 missing values
ede wey
0 runs0 likes0 downloads0 reach0 impact
589 instances - 2909 features - classes - 0 missing values
This database contains all legal 8-ply positions in the game of connect-4 in which neither player has won yet, and in which the next move is not forced. Attributes represent board positions on a 6x6…
9329 runs0 likes8 downloads8 reach19 impact
67557 instances - 43 features - 3 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: We used binary encoding for each feature (o, b, x), so the number of features is 42*3 = 126
0 runs0 likes3 downloads3 reach10 impact
67557 instances - 127 features - 0 classes - 0 missing values
Touch samples 2
0 runs0 likes0 downloads0 reach0 impact
265 instances - 11 features - 8 classes - 0 missing values
valores de saida de fardamento com temperaturas, admissões e demissões
0 runs0 likes0 downloads0 reach0 impact
6277 instances - 7 features - 0 classes - 0 missing values
rrvrf 4rr
0 runs0 likes0 downloads0 reach0 impact
4 instances - 49 features - classes - 0 missing values
ef f
0 runs0 likes0 downloads0 reach0 impact
4 instances - 49 features - classes - 0 missing values
Touch Signals
0 runs0 likes0 downloads0 reach0 impact
265 instances - 11 features - classes - 0 missing values
punch sound
0 runs0 likes1 downloads1 reach2 impact
221 instances - 1 features - classes - 0 missing values
dsd efe
0 runs0 likes0 downloads0 reach0 impact
601 instances - 7 features - classes - 0 missing values
fr frf
0 runs0 likes0 downloads0 reach0 impact
1556 instances - 5629 features - classes - 0 missing values
sde c
0 runs0 likes0 downloads0 reach0 impact
1556 instances - 5629 features - classes - 0 missing values
de d
0 runs0 likes0 downloads0 reach0 impact
1556 instances - 5628 features - classes - 0 missing values
This is data set is concerned with the forward kinematics of an 8 link robot arm. Among the existing variants of this data set we have used the variant 8nm, which is known to be highly non-linear and…
19 runs0 likes7 downloads7 reach3 impact
8192 instances - 9 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs1 likes1 downloads2 reach7 impact
4450 instances - 203 features - 0 classes - 0 missing values
Multi-label dataset. The birds dataset consists of 327 audio recordings of 12 different vocalizing bird species. Each sound can be assigned to various bird species.
0 runs0 likes6 downloads6 reach5 impact
645 instances - 279 features - 2 classes - 0 missing values
The Computer Activity databases are a collection of computer systems activity measures. The data was collected from a Sun Sparcstation 20/712 with 128 Mbytes of memory running in a multi-user…
5 runs1 likes2 downloads3 reach3 impact
8192 instances - 13 features - 0 classes - 0 missing values
1. Title: Wine Quality 2. Sources Created by: Paulo Cortez (Univ. Minho), Antonio Cerdeira, Fernando Almeida, Telmo Matos and Jose Reis (CVRVV) @ 2009 3. Past Usage: P. Cortez, A. Cerdeira, F.…
0 runs1 likes13 downloads14 reach7 impact
6497 instances - 12 features - 0 classes - 0 missing values
This is one of a family of datasets synthetically generated from a realistic simulation of the dynamics of a Unimation Puma 560 robot arm. There are eight datastets in this family . In this repository…
0 runs0 likes6 downloads6 reach7 impact
8192 instances - 33 features - 0 classes - 0 missing values
This is an artificial data set with dependencies between the attribute values. The cases are generated using the following method: X1 : uniformly distributed over [-5,5] X2 : uniformly distributed…
3 runs1 likes5 downloads6 reach7 impact
40768 instances - 11 features - 0 classes - 0 missing values
This is an artificial data set described in Breiman et al. (1984,p.238) (with variance 1 instead of 2). Generate the values of the 10 attributes independently using the following probabilities: P(X_1…
2 runs1 likes4 downloads5 reach4 impact
40768 instances - 11 features - 0 classes - 0 missing values
efe def
0 runs0 likes0 downloads0 reach0 impact
4 instances - 49 features - classes - 0 missing values
as cscs
0 runs0 likes0 downloads0 reach0 impact
1557 instances - 5629 features - classes - 0 missing values
This database was designed on the basis of data provided by US Census Bureau [http://www.census.gov] (under Lookup Access [http://www.census.gov/cdrom/lookup]: Summary Tape File 1). The data were…
0 runs1 likes6 downloads7 reach7 impact
22784 instances - 17 features - 0 classes - 0 missing values
sd vfv
0 runs0 likes0 downloads0 reach0 impact
4 instances - 50 features - 2 classes - 0 missing values
Wikidata with top-474 most frequent types and ingoing/outgoing properties as features
0 runs0 likes15 downloads15 reach5 impact
19254100 instances - 2331 features - classes - 0 missing values
A dataset of steel plates' faults, classified into 7 different types. The goal was to train machine learning for automatic pattern recognition. The dataset consists of 27 features describing each…
277313 runs1 likes42 downloads43 reach19 impact
1941 instances - 34 features - 2 classes - 0 missing values
mydata
0 runs0 likes0 downloads0 reach0 impact
3892 instances - 36 features - classes - 0 missing values
Dataset showing Data from matches played RB Leipzig prior to 14.06.2020
0 runs0 likes0 downloads0 reach0 impact
102 instances - 1 features - classes - 0 missing values
According to Epsilon research, 80% of customers are more likely to do business with you if you provide personalized service. Banking is no exception. The digitalization of everyday lives means that…
0 runs0 likes0 downloads0 reach0 impact
4459 instances - 4992 features - 0 classes - 0 missing values
Since the first automobile, the Benz Patent Motor Car in 1886, Mercedes-Benz has stood for important automotive innovations. These include, for example, the passenger safety cell with crumple zone,…
0 runs0 likes0 downloads0 reach0 impact
4209 instances - 377 features - 0 classes - 0 missing values
as dwd
0 runs0 likes0 downloads0 reach0 impact
1557 instances - 5629 features - classes - 0 missing values
ef r
0 runs0 likes0 downloads0 reach0 impact
1557 instances - 5629 features - classes - 0 missing values
BitcoinHeist Ransomware Dataset Akcora, C.G., Li, Y., Gel, Y.R. and Kantarcioglu, M., 2019. BitcoinHeist. Topological Data Analysis for Ransomware Detection on the Bitcoin Blockchain. IJCAI-PRICAI…
0 runs1 likes0 downloads1 reach0 impact
2916697 instances - 10 features - 29 classes - 0 missing values
The Inpatient Utilization and Payment Public Use File (Inpatient PUF) provides information on inpatient discharges for Medicare fee-for-service beneficiaries. The Inpatient PUF includes…
0 runs0 likes0 downloads0 reach0 impact
163065 instances - 12 features - 0 classes - 0 missing values
r rg
0 runs0 likes0 downloads0 reach0 impact
4 instances - 50 features - classes - 0 missing values
dd ref
0 runs0 likes0 downloads0 reach0 impact
4 instances - 50 features - classes - 0 missing values
When you've been devastated by a serious car accident, your focus is on the things that matter the most: family, friends, and other loved ones. Pushing paper with your insurance agent is the last…
0 runs0 likes0 downloads0 reach0 impact
188318 instances - 131 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 20158, and it has 257 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach5 impact
257 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 103169, and it has 10 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach5 impact
10 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 103456, and it has 73 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach5 impact
73 instances - 1026 features - 0 classes - 0 missing values
* Dataset: DBworld e-mails data set Task: dbworld-subjects-stemmed * Source: Michele Filannino, PhD University of Manchester Centre for Doctoral Training Email: filannim_AT_cs.man.ac.uk * Data Set…
71 runs0 likes2 downloads2 reach7 impact
64 instances - 230 features - 2 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository.
0 runs0 likes0 downloads0 reach10 impact
64700 instances - 301 features - 0 classes - 0 missing values
hydraulic
0 runs0 likes0 downloads0 reach0 impact
2205 instances - 22 features - classes - 0 missing values
Source: Ashwin Srinivasan Department of Statistics and Data Modeling University of Strathclyde Glasgow Scotland UK ross '@' uk.ac.turing The original Landsat data for this database was generated from…
1 runs1 likes7 downloads8 reach13 impact
6435 instances - 37 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 266, and it has 137 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach5 impact
137 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 100848, and it has 60 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach5 impact
60 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 101508, and it has 532 rows and 1026 features…
1 runs0 likes1 downloads1 reach5 impact
532 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 136, and it has 4085 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach5 impact
4085 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 20014, and it has 2625 rows and 1026 features…
1 runs0 likes1 downloads1 reach5 impact
2625 instances - 1026 features - 0 classes - 0 missing values
No data.
32 runs0 likes1 downloads1 reach5 impact
1000000 instances - 17 features - 26 classes - 0 missing values
Multi-label dataset. Audio dataset (emotions) consists of 593 musical files with 6 clustered emotional labels and 72 predictors. Each song can be labeled with one or more of the labels…
0 runs0 likes1 downloads1 reach3 impact
593 instances - 78 features - classes - 0 missing values
Water stress dataset for Indian variety of wheat crop: The data consist of a file system-based data of Raj 3765 variety of wheat. There are twenty-four chlorophyll fluorescence images captured every…
0 runs0 likes1 downloads1 reach1 impact
1188 instances - 23 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10871, and it has 30 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach5 impact
30 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 100107, and it has 41 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach5 impact
41 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 103101, and it has 10 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach5 impact
10 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10227, and it has 15 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach5 impact
15 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 30021, and it has 92 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach5 impact
92 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 12689, and it has 575 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach5 impact
575 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10918, and it has 1238 rows and 1026 features…
1 runs0 likes1 downloads1 reach5 impact
1238 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10849, and it has 1580 rows and 1026 features…
1 runs0 likes1 downloads1 reach5 impact
1580 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 12014, and it has 22 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach5 impact
22 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11403, and it has 20 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach5 impact
20 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 100120, and it has 18 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach5 impact
18 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 13005, and it has 124 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach5 impact
124 instances - 1026 features - 0 classes - 0 missing values