OpenML
Filter results by:
test
0 runs0 likes0 downloads0 reach6 impact
891 instances - 12 features - classes - 866 missing values
test
0 runs0 likes0 downloads0 reach6 impact
891 instances - 12 features - classes - 866 missing values
test
0 runs0 likes0 downloads0 reach6 impact
891 instances - 12 features - classes - 866 missing values
test
0 runs0 likes0 downloads0 reach6 impact
891 instances - 12 features - classes - 866 missing values
test
0 runs0 likes0 downloads0 reach6 impact
891 instances - 12 features - classes - 866 missing values
test
0 runs0 likes0 downloads0 reach6 impact
891 instances - 12 features - classes - 866 missing values
test
0 runs0 likes0 downloads0 reach6 impact
891 instances - 12 features - classes - 866 missing values
test
0 runs0 likes0 downloads0 reach6 impact
891 instances - 12 features - classes - 866 missing values
test
0 runs0 likes0 downloads0 reach6 impact
891 instances - 12 features - classes - 866 missing values
test
0 runs0 likes0 downloads0 reach6 impact
891 instances - 12 features - classes - 866 missing values
test
0 runs0 likes0 downloads0 reach6 impact
891 instances - 12 features - classes - 866 missing values
test
0 runs0 likes0 downloads0 reach6 impact
891 instances - 12 features - classes - 866 missing values
No data.
65 runs0 likes3 downloads3 reach9 impact
1000000 instances - 40 features - 2 classes - 0 missing values
No data.
67 runs0 likes2 downloads2 reach11 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
70 runs0 likes2 downloads2 reach11 impact
1000000 instances - 14 features - 2 classes - 0 missing values
No data.
314 runs1 likes8 downloads9 reach11 impact
1000000 instances - 36 features - 19 classes - 0 missing values
No data.
326 runs0 likes4 downloads4 reach11 impact
1000000 instances - 16 features - 2 classes - 0 missing values
No data.
307 runs0 likes3 downloads3 reach11 impact
1000000 instances - 41 features - 3 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
1000000 instances - 19 features - 4 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes1 downloads1 reach13 impact
8 instances - 1143 features - 0 classes - 0 missing values
Information about customers consists of 86 variables and includes product usage data and socio-demographic data derived from zip area codes. The data was supplied by the Dutch data mining company…
0 runs0 likes3 downloads3 reach13 impact
9822 instances - 86 features - 0 classes - 0 missing values
No data.
68 runs0 likes4 downloads4 reach11 impact
1000000 instances - 21 features - 2 classes - 0 missing values
As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based learning with encoding length selection. In Progress in Connectionist-Based Information Systems.…
2 runs0 likes1 downloads1 reach12 impact
200 instances - 11 features - 0 classes - 0 missing values
No data.
332 runs0 likes4 downloads4 reach11 impact
1000000 instances - 17 features - 2 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag. Electicity usage is being treated as the…
4 runs0 likes0 downloads0 reach9 impact
55 instances - 3 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
14 instances - 1143 features - 0 classes - 0 missing values
No data.
306 runs0 likes3 downloads3 reach9 impact
1000000 instances - 13 features - 6 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
30 instances - 1143 features - 0 classes - 0 missing values
No data.
60 runs0 likes2 downloads2 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
34 instances - 1143 features - 0 classes - 0 missing values
The data are a subsample of 500 observations from a data set that originate in a study where air pollution at a road is related to traffic volume and meteorological variables, collected by the…
2 runs0 likes1 downloads1 reach13 impact
500 instances - 8 features - 0 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
0 runs0 likes0 downloads0 reach11 impact
2796 instances - 35 features - 2 classes - 68100 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes4 downloads4 reach14 impact
138 instances - 10936 features - 2 classes - 0 missing values
No data.
70 runs0 likes3 downloads3 reach9 impact
1000000 instances - 28 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes5 downloads5 reach15 impact
250 instances - 10936 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
66 runs0 likes4 downloads4 reach15 impact
195 instances - 10936 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
66 runs0 likes1 downloads1 reach15 impact
386 instances - 10936 features - 2 classes - 0 missing values
No data.
90 runs2 likes3 downloads5 reach11 impact
663552 instances - 13 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX EUR/SEK from Dukascopy. One instance (row) is one candlestick of one minute. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes0 downloads0 reach8 impact
375840 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX AUD/SGD from Dukascopy. One instance (row) is one candlestick of one hour. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes1 downloads1 reach8 impact
43825 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX AUD/CAD from Dukascopy. One instance (row) is one candlestick of one hour. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes1 downloads1 reach8 impact
43825 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX GBP/USD from Dukascopy. One instance (row) is one candlestick of one day. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes1 downloads1 reach8 impact
1834 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX EUR/TRY from Dukascopy. One instance (row) is one candlestick of one day. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes1 downloads1 reach8 impact
1832 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX EUR/GBP from Dukascopy. One instance (row) is one candlestick of one day. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes1 downloads1 reach8 impact
1835 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX CAD/JPY from Dukascopy. One instance (row) is one candlestick of one hour. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes2 downloads2 reach8 impact
43825 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX EUR/NOK from Dukascopy. One instance (row) is one candlestick of one hour. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes1 downloads1 reach8 impact
43825 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX AUD/CAD from Dukascopy. One instance (row) is one candlestick of one day. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes1 downloads1 reach8 impact
1834 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX AUD/JPY from Dukascopy. One instance (row) is one candlestick of one minute. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes0 downloads0 reach8 impact
375840 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX EUR/SEK from Dukascopy. One instance (row) is one candlestick of one day. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes1 downloads1 reach8 impact
1837 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX USD/JPY from Dukascopy. One instance (row) is one candlestick of one minute. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes0 downloads0 reach8 impact
375840 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX EUR/HKD from Dukascopy. One instance (row) is one candlestick of one day. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes1 downloads1 reach8 impact
1832 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX EUR/PLN from Dukascopy. One instance (row) is one candlestick of one hour. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes0 downloads0 reach8 impact
43825 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX EUR/CAD from Dukascopy. One instance (row) is one candlestick of one minute. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes0 downloads0 reach8 impact
375840 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX EUR/HUF from Dukascopy. One instance (row) is one candlestick of one hour. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes1 downloads1 reach8 impact
43825 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX AUD/USD from Dukascopy. One instance (row) is one candlestick of one day. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes1 downloads1 reach8 impact
1834 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX CHF/SGD from Dukascopy. One instance (row) is one candlestick of one minute. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes0 downloads0 reach8 impact
375840 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX EUR/HKD from Dukascopy. One instance (row) is one candlestick of one hour. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes1 downloads1 reach8 impact
43825 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX AUD/CHF from Dukascopy. One instance (row) is one candlestick of one day. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes1 downloads1 reach8 impact
1833 instances - 12 features - 2 classes - 0 missing values
The weather problem is a tiny dataset that we will use repeatedly to illustrate machine learning methods. Entirely fictitious, it supposedly concerns the conditions that are suitable for playing some…
0 runs0 likes0 downloads0 reach8 impact
14 instances - 5 features - 2 classes - 0 missing values
The weather problem is a tiny dataset that we will use repeatedly to illustrate machine learning methods. Entirely fictitious, it supposedly concerns the conditions that are suitable for playing some…
0 runs0 likes0 downloads0 reach8 impact
14 instances - 5 features - 2 classes - 0 missing values
The weather problem is a tiny dataset that we will use repeatedly to illustrate machine learning methods. Entirely fictitious, it supposedly concerns the conditions that are suitable for playing some…
0 runs0 likes0 downloads0 reach8 impact
14 instances - 5 features - 2 classes - 0 missing values
source: An Algorithm Selection Benchmark for the Container Pre-Marshalling Problem (CPMP) authors: K. Tierney and Y. Malitsky (features) / K. Tierney and D. Pacino and S. Voss (algorithms) translator…
20 runs0 likes0 downloads0 reach10 impact
2108 instances - 27 features - 0 classes - 0 missing values
analysis of stocks
0 runs0 likes0 downloads0 reach8 impact
245 instances - 15 features - classes - 0 missing values
This dataset is an artificial simulation of the Duffing system with random changes from the chaotic to the non-chaotic regime at different noise levels.
0 runs0 likes0 downloads0 reach8 impact
2493200 instances - 26 features - classes - 0 missing values
This dataset is an artificial simulation of the Duffing system with one phase transition to the chaotic regime.
0 runs0 likes0 downloads0 reach8 impact
9983 instances - 4 features - classes - 0 missing values
punch sound
0 runs0 likes1 downloads1 reach8 impact
221 instances - 1 features - classes - 0 missing values
Training dataset of the 'Porto Seguros Safe Driver Prediction' Kaggle challenge [https://www.kaggle.com/c/porto-seguro-safe-driver-prediction]. The goal was to predict whether a driver will file an…
2 runs0 likes0 downloads0 reach12 impact
595212 instances - 38 features - 2 classes - 846458 missing values
Hourly particulate matter air polution data of Great Britain for the year 2017, provided by Ricardo Energy and Environment on behalf of the UK Department for Environment, Food and Rural Affairs…
0 runs0 likes0 downloads0 reach8 impact
394299 instances - 10 features - 0 classes - 0 missing values
Trip Record Data provided by the New York City Taxi and Limousine Commission (TLC) [http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml]. The dataset includes TLC trips of the green line in…
0 runs0 likes0 downloads0 reach9 impact
581835 instances - 15 features - 0 classes - 0 missing values
#modelage
31 runs0 likes0 downloads0 reach8 impact
202 instances - 20 features - 2 classes - 17 missing values
Source: C. Okan Sakar a, Gorkem Serbes b, Aysegul Gunduz c, Hunkar C. Tunc a, Hatice Nizam d, Betul Erdogdu Sakar e, Melih Tutuncu c, Tarkan Aydin a, M. Erdem Isenkul d, Hulya Apaydin c a Department…
0 runs0 likes0 downloads0 reach12 impact
756 instances - 754 features - 0 classes - 0 missing values
1. Title: Echocardiogram Data 2. Source Information: -- Donor: Steven Salzberg (salzberg@cs.jhu.edu) -- Collector: -- Dr. Evlin Kinney -- The Reed Institute -- P.O. Box 402603 -- Maimi, FL 33140-0603…
0 runs0 likes0 downloads0 reach8 impact
132 instances - 8 features - 4 classes - 103 missing values
Context "Predict behavior to retain customers. You can analyze all relevant customer data and develop focused customer retention programs." [IBM Sample Data Sets] Content Each row represents a…
0 runs1 likes2 downloads3 reach8 impact
7043 instances - 20 features - 2 classes - 0 missing values
The Inpatient Utilization and Payment Public Use File (Inpatient PUF) provides information on inpatient discharges for Medicare fee-for-service beneficiaries. The Inpatient PUF includes information on…
0 runs0 likes2 downloads2 reach8 impact
163065 instances - 12 features - 0 classes - 0 missing values
This dataset contains traffic violation information from all electronic traffic violations issued in the County. Any information that can be used to uniquely identify the vehicle, the vehicle owner or…
0 runs0 likes1 downloads1 reach8 impact
1578154 instances - 43 features - 4 classes - 8006541 missing values
Chocolate Bar Ratings. Expert ratings of over 1,700 chocolate bars. Each chocolate is evaluated from a combination of both objective qualities and subjective interpretation. A rating here only…
0 runs0 likes1 downloads1 reach8 impact
1795 instances - 9 features - 42 classes - 1 missing values
Regroups information for about 7800 different US colleges. Including geographical information, stats about the population attending and post graduation career earnings.
0 runs0 likes0 downloads0 reach8 impact
This dataset reflects incidents of crime in the City of Los Angeles dating back to 2010. This data is transcribed from original crime reports that are typed on paper and therefore there may be some…
0 runs0 likes0 downloads0 reach8 impact
Public procurement data for the European Economic Area, Switzerland, and the Macedonia. 2015
0 runs0 likes0 downloads0 reach8 impact
10% stratified subsample of the original SVHN data
0 runs0 likes0 downloads0 reach9 impact
9927 instances - 3073 features - 10 classes - 0 missing values
Public procurement data for the European Economic Area, Switzerland, and the Macedonia. 2015
0 runs0 likes1 downloads1 reach8 impact
565163 instances - 75 features - 0 classes - 15247061 missing values
Anonymized data of dating profiles from OkCupid
0 runs0 likes3 downloads3 reach8 impact
59946 instances - 31 features - 0 classes - 273249 missing values
Chocolate Bar Ratings. Expert ratings of over 1,700 chocolate bars. Each chocolate is evaluated from a combination of both objective qualities and subjective interpretation. A rating here only…
0 runs0 likes1 downloads1 reach8 impact
1794 instances - 9 features - 41 classes - 0 missing values
#modelage
28 runs0 likes0 downloads0 reach8 impact
202 instances - 13 features - 3 classes - 202 missing values
#modelage
87 runs0 likes0 downloads0 reach8 impact
224 instances - 20 features - 6 classes - 205 missing values
50% stratified subsample of the original SVHN data
0 runs0 likes0 downloads0 reach9 impact
49644 instances - 3073 features - 10 classes - 0 missing values
nfl_games
0 runs0 likes0 downloads0 reach8 impact
16274 instances - 12 features - classes - 0 missing values
nominal features and target for COMPAS
0 runs0 likes1 downloads1 reach9 impact
5278 instances - 14 features - 2 classes - 0 missing values
Original data from https://github.com/propublica/compas-analysis/ by ProPublica. The data was subsequently preprocessed and reduced to relevant features for classification. The target variable is…
0 runs0 likes1 downloads1 reach10 impact
5278 instances - 14 features - 2 classes - 0 missing values
The dataset contains all the statistics for each player from 2008 to 2016.
0 runs0 likes1 downloads1 reach8 impact
183978 instances - 42 features - classes - 47301 missing values
The dataset contains the serie a matches for season 2015-2016
0 runs0 likes0 downloads0 reach8 impact
379 instances - 38 features - classes - 44 missing values
This dataset contains all Premier League matches, with player statistic take from Sofifa, from 2008 to 2016
0 runs0 likes0 downloads0 reach8 impact
2961 instances - 17 features - classes - 0 missing values
This dataset contains, for each Premier League matches 2014-2015, the probabilities generated with the L2F models, as well as matches odds.
0 runs0 likes0 downloads0 reach8 impact
323 instances - 11 features - classes - 0 missing values
This dataset contains all the player names and player ids, taken from Sofifa
0 runs0 likes0 downloads0 reach8 impact
11009 instances - 3 features - classes - 0 missing values
dataset for feature extraction
0 runs0 likes0 downloads0 reach8 impact
69 instances - 37 features - classes - 0 missing values
Regroups information for about 7800 different US colleges. Including geographical information, stats about the population attending and post graduation career earnings.
0 runs0 likes1 downloads1 reach9 impact
7063 instances - 50 features - 0 classes - 125494 missing values
This dataset reflects incidents of crime in the City of Los Angeles dating back to 2010. This data is transcribed from original crime reports that are typed on paper and therefore there may be some…
0 runs0 likes0 downloads0 reach8 impact
1468825 instances - 26 features - 0 classes - 7881776 missing values
This dataset contains a simulation of the Lorenz attractor with the parameter $\rho$ varying in time. The stable and chaotic regimes alternate.
0 runs0 likes0 downloads0 reach8 impact
4942 instances - 4 features - classes - 0 missing values
Dataset sales
0 runs0 likes0 downloads0 reach10 impact
10738 instances - 15 features - 0 classes - 0 missing values
Test file for ML training
0 runs0 likes0 downloads0 reach9 impact
1599 instances - 12 features - classes - 0 missing values