OpenML
Filter results by:
This dataset consists of beer reviews from Beeradvocate. The data span a period of more than 10 years, including all ~1.5 million reviews up to November 2011. Each review includes ratings in terms of…
0 runs0 likes0 downloads0 reach0 impact
1586614 instances - 13 features - 104 classes - 68148 missing values
This dataset consists of beer reviews from Beeradvocate. The data span a period of more than 10 years, including all ~1.5 million reviews up to November 2011. Each review includes ratings in terms of…
0 runs0 likes0 downloads0 reach0 impact
1586614 instances - 13 features - 104 classes - 68148 missing values
This dataset consists of beer reviews from Beeradvocate. The data span a period of more than 10 years, including all ~1.5 million reviews up to November 2011. Each review includes ratings in terms of…
0 runs0 likes0 downloads0 reach0 impact
1586614 instances - 13 features - 104 classes - 68148 missing values
Employee remuneration and expenses (earning over 75,000CAD per year). This data set includes remuneration and expenses from employees earning over 75,000CAD per year. Attributes: NAME: Name of…
0 runs0 likes0 downloads0 reach0 impact
uci adult partitioned
0 runs0 likes0 downloads0 reach0 impact
48844 instances - 17 features - classes - 6495 missing values
uci
0 runs0 likes0 downloads0 reach0 impact
30000 instances - 27 features - classes - 0 missing values
uci
0 runs0 likes0 downloads0 reach0 impact
101766 instances - 52 features - classes - 192849 missing values
hmeq_p,BAD,binary
0 runs0 likes0 downloads0 reach0 impact
5960 instances - 15 features - classes - 5271 missing values
kaggle_santander_p
0 runs0 likes0 downloads0 reach0 impact
200000 instances - 203 features - classes - 0 missing values
Synthetic 2-d data with N=5000 vectors and k=15 Gaussian clusters with different degree of cluster overlap P. Fränti and O. Virmajoki, "Iterative shrinking method for clustering…
0 runs0 likes0 downloads0 reach0 impact
5000 instances - 3 features - 0 classes - 0 missing values
Public procurement data for the European Economic Area, Switzerland, and the Macedonia. 2015
0 runs0 likes0 downloads0 reach0 impact
565163 instances - 75 features - 0 classes - 15247061 missing values
Anonymized data of dating profiles from OkCupid
0 runs0 likes0 downloads0 reach0 impact
59946 instances - 31 features - 0 classes - 273249 missing values
Ask a home buyer to describe their dream house, and they probably won't begin with the height of the basement ceiling or the proximity to an east-west railroad. But this playground competition's…
0 runs0 likes0 downloads0 reach0 impact
1460 instances - 81 features - 0 classes - 6965 missing values
Chocolate Bar Ratings. Expert ratings of over 1,700 chocolate bars. Each chocolate is evaluated from a combination of both objective qualities and subjective interpretation. A rating here only…
0 runs0 likes0 downloads0 reach0 impact
1794 instances - 9 features - 41 classes - 0 missing values
#modelage
0 runs0 likes0 downloads0 reach0 impact
202 instances - 13 features - 3 classes - 202 missing values
#modelage
0 runs0 likes0 downloads0 reach0 impact
224 instances - 20 features - 6 classes - 205 missing values
#modelage
0 runs0 likes0 downloads0 reach0 impact
202 instances - 20 features - 2 classes - 17 missing values
# Data Description This is the historical price data of the FOREX EUR/CAD from Dukascopy. One instance (row) is one candlestick of one hour. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes0 downloads0 reach0 impact
43825 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX USD/CHF from Dukascopy. One instance (row) is one candlestick of one minute. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes0 downloads0 reach0 impact
375840 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX EUR/SGD from Dukascopy. One instance (row) is one candlestick of one hour. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes0 downloads0 reach0 impact
43825 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX EUR/SEK from Dukascopy. One instance (row) is one candlestick of one hour. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes0 downloads0 reach0 impact
43825 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX USD/DKK from Dukascopy. One instance (row) is one candlestick of one day. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes0 downloads0 reach0 impact
1832 instances - 12 features - 2 classes - 0 missing values
Data set of around 45 language and 25 Category. Consist of articles.
0 runs0 likes0 downloads0 reach0 impact
65428 instances - 3 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - 3 classes - 0 missing values
The BoT-IoT dataset was created by designing a realistic network environment in the Cyber Range Lab of The center of UNSW Canberra Cyber. The environment incorporates a combination of normal and…
0 runs0 likes0 downloads0 reach1 impact
Wine data gathered by https://www.kaggle.com/zynicideThe data was scraped from WineEnthusiast during the week of June 15th, 2017. The code for the scraper can be found at…
0 runs0 likes0 downloads0 reach0 impact
150930 instances - 10 features - classes - 174477 missing values
Data are collected from Kickstarter Platform You'll find most useful data for project analysis. Columns are self explanatory except: usd_pledged: conversion in US dollars of the pledged column…
0 runs0 likes0 downloads0 reach0 impact
331675 instances - 14 features - classes - 210 missing values
The ILPD dataset from the OpenCC18 with all categorical variables label encoded
0 runs0 likes0 downloads0 reach0 impact
583 instances - 11 features - 0 classes - 0 missing values
The sick dataset from the OpenCC18 with all categorical data label encoded so all data is numeric
0 runs0 likes0 downloads0 reach0 impact
3772 instances - 30 features - classes - 0 missing values
The ILPD liver dataset from the OpenCC18 with the gender binary encoded so all features are numeric
0 runs0 likes0 downloads0 reach1 impact
583 instances - 11 features - 2 classes - 0 missing values
Sick dataset from the opencc18 with all textual binary variables label encoded.
0 runs0 likes0 downloads0 reach1 impact
3772 instances - 30 features - 2 classes - 0 missing values
test openml upload
0 runs0 likes0 downloads0 reach1 impact
150 instances - 5 features - 3 classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - 3 classes - 0 missing values
UserID
0 runs0 likes0 downloads0 reach0 impact
1974675 instances - 10 features - classes - 1974675 missing values
web services evaluations in this table
0 runs0 likes0 downloads0 reach1 impact
1974675 instances - 10 features - classes - 1974675 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - 3 classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - 3 classes - 0 missing values
Daily air quality measurements in New York, May to September 1973. This data is taken from R.
0 runs0 likes0 downloads0 reach0 impact
Daily air quality measurements in New York, May to September 1973. This data is taken from R.
0 runs0 likes0 downloads0 reach0 impact
Daily air quality measurements in New York, May to September 1973. This data is taken from R.
0 runs0 likes0 downloads0 reach0 impact
Daily air quality measurements in New York, May to September 1973. This data is taken from R.
0 runs0 likes0 downloads0 reach0 impact
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - 3 classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - 3 classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - 3 classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - 3 classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - 3 classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - 3 classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - 3 classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - 3 classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - 3 classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - classes - 0 missing values
The Inpatient Utilization and Payment Public Use File (Inpatient PUF) provides information on inpatient discharges for Medicare fee-for-service beneficiaries. The Inpatient PUF includes information on…
0 runs0 likes0 downloads0 reach0 impact
163065 instances - 12 features - 0 classes - 0 missing values
This dataset contains traffic violation information from all electronic traffic violations issued in the County. Any information that can be used to uniquely identify the vehicle, the vehicle owner or…
0 runs0 likes0 downloads0 reach0 impact
1578154 instances - 43 features - 4 classes - 8006541 missing values
Chocolate Bar Ratings. Expert ratings of over 1,700 chocolate bars. Each chocolate is evaluated from a combination of both objective qualities and subjective interpretation. A rating here only…
0 runs0 likes0 downloads0 reach0 impact
1795 instances - 9 features - 42 classes - 1 missing values
Regroups information for about 7800 different US colleges. Including geographical information, stats about the population attending and post graduation career earnings.
0 runs0 likes0 downloads0 reach0 impact
This dataset reflects incidents of crime in the City of Los Angeles dating back to 2010. This data is transcribed from original crime reports that are typed on paper and therefore there may be some…
0 runs0 likes0 downloads0 reach0 impact
Public procurement data for the European Economic Area, Switzerland, and the Macedonia. 2015
0 runs0 likes0 downloads0 reach0 impact
10% stratified subsample of the original SVHN data
0 runs0 likes0 downloads0 reach0 impact
9927 instances - 3073 features - 10 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
5 runs0 likes0 downloads0 reach7 impact
10000 instances - 7201 features - 10 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
4 runs0 likes0 downloads0 reach7 impact
65196 instances - 28 features - 100 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
6 runs0 likes0 downloads0 reach8 impact
20000 instances - 4297 features - 2 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
6 runs0 likes0 downloads0 reach8 impact
20000 instances - 4297 features - 2 classes - 0 missing values
This is the dataset used for the 2016 IDA Industrial Challenge, courtesy of Scania. For a full description, see http://archive.ics.uci.edu/ml/datasets/IDA2016Challenge . This dataset contains both the…
7 runs0 likes0 downloads0 reach7 impact
76000 instances - 171 features - 2 classes - 1078695 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
0 runs0 likes0 downloads0 reach3 impact
163 instances - 28 features - 5 classes - 9 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
0 runs0 likes0 downloads0 reach5 impact
379 instances - 9 features - 4 classes - 1418 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 101079, and it has 125 rows and 1026 features…
1 runs0 likes1 downloads1 reach3 impact
125 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11154, and it has 688 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
688 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10450, and it has 214 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
214 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 137, and it has 3689 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
3689 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 118, and it has 1362 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
1362 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 101602, and it has 79 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
79 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 101324, and it has 79 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
79 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 100424, and it has 97 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
97 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11868, and it has 519 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
519 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 103441, and it has 74 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
74 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11402, and it has 413 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
413 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 12787, and it has 10 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
10 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 12090, and it has 1312 rows and 1026 features…
1 runs0 likes1 downloads1 reach3 impact
1312 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 20023, and it has 60 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
60 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 12131, and it has 111 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
111 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11199, and it has 104 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
104 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11942, and it has 1002 rows and 1026 features…
1 runs0 likes1 downloads1 reach3 impact
1002 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 100962, and it has 35 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
35 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 12227, and it has 1510 rows and 1026 features…
1 runs0 likes1 downloads1 reach3 impact
1510 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 12735, and it has 646 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
646 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 101222, and it has 78 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
78 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 101237, and it has 10 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
10 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 216, and it has 520 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
520 instances - 1026 features - 0 classes - 0 missing values