OpenML
Filter results by:
Electrical Length data set This problem with only two input variables involves a small search space (small complexity). However, it is still an interesting problem since the system is strongly…
0 runs0 likes0 downloads0 reach8 impact
495 instances - 3 features - 0 classes - 0 missing values
Electrical-Maintenance data set This problem consists of four input variables and the available data set is comprised of a representative number of well distributed examples. In this case, the…
0 runs0 likes0 downloads0 reach8 impact
1056 instances - 5 features - 0 classes - 0 missing values
Forest Fires Data Set This is a difficult regression task, where the aim is to predict the burned area of forest fires, in the northeast region of Portugal, by using meteorological and other data.…
0 runs0 likes0 downloads0 reach8 impact
517 instances - 13 features - 0 classes - 0 missing values
This data set was originally a univariate time record of a single observed quantity, recorded from a Far-Infrared-Laser in a chaotic state. The original set 1000 points has been adapted for regression…
0 runs0 likes0 downloads0 reach8 impact
993 instances - 5 features - 0 classes - 0 missing values
Dataset includes construction cost, sale prices, project variables, and economic variables corresponding to real estate single-family residential apartments in Tehran, Iran. Totally 105: 8 project…
0 runs0 likes0 downloads0 reach8 impact
372 instances - 109 features - 0 classes - 0 missing values
This file contains the Economic data information of USA from 01/04/1980 to 02/04/2000 on a weekly basis. From given features, the goal is to predict 1 Month CD Rate. 1. 1Y-CMaturityRate real [77.055,…
0 runs0 likes0 downloads0 reach8 impact
1049 instances - 16 features - 0 classes - 0 missing values
This file contains the weather information of Ankara from 01/01/1994 to 28/05/1998. From given features, the goal is to predict the mean temperature. 1. Max_temperature real [23.0, 100.0] 2.…
0 runs0 likes0 downloads0 reach8 impact
321 instances - 10 features - 0 classes - 0 missing values
This file contains the weather information of Izmir from 01/01/1994 to 31/12/1997. From given features, the goal is to predict the mean temperature. 1. Max_temperature real[36.7,105.0] 2.…
0 runs0 likes0 downloads0 reach8 impact
1461 instances - 10 features - 0 classes - 0 missing values
Prediction of residuary resistance of sailing yachts at the initial design stage is of a great value for evaluating the ship’s performance and for estimating the required propulsive…
0 runs0 likes0 downloads0 reach8 impact
308 instances - 7 features - 0 classes - 0 missing values
This data approach student achievement in secondary education of two Portuguese schools. The data attributes include student grades, demographic, social and school related features) and it was…
0 runs0 likes0 downloads0 reach8 impact
649 instances - 33 features - 0 classes - 0 missing values
This data approach student achievement in secondary education of two Portuguese schools. The data attributes include student grades, demographic, social and school related features) and it was…
0 runs0 likes1 downloads1 reach8 impact
395 instances - 33 features - 0 classes - 0 missing values
Daily electric energy dataset The dee problem involves predicting the daily average price of TkWhe electricity energy in Spain. The data set contains real values from 2003 about the daily consumption…
0 runs0 likes0 downloads0 reach8 impact
365 instances - 7 features - 0 classes - 0 missing values
Auto MPG (6 variables) dataset The data concerns city-cycle fuel consumption in miles per gallon (Mpg), to be predicted in terms of 1 multivalued discrete and 5 continuous attributes (two multivalued…
0 runs0 likes0 downloads0 reach8 impact
392 instances - 6 features - 0 classes - 0 missing values
This is an artificial data set with dependencies between the attribute values. The cases are generated using the following method: X1 : uniformly distributed over [-5,5] X2 : uniformly distributed…
3 runs1 likes5 downloads6 reach13 impact
40768 instances - 11 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
9 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
10 instances - 1143 features - 0 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag.
2 runs0 likes2 downloads2 reach9 impact
2178 instances - 4 features - 0 classes - 0 missing values
The task consists of Learning Quantitative Structure Activity Relationships (QSARs). The Inhibition of Dihydrofolate Reductase by Pyrimidines.The data are described in: King, Ross .D., Muggleton,…
6 runs0 likes2 downloads2 reach9 impact
74 instances - 28 features - 0 classes - 0 missing values
Information about the dataset CLASSTYPE: numeric CLASSINDEX: last
2 runs0 likes0 downloads0 reach13 impact
559 instances - 5 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
80 instances - 113 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
34 instances - 1143 features - 0 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
2 runs0 likes0 downloads0 reach14 impact
59 instances - 16 features - 0 classes - 0 missing values
titanic surviual prediction
0 runs0 likes1 downloads1 reach7 impact
891 instances - 8 features - 0 classes - 0 missing values
AutoML challenge 2014. Original task: regression. Test and validation sets can be obtained on the Cha Learn website: https://automl.chalearn.org/data
0 runs0 likes0 downloads0 reach2 impact
99 instances - 200001 features - 0 classes - 0 missing values
% Title: Flora % Source: https://automl.chalearn.org/data % % Dataset from the first ChaLearn AutoML challenge (2014). % Only the training data is included, as there were no labels for validation and…
0 runs0 likes0 downloads0 reach3 impact
15000 instances - 200001 features - 0 classes - 0 missing values
Version with corrected feature types. 'PrivacySuppressed' are converted to None. Regroups information for about 7800 different US colleges. Including geographical information, stats about the…
0 runs0 likes0 downloads0 reach0 impact
7063 instances - 47 features - 0 classes - 104305 missing values
Bike sharing systems are new generation of traditional bike rentals where whole process from membership, rental and return back has become automatic. Through these systems, user is able to easily rent…
0 runs0 likes0 downloads0 reach3 impact
17379 instances - 13 features - 0 classes - 0 missing values
Bike sharing systems are new generation of traditional bike rentals where whole process from membership, rental and return back has become automatic. Through these systems, user is able to easily rent…
0 runs0 likes0 downloads0 reach2 impact
17379 instances - 13 features - 0 classes - 0 missing values
The Inpatient Utilization and Payment Public Use File (Inpatient PUF) provides information on inpatient discharges for Medicare fee-for-service beneficiaries. The Inpatient PUF includes information on…
0 runs0 likes0 downloads0 reach0 impact
163065 instances - 12 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
32 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes1 downloads1 reach13 impact
22 instances - 111 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs1 likes1 downloads2 reach13 impact
4450 instances - 203 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
19 instances - 10 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
26 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
13 instances - 1143 features - 0 classes - 0 missing values
The problem is to learn a regression equation/rule/tree to predict the activity from the descriptive structural attributes. The data and methodology is described in detail in: - King, Ross .D., Hurst,…
5 runs0 likes1 downloads1 reach9 impact
186 instances - 61 features - 0 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag. Gasoline comnsumption is being treated as…
2 runs0 likes0 downloads0 reach9 impact
27 instances - 5 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
37 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
15 instances - 10 features - 0 classes - 0 missing values
Information about the dataset CLASSTYPE: numeric CLASSINDEX: last
2 runs0 likes1 downloads1 reach13 impact
559 instances - 5 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
22 instances - 40 features - 0 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach9 impact
144 instances - 77 features - 0 classes - 0 missing values
Data on the population density of tree pipits, Anthus trivialis, in Franconian oak forests including variables describing the forest ecosystem. This data is taken from R package coin. This study is…
0 runs0 likes2 downloads2 reach13 impact
86 instances - 10 features - 0 classes - 0 missing values
It has 3 attributes (ID, tweet, label ) 91299 tweets with non-sarcastic 39998 tweets and 51300 sarcastic tweets.
0 runs0 likes0 downloads0 reach9 impact
91298 instances - 2 features - 0 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach9 impact
1000000 instances - 33 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach9 impact
78732 instances - 11 features - 0 classes - 0 missing values
A family of datasets synthetically generated from a simulation of how bank-customers choose their banks. Tasks are based on predicting the fraction of bank customers who leave the bank because of full…
0 runs0 likes2 downloads2 reach13 impact
8192 instances - 33 features - 0 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach12 impact
177147 instances - 11 features - 0 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: Vikas Sindhwani for the SVMlin project.
0 runs0 likes3 downloads3 reach16 impact
72309 instances - 20959 features - 0 classes - 0 missing values
No data.
0 runs0 likes2 downloads2 reach12 impact
1000000 instances - 37 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach9 impact
1000000 instances - 14 features - 0 classes - 0 missing values
No data.
0 runs0 likes3 downloads3 reach12 impact
116640 instances - 10 features - 0 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: The original Adult data set has 14 features, among which six are continuous and eight are categorical. In this data set,…
0 runs0 likes1 downloads1 reach16 impact
32561 instances - 124 features - 0 classes - 0 missing values
1. Title: Lecturers Evaluation (Ordinal LEV) 2. Source Informaion: Donor: Arie Ben David MIS, Dept. of Technology Management Holon Academic Inst. of Technology 52 Golomb St. Holon 58102 Israel…
0 runs1 likes2 downloads3 reach13 impact
1000 instances - 5 features - 0 classes - 0 missing values
Building projectable classifiers of arbitrary complexity. In Proceedings of the 13th International Conference on Pattern Recognition, pages 880-885, Vienna, Austria, August 1996. #Dataset from the…
0 runs0 likes3 downloads3 reach16 impact
862 instances - 3 features - 0 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: The original Adult data set has 14 features, among which six are continuous and eight are categorical. In this data set,…
0 runs0 likes3 downloads3 reach16 impact
48842 instances - 124 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach13 impact
500 instances - 6 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach9 impact
177147 instances - 11 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach12 impact
1000000 instances - 26 features - 0 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach9 impact
1000000 instances - 33 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach12 impact
1000000 instances - 91 features - 0 classes - 0 missing values
No data.
0 runs0 likes3 downloads3 reach12 impact
31104 instances - 10 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach13 impact
500 instances - 26 features - 0 classes - 0 missing values
No data.
0 runs0 likes3 downloads3 reach9 impact
59049 instances - 10 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes1 downloads1 reach13 impact
20 instances - 10 features - 0 classes - 0 missing values
Michel Lang fRMA-normalized. Only "Kratz-genes"*. \* (see: A practical molecular assay to predict survival in resected non-squamous, non-small-cell lung cancer: development and international…
3 runs0 likes3 downloads3 reach12 impact
442 instances - 24 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach9 impact
531441 instances - 12 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach12 impact
1000000 instances - 19 features - 0 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: The original Adult data set has 14 features, among which six are continuous and eight are categorical. In this data set,…
0 runs0 likes2 downloads2 reach16 impact
32561 instances - 124 features - 0 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: The original Adult data set has 14 features, among which six are continuous and eight are categorical. In this data set,…
0 runs0 likes0 downloads0 reach16 impact
32561 instances - 124 features - 0 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: Regenerate features by the authors' matlab scripts (see Sec. C of Appendix A), then randomly select 10% instances from the…
0 runs0 likes2 downloads2 reach16 impact
98528 instances - 101 features - 0 classes - 0 missing values
The BoT-IoT dataset was created by designing a realistic network environment in the Cyber Range Lab of The center of UNSW Canberra Cyber. The environment incorporates a combination of normal and…
0 runs0 likes0 downloads0 reach9 impact
3668522 instances - 45 features - 0 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach9 impact
1000000 instances - 16 features - 0 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach9 impact
1000000 instances - 14 features - 0 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach12 impact
17496 instances - 10 features - 0 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
0 runs0 likes1 downloads1 reach13 impact
200 instances - 20 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach9 impact
1000000 instances - 22 features - 0 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository.
0 runs0 likes1 downloads1 reach16 impact
49749 instances - 301 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach17 impact
1000 instances - 25 features - 0 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
3 runs0 likes3 downloads3 reach13 impact
50 instances - 5 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 20157, and it has 63 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach11 impact
63 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 100817, and it has 14 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach11 impact
14 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 12786, and it has 624 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach11 impact
624 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10116, and it has 399 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach11 impact
399 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 101582, and it has 36 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach11 impact
36 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 193, and it has 199 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach11 impact
199 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 30007, and it has 534 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach11 impact
534 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 61, and it has 2076 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach11 impact
2076 instances - 1026 features - 0 classes - 0 missing values
Subset of KITS dataset with 100 images
0 runs0 likes0 downloads0 reach0 impact
100 instances - 27649 features - 0 classes - 0 missing values
Subset of KITS dataset with 100 images
0 runs0 likes0 downloads0 reach0 impact
100 instances - 27649 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11758, and it has 213 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach11 impact
213 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 100843, and it has 16 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach11 impact
16 instances - 1026 features - 0 classes - 0 missing values
Since the first automobile, the Benz Patent Motor Car in 1886, Mercedes-Benz has stood for important automotive innovations. These include, for example, the passenger safety cell with crumple zone,…
0 runs0 likes0 downloads0 reach8 impact
4209 instances - 377 features - 0 classes - 0 missing values
AutoML challenge 2014. Original task: regression. Test and validation sets can be obtained on the Cha Learn website: https://automl.chalearn.org/data
0 runs0 likes0 downloads0 reach4 impact
400000 instances - 101 features - 0 classes - 0 missing values
Abstract: This data-set contains examples of buzz events from two different social networks: Twitter, and Tom's Hardware, a forum network focusing on new technology with more conservative dynamics.…
0 runs0 likes0 downloads0 reach13 impact
583250 instances - 78 features - 0 classes - 0 missing values
When you've been devastated by a serious car accident, your focus is on the things that matter the most: family, friends, and other loved ones. Pushing paper with your insurance agent is the last…
0 runs0 likes0 downloads0 reach8 impact
188318 instances - 131 features - 0 classes - 0 missing values
File README ----------- smoothmeth A collection of the data sets used in the book "Smoothing Methods in Statistics," by Jeffrey S. Simonoff, Springer-Verlag, New York, 1996. Submitted by Jeff Simonoff…
0 runs0 likes0 downloads0 reach15 impact
2178 instances - 4 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 11, and it has 5742 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach12 impact
5742 instances - 1026 features - 0 classes - 0 missing values
This classic dataset contains the prices and other attributes of almost 54,000 diamonds. It's a great dataset for beginners learning to work with data analysis and visualization. Content price price…
0 runs0 likes1 downloads1 reach9 impact
53940 instances - 10 features - 0 classes - 0 missing values
source: http://plato.asu.edu/ftp/solvable.html authors: Rolf-David Bergdoll PAR10 performances of modern solvers on the solvable instances of MIPLIB2010. http://miplib.zib.de/ The algorithm runtime…
0 runs0 likes0 downloads0 reach10 impact
1090 instances - 148 features - 0 classes - 0 missing values