OpenML
Home
Machine learning, better, together
data sets

Find or add data to analyse
tasks

Download or create scientific tasks
flows

Find or add data analysis flows
runs

Upload and explore all results online.

Democratizing Machine Learning

As machine learning is enhancing our ability to understand nature and build a better future, it is crucial that we make it transparent and easily accessible to everyone in research, education and industry. The Open Machine Learning project is an inclusive movement to build an open, organized, online ecosystem for machine learning. We build open source tools to discover (and share) open data from any domain, easily draw them into your favourite machine learning environments, quickly build models alongside (and together with) thousands of other data scientists, analyse your results against the state of the art, and even get automatic advice on how to build better models. Stand on the shoulders of giants and make the world a better place.

Sign me up!

Signing up is free and brings you lots of powerful features.
All public data is always openly available.

Open science Machine learning

Identifying the most appropriate machine learning techniques and using them optimally can be challenging for the best of us. OpenML is a place where you can share interesting datasets with the people who love to analyse data, and build the best solutions together, saving you valuable time, increasing your visibility, and speeding up discovery. OpenML links data to algorithms and people, so you can build on the state of the art and learn to teach machines to learn better.

It starts with data

Upload your datasets, or link them from existing repositories. OpenML automatically versions and analyses each dataset and annotates them with rich meta-data to streamline analysis. Easily search thousands of datasets and import them directly into your code or toolboxes, or quickly find similar datasets together with the best machine learning approaches.

Collaborative science - easy benchmarking

Create tasks that tell people what needs to be done with the data (e.g. classification). OpenML creates machine-readable protocols to train and evaluate models, so you can focus on the science. Try many techniques on the same task and compare directly to the state of the art, or evaluate one technique on hundreds of tasks at once.

Code Integrations

OpenML integrates seamlessly into existing data science environments, so you can readily use it. With a few lines of code or a few clicks, you can import datasets, build algorithms locally, upload models, and (at any time) download your and other people's workflows, models and evaluations for reuse and further analysis.
OpenML is directly integrated into the most popular machine learning tools, but you can also build your own integrations with the Python, R, Java, and C++ APIs, or program against the REST API.

Reproducible, reusable, transparent research

The OpenML integrations make sure that all uploaded results are linked to the exact (versions) of datasets, workflows, software, and the people involved. We generate predictions locally using exact procedures, and evaluate them server-side so that results are directly comparable and reusable in further work. Wherever possible, we extract clear descriptions of machine learning workflows and models.