General-Purpose Datasets

Overview: This category is a set of datasets that are originally generated with the purpose of machine learning research. However, they are also commonly used in federated learning for experimental purpose due to the scarcity of high quality federated learning datasets. 

  • Datasets From FedML – 2020
    Overview: FedML is a research library that provides both frameworks for federated learning and benchmark functionalities. As a benchmark, it provides comprehensive baseline implementations for multiple ML models and FL […]

    Read More…

  • Datasets From LEAF – 2018
    Overview: LEAF is one of the earliest dataset proposals for federated learning. It contains six datasets covering different domains, including image classification, sentiment analysis and next-character prediction. A set of […]

    Read More…

  • Datasets from FedScale – 2021
    Description: This repository contains scripts and instructions of building FedScale, a diverse set of challenging and realistic benchmark datasets to facilitate scalable, comprehensive, and reproducible federated learning (FL) research. FedScale […]

    Read More…