Nimble is a new columnar file format for efficiently storing and processing large, wide datasets, particularly those used in machine learning workflows.
Nimble, a C++ columnar file format by Meta, aims to replace formats like Apache Parquet for extremely wide datasets in machine learning and data analytics. It features an extensible encoding system, allowing users to define and apply cascading encodings, and uses Flatbuffers for lighter metadata and block encoding for predictable memory usage. The project encourages using its unified C++ library, with details available in the `README.md`.
Nimble is a new columnar file format for efficiently storing and processing large, wide datasets, particularly those used in machine learning workflows.
Developers building data systems that handle vast, multi-column datasets for machine learning or analytics, seeking an alternative to existing columnar formats.