Machine Learning

Ultrafast NVMe Sampler

Bartłomiej RomańskiMarch 2, 2018

Last Updated on: 13th June 2024, 08:14 pm

Random batch generation at 6 GB/s (or 5M records/s).

Tired of shuffling your learning data each and every epoch? You really need to check this out!

Paweł from our data science team just open-sourced his NVMe Sampler – a library we use at RTB House while training our PyTorch models. With a bunch of NVMe drives, libaio and some black performance magic this little tool can generate random batches for you with the astonishing speed – over 6 GB/s (or 5M records/s).

See README for all the details. Here’s just a little architecture preview:

Never wait for data shuffling again!

Comments are closed.

More in Machine Learning

Machine Learning

Model Explainer

Why do we need to explain our models?

Machine Learning

Large language models in recommendation systems

Exploring recent advancements in leveraging LLMs for recommendation systems.

Machine Learning

FastEmbedding

PyTorch embeddings up to 35x faster.

Ultrafast NVMe Sampler

Random batch generation at 6 GB/s (or 5M records/s).

More in Machine Learning

Model Explainer

Large language models in recommendation systems

FastEmbedding

Popular Tags

Popular Search

Random batch generation at 6 GB/s (or 5M records/s).

More in Machine Learning

Model Explainer

Large language models in recommendation systems

FastEmbedding

Latest Posts

Breaking Down the Bidding Service Monolith into Microservices

Transitioning to Capacity-Based Pricing in Google BigQuery

Near Real-Time Document Categorization with Apache Solr and RTB House Percolator Plugin

Popular Tags

Popular Search