Introduction

The data.table package enhances R with fast data wrangling for small and large data sets. Since data.table inherits from data.frame it can be used as a drop-in replacement for data.frame. The design of data.table is influenced by the desire to have a efficient tool to do SQL like operations from within R. Here, efficient not only refers to speed but also to reduce the code to be written by the user.

The main motivation of using data.table is typically its speed. It is so fast that some of the authors of dplyr created the dtplyr created a data.table back-end for dplyr to be able to harness the amazing speed data.table.

Benchmarks

The page https://h2oai.github.io/db-benchmark/ gives several benchmarks which compare the performance of data.table to similar software.

Matt Dowle’s view on data.table

Matt Dowle is the creator and still active maintainer of data.table. In the video below (recorded on the UseR conference 2014) he shares his motivation for creating data.table.