We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Firefox.

Creating Data Frames

data.frame(___ = ___, 
           ___ = ___, 
View Interactive Version

Data frames hold tabular data in various columns or attributes. Each column is represented by a vector of different data types like numbers or characters. The data.frame() function supports the construction of data frame objects by combining different vectors to a table. To form a table, vectors are required to have equal lengths. A data frame can also be seen as a collection of vectors connected together to form a table.

Let’s create our first data frame with four different people including their ids, names and indicators if they are female or not. Each of these attributes is created by a different vector of different data types (numeric, character and logical). The attributes are finally combined to a table using the data.frame() function:

  c(1, 2, 3, 4),
  c("Louisa", "Jonathan", "Luigi", "Rachel"),
  c.1..2..3..4. c..Louisa....Jonathan....Luigi....Rachel..
1             1                                     Louisa
2             2                                   Jonathan
3             3                                      Luigi
4             4                                     Rachel
1                        TRUE
2                       FALSE
3                       FALSE
4                        TRUE

The resulting data frame stores the values of each vector in a different column. It has four rows and three columns. However, the column names printed on the first line seem to include the column values separated by dots which is a very strange naming scheme!

Column names can be included into the data.frame() construction as argument names preceding the values of column vectors. To improve the column naming of the previous data frame we can write

  id = c(1, 2, 3, 4),
  name = c("Louisa", "Jonathan", "Luigi", "Rachel"),
  female = c(TRUE, FALSE, FALSE, TRUE)
  id     name female
1  1   Louisa   TRUE
2  2 Jonathan  FALSE
3  3    Luigi  FALSE
4  4   Rachel   TRUE

The resulting data frame includes the column names needed to see the actual meaning of the different columns.