Creating Data Frames
View Interactive Version
data.frame(___ = ___, ___ = ___, ...)
Data frames hold tabular data in various columns or attributes. Each column is represented by a vector of different data types like numbers or characters. The
data.frame() function supports the construction of data frame objects by combining different vectors to a table. To form a table, vectors are required to have equal lengths. A data frame can also be seen as a collection of vectors connected together to form a table.
Let’s create our first data frame with four different people including their ids, names and indicators if they are female or not. Each of these attributes is created by a different vector of different data types (numeric, character and logical). The attributes are finally combined to a table using the
data.frame( c(1, 2, 3, 4), c("Louisa", "Jonathan", "Luigi", "Rachel"), c(TRUE, FALSE, FALSE, TRUE) )
c.1..2..3..4. c..Louisa....Jonathan....Luigi....Rachel.. 1 1 Louisa 2 2 Jonathan 3 3 Luigi 4 4 Rachel c.TRUE..FALSE..FALSE..TRUE. 1 TRUE 2 FALSE 3 FALSE 4 TRUE
The resulting data frame stores the values of each vector in a different column. It has four rows and three columns. However, the column names printed on the first line seem to include the column values separated by dots which is a very strange naming scheme!
Column names can be included into the
data.frame() construction as argument names preceding the values of column vectors. To improve the column naming of the previous data frame we can write
data.frame( id = c(1, 2, 3, 4), name = c("Louisa", "Jonathan", "Luigi", "Rachel"), female = c(TRUE, FALSE, FALSE, TRUE) )
id name female 1 1 Louisa TRUE 2 2 Jonathan FALSE 3 3 Luigi FALSE 4 4 Rachel TRUE
The resulting data frame includes the column names needed to see the actual meaning of the different columns.