Some convenience functions to manipulate data.frames
are subset()
, transform()
, with()
and within()
.
The subset()
function allows you to subset a data.frame
in a more convenient way (subset also works with other classes):
subset(mtcars, subset = cyl == 6, select = c("mpg", "hp"))
mpg hp
Mazda RX4 21.0 110
Mazda RX4 Wag 21.0 110
Hornet 4 Drive 21.4 110
Valiant 18.1 105
Merc 280 19.2 123
Merc 280C 17.8 123
Ferrari Dino 19.7 175
In the code above we asking only for the lines in which cyl == 6
and for the columns mpg
and hp
. You could achieve the same result using []
with the following code:
mtcars[mtcars$cyl == 6, c("mpg", "hp")]
The transform()
function is a convenience function to change columns inside a data.frame
. For instance the following code adds another column named mpg2
with the result of mpg^2
to the mtcars
data.frame
:
mtcars <- transform(mtcars, mpg2 = mpg^2)
Both with()
and within()
let you to evaluate expressions inside the data.frame
environment, allowing a somewhat cleaner syntax, saving you the use of some $
or []
.
For example, if you want to create, change and/or remove multiple columns in the airquality
data.frame
:
aq <- within(airquality, {
lOzone <- log(Ozone) # creates new column
Month <- factor(month.abb[Month]) # changes Month Column
cTemp <- round((Temp - 32) * 5/9, 1) # creates new column
S.cT <- Solar.R / cTemp # creates new column
rm(Day, Temp) # removes columns
})