R Language Convenience functions to manipulate data.frames


Some convenience functions to manipulate data.frames are subset(), transform(), with() and within().


The subset() function allows you to subset a data.frame in a more convenient way (subset also works with other classes):

subset(mtcars, subset = cyl == 6, select = c("mpg", "hp"))
                mpg  hp
Mazda RX4      21.0 110
Mazda RX4 Wag  21.0 110
Hornet 4 Drive 21.4 110
Valiant        18.1 105
Merc 280       19.2 123
Merc 280C      17.8 123
Ferrari Dino   19.7 175

In the code above we asking only for the lines in which cyl == 6 and for the columns mpg and hp. You could achieve the same result using [] with the following code:

mtcars[mtcars$cyl == 6, c("mpg", "hp")]


The transform() function is a convenience function to change columns inside a data.frame. For instance the following code adds another column named mpg2 with the result of mpg^2 to the mtcars data.frame:

mtcars <- transform(mtcars, mpg2 = mpg^2)

with and within

Both with() and within() let you to evaluate expressions inside the data.frame environment, allowing a somewhat cleaner syntax, saving you the use of some $ or [].

For example, if you want to create, change and/or remove multiple columns in the airquality data.frame:

aq <- within(airquality, {     
    lOzone <- log(Ozone) # creates new column
    Month <- factor(month.abb[Month]) # changes Month Column
    cTemp <- round((Temp - 32) * 5/9, 1) # creates new column
    S.cT <- Solar.R / cTemp  # creates new column
    rm(Day, Temp) # removes columns