Tutorial by Examples

A feature that has near zero variance is a good candidate for removal. You can manually detect numerical variance below your own threshold: data("GermanCredit") variances<-apply(GermanCredit, 2, var) variances[which(variances<=0.0025)] Or, you can use the caret package to find...
If a feature is largely lacking data, it is a good candidate for removal: library(VIM) data(sleep) colMeans(is.na(sleep)) BodyWgt BrainWgt NonD Dream Sleep Span Gest 0.00000000 0.00000000 0.22580645 0.19354839 0.06451613 0.06451613 0.06451613 Pred ...
Closely correlated features may add variance to your model, and removing one of a correlated pair might help reduce that. There are lots of ways to detect correlation. Here's one: library(purrr) # in order to use keep() # select correlatable vars toCorrelate<-mtcars %>% keep(is.numeric) ...

Page 1 of 1