Variable importance in subgroup identification for predictive variables.
MrSImp( dataframe, role, B = 100, bestK = 1, maxDepth = 5, minTrt = 5, minData = max(c(minTrt * maxDepth, NROW(Y)/20)), batchNum = 1L, faster = FALSE, display = FALSE, treeName = paste0("tree_", format(Sys.time(), "%m%d"), ".yaml"), nodeName = paste0("node_", format(Sys.time(), "%m%d"), ".txt"), impName = paste0("imp_", format(Sys.time(), "%m%d"), ".txt") )
dataframe | train data frame |
---|---|
role | role follows GUIDE role |
B | bootstrap number default = 100 |
bestK | number of covariates in the regression model |
maxDepth | maximum tree depth |
minTrt | minimum treatment and placebo sample in each node |
minData | minimum sample in each node |
batchNum | related with exhaustive search for numerical split variable |
faster | related with tree split searching |
display | Whether display tree in the end |
treeName | yaml file for save the tree |
nodeName | file same for each node |
impName | important variable file name |
A list contains importance score variable names and roles
Importance score data frame
Role for each variable
Settings used to build the tree
MrSGUIDE variable importance
library(MrSGUIDE) set.seed(1234) N = 200 np = 3 numX <- matrix(rnorm(N * np), N, np) ## numerical features gender <- sample(c('Male', 'Female'), N, replace = TRUE) country <- sample(c('US', 'UK', 'China', 'Japan'), N, replace = TRUE) z <- sample(c(0, 1), N, replace = TRUE) # Binary treatment assignment y1 <- numX[, 1] + 1 * z * (gender == 'Female') + rnorm(N) y2 <- numX[, 2] + 2 * z * (gender == 'Female') + rnorm(N) train <- data.frame(numX, gender, country, z, y1, y2) role <- c(rep('n', 3), 'c', 'c', 'r', 'd', 'd') mrsobj <- MrSImp(dataframe = train, role = role, B = 10) mrsobj$imp#> Importance Feature #> 1 0.0000e+00 X1 #> 3 0.0000e+00 X3 #> 5 0.0000e+00 country #> 2 4.8052e+01 X2 #> 4 2.0000e+12 gender