Variable importance in subgroup identification for predictive variables.

MrSImp(
  dataframe,
  role,
  B = 100,
  bestK = 1,
  maxDepth = 5,
  minTrt = 5,
  minData = max(c(minTrt * maxDepth, NROW(Y)/20)),
  batchNum = 1L,
  faster = FALSE,
  display = FALSE,
  treeName = paste0("tree_", format(Sys.time(), "%m%d"), ".yaml"),
  nodeName = paste0("node_", format(Sys.time(), "%m%d"), ".txt"),
  impName = paste0("imp_", format(Sys.time(), "%m%d"), ".txt")
)

Arguments

dataframe

train data frame

role

role follows GUIDE role

B

bootstrap number default = 100

bestK

number of covariates in the regression model

maxDepth

maximum tree depth

minTrt

minimum treatment and placebo sample in each node

minData

minimum sample in each node

batchNum

related with exhaustive search for numerical split variable

faster

related with tree split searching

display

Whether display tree in the end

treeName

yaml file for save the tree

nodeName

file same for each node

impName

important variable file name

Value

A list contains importance score variable names and roles

imp

Importance score data frame

role

Role for each variable

Settings

Settings used to build the tree

Details

MrSGUIDE variable importance

Examples

library(MrSGUIDE) set.seed(1234) N = 200 np = 3 numX <- matrix(rnorm(N * np), N, np) ## numerical features gender <- sample(c('Male', 'Female'), N, replace = TRUE) country <- sample(c('US', 'UK', 'China', 'Japan'), N, replace = TRUE) z <- sample(c(0, 1), N, replace = TRUE) # Binary treatment assignment y1 <- numX[, 1] + 1 * z * (gender == 'Female') + rnorm(N) y2 <- numX[, 2] + 2 * z * (gender == 'Female') + rnorm(N) train <- data.frame(numX, gender, country, z, y1, y2) role <- c(rep('n', 3), 'c', 'c', 'r', 'd', 'd') mrsobj <- MrSImp(dataframe = train, role = role, B = 10) mrsobj$imp
#> Importance Feature #> 1 0.0000e+00 X1 #> 3 0.0000e+00 X3 #> 5 0.0000e+00 country #> 2 4.8052e+01 X2 #> 4 2.0000e+12 gender