R package for HSTree

Paper Author : Abhineet Agarwal, Yan Shuo Tan, Omer Ronen, Chandan Singh, Bin Yu

R Package Author: Haoxue Wang ([email protected])---University of Cambridge

This package is R version for the Hierarchical Shinkage algorithm based on python, there is another R package for FIGS. algorithm. Hopefully more R version for imodels will be developed in the furture. The introduction manual of the package is in Manual.

Plot result with HSTree R package

HSTree decreases MSPE in all shrinkage methods. It shows its superiority of generelization especially for random forest

Install all the packages

library(devtools)
install_github("wanghaoxue0/HSTree")
library(rpart)
library(randomForest)
library(gbm)
library(HSTree) 
source("fit.R")
source("fitCV.R")

Use the cross validation

set.seed(2023)
X=read.csv("X.csv",header = FALSE)
y=read.csv("y.csv",header = FALSE)
colnames(y) <-"y"
fit <- HSTreeRegressorCV(X, y, reg_param=c(0.1, 1, 10, 20, 50, 100, 500), cv=4, verbose=TRUE, shrinkage="constant") # the default estimator is CART

the best regulization parameter is 1 , its mean square error is 908.8491

Split the dataset

# split the data into 3:1
smp_size <- floor(0.75 * nrow(X))
train_ind <- sample(seq_len(nrow(X)), size = smp_size)
X_train <- X[train_ind, ]
y_train <- data.frame(y[train_ind, ])
colnames(y_train) <-"y"
X_test <- X[-train_ind, ]
y_test <- data.frame(y[-train_ind, ])
colnames(y_test) <-"y"

Compare the original CART and hierarchical shrinkage tree

# original decision tree model
fit <- rpart(y~., data=data.frame(X_train,y_train), control = rpart.control(maxdepth = 5))
fit1 <- HSTreeRegressor(X_train, y_train, shrinkage="constant") # the default estimator is CART
fit2 <- HSTreeRegressor(X_train, y_train, estimator="CART") # the default shrinkage method is node_based
msep = sum((predict(fit, X_test)-y_test[[1]])^2)/nrow(X)
msep1 = sum((predict(fit1, X_test)-y_test[[1]])^2)/nrow(X)
msep2 = sum((predict(fit2, X_test)-y_test[[1]])^2)/nrow(X)
plot(fit1)
text(fit1, use.n = TRUE)

Compare original random forest and with hierarchical shrinkage

fit <- randomForest(X_train, y_train[[1]], ntree=50, maxnodes=5)
fit1 <- HSTreeRegressor(X_train, y_train, estimator="RandomForest")  # the default shrinkage method is node_based
fit2 <- HSTreeRegressor(X_train, y_train, estimator="RandomForest", shrinkage="constant")
fit3 <- HSTreeRegressor(X_train, y_train, estimator="RandomForest", shrinkage="leaf_based")

msep = sum((predict(fit, X_test)-y_test[[1]])^2)/nrow(X)
msep1 = sum((predict(fit1, X_test)-y_test[[1]])^2)/nrow(X)
msep2 = sum((predict(fit2, X_test)-y_test[[1]])^2)/nrow(X)
msep3 = sum((predict(fit3, X_test)-y_test[[1]])^2)/nrow(X)

Compare original gradient boosting model and with hierarchical shrinkage

fit <- gbm(y~., data = data.frame(X_train,y_train), n.trees=100, interaction.depth=2)
fit1 <- HSTreeRegressor(X_train, y_train, interaction.depth=2, estimator="GradientBoosting")  # the default shrinkage method is node_based
fit2 <- HSTreeRegressor(X_train, y_train, interaction.depth=2, estimator="GradientBoosting", shrinkage="constant")
msep = sum((predict(fit, X_test)-y_test[[1]])^2)/nrow(X)
msep1 = sum((predict(fit1, X_test)-y_test[[1]])^2)/nrow(X)
msep2 = sum((predict(fit2, X_test)-y_test[[1]])^2)/nrow(X)

# to print the structure for each tree
# single= pretty.gbm.tree(fit, i.tree = 1)

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
R		R
man		man
DESCRIPTION		DESCRIPTION
HSTree.Rproj		HSTree.Rproj
HSTree_0.8.0.pdf		HSTree_0.8.0.pdf
NAMESPACE		NAMESPACE
README.md		README.md
X.csv		X.csv
image.png		image.png
y.csv		y.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

R package for HSTree

Plot result with HSTree R package

Install all the packages

Use the cross validation

Split the dataset

Compare the original CART and hierarchical shrinkage tree

Compare original random forest and with hierarchical shrinkage

Compare original gradient boosting model and with hierarchical shrinkage

About

Releases

Packages

Languages

wanghaoxue0/HSTree

Folders and files

Latest commit

History

Repository files navigation

R package for HSTree

Plot result with HSTree R package

Install all the packages

Use the cross validation

Split the dataset

Compare the original CART and hierarchical shrinkage tree

Compare original random forest and with hierarchical shrinkage

Compare original gradient boosting model and with hierarchical shrinkage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages