poniedziałek, 1 sierpnia 2016

Top reading and exploring on my list. xgboost, ensembles, preprocessing.

1. xgboost - one of Kaggle winner packages, efficient code for home users.
http://www.r-bloggers.com/an-introduction-to-xgboost-r-package/
https://github.com/rachar1/DataAnalysis/blob/master/xgboost_Classification.R
Awesome XGBoost: https://github.com/dmlc/xgboost/blob/master/demo/README.md#features-walkthrough
https://github.com/dmlc/xgboost/blob/master/R-package/vignettes/xgboostPresentation.Rmd
http://wiselily.com/2015/07/12/xgboost-data-mining-example-1/


apply Xgboost in Kaggle with us https://m.youtube.com/watch?v=zwKFyMkvNXE

Tuning parameters:
https://rpubs.com/flyingdisc/practical-machine-learning-xgboost
http://stackoverflow.com/questions/33949735/tuning-xgboost-parameters-in-r
http://www.r-bloggers.com/r-setup-a-grid-search-for-xgboost/
https://www.kaggle.com/c/otto-group-product-classification-challenge/forums/t/14009/script-understanding-xgboost-model-on-otto-dataset
https://www.kaggle.com/tqchen/otto-group-product-classification-challenge/understanding-xgboost-model-on-otto-data/code
http://stats.stackexchange.com/questions/171043/how-to-tune-hyperparameters-of-xgboost-trees
https://www.kaggle.com/c/bnp-paribas-cardif-claims-management/forums/t/19922/how-to-tune-xgboost-using-r
https://www.kaggle.com/forums/f/15/kaggle-forum/t/17120/how-to-tuning-xgboost-in-an-efficient-way
https://www.analyticsvidhya.com/blog/2016/01/xgboost-algorithm-easy-steps/
https://github.com/topepo/caret/issues/336

Plot:
http://rpackages.ianhowson.com/cran/xgboost/man/xgb.plot.tree.html
(off topic, rearange corr) https://drsimonj.svbtle.com/rearrange-your-correlations-with-corrr

 http://www.analyticsvidhya.com/blog/2016/01/xgboost-algorithm-easy-steps/
(also RF: http://www.analyticsvidhya.com/blog/2015/09/random-forest-algorithm-multiple-challenges/)
https://cran.r-project.org/web/packages/xgboost/vignettes/xgboostPresentation.html
https://www.kaggle.com/rajivranjansingh/liberty-mutual-group-property-inspection-prediction/xgboost-in-caret/run/44015/code
https://cran.r-project.org/web/packages/xgboost/vignettes/discoverYourData.html
https://cran.r-project.org/web/packages/xgboost/vignettes/xgboostPresentation.html

2. ensemble learning
http://www.vikparuchuri.com/blog/intro-to-ensemble-learning-in-r/

3. preprocessing
http://topepo.github.io/caret/preprocess.html

4. Another free course:
https://lagunita.stanford.edu/courses/HumanitiesScience/StatLearning/Winter2014/about

5. Deep learning in R resources: https://discuss.analyticsvidhya.com/t/deep-learning-in-r/10349

 Do not read this, My first attempts:

library(xgboost)

mydata<-read.csv(file="clipboard", sep="\t", header=T)

mydata<-sapply(mydata, as.numeric)

#all columns as numeric
mydata<-as.data.frame(lapply(mydata,as.numeric)

#or
mydata<-as.data.frame(sapply(mydata[,1:3], as.numeric))

#label must be 0 or 1 in logistic regression
mydata$Gatunek<-mydata$Gatunek-1


model <- xgboost(data = as.matrix(mydata[,1:2]), label = mydata$Gatunek,
                 nrounds = 2, objective = "binary:logistic")

> install.packages("DiagrammeR")
xgb.plot.tree(model = model)

Brak komentarzy:

Prześlij komentarz