piątek, 12 sierpnia 2016

R: data exploration nomad's top reading this month, mlR

Alternative to caret: mlR
To practice: https://www.analyticsvidhya.com/blog/2016/08/practicing-machine-learning-techniques-in-r-with-mlr-package/
Official tutorial: http://mlr-org.github.io/mlr-tutorial/devel/html/


Other resolutions:
Learn R control structures: http://www.statmethods.net/management/controlstructures.html
Learn xgboost bare without the convenience of caret (prepare sparse/dense matrix data, finetune): https://github.com/dmlc/xgboost/blob/master/R-package/vignettes/xgboostPresentation.Rmd
http://xgboost.readthedocs.io/en/latest/R-package/xgboostPresentation.html
Use my own model in train in caret (or learn to put in the grid the parameters not originally supported in caret)
https://github.com/dmlc/xgboost/blob/master/R-package/vignettes/xgboostPresentation.Rmd
Cleaning data, operations on text, imputation of NA's etc !!!

Other:
Time series wanderings:
http://swmprats.net/forum/potm-single-variable-exploration/35-august-potm-time-series-decomposition-with-swmpr


Excel trivia: learn how to find multiple matches:
 https://fiveminutelessons.com/learn-microsoft-excel/use-index-lookup-multiple-values-list
http://stackoverflow.com/questions/26424226/excel-return-multiple-matching-values-from-a-column-horizontally-in-one-row
http://eimagine.com/how-to-return-multiple-match-values-in-excel-using-index-match-or-vlookup/
 

poniedziałek, 1 sierpnia 2016

Top reading and exploring on my list. xgboost, ensembles, preprocessing.

1. xgboost - one of Kaggle winner packages, efficient code for home users.
http://www.r-bloggers.com/an-introduction-to-xgboost-r-package/
https://github.com/rachar1/DataAnalysis/blob/master/xgboost_Classification.R
Awesome XGBoost: https://github.com/dmlc/xgboost/blob/master/demo/README.md#features-walkthrough
https://github.com/dmlc/xgboost/blob/master/R-package/vignettes/xgboostPresentation.Rmd
http://wiselily.com/2015/07/12/xgboost-data-mining-example-1/


apply Xgboost in Kaggle with us https://m.youtube.com/watch?v=zwKFyMkvNXE

Tuning parameters:
https://rpubs.com/flyingdisc/practical-machine-learning-xgboost
http://stackoverflow.com/questions/33949735/tuning-xgboost-parameters-in-r
http://www.r-bloggers.com/r-setup-a-grid-search-for-xgboost/
https://www.kaggle.com/c/otto-group-product-classification-challenge/forums/t/14009/script-understanding-xgboost-model-on-otto-dataset
https://www.kaggle.com/tqchen/otto-group-product-classification-challenge/understanding-xgboost-model-on-otto-data/code
http://stats.stackexchange.com/questions/171043/how-to-tune-hyperparameters-of-xgboost-trees
https://www.kaggle.com/c/bnp-paribas-cardif-claims-management/forums/t/19922/how-to-tune-xgboost-using-r
https://www.kaggle.com/forums/f/15/kaggle-forum/t/17120/how-to-tuning-xgboost-in-an-efficient-way
https://www.analyticsvidhya.com/blog/2016/01/xgboost-algorithm-easy-steps/
https://github.com/topepo/caret/issues/336

Plot:
http://rpackages.ianhowson.com/cran/xgboost/man/xgb.plot.tree.html
(off topic, rearange corr) https://drsimonj.svbtle.com/rearrange-your-correlations-with-corrr

 http://www.analyticsvidhya.com/blog/2016/01/xgboost-algorithm-easy-steps/
(also RF: http://www.analyticsvidhya.com/blog/2015/09/random-forest-algorithm-multiple-challenges/)
https://cran.r-project.org/web/packages/xgboost/vignettes/xgboostPresentation.html
https://www.kaggle.com/rajivranjansingh/liberty-mutual-group-property-inspection-prediction/xgboost-in-caret/run/44015/code
https://cran.r-project.org/web/packages/xgboost/vignettes/discoverYourData.html
https://cran.r-project.org/web/packages/xgboost/vignettes/xgboostPresentation.html

2. ensemble learning
http://www.vikparuchuri.com/blog/intro-to-ensemble-learning-in-r/

3. preprocessing
http://topepo.github.io/caret/preprocess.html

4. Another free course:
https://lagunita.stanford.edu/courses/HumanitiesScience/StatLearning/Winter2014/about

5. Deep learning in R resources: https://discuss.analyticsvidhya.com/t/deep-learning-in-r/10349

 Do not read this, My first attempts:

library(xgboost)

mydata<-read.csv(file="clipboard", sep="\t", header=T)

mydata<-sapply(mydata, as.numeric)

#all columns as numeric
mydata<-as.data.frame(lapply(mydata,as.numeric)

#or
mydata<-as.data.frame(sapply(mydata[,1:3], as.numeric))

#label must be 0 or 1 in logistic regression
mydata$Gatunek<-mydata$Gatunek-1


model <- xgboost(data = as.matrix(mydata[,1:2]), label = mydata$Gatunek,
                 nrounds = 2, objective = "binary:logistic")

> install.packages("DiagrammeR")
xgb.plot.tree(model = model)

My NO to Warsaw Uprising 1944 use in current politics.

Am i entitled to take a stand on behalf of AK too? Can I wear their symbols? An armband?

My granddad was in Home Army (Armia Krajowa) in Czortków and then in 2nd People's Army (2 Armia LWP). In Czortków he witnessed Polish scouts, priests, railwaymen, elites murdered by NKVD, with bags over their heads, driven over by a tracked tractor. When the Russians pushed the Germans back, he joined the soviet controlled Polish Army to fight Germans and (this was an order of his superior) avoid arrest from hands of communist Smiersh. I was very proud of his bravery. After the war his children had a pretty hard time from the socialist "democrats". Chances were lost. Interrogated, beaten, he did not reveal his contacts. He was protected against major persecution and death thanks for his brave rescuing of high Soviet Artillery officer in battle. His colleagues were not so lucky...

After the war he could not buy farming equipment without begging commie bureauucrats, writing letters, showing his war medals. Then his children had various small and large problems in education and finding adequate employment.  My childhood in a gray communist world can humbly be added; where would I be now were it not for limitations of "socialist/communist" heaven?

My grandma's father had a big farm, earned with the hard work in Argentine. It was nationalised by commies, a "friendly neighbor" with a red band warned them they would be arrested before dawn and deported to Siberia. "I am preparing these lists, I removed you once, but I cannot do it again. Please go."  All the Poles were to be deported - this was the main criterion.  They loaded a handful of necessities on a cart and fled to the forest. After Germany attacked the Soviets, she ended up in German Concentration Camp near Koenigsberg. She told me she witnessed a girl inmate publically executed for stealing potato peals. They were just teens forced to work for German war effort, work in two shifts. Initially these camps had a regulation for food rationing calculated to exploit a worker and put him or her to death within 3 months but it was inefficient and had to be relaxed. This is the story I learned from her.

I am also entitled to take a stand because I served my free country and I have a mobilization card. If my country is threatened I will not run and say "Sorry Poland" or even "Sorry NATO countries". I will not be happy to take anyone's lives and I am not prejudiced against any nation but I will defend my country.



Personally I seem to have socialist views: poor people should be helped out and not fought against, employment shall not exploit, education and healthcare for all, fair salaries, inequality shall be moderated by giving people chances. But after the XX totalitarian experiment how can I trust anything that has "socialist" in the name or refers to Marxism.


I am not sure I can properly express my anger at current plays of politicians around Warsaw Uprising and AK heroes. And of the media teaching the young that they should not perhaps wear AK/WP armbands or that they are not entitled to commemorate the heroes of Warsaw Uprising. Or perhaps AK was antisemitic. Or it was responsible for suffering of Warsaw civilians. Responsibility is debatable, but is there a sincere debate without arguments copied verbose from  propagandists serving Stalinist or Nazi regimes?
I am sure of what my grandma would say but it would be an extremely explicit language, I will use Orwell's precise words then:

First of all, a message to [...] left-wing journalists and intellectuals generally: ‘Do remember that dishonesty and cowardice always have to be paid for. Don’t imagine that for years on end you can make yourself the boot-licking propagandist of the Soviet régime, or any other régime, and then suddenly return to mental decency. Once a whore, always a whore.’ 

George Orwell "As I Please", 1 Sep. 1944
http://www.telelib.com/authors/O/OrwellGeorge/essay/tribune/AsIPlease19440901.html

Sorry for the language, this is one of the days I should quote it.