wtorek, 12 września 2017

Dlaczego R? hclust

hclust - Próba analizy odpowiedzi w formularzu zapisów na konferencję WhyR. kolumna: "Chcesz.podzielić.się.odpowiedzią.na.to.pytanie..Chętnie.przedstawimy.najciekawsze.odpowiedzi"
- bo lingua franca,
- bo ogromne możliwości,
- bo ggplot2,
- bo nie Python, bo nie SAS,
- bo za free,
- bo Data Science...

Interpretacja subiektywna w Paincie :-)


R za darmo daje ogromne możliwości i pozwala się rozwijać i komunikować (lingua franca), szczególnie dzięki GGPlot2. Jest za darmo. Jest alternatywą dla  SASa i  Pythona.

Edycja w EzGif.com

piątek, 9 czerwca 2017

Introductory Python ML, short 2 day course.

Conclusions:
1. R is so much easier, portable, supercool. Jupyter Notebooks are far from Rstudio R Notebooks.
2. Python can be learned/is similar, just remember,  indenting is part of syntax :-)
3. I must learn Pandas! Scikit Learn! Seaborn! maybe by comparison to dplyr, ggplot2.

Microsoft, please allow more of R/ExcelVBA/Python interoperability for all.

Excel: Conditional formatting string numbers...

Why I have never discovered that until today!!!:

[=0]"";rrrr-mm-dd

"If there is 0, then do not put an erroneous date (a result of some bug in Excel insisting there is a 1900-01-00 date)". Is there more? Like coloring the font dependent on conditions met? Supercool!

sobota, 6 maja 2017

Big Data on a laptop, some options.

Data mining on streams.
http://moa.cms.waikato.ac.nz/rmoa-massive-online-data-stream-classifications-with-r-moa/
http://jwijffels.github.io/RMOA/
https://cran.r-project.org/web/packages/stream/stream.pdf

Database light backend.
https://www.monetdb.org/blog/monetdblite-r

Data mining
https://rdrr.io/cran/ffbase/man/bigglm.ffdf.html
https://cran.r-project.org/web/packages/speedglm/speedglm.pdf
https://cran.r-project.org/web/packages/randomForest.ddR/randomForest.ddR.pdf

piątek, 5 maja 2017

Bayes for beginners videos.

How to explain it plain E:
https://www.khanacademy.org/math/statistics-probability/probability-library/conditional-probability-independence/v/calculating-conditional-probability
http://www.watchknowlearn.org/Video.aspx?VideoID=16751&CategoryID=4457

https://www.khanacademy.org/math/ap-statistics/probability-ap/stats-conditional-probability/v/bayes-theorem-visualized
https://www.khanacademy.org/partner-content/wi-phi/wiphi-critical-thinking/wiphi-fundamentals/v/bayes-theorem
https://www.youtube.com/watch?v=Y-V4rfdl3NI
https://brilliant.org/wiki/bayes-theorem/

piątek, 24 marca 2017

Risk matrix in R (interesting readings)



1) Risk matrix examples http://davidmeza1.github.io/2015/12/17/2015-12-17-Creating-a-Risk-Matrix-in-R.html

2) Use ggrepel instead of jitter  https://cran.r-project.org/web/packages/ggrepel/vignettes/ggrepel.html
 for multiple points.

3)  Is there ggrepel for Excel (?) http://stackoverflow.com/questions/30294041/excel-bubble-chart-overlapping-data-label

wtorek, 21 marca 2017

Stats Day2: Covariance

Covariance 
Cov(for sample) = sum[(x - śr(x))*(y-śr(y))]/n-1 ... we are interesting in the sign. Does not provide strength. Is not standardized.

Covariance matrix...diagonal shows variance of each variable, off-diagonal show covariances betw. each variable pair.

Correlation (Pearson, r)
r =Cov(x,y)/[SD(x)* SD(y)]