piątek, 29 stycznia 2016

What u play with on smartphone? I play R stats :-)

I followed this tut, thanks! http://www.r-bloggers.com/install-r-in-android-via-gnuroot-no-root-required/ 
1. Installed GnuRoot app and GnuRoot Wheezy (no real rooting required, it creates its own world!)
2. Refreshed software: apt-get update & apt-get upgrade.
3. Installed R: apt-get install R-base-core. Lots of space is needed, so delete all trash apps from your android phone!

===========
I typed R, created a dataframe to play with, run kmeans on it, with no issues. There is no graphics yet, you will not see plots (probably you can generate them as files), unless you install another heavy MB of linux GUI like lxde. But for my bus trips I use txtplot - it produces density, boxplot, scatter, acf in ASCII.
To see kmeans clusters: txtplot(x,y,pch=cluster_no)
R used phone's internet and downloaded it with no issues - it means you can also download dataframes from a repository online.

 A keyboard with arrows is recommended, like Hacker's Keyboard, so that it should be easy to go through previous commands, modify them and correct typo's inside the string..


GnuRoot autor Corbin Champion also offers a solution to run Octave on your phone. I highly recommend him.  https://play.google.com/store/apps/details?id=champion.gnuroot

More: Google: install r in android:
http://www.r-ohjelmointi.org/?p=1434
http://stackoverflow.com/questions/13047994/are-there-any-android-version-of-r-without-rooting-the-device

czwartek, 28 stycznia 2016

Watched "Interstellar" with my son. Impressive

Can we chase and reprogram an old drone just like in a movie? What is tesseract? Event horison? Singularity? Will Earth turn against us? Can we leave Earth now and go, confortably, sustainably live in Space? Hybernate and go towards other planetary systems? How do we know Earth turned against life in the past? Geological layers? How do we know there are other planets around other stars and there is oxygen on them? Spectroscopy? We will probably talk for several weeks about it. Great movie. https://en.wikipedia.org/wiki/Tesseract https://pl.wikipedia.org/wiki/Horyzont_zdarze%C5%84 https://en.wikipedia.org/wiki/Event_horizon https://en.wikipedia.org/wiki/Gravitational_singularity
 
"Mantra" repeated frequently in the movie: 
 
Do not go gentle into that good night
Dylan Thomas, 1914 - 1953

Do not go gentle into that good night,
Old age should burn and rave at close of day;
Rage, rage against the dying of the light.

Though wise men at their end know dark is right,
Because their words had forked no lightning they
Do not go gentle into that good night.

Good men, the last wave by, crying how bright
Their frail deeds might have danced in a green bay,
Rage, rage against the dying of the light.

Wild men who caught and sang the sun in flight,
And learn, too late, they grieved it on its way,
Do not go gentle into that good night.

Grave men, near death, who see with blinding sight
Blind eyes could blaze like meteors and be gay,
Rage, rage against the dying of the light.

--------------------------------

Nie wchodź łagodnie do tej dobrej nocy,
Starość u kresu dnia niech płonie, krwawi;
Buntuj się, buntuj, gdy światło się mroczy.

Mędrcy, choć wiedzą, że ciemność w nich wkroczy –
Bo nie rozszczepią słowami błyskawic –
Nie wchodzą cicho do tej dobrej nocy.

Cnotliwi, płacząc kiedy ich otoczy
Wspomnienie czynów w kruchym wieńcu sławy,
Niech się buntują, gdy światło się mroczy.

Szaleni słońce chwytający w locie,
Wasz śpiew radosny był mu trenem łzawym;
Nie wchodźcie cicho do tej dobrej nocy.

Posępnym, którym śmierć oślepia oczy,
Niech wzrok się w blasku jak meteor pławi;
Niech się buntują, gdy światło się mroczy.

Błogosławieństwem i klątwą niech broczy
Łza twoja, ojcze w niebie niełaskawym.
Nie wchodź łagodnie do tej dobrej nocy.
Buntuj się, buntuj, gdy światło się mroczy.

(wiersz Dylana Thomasa w przekładzie Stanisława Barańczaka)

piątek, 22 stycznia 2016

Coffee break: VBA Class refresher

Very simple and recommendable: http://analystcave.com/vba-vba-class-tutorial/
Where to use: http://stackoverflow.com/questions/118863/when-to-use-a-class-in-vba
Some Access vba reading: http://sourcedaddy.com/ms-access/object-oriented-programming-with-vba.html
http://vbadeveloper.net/optionpricerclassesdesignvba.pdf

very interesting: http://www.thelandbeyondspreadsheets.com/a-simple-example-of-how-to-use-class-modules-for-something-useful-in-excel-vba/
Microsoft article: https://www.microsoftpressstore.com/articles/article.aspx?p=2225071
Indeed Wise Owl: http://www.wiseowl.co.uk/blog/s237/classes.htm

czwartek, 21 stycznia 2016

VBA: cleaning data from ListobjectTable without destroying it.

I highly recommend:
http://stackoverflow.com/questions/10220906/how-to-select-clear-table-contents-without-destroying-the-table Version improved for my needs checks if data has not been cleaned already:
    With .ListObjects("Table3")

        .Range.AutoFilter
            If .DataBodyRange.Rows.Count > 1 Then
            .DataBodyRange.Offset(1).Resize(.DataBodyRange.Rows.Count - 1, .DataBodyRange.Columns.Count).Rows.Delete
            End If
        .DataBodyRange.Rows(1).Cells.ClearContents
        
    End With

Poland's S&P Ratings Downgrade by M.Tyrmand

http://www.breitbart.com/london/2016/01/18/polands-sp-ratings-downgrade/

środa, 20 stycznia 2016

Excel - fast vlookup with Dictionary (MSScriptingRuntime)

I found an interesting entry: http://stackoverflow.com/questions/18656808/how-to-optimize-vlookup-for-high-search-count-alternatives-to-vlookup The formula is a table one and is entered with Ctrl+Shift+Enter. I added Application.Volatile to see live updates. Small limitation: It needs to be entered every 65536 rows if the column is longer.

R vs Python - nice code comparison.

I recommend https://www.dataquest.io/blog/python-vs-r/

poniedziałek, 18 stycznia 2016

Excel VBA to hide and unhide a sheet on entering the right password.

Read this post: http://www.mrexcel.com/forum/excel-questions/380770-visual-basic-applications-hiding%5Cunhiding-very-hidden-sheet-via-password.html to have a nice solution to protect a sheet using "very hidden" object property. Attention: One must also remember to protect VBA project itself with a password, so that the sheet cannot be explored and unhidden in VBA editor.

niedziela, 17 stycznia 2016

What is happening in Poland?

http://www.politico.eu/sponsored-content/what-is-really-happening-in-poland/

Data nomad reporting from Sahara

Another day with numbers comforting me. I am Mistah W.Kurtz in my cave amidst wilderness of big city.

Numbers are from only God.
Pure truth. Devoid of adjectives and exclamations. Beautiful beyond any girl you would like to love in your dreams.

Existing beyond death. Will take you to the stars...

Mistah Kurtz contemplates:

A doberman mistreated badly by a former owner tells the new one how happy it is by forming geometrical shapes in the garden with its toys. I contemplate it daily. Thinking about my  family incarcerated, crushed and blacklisted by scientific ideology, in the name of the progress of man.

Adjectives of man are forbidden among us, a taboo.

Numbers are from God. Will protect and help calculate the way from the stars.

The desert conforms me.
I am a Beduin nomad on a camel. I eat dates with my coffee and meet other nomads once a year in a wilderness. Like lone sailors meeting mid-ocean and we talk simple geometry of the universe. finding regularities in the clear sky above.

Greetings from a digital nomad from xxi c. desert in Warsaw. Central Europe.

środa, 13 stycznia 2016

Excel formula to sort into bins.

 With VBA

Function MyBins(MyValue As Single) As String

Application.Volatile

        If SGPValue > 1 Then
               MyBins = "> 100%"
        
       
        ElseIf MyValue <= 1 And MyValue > 0.8 Then
           MyBins = "100%-80%"
        ElseIf MyValue <= 0.8 And MyValue > 0.65 Then
           MyBins = "80%-65%"
        ElseIf MyValue <= 0.65 And MyValue > 0.5 Then
           MyBins = "65%-50%"
        ElseIf MyValue <= 0.5 And MyValue > 0.2 Then
           MyBins = "50%-20%"
        ElseIf MyValue <= 0.2 And MyValue >= 0 Then
           MyBins = "20%-0%"
        ElseIf MyValue < 0 Then
           MyBins = "< 0"
        
       Else
           MyBins = "Please Verify"

           
    End If
 
    
End Function







Without?
Would you use IF for that purpose? And when there are several bins you will have plenty of nested IFs?
The answer is to use LOOKUP. cf.: http://www.mrexcel.com/forum/excel-questions/631008-excel-formula-sort-into-bins.html#post3130377 In this example you can also see how an array, a table, can be encoded in a formula.
I think it is very neat to use ListObject Table to keep the parameters in, then the formula would look like:
=LOOKUP(B3;Table1[a];Table1[b]) where [a] is a column with upper limits and [b] contains the corresponding names of the bins.

Table 1
a            b
-99999  < 1
1           1 to 1,9999
2           2 to 4,9999
5           5 to 9,9999
10         10 and avobe

To count bins to generate a histogram it is probably best to use countifs:
http://exceluser.com/formulas/countifs-frequency-distributions.htm 




Survey Science: an interesting excel macro collection for researchers.

See: http://www.surveyscience.co.uk/index.html

Plus, an interesting article on outliers' detection: http://fernandonunezserrano.com/Outlier_Detector.pdf
(and in R: http://www.rdatamining.com/examples/outlier-detection)

wtorek, 12 stycznia 2016

R: mosaic plots with labels? Other graph musings.

Googled out this solution: http://stackoverflow.com/questions/20228326/mosaic-plot-with-labels-in-each-box-showing-a-name-and-percentage-of-all-observa

I looked at these pages as well:
https://ds4ci.files.wordpress.com/2013/09/user08_jimp_custseg_revnov08.pdf
http://tables2graphs.com/doku.php?id=03_descriptive_statistics#figure_2
https://cran.r-project.org/web/packages/vcd/vignettes/residual-shadings.pdf
http://cran.us.r-project.org/web/packages/vcdExtra/vignettes/vcd-tutorial.pdf
http://www.math.ucla.edu/~anderson/rw1001/library/base/html/mosaicplot.html
summary of cat. values http://stackoverflow.com/questions/14235096/data-summary-based-on-multiple-categorical-variables

Clustering segmentation:
http://marketing-yogi.blogspot.com/2012/12/segmentation-tools-in-r-session-5.html
https://ds4ci.files.wordpress.com/2013/09/jimporzak_rfwithr_dmaac_jan07_webinar.pdf


OT: other graphs
http://www.cyclismo.org/tutorial/R/intermediatePlotting.html
http://stats.stackexchange.com/questions/4089/graphical-data-overview-summary-function-in-r?rq=1
http://www.statmethods.net/advstats/cart.html
http://ww2.coastal.edu/kingw/statistics/R-tutorials/graphically.html


Breaking data into bins: http://stackoverflow.com/questions/12979456/r-code-to-categorize-age-into-group-bins-breaks

Aggregating: http://www.statmethods.net/management/aggregate.html
http://davetang.org/muse/2013/05/22/using-aggregate-and-apply-in-r/
http://www.r-statistics.com/2012/01/aggregation-and-restructuring-data-from-r-in-action/
https://www.soa.org/News-and-Publications/Newsletters/Compact/2012/august/The-R-Corner-%E2%80%93-Aggregate-Function.aspx

poniedziałek, 11 stycznia 2016

Early morning R readings: knn imputation of missing values

Keywords: imputation of missing values in r with knn Googled out and briefly read about of using k nearest neigbors for filling missing values: http://stats.stackexchange.com/questions/61110/knn-imputation-r-packages http://finzi.psych.upenn.edu/library/clustergas/html/missvalue.knn.html http://artax.karlin.mff.cuni.cz/r-help/library/scrime/html/knncatimputeLarge.html

czwartek, 7 stycznia 2016

Excel VBA: changing value calculates discount, changing discount calculates value.

We want to enter values and have discount calculated in another column. On the other hand we want to change discount and have value calculated... We need VBA and sheet events:

Private oListObj As ListObject
      
'Private Sub Worksheet_Activate()
'    ActiveSheet.Protect "pwd", AllowFiltering:=True
'End Sub

    Private Sub Worksheet_BeforeDoubleClick(ByVal Target As Range, Cancel As Boolean)
      
    Set oListObj = Worksheets("Quotation").ListObjects("tblProForma")
      
    Application.EnableEvents = True
      
        If Not Intersect(Target, oListObj.ListColumns("Price").DataBodyRange) Is Nothing Then
        Application.EnableEvents = False
        Application.AutoCorrect.AutoFillFormulasInLists = False
        Target.Formula = Target.Value
        Application.EnableEvents = True
        End If
          
        If Not Intersect(Target, oListObj.ListColumns("Discount").DataBodyRange) Is Nothing Then
        Application.EnableEvents = False
        Application.AutoCorrect.AutoFillFormulasInLists = False
        Target.Formula = Round(Target.Value, 5)
        Application.EnableEvents = True
        End If

    End Sub


      
    Private Sub Worksheet_Change(ByVal Target As Range)
      
    Dim PriceDiscountOffset As Integer: PriceDiscountOffset = ActiveSheet.Range("tblProForma[[#All],[Price]:[Discount]]").Columns.Count - 1
          
    Set oListObj = Worksheets("Quotation").ListObjects("tblProForma")
      
    Application.EnableEvents = True
    
    '=============
    'This code prevents macro from ruining table if user resizes it.
    '=============http://www.mrexcel.com/forum/excel-questions/844654-can-i-tell-excel-run-macro-every-time-i-manually-resize-listobject-excel-table.html
    Dim sLastUndoStackItem  As String
  
        On Error Resume Next
        sLastUndoStackItem = Application.CommandBars("Standard").FindControl(ID:=128).List(1)
        On Error GoTo 0
    
     If sLastUndoStackItem = "Table Resize" Then
       Exit Sub
     End If
    
    '=============
    
      
        If Not Intersect(Target, oListObj.ListColumns("Price").DataBodyRange) Is Nothing Then
        Application.EnableEvents = False
        Application.AutoCorrect.AutoFillFormulasInLists = False
        
        
        Target.Offset(0, PriceDiscountOffset).Formula = "=IF([@[Price]]<>"""", -([@[Price]]-[@[Pricelist]])/[@[Price]],"""")"
        Application.EnableEvents = True
        End If
          
        If Not Intersect(Target, oListObj.ListColumns("Discount").DataBodyRange) Is Nothing Then
        Application.EnableEvents = False
        
             
        Application.AutoCorrect.AutoFillFormulasInLists = False
        Target.Offset(0, -PriceDiscountOffset).Formula = "=[@[Pricelist]]-([@[Pricelist]]*[@[Discount]])"
        Application.EnableEvents = True
        End If

      
    End Sub

wtorek, 5 stycznia 2016

Coffee break. Trees, some reading and stupidly playing with tree clusters...

http://www.exegetic.biz/blog/2013/05/package-party-conditional-inference-trees/ very good read.
http://www.r-bloggers.com/package-party-conditional-inference-trees/
http://www.statmethods.net/advstats/cart.html
https://stat.ethz.ch/R-manual/R-devel/library/stats/html/hclust.html

Played like that:
> drzewko<-ctree(mojedane$y~.,data = mojedane[,c(1:3)])
> predict(drzewko, type = "response") 
 

R: quick view at data - correlogram, margin plot etc.

Quick get it from excel
 
> mojedane<-read.table(file="clipboard",dec = ",",header=TRUE) 
 
Then, after: http://www.statmethods.net/advgraphs/correlograms.html 
 
> library(corrgram) 
 
> corrgram(mojedane, order=TRUE, lower.panel=panel.ellipse,
+          upper.panel=panel.pie, text.panel=panel.txt,
+          diag.panel=panel.minmax,
+          main="Korelacje - moje dane") 
 
 
To see where the missing data goes.
 
http://www.inside-r.org/packages/cran/VIM/docs/marginplot
 
> library("VIM")
> mojedane<-read.table(file="clipboard",dec = ",",header=TRUE,row.names = 1)
> marginplot(mojedane[,c(1,2)])

R: xts and zoo package for missing data; two plots

Dealing with incomplete data:

Useful and basic entry into the topic:
http://nicercode.github.io/intro/missing-data.html and
http://thomasleeper.com/Rcourse/Tutorials/NAhandling.html
http://www.ats.ucla.edu/stat/r/faq/missing.htm

==Fill NA==
A very interesting reading on the use of zoo/xts package here (page 1 and the following): http://publish.illinois.edu/spencer-guerrero/2014/12/11/2-dealing-with-missing-data-in-r-omit-approx-or-spline-part-1/
 

My dirty notebook:

> is.na (
> mean(Data&Variable1, na.rm=TRUE)

Remove incomplete
> na.omit(Data$Variable1)
> complete.cases(Data)
> complete.data <- complete.cases(Data)

Fill continuous/time series data
> require (xts)
> prices <-na.locf(prices)
> prices <-na.fill(prices, fill="extend") are there other fills?
check out na.approx, na.spline!!!
na.contiguous (to find longest contiguous na's?) 

For plotting
na.fill etc functions turn data frame into atomic vectors...
It means that if we plot, we cannot refer to column names but to column numbers.

Plot several plots:
> par(mfrow=(c(2,1))
> plot(prices[,1], type='l')
> plot(prices[,2], type='l')




 ====
For correlation
cor ()... use the argument use='pairwise', then only the pairs with na will be ommited and not the whole observations.
 ====
Serious approach: imputation:
http://www.stat.columbia.edu/~gelman/arm/missing.pdf
e.g.: http://stackoverflow.com/questions/11990463/match-fitted-values-from-lm-with-a-data-frame-in-case-of-na-values
https://stat.ethz.ch/pipermail/r-help/2002-January/017756.html

Good article for a study: http://www.r-bloggers.com/imputing-missing-data-with-r-mice-package/
http://research.stowers-institute.org/efg/R/Statistics/loess.htm
http://www.biostat.jhsph.edu/~iruczins/teaching/jf/ch14.pdf
http://www.ats.ucla.edu/stat/r/faq/R_pmm_mi.htm
http://www.stefvanbuuren.nl/publications/MICE%20in%20R%20-%20Draft.pdf
http://stackoverflow.com/questions/2613420/handling-missing-incomplete-data-in-r-is-there-function-to-mask-but-not-remove

poniedziałek, 4 stycznia 2016

Maps of Poland in R

To start with:
http://www.gauss.pl/analityk/drupal/node/43
http://www.kep.uni.lodz.pl/em/materialy/BANKI/BANKI_DANYCH_cw_10.doc

Most comprehensive:
http://quantup.pl/2015/03/13/analiza-danych/wizualizacja-danych-na-mapach-w-r/

http://www.goldenline.pl/grupy/Komputery_Internet/r/mapa-polski,1523596/
http://www.gis-support.pl/baza-wiedzy/dane/dane-do-pobrania/
http://www.wais.kamil.rzeszow.pl/pakiet-r-openpoland-tutorial/

In Excel
http://www.apocotenexcel.pl/mapa2.htm

sobota, 2 stycznia 2016

R: Just discovered Swirl, interactive courses in R console.

To install: Choose a course name from https://github.com/swirldev/swirl_courses and paste it in R command, eg:
> install_from_swirl("Regression_Models", dev = FALSE, mirror = "github")

This night I completed R programming course, an intro.

Some of my unordered notes, useful perhaps only to myself:

Working directory:
list.files() or dir()
> args(list.files)
> dir.create("testdir")
> file.create("mytest.R")
> file.exists("mytest.R")
> file.rename("mytest.R","mytest2.R")
> file.copy("mytest2.R","mytest3.R")
> file.path("mytest3.R") - good to construct relative, independent of system, paths in a procedure.
> file.path("folder1","folder2")
> dir.create(file.path("testdir2","testdir3"),recursive = TRUE) - creating dir and subdir in one shot
> unlink("testdir2", recursive=TRUE) - deleting directory with all its content, note use Recursive
> unlink("testdir", recursive=TRUE)



Sequences of numbers along some vector:
> seq(along.with = my_seq)
> seq_along(my_seq)
> rep(c(0,1,2),times = 40)
> rep(c(0,1,2), each = 10)
> file.info("mytest.R")$mode --- to grab specific info


Vectors
> paste(my_char, collapse = " ")  - collapsing string vector to one string.
> paste("Hello", "world!", sep=" ")

> paste(1:3,c("X","Y","Z"), sep="")
[1] "1X" "2Y" "3Z"

> paste(LETTERS, 1:4, sep="-")
 [1] "A-1" "B-2" "C-3" "D-4" "E-1" "F-2" "G-3" "H-4" "I-1" "J-2" "K-3" "L-4" "M-1" "N-2" "O-3" "P-4" "Q-1" "R-2" "S-3" "T-4" "U-1" "V-2" "W-3" "X-4" "Y-1"
[26] "Z-2"


indexing
> x[-c(2,10)] exclude these numbers, use - once before concatenate

==data.frame==
> colnames(my_data)<-cnames - giving names to columns


> which(ints>7) Which are less then 7?
[1] 1 3 5
> any(ints<0) are there any lesser than 7
[1] FALSE
> all(ints>0)
[1] TRUE

functions

ellipsis arguments...

mad_libs <- function(...){
  # Do your argument unpacking here!

  args <- list(...)


  place <- args[["place"]]
  adjective <- args[["adjective"]]
  noun <- args[["noun"]]

  paste("News from", place, "today where", adjective, "students took to the streets in protest of the new", noun, "being installed on campus.")
}

your own operators, eg. add texts to each other.

"%p%" <- function(left,right){
  paste(left,right)
 
}

> lapply(unique_vals, function(elem) elem[2])  - one liner function

Table apply?

> table(flags$animate) - count flags which are animate :-)

> tapply(flags$animate, flags$landmass, mean)  - get the proportion of animate elements in each landmass.

>  tapply(flags$population, flags$red, summary) - get the population summary for red in the flag.

>  tapply(flags$population, flags$landmass, summary) - get the population summary for each landmass


Aquaint with data
> summary(plants)

if categorical data gets truncated and I need to analyse it:
> table (plants$Active_Growth_Period)

But the most concise info on data one gets via
> str(plants)

Simulation:
Dice throwing 4 times:
> sample(1:6,4, replace = TRUE)
Unfair coin flips:
> flips<- sample(c(0,1),100,replace=TRUE, prob=c(0.3,0.7))
> rbinom(1,size=100, prob=0.7) - number of heads
> rbinom(100, size=1, prob=0.7) - set of all observations


 Each probability distribution in R has an
r*** function (for "random"), a d*** function (for "density"), a p*** (for "probability"), and q*** (for
| "quantile").


Time:
> str(unclass(t2)) to have a more compact view of all components of a structure.
> t4<- strptime(t3,  "%B %d, %Y %H:%M") - check it out, it converts txt dates into POSIXlt
> difftime(Sys.time(),t1, units='days') time difference in days



Plot:
> plot(cars, xlim = c(10,15)) - limit x to 10..15 range
> plot(cars, pch = 2) - change tick symbols (here to triangles)

> boxplot(mpg ~ cyl, data = mtcars)   y mpg to x cyl from data = mtcars
> hist(mtcars$mpg)