Interesting Excel interface to R via RExcel: https://sourceforge.net/projects/imdev
Some interesting tutorials to study: https://github.com/dgrapov/TeachingDemos
https://sourceforge.net/projects/teachingdemos/files/
The author responded next day. I need to follow up!
piątek, 26 lutego 2016
środa, 24 lutego 2016
Excel VBA: Evaluate() or square brackets shorthand.
I recommend this trick to shorten the code for entering/copying/calculating data in spreadsheet ranges:
http://www.ozgrid.com/forum/showthread.php?t=52372&page=2
Example:
http://www.ozgrid.com/forum/showthread.php?t=52372&page=2
Example:
Sub Test() [A1].Offset(0, 1) = [=SUM(Table1[A])] End Sub
sobota, 13 lutego 2016
R appeal to Ministry of Digital Affairs of Poland
Dear Minister Anna Streżyńska
I would like humbly to propose that Polish Ministry of Digital Affairs should support the development of R community in Poland with a dedicated government program.
R community is a dynamically growing number of users of free R statistical environment which facilitates (indeed makes it easy for everyone) programming, data cleaning, exploration and data mining. In Warsaw alone it includes growing number of enthusiasts, Warsaw Open University alumni of R courses, regular meetups (http://www.spotkania-entuzjastow-r.pl/), Smarter Poland activities and many others.
R, can provide a single environment for:
- online access to public statistical data (medical, social, crime, economic, weather),
- a remote interactive teaching data science and programming in schools - teaching the IT skills that will be of immediate use to anyone from business startup-per to hobbyist building a drone capable of machine learning),
- visualizing and representing analysis results related to Poland (choropleths or subplots imposed on the map of Poland)
- the environment can also be set up for quick generating very high quality and publication ready reports that can be immediately printed and published on-line.
R can facilitate government transparency and promoting Open Government programme. (http://radar.oreilly.com/2009/07/making-government-transparent.html, http://ropengov.github.io/)
Some ideas of what Ministry of Digital Affairs can coordinate and assist with is:
- development of a dedicated R package for accessing public data, analysing and visualising it on an administrative map of Poland,
-development of an R distance learning package teaching statistics to Polish students (it can be the the matter of preparing a localized quality content, the tool "swirl" is available for that purpose),
- encouragement of public offices to publish current data that can be easily analyzed in R (csv, . This can be current information from treasury, statistical offices, police (geography of crime incidents), weather, health diagnosis and procedures (NHF and Ministry of Health data) - information devoid of personal information but fresh and detailed enough for Polish citizen to make informed decisions.
Advantages for Polish administration and economy:
Free R can be compared to expensive SPSS, Statistica and SAS packages - it can save millions giving powerful mining capabilities to government at any level and all citizens. It has been popular with world best universities, now it is used by big media houses, banks and research labs.
R data mining can be used for evectively analysing any data and drawing right conclusions in health management, poverty, tax evasion, crime incidence, even counterintelligence and anti-terrorism. R literacy can generate high quality jobs and bring measurable profit because it is a tool facilitating innovation and right decisions based on science.
It gives Poland a unique chance to be among the first to embrace the ideas of true modern democracy expressed in access to information and true modern freedom of making informed choices.
Sincerely Yours,
Jacek Kotowski
I would like humbly to propose that Polish Ministry of Digital Affairs should support the development of R community in Poland with a dedicated government program.
R community is a dynamically growing number of users of free R statistical environment which facilitates (indeed makes it easy for everyone) programming, data cleaning, exploration and data mining. In Warsaw alone it includes growing number of enthusiasts, Warsaw Open University alumni of R courses, regular meetups (http://www.spotkania-entuzjastow-r.pl/), Smarter Poland activities and many others.
R, can provide a single environment for:
- online access to public statistical data (medical, social, crime, economic, weather),
- a remote interactive teaching data science and programming in schools - teaching the IT skills that will be of immediate use to anyone from business startup-per to hobbyist building a drone capable of machine learning),
- visualizing and representing analysis results related to Poland (choropleths or subplots imposed on the map of Poland)
- the environment can also be set up for quick generating very high quality and publication ready reports that can be immediately printed and published on-line.
R can facilitate government transparency and promoting Open Government programme. (http://radar.oreilly.com/2009/07/making-government-transparent.html, http://ropengov.github.io/)
Some ideas of what Ministry of Digital Affairs can coordinate and assist with is:
- development of a dedicated R package for accessing public data, analysing and visualising it on an administrative map of Poland,
-development of an R distance learning package teaching statistics to Polish students (it can be the the matter of preparing a localized quality content, the tool "swirl" is available for that purpose),
- encouragement of public offices to publish current data that can be easily analyzed in R (csv, . This can be current information from treasury, statistical offices, police (geography of crime incidents), weather, health diagnosis and procedures (NHF and Ministry of Health data) - information devoid of personal information but fresh and detailed enough for Polish citizen to make informed decisions.
Advantages for Polish administration and economy:
Free R can be compared to expensive SPSS, Statistica and SAS packages - it can save millions giving powerful mining capabilities to government at any level and all citizens. It has been popular with world best universities, now it is used by big media houses, banks and research labs.
R data mining can be used for evectively analysing any data and drawing right conclusions in health management, poverty, tax evasion, crime incidence, even counterintelligence and anti-terrorism. R literacy can generate high quality jobs and bring measurable profit because it is a tool facilitating innovation and right decisions based on science.
It gives Poland a unique chance to be among the first to embrace the ideas of true modern democracy expressed in access to information and true modern freedom of making informed choices.
Sincerely Yours,
Jacek Kotowski
czwartek, 11 lutego 2016
Coffee break: Lesson ideas: R and Titanic disaster - trees and forests.
Great ideas for a beginner like me to play with data mining.
Data and challenge proposed by kaggle.com:
https://www.kaggle.com/c/titanic/dat
Good tutorials and proposed solutins here:
http://trevorstephens.com/post/72916401642/titanic-getting-started-with-r and
https://github.com/trevorstephens/titanic
Also: https://www.kaggle.com/amoyakd/titanic/randomforest-method-v1-0
https://www.kaggle.com/c/titanic/forums
Some reading on R capabilities:
http://www.edureka.co/blog/implementation-of-decision-tree/
http://www.r-bloggers.com/a-brief-tour-of-the-trees-and-forests/
Other tutorials:
Iris data: http://rischanlab.github.io/RandomForest.html
(R Basics worth exploring: http://rischanlab.github.io/)
http://dni-institute.in/blogs/random-forest-using-r-step-by-step-tutorial/ (might be even better for a start
http://www.edureka.co/blog/implementation-of-decision-tree/
Biostars Tutorial: Machine Learning For Cancer Classification - Part 1 - Preparing The Data Sets
https://www.biostars.org/p/85124/
http://www.tutorialspoint.com/r/r_random_forest.htm
http://www.analyticsvidhya.com/blog/2015/09/random-forest-algorithm-multiple-challenges/
Other important topics to explore:
http://www.analyticsvidhya.com/blog/2015/12/faster-data-manipulation-7-packages/
Data and challenge proposed by kaggle.com:
https://www.kaggle.com/c/titanic/dat
Good tutorials and proposed solutins here:
http://trevorstephens.com/post/72916401642/titanic-getting-started-with-r and
https://github.com/trevorstephens/titanic
Also: https://www.kaggle.com/amoyakd/titanic/randomforest-method-v1-0
https://www.kaggle.com/c/titanic/forums
Some reading on R capabilities:
http://www.edureka.co/blog/implementation-of-decision-tree/
http://www.r-bloggers.com/a-brief-tour-of-the-trees-and-forests/
Other tutorials:
Iris data: http://rischanlab.github.io/RandomForest.html
(R Basics worth exploring: http://rischanlab.github.io/)
http://dni-institute.in/blogs/random-forest-using-r-step-by-step-tutorial/ (might be even better for a start
http://www.edureka.co/blog/implementation-of-decision-tree/
Biostars Tutorial: Machine Learning For Cancer Classification - Part 1 - Preparing The Data Sets
https://www.biostars.org/p/85124/
http://www.tutorialspoint.com/r/r_random_forest.htm
http://www.analyticsvidhya.com/blog/2015/09/random-forest-algorithm-multiple-challenges/
Other important topics to explore:
http://www.analyticsvidhya.com/blog/2015/12/faster-data-manipulation-7-packages/
wtorek, 9 lutego 2016
Rafał Wójcikowski -constructive critique of gov pronatalist policy in Polish Parliament
Coffee break, listened briefly to Polish Parliament debate on pronatalist policy 500PLN/month for each child.
Very good alternative proposed by Rafał Wójcikowski: https://pl.wikipedia.org/wiki/Rafa%C5%82_W%C3%B3jcikowski Based on Milton Friedman's negative income tax https://en.wikipedia.org/wiki/Negative_income_tax Implemented in Israel. Need to read more on this. Will PiS gov have courage and guts to consider it? I wonder what min. Morawiecki thinks. Point for Kukiz15 party.
Very good alternative proposed by Rafał Wójcikowski: https://pl.wikipedia.org/wiki/Rafa%C5%82_W%C3%B3jcikowski Based on Milton Friedman's negative income tax https://en.wikipedia.org/wiki/Negative_income_tax Implemented in Israel. Need to read more on this. Will PiS gov have courage and guts to consider it? I wonder what min. Morawiecki thinks. Point for Kukiz15 party.
niedziela, 7 lutego 2016
Small plots on a plot: R
Pie charts instead of points or bubbles in x-y/scatter-plot/bubble plot:
Uncle google found this for me:
http://blog.revolutionanalytics.com/2012/09/visualize-complex-data-with-subplots.html
http://www.r-bloggers.com/embeding-a-subplot-in-ggplot-via-subview/
http://vita.had.co.nz/papers/embedded-plots.pdf
ggsubplot
http://stackoverflow.com/questions/26577669/tiny-pie-charts-to-represent-each-point-in-an-scatterplot-using-ggplot2
http://stackoverflow.com/questions/7714677/r-scatterplot-with-too-many-points/16122003#16122003
pieGlyph
http://stackoverflow.com/questions/20465070/barplots-on-a-map/20468836#20468836
Maybe it will be possible in ggvis??
http://ggvis.rstudio.com/ggvis-basics.html
===========
Here a solution without ggplot (p.11):
https://www.r-project.org/doc/Rnews/Rnews_2003-2.pdf
====
plotrix and floating.pie?
http://ask.programmershare.com/2269_10385068/
====
http://gis.stackexchange.com/questions/4568/effectively-displaying-demographic-data-on-a-printed-map
====
ggtree
http://stackoverflow.com/questions/10368180/plotting-pie-graphs-on-map-in-ggplot
mapplots
http://www.molecularecologist.com/2012/09/making-maps-with-r/
Rozwiazanie w Excelu:
http://www.andypope.info/charts/piedatamarkers.htm
http://blog.revolutionanalytics.com/2012/09/visualize-complex-data-with-subplots.html
http://www.r-bloggers.com/embeding-a-subplot-in-ggplot-via-subview/
http://vita.had.co.nz/papers/embedded-plots.pdf
ggsubplot
http://stackoverflow.com/questions/26577669/tiny-pie-charts-to-represent-each-point-in-an-scatterplot-using-ggplot2
http://stackoverflow.com/questions/7714677/r-scatterplot-with-too-many-points/16122003#16122003
pieGlyph
http://stackoverflow.com/questions/20465070/barplots-on-a-map/20468836#20468836
Maybe it will be possible in ggvis??
http://ggvis.rstudio.com/ggvis-basics.html
===========
Here a solution without ggplot (p.11):
https://www.r-project.org/doc/Rnews/Rnews_2003-2.pdf
====
plotrix and floating.pie?
http://ask.programmershare.com/2269_10385068/
====
http://gis.stackexchange.com/questions/4568/effectively-displaying-demographic-data-on-a-printed-map
====
ggtree
http://stackoverflow.com/questions/10368180/plotting-pie-graphs-on-map-in-ggplot
mapplots
http://www.molecularecologist.com/2012/09/making-maps-with-r/
Rozwiazanie w Excelu:
http://www.andypope.info/charts/piedatamarkers.htm
piątek, 5 lutego 2016
R - Machine learning cheatsheet
A syntax for applying some machine learning procedures in R. Misleadingly.... simple :-)
http://vitalflux.com/cheat-sheet-10-machine-learning-algorithms-r-commands/
środa, 3 lutego 2016
Excel: fill in table from another table.
UPDATE:
I am now working on replacing offsets with absolute or structured references, ie. references to Listobject Table.
This will make my vba immune from inserting columns and allow hiding the source table:
Rozwiązanie, np. użycie VBA
Inny temat: chcemy skopiować tylko widoczne wiersze do innej tabeli (niewidoczne nas nie interesują).
Private Sub FindCopyValues() Dim strKodRange As Range Dim lObjSource As ListObject: Set lObjSource = Sheets("Pricelist").ListObjects("tblPricelist") Dim lObjTarget As ListObject: Set lObjTarget = Sheets("Quote").ListObjects("tblQuote") Dim iTableRowNum As Integer 'set cursor in first cell of column Code in tblPricelist Dim curAddr As Range: Set curAddr = lObjTarget.DataBodyRange.Cells(1, lObjTarget.ListColumns("Code").Index) 'set what to look for Dim strKod As String: strKod = curAddr.Value 'what offsets between Code and Price With lObjTarget Dim Code2Price As Integer: Code2Price = .ListColumns("Price").Index - .ListColumns("Code").Index Dim Code2Description As Integer: Code2Description = .ListColumns("Description").Index - .ListColumns("Code").Index End With Do Until strKod = "" Set Cell = lObjSource.ListColumns("Code").DataBodyRange.Cells.Find(What:=strKod, MatchCase:=False, LookAt:= _ xlWhole, SearchFormat:=False) iTableRowNum = Cell.Row - lObjSource.DataBodyRange.Rows(1).Row + 1 Debug.Print iTableRowNum ' With Cell ' ' curAddr.Offset(0, Code2Price).Value = .Offset(0, 1).Value ' curAddr.Offset(0, Code2Description).Value = .Offset(0, 2).Value ' ' End With 'Albo curAddr.Offset(0, Code2Price).Value = lObjSource.DataBodyRange.Cells(iTableRowNum, lObjSource.ListColumns("Price").Index) curAddr.Offset(0, Code2Description).Value = lObjSource.DataBodyRange.Cells(iTableRowNum, lObjSource.ListColumns("Description").Index) Set curAddr = curAddr.Offset(1, 0) strKod = curAddr.Text Loop Application.ScreenUpdating = True Application.EnableEvents = True End SubProblem z vlookupem, nie zwraca wartości wraz z linkiem do strony internetowej, zdjęcia czy dokumentu.
Rozwiązanie, np. użycie VBA
Private Sub Worksheet_Change(ByVal Target As Range) Dim strKod As String strKod = Sheets("Baza").Range("G4").Text If Target.Address = "$G$4" Then Range("$G$4").Select With Sheets("Aku").Cells.Find(What:=strKod, after:=ActiveCell, LookIn:=xlValues, LookAt:= _ xlWhole, SearchOrder:=xlByColumns, SearchDirection:=xlNext, MatchCase:= _ False) .Offset(0, 1).Copy Destination:=Sheets("Baza").Range("H4") .Offset(0, 2).Copy Destination:=Sheets("Baza").Range("I4") End With End If End SubPowtarza się kod .Offset... więc można go połączyć w jedną linię za pomocą Union. Ponadto, nie chcę szukać w całym arkuszu a jedynie w kolumnie tabeli.
Private Sub Worksheet_Change(ByVal Target As Range) Dim strKod As String strKod = Sheets("Baza").Range("G4").Text oListObj = Sheets("Aku").ListObjects("Tabela1").ListColumns("Nazwa aku.") If Target.Address = "$G$4" Then Range("$G$4").Select With oListObj.DataBodyRange.Cells.Find(What:=strKod, after:=ActiveCell, LookIn:=xlValues, LookAt:= _ xlWhole, SearchOrder:=xlByColumns, SearchDirection:=xlNext, MatchCase:= _ False) Union(.Offset(0, 1), .Offset(0, 2)).Copy Destination:=Sheets("Baza").Range("H4") End With End If End SubA co jeśli chcemy wyszukać nie tylko w komórce G4 ale w całej kolumnie?
Private Sub FindCopyValues() Dim strKod As String Dim strKodRange As Range Dim objList As ListObject Dim curAddr As Object 'Application.ScreenUpdating = False 'Application.EnableEvents = False Set objList = Sheets("Aku").ListObjects("Tabela1") Set curAddr = Worksheets("Baza").Range("$G$4") strKod = curAddr.Value Do Until strKod = "" Set cell = objList.ListColumns("Nazwa aku.").DataBodyRange.Cells.Find(What:=strKod, MatchCase:=False, LookAt:= _ xlWhole, SearchFormat:=False) With cell Union(.Offset(0, 1), .Offset(0, 2)).Copy Destination:=curAddr.Offset(0, 1) End With Set curAddr = curAddr.Offset(1, 0) strKod = curAddr.Text Loop Application.ScreenUpdating = True Application.EnableEvents = True End Sub
Inny temat: chcemy skopiować tylko widoczne wiersze do innej tabeli (niewidoczne nas nie interesują).
Sub CopyVisibleProducts() Sheets("cennik").ListObjects("Table1").DataBodyRange.SpecialCells(xlCellTypeVisible).Copy Sheets("pro_forma").Range("b5") Application.CutCopyMode = False End Sub
poniedziałek, 1 lutego 2016
Coffee break: imputation
From CranR:
Simple mean imputation: http://www.r-bloggers.com/example-2014-5-simple-mean-imputation/
knnImputation {DMwR} Function that fills in all NA values using the k Nearest Neighbours of each case with NA values.
centralImputation {DMwR} This function fills in any NA value in all columns of a data frame with the statistic of centrality (given by the function centralvalue()) of the respective column.
Other: kNNImpute {imputation} Imputation using k-nearest neighbors. For each record, identify missinng features. http://www.bioconductor.org/packages/release/bioc/html/impute.html
impute {Hmisc} http://www.inside-r.org/packages/cran/hmisc/docs/impute http://www.jstatsoft.org/article/view/v023i10 http://docs.zeligproject.org/en/latest/using-Zelig-with-Amelia.html
Simple mean imputation: http://www.r-bloggers.com/example-2014-5-simple-mean-imputation/
knnImputation {DMwR} Function that fills in all NA values using the k Nearest Neighbours of each case with NA values.
centralImputation {DMwR} This function fills in any NA value in all columns of a data frame with the statistic of centrality (given by the function centralvalue()) of the respective column.
Other: kNNImpute {imputation} Imputation using k-nearest neighbors. For each record, identify missinng features. http://www.bioconductor.org/packages/release/bioc/html/impute.html
impute {Hmisc} http://www.inside-r.org/packages/cran/hmisc/docs/impute http://www.jstatsoft.org/article/view/v023i10 http://docs.zeligproject.org/en/latest/using-Zelig-with-Amelia.html
Subskrybuj:
Posty (Atom)