Interesting Excel interface to R via RExcel: https://sourceforge.net/projects/imdev
Some interesting tutorials to study: https://github.com/dgrapov/TeachingDemos
https://sourceforge.net/projects/teachingdemos/files/
The author responded next day. I need to follow up!
piątek, 26 lutego 2016
środa, 24 lutego 2016
Excel VBA: Evaluate() or square brackets shorthand.
I recommend this trick to shorten the code for entering/copying/calculating data in spreadsheet ranges:
http://www.ozgrid.com/forum/showthread.php?t=52372&page=2
Example:
http://www.ozgrid.com/forum/showthread.php?t=52372&page=2
Example:
Sub Test()
[A1].Offset(0, 1) = [=SUM(Table1[A])]
End Sub
sobota, 13 lutego 2016
R appeal to Ministry of Digital Affairs of Poland
Dear Minister Anna Streżyńska
I would like humbly to propose that Polish Ministry of Digital Affairs should support the development of R community in Poland with a dedicated government program.
R community is a dynamically growing number of users of free R statistical environment which facilitates (indeed makes it easy for everyone) programming, data cleaning, exploration and data mining. In Warsaw alone it includes growing number of enthusiasts, Warsaw Open University alumni of R courses, regular meetups (http://www.spotkania-entuzjastow-r.pl/), Smarter Poland activities and many others.
R, can provide a single environment for:
- online access to public statistical data (medical, social, crime, economic, weather),
- a remote interactive teaching data science and programming in schools - teaching the IT skills that will be of immediate use to anyone from business startup-per to hobbyist building a drone capable of machine learning),
- visualizing and representing analysis results related to Poland (choropleths or subplots imposed on the map of Poland)
- the environment can also be set up for quick generating very high quality and publication ready reports that can be immediately printed and published on-line.
R can facilitate government transparency and promoting Open Government programme. (http://radar.oreilly.com/2009/07/making-government-transparent.html, http://ropengov.github.io/)
Some ideas of what Ministry of Digital Affairs can coordinate and assist with is:
- development of a dedicated R package for accessing public data, analysing and visualising it on an administrative map of Poland,
-development of an R distance learning package teaching statistics to Polish students (it can be the the matter of preparing a localized quality content, the tool "swirl" is available for that purpose),
- encouragement of public offices to publish current data that can be easily analyzed in R (csv, . This can be current information from treasury, statistical offices, police (geography of crime incidents), weather, health diagnosis and procedures (NHF and Ministry of Health data) - information devoid of personal information but fresh and detailed enough for Polish citizen to make informed decisions.
Advantages for Polish administration and economy:
Free R can be compared to expensive SPSS, Statistica and SAS packages - it can save millions giving powerful mining capabilities to government at any level and all citizens. It has been popular with world best universities, now it is used by big media houses, banks and research labs.
R data mining can be used for evectively analysing any data and drawing right conclusions in health management, poverty, tax evasion, crime incidence, even counterintelligence and anti-terrorism. R literacy can generate high quality jobs and bring measurable profit because it is a tool facilitating innovation and right decisions based on science.
It gives Poland a unique chance to be among the first to embrace the ideas of true modern democracy expressed in access to information and true modern freedom of making informed choices.
Sincerely Yours,
Jacek Kotowski
I would like humbly to propose that Polish Ministry of Digital Affairs should support the development of R community in Poland with a dedicated government program.
R community is a dynamically growing number of users of free R statistical environment which facilitates (indeed makes it easy for everyone) programming, data cleaning, exploration and data mining. In Warsaw alone it includes growing number of enthusiasts, Warsaw Open University alumni of R courses, regular meetups (http://www.spotkania-entuzjastow-r.pl/), Smarter Poland activities and many others.
R, can provide a single environment for:
- online access to public statistical data (medical, social, crime, economic, weather),
- a remote interactive teaching data science and programming in schools - teaching the IT skills that will be of immediate use to anyone from business startup-per to hobbyist building a drone capable of machine learning),
- visualizing and representing analysis results related to Poland (choropleths or subplots imposed on the map of Poland)
- the environment can also be set up for quick generating very high quality and publication ready reports that can be immediately printed and published on-line.
R can facilitate government transparency and promoting Open Government programme. (http://radar.oreilly.com/2009/07/making-government-transparent.html, http://ropengov.github.io/)
Some ideas of what Ministry of Digital Affairs can coordinate and assist with is:
- development of a dedicated R package for accessing public data, analysing and visualising it on an administrative map of Poland,
-development of an R distance learning package teaching statistics to Polish students (it can be the the matter of preparing a localized quality content, the tool "swirl" is available for that purpose),
- encouragement of public offices to publish current data that can be easily analyzed in R (csv, . This can be current information from treasury, statistical offices, police (geography of crime incidents), weather, health diagnosis and procedures (NHF and Ministry of Health data) - information devoid of personal information but fresh and detailed enough for Polish citizen to make informed decisions.
Advantages for Polish administration and economy:
Free R can be compared to expensive SPSS, Statistica and SAS packages - it can save millions giving powerful mining capabilities to government at any level and all citizens. It has been popular with world best universities, now it is used by big media houses, banks and research labs.
R data mining can be used for evectively analysing any data and drawing right conclusions in health management, poverty, tax evasion, crime incidence, even counterintelligence and anti-terrorism. R literacy can generate high quality jobs and bring measurable profit because it is a tool facilitating innovation and right decisions based on science.
It gives Poland a unique chance to be among the first to embrace the ideas of true modern democracy expressed in access to information and true modern freedom of making informed choices.
Sincerely Yours,
Jacek Kotowski
czwartek, 11 lutego 2016
Coffee break: Lesson ideas: R and Titanic disaster - trees and forests.
Great ideas for a beginner like me to play with data mining.
Data and challenge proposed by kaggle.com:
https://www.kaggle.com/c/titanic/dat
Good tutorials and proposed solutins here:
http://trevorstephens.com/post/72916401642/titanic-getting-started-with-r and
https://github.com/trevorstephens/titanic
Also: https://www.kaggle.com/amoyakd/titanic/randomforest-method-v1-0
https://www.kaggle.com/c/titanic/forums
Some reading on R capabilities:
http://www.edureka.co/blog/implementation-of-decision-tree/
http://www.r-bloggers.com/a-brief-tour-of-the-trees-and-forests/
Other tutorials:
Iris data: http://rischanlab.github.io/RandomForest.html
(R Basics worth exploring: http://rischanlab.github.io/)
http://dni-institute.in/blogs/random-forest-using-r-step-by-step-tutorial/ (might be even better for a start
http://www.edureka.co/blog/implementation-of-decision-tree/
Biostars Tutorial: Machine Learning For Cancer Classification - Part 1 - Preparing The Data Sets
https://www.biostars.org/p/85124/
http://www.tutorialspoint.com/r/r_random_forest.htm
http://www.analyticsvidhya.com/blog/2015/09/random-forest-algorithm-multiple-challenges/
Other important topics to explore:
http://www.analyticsvidhya.com/blog/2015/12/faster-data-manipulation-7-packages/
Data and challenge proposed by kaggle.com:
https://www.kaggle.com/c/titanic/dat
Good tutorials and proposed solutins here:
http://trevorstephens.com/post/72916401642/titanic-getting-started-with-r and
https://github.com/trevorstephens/titanic
Also: https://www.kaggle.com/amoyakd/titanic/randomforest-method-v1-0
https://www.kaggle.com/c/titanic/forums
Some reading on R capabilities:
http://www.edureka.co/blog/implementation-of-decision-tree/
http://www.r-bloggers.com/a-brief-tour-of-the-trees-and-forests/
Other tutorials:
Iris data: http://rischanlab.github.io/RandomForest.html
(R Basics worth exploring: http://rischanlab.github.io/)
http://dni-institute.in/blogs/random-forest-using-r-step-by-step-tutorial/ (might be even better for a start
http://www.edureka.co/blog/implementation-of-decision-tree/
Biostars Tutorial: Machine Learning For Cancer Classification - Part 1 - Preparing The Data Sets
https://www.biostars.org/p/85124/
http://www.tutorialspoint.com/r/r_random_forest.htm
http://www.analyticsvidhya.com/blog/2015/09/random-forest-algorithm-multiple-challenges/
Other important topics to explore:
http://www.analyticsvidhya.com/blog/2015/12/faster-data-manipulation-7-packages/
wtorek, 9 lutego 2016
Rafał Wójcikowski -constructive critique of gov pronatalist policy in Polish Parliament
Coffee break, listened briefly to Polish Parliament debate on pronatalist policy 500PLN/month for each child.
Very good alternative proposed by Rafał Wójcikowski: https://pl.wikipedia.org/wiki/Rafa%C5%82_W%C3%B3jcikowski Based on Milton Friedman's negative income tax https://en.wikipedia.org/wiki/Negative_income_tax Implemented in Israel. Need to read more on this. Will PiS gov have courage and guts to consider it? I wonder what min. Morawiecki thinks. Point for Kukiz15 party.
Very good alternative proposed by Rafał Wójcikowski: https://pl.wikipedia.org/wiki/Rafa%C5%82_W%C3%B3jcikowski Based on Milton Friedman's negative income tax https://en.wikipedia.org/wiki/Negative_income_tax Implemented in Israel. Need to read more on this. Will PiS gov have courage and guts to consider it? I wonder what min. Morawiecki thinks. Point for Kukiz15 party.
niedziela, 7 lutego 2016
Small plots on a plot: R
Pie charts instead of points or bubbles in x-y/scatter-plot/bubble plot:
Uncle google found this for me:
http://blog.revolutionanalytics.com/2012/09/visualize-complex-data-with-subplots.html
http://www.r-bloggers.com/embeding-a-subplot-in-ggplot-via-subview/
http://vita.had.co.nz/papers/embedded-plots.pdf
ggsubplot
http://stackoverflow.com/questions/26577669/tiny-pie-charts-to-represent-each-point-in-an-scatterplot-using-ggplot2
http://stackoverflow.com/questions/7714677/r-scatterplot-with-too-many-points/16122003#16122003
pieGlyph
http://stackoverflow.com/questions/20465070/barplots-on-a-map/20468836#20468836
Maybe it will be possible in ggvis??
http://ggvis.rstudio.com/ggvis-basics.html
===========
Here a solution without ggplot (p.11):
https://www.r-project.org/doc/Rnews/Rnews_2003-2.pdf
====
plotrix and floating.pie?
http://ask.programmershare.com/2269_10385068/
====
http://gis.stackexchange.com/questions/4568/effectively-displaying-demographic-data-on-a-printed-map
====
ggtree
http://stackoverflow.com/questions/10368180/plotting-pie-graphs-on-map-in-ggplot
mapplots
http://www.molecularecologist.com/2012/09/making-maps-with-r/
Rozwiazanie w Excelu:
http://www.andypope.info/charts/piedatamarkers.htm
http://blog.revolutionanalytics.com/2012/09/visualize-complex-data-with-subplots.html
http://www.r-bloggers.com/embeding-a-subplot-in-ggplot-via-subview/
http://vita.had.co.nz/papers/embedded-plots.pdf
ggsubplot
http://stackoverflow.com/questions/26577669/tiny-pie-charts-to-represent-each-point-in-an-scatterplot-using-ggplot2
http://stackoverflow.com/questions/7714677/r-scatterplot-with-too-many-points/16122003#16122003
pieGlyph
http://stackoverflow.com/questions/20465070/barplots-on-a-map/20468836#20468836
Maybe it will be possible in ggvis??
http://ggvis.rstudio.com/ggvis-basics.html
===========
Here a solution without ggplot (p.11):
https://www.r-project.org/doc/Rnews/Rnews_2003-2.pdf
====
plotrix and floating.pie?
http://ask.programmershare.com/2269_10385068/
====
http://gis.stackexchange.com/questions/4568/effectively-displaying-demographic-data-on-a-printed-map
====
ggtree
http://stackoverflow.com/questions/10368180/plotting-pie-graphs-on-map-in-ggplot
mapplots
http://www.molecularecologist.com/2012/09/making-maps-with-r/
Rozwiazanie w Excelu:
http://www.andypope.info/charts/piedatamarkers.htm
piątek, 5 lutego 2016
R - Machine learning cheatsheet
A syntax for applying some machine learning procedures in R. Misleadingly.... simple :-)
http://vitalflux.com/cheat-sheet-10-machine-learning-algorithms-r-commands/
środa, 3 lutego 2016
Excel: fill in table from another table.
UPDATE:
I am now working on replacing offsets with absolute or structured references, ie. references to Listobject Table.
This will make my vba immune from inserting columns and allow hiding the source table:
Rozwiązanie, np. użycie VBA
Inny temat: chcemy skopiować tylko widoczne wiersze do innej tabeli (niewidoczne nas nie interesują).
Private Sub FindCopyValues()
Dim strKodRange As Range
Dim lObjSource As ListObject: Set lObjSource = Sheets("Pricelist").ListObjects("tblPricelist")
Dim lObjTarget As ListObject: Set lObjTarget = Sheets("Quote").ListObjects("tblQuote")
Dim iTableRowNum As Integer
'set cursor in first cell of column Code in tblPricelist
Dim curAddr As Range: Set curAddr = lObjTarget.DataBodyRange.Cells(1, lObjTarget.ListColumns("Code").Index)
'set what to look for
Dim strKod As String: strKod = curAddr.Value
'what offsets between Code and Price
With lObjTarget
Dim Code2Price As Integer: Code2Price = .ListColumns("Price").Index - .ListColumns("Code").Index
Dim Code2Description As Integer: Code2Description = .ListColumns("Description").Index - .ListColumns("Code").Index
End With
Do Until strKod = ""
Set Cell = lObjSource.ListColumns("Code").DataBodyRange.Cells.Find(What:=strKod, MatchCase:=False, LookAt:= _
xlWhole, SearchFormat:=False)
iTableRowNum = Cell.Row - lObjSource.DataBodyRange.Rows(1).Row + 1
Debug.Print iTableRowNum
' With Cell
'
' curAddr.Offset(0, Code2Price).Value = .Offset(0, 1).Value
' curAddr.Offset(0, Code2Description).Value = .Offset(0, 2).Value
'
' End With
'Albo
curAddr.Offset(0, Code2Price).Value = lObjSource.DataBodyRange.Cells(iTableRowNum, lObjSource.ListColumns("Price").Index)
curAddr.Offset(0, Code2Description).Value = lObjSource.DataBodyRange.Cells(iTableRowNum, lObjSource.ListColumns("Description").Index)
Set curAddr = curAddr.Offset(1, 0)
strKod = curAddr.Text
Loop
Application.ScreenUpdating = True
Application.EnableEvents = True
End Sub
Problem z vlookupem, nie zwraca wartości wraz z linkiem do strony internetowej, zdjęcia czy dokumentu.Rozwiązanie, np. użycie VBA
Private Sub Worksheet_Change(ByVal Target As Range)
Dim strKod As String
strKod = Sheets("Baza").Range("G4").Text
If Target.Address = "$G$4" Then
Range("$G$4").Select
With Sheets("Aku").Cells.Find(What:=strKod, after:=ActiveCell, LookIn:=xlValues, LookAt:= _
xlWhole, SearchOrder:=xlByColumns, SearchDirection:=xlNext, MatchCase:= _
False)
.Offset(0, 1).Copy Destination:=Sheets("Baza").Range("H4")
.Offset(0, 2).Copy Destination:=Sheets("Baza").Range("I4")
End With
End If
End Sub
Powtarza się kod .Offset... więc można go połączyć w jedną linię za pomocą Union.
Ponadto, nie chcę szukać w całym arkuszu a jedynie w kolumnie tabeli.
Private Sub Worksheet_Change(ByVal Target As Range)
Dim strKod As String
strKod = Sheets("Baza").Range("G4").Text
oListObj = Sheets("Aku").ListObjects("Tabela1").ListColumns("Nazwa aku.")
If Target.Address = "$G$4" Then
Range("$G$4").Select
With oListObj.DataBodyRange.Cells.Find(What:=strKod, after:=ActiveCell, LookIn:=xlValues, LookAt:= _
xlWhole, SearchOrder:=xlByColumns, SearchDirection:=xlNext, MatchCase:= _
False)
Union(.Offset(0, 1), .Offset(0, 2)).Copy Destination:=Sheets("Baza").Range("H4")
End With
End If
End Sub
A co jeśli chcemy wyszukać nie tylko w komórce G4 ale w całej kolumnie?
Private Sub FindCopyValues()
Dim strKod As String
Dim strKodRange As Range
Dim objList As ListObject
Dim curAddr As Object
'Application.ScreenUpdating = False
'Application.EnableEvents = False
Set objList = Sheets("Aku").ListObjects("Tabela1")
Set curAddr = Worksheets("Baza").Range("$G$4")
strKod = curAddr.Value
Do Until strKod = ""
Set cell = objList.ListColumns("Nazwa aku.").DataBodyRange.Cells.Find(What:=strKod, MatchCase:=False, LookAt:= _
xlWhole, SearchFormat:=False)
With cell
Union(.Offset(0, 1), .Offset(0, 2)).Copy Destination:=curAddr.Offset(0, 1)
End With
Set curAddr = curAddr.Offset(1, 0)
strKod = curAddr.Text
Loop
Application.ScreenUpdating = True
Application.EnableEvents = True
End Sub
Inny temat: chcemy skopiować tylko widoczne wiersze do innej tabeli (niewidoczne nas nie interesują).
Sub CopyVisibleProducts()
Sheets("cennik").ListObjects("Table1").DataBodyRange.SpecialCells(xlCellTypeVisible).Copy Sheets("pro_forma").Range("b5")
Application.CutCopyMode = False
End Sub
poniedziałek, 1 lutego 2016
Coffee break: imputation
From CranR:
Simple mean imputation: http://www.r-bloggers.com/example-2014-5-simple-mean-imputation/
knnImputation {DMwR} Function that fills in all NA values using the k Nearest Neighbours of each case with NA values.
centralImputation {DMwR} This function fills in any NA value in all columns of a data frame with the statistic of centrality (given by the function centralvalue()) of the respective column.
Other: kNNImpute {imputation} Imputation using k-nearest neighbors. For each record, identify missinng features. http://www.bioconductor.org/packages/release/bioc/html/impute.html
impute {Hmisc} http://www.inside-r.org/packages/cran/hmisc/docs/impute http://www.jstatsoft.org/article/view/v023i10 http://docs.zeligproject.org/en/latest/using-Zelig-with-Amelia.html
Simple mean imputation: http://www.r-bloggers.com/example-2014-5-simple-mean-imputation/
knnImputation {DMwR} Function that fills in all NA values using the k Nearest Neighbours of each case with NA values.
centralImputation {DMwR} This function fills in any NA value in all columns of a data frame with the statistic of centrality (given by the function centralvalue()) of the respective column.
Other: kNNImpute {imputation} Imputation using k-nearest neighbors. For each record, identify missinng features. http://www.bioconductor.org/packages/release/bioc/html/impute.html
impute {Hmisc} http://www.inside-r.org/packages/cran/hmisc/docs/impute http://www.jstatsoft.org/article/view/v023i10 http://docs.zeligproject.org/en/latest/using-Zelig-with-Amelia.html
Subskrybuj:
Komentarze (Atom)