poniedziałek, 28 listopada 2016
Music from my dream...
Ives: The Unanswered Question / Premil Petrovic / No Borders Orchestra Similar:
piątek, 25 listopada 2016
Marketing with R, very interesting articles.
My search started with the question, how to reduce the huge amount of information about the history of purchases, contacts of hundred thousands of customers to manageable few variables.
I came across Recency, Frequency, Monetary value (RFM) analysis.
I came across Recency, Frequency, Monetary value (RFM) analysis.
Review articles of these users on R bloggers.
https://www.r-bloggers.com/ author/jack-han/
https://www.r-bloggers.com/ author/analyze-core-r- language/
https://www.r-bloggers.com/
https://www.r-bloggers.com/
piątek, 4 listopada 2016
Cleaning data with Regex in Excel (then in R)...
I have a dirty column that should contain duration in months as integer, but it contains variants of strings: due to end on dd-mm-yy, duration x days, from dd.mm.yyyy through/to dd.mm.yyyy y.
Headache.
This is the day I started playing with regex.
First, how to find a date after a specific word:
http://stackoverflow.com/questions/19193251/regex-to-get-the-words-after-matching-string
After some reading (eg. http://stackoverflow.com/questions/18139493/finding-first-word-after-regex-match, http://stackoverflow.com/questions/546220/how-to-match-the-first-word-after-an-expression-with-regex
modifications and experimentations I reached the following compact solution:
[\n\r].*[skończy się:|until:|do:|zakończenia:]\s*(\w+.\w+.\w+)
Tested with: "The contract will last until: dd.mm.yyy or to the [...]" it will return just the date "dd.mm.yyy
Cute!
Macros, that work with Excel:
http://analystcave.com/excel-regex-tutorial/ (I used RegexExecute macro with success)
http://www.vbaexpress.com/kb/getarticle.php?kb_id=841
http://stackoverflow.com/questions/22542834/how-to-use-regular-expressions-regex-in-microsoft-excel-both-in-cell-and-loops
http://www.vbaexpress.com/kb/getarticle.php?kb_id=841
http://superuser.com/questions/978532/how-do-i-use-regex-using-instr-in-vba
http://stackoverflow.com/questions/19481175/general-purpose-udfs-for-using-regular-expressions-in-excel (set of very useful tools)
R code to test:
http://stackoverflow.com/questions/34804708/matching-a-word-after-another-word-in-r-regex
https://renkun.me/blog/2014/02/20/extract-information-from-texts-with-regular-expressions-in-r.html
In R the following worked with me:
library(stringr)
str_extract(string=df$duration, pattern= "(?<=(dnia:|zakończenia:|until:)\\s)(\\w+.\\w+.\\w+)" )
Headache.
This is the day I started playing with regex.
First, how to find a date after a specific word:
http://stackoverflow.com/questions/19193251/regex-to-get-the-words-after-matching-string
After some reading (eg. http://stackoverflow.com/questions/18139493/finding-first-word-after-regex-match, http://stackoverflow.com/questions/546220/how-to-match-the-first-word-after-an-expression-with-regex
modifications and experimentations I reached the following compact solution:
[\n\r].*[skończy się:|until:|do:|zakończenia:]\s*(\w+.\w+.\w+)
Tested with: "The contract will last until: dd.mm.yyy or to the [...]" it will return just the date "dd.mm.yyy
Cute!
Macros, that work with Excel:
http://analystcave.com/excel-regex-tutorial/ (I used RegexExecute macro with success)
http://www.vbaexpress.com/kb/getarticle.php?kb_id=841
http://stackoverflow.com/questions/22542834/how-to-use-regular-expressions-regex-in-microsoft-excel-both-in-cell-and-loops
http://www.vbaexpress.com/kb/getarticle.php?kb_id=841
http://superuser.com/questions/978532/how-do-i-use-regex-using-instr-in-vba
http://stackoverflow.com/questions/19481175/general-purpose-udfs-for-using-regular-expressions-in-excel (set of very useful tools)
R code to test:
http://stackoverflow.com/questions/34804708/matching-a-word-after-another-word-in-r-regex
https://renkun.me/blog/2014/02/20/extract-information-from-texts-with-regular-expressions-in-r.html
In R the following worked with me:
library(stringr)
str_extract(string=df$duration, pattern= "(?<=(dnia:|zakończenia:|until:)\\s)(\\w+.\\w+.\\w+)" )
Subskrybuj:
Posty (Atom)