Interesting topics, see
https://www.r-bloggers.com/self-organising-maps-for-customer-segmentation-using-r/
http://www.slideshare.net/shanelynn/2014-0117-dublin-r-selforganising-maps-for-customer-segmentation-shane-lynn
Other interesting reading
https://www.r-bloggers.com/r-an-integrated-statistical-programming-environment-and-gis/
https://www.r-bloggers.com/how-to-perform-pca-with-r/
piątek, 17 lutego 2017
czwartek, 16 lutego 2017
SQL playground, remove duplicates and count students that passed exam.
Patent na usuwanie duplikatów z tabeli “in
place” tj bez nadpisywania jej, bez usuwania i wstawiania innej:
Do przetestowania:
Kolumny ID (nie powtarza się). imie,
nazwisko, dane (może duplikować się).
DELETE
*
FROM Tabela1
WHERE [id] NOT IN
(SELECT Max(Tabela1.id) AS id
FROM Tabela1
GROUP BY Tabela1.imie, Tabela1.nazwisko, Tabela1.dane);
FROM Tabela1
WHERE [id] NOT IN
(SELECT Max(Tabela1.id) AS id
FROM Tabela1
GROUP BY Tabela1.imie, Tabela1.nazwisko, Tabela1.dane);
w linii GROUP BY można określić gdzie
szukamy duplikatów, w jakich kolumnach
Select Max... można użyć First(), albo Min(), żeby określić, które zduplikowane rekordy zachować.
Select Max... można użyć First(), albo Min(), żeby określić, które zduplikowane rekordy zachować.
Są trzy tabele
1. Studenci (_indeks_, imie, nazwisko)
2. Kursy (_id_, tytul, godzin, punktow), gdzie punktow oznacza ile punktow za zaliczenie kursu się dostaje
3. Szkolenia( osoba, kurs, zaliczenie) gdzie osoba to numer indeksu studenta, zaliczenie to data zaliczenia, a kurs to numer kursu
i pierwsze zadanie z matury brzmi: Ilu studentów zaliczyło w pierwszym terminie? (do 30 czerwca 2016)<pre>
Select count([_indeks_Stud]) AS ilu_zdalo From
(Select [Studenci$].[_indeks_Stud]
From ([Kursy$]
Inner Join [Szkolenia$] On [Kursy$].[_id_kurs] = [Szkolenia$].kurs)
Inner Join [Studenci$] On [Studenci$].[_indeks_Stud] = [Szkolenia$].osoba
Where [Szkolenia$].zaliczenie <= #2016-06-30#
Group By [Studenci$].[_indeks_Stud]
Having Sum([Kursy$].punktow) >= 15)
<./pre>
Kod powstał w ide Flyspeed SQL Query na bazie stworzonej w zakładkach w Excelu, stąd znaki dolara.... kod więc powinien działać po wklejeniu do Microsoft Query.
Python in RStudio
Data used:
Titanic data: https://www.kaggle.com/c/titanic/data
and tutorial: http://nbviewer.jupyter.org/github/savarin/pyconuk-introtutorial/blob/master/notebooks/Section%201-0%20-%20First%20Cut.ipynb
Flights data: http://ucl.ac.uk/~uctqiax/data/flights.csv
Software used:
Portable scientific winpython (with pandas scikit-learn):
https://sourceforge.net/projects/winpython/?source=typ_redirect
To work it needed windows updates (my OS is windows 7):
https://www.microsoft.com/en-us/download/confirmation.aspx?id=49093
To install packages from source it needed:
http://landinghub.visualstudio.com/visual-cpp-build-tools
I needed feather package so I dowloaded it and used python command: pip install
as taught here: https://github.com/winpython/winpython/wiki/Installing-Additional-Packages
and installed from source: https://github.com/wesm/feather/tree/master/python
To learn how to use other languages in RStudio: http://rmarkdown.rstudio.com/authoring_knitr_engines.html
I also wanted to try if some portable version of bash would work. No problem:
http://win-bash.sourceforge.net/
Code for my playground.
Titanic data: https://www.kaggle.com/c/titanic/data
and tutorial: http://nbviewer.jupyter.org/github/savarin/pyconuk-introtutorial/blob/master/notebooks/Section%201-0%20-%20First%20Cut.ipynb
Flights data: http://ucl.ac.uk/~uctqiax/data/flights.csv
Software used:
Portable scientific winpython (with pandas scikit-learn):
https://sourceforge.net/projects/winpython/?source=typ_redirect
To work it needed windows updates (my OS is windows 7):
https://www.microsoft.com/en-us/download/confirmation.aspx?id=49093
To install packages from source it needed:
http://landinghub.visualstudio.com/visual-cpp-build-tools
I needed feather package so I dowloaded it and used python command: pip install
as taught here: https://github.com/winpython/winpython/wiki/Installing-Additional-Packages
and installed from source: https://github.com/wesm/feather/tree/master/python
To learn how to use other languages in RStudio: http://rmarkdown.rstudio.com/authoring_knitr_engines.html
I also wanted to try if some portable version of bash would work. No problem:
http://win-bash.sourceforge.net/
Code for my playground.
--- title: "R Notebook" output: html_notebook --- ## Bash ```{bash, engine.path="C:\\Users\\jkotows2\\Desktop\\shell.w32-ix86\\bash.exe"} cat flights1.csv flights2.csv flights3.csv > flights.csv ``` ## Python http://rmarkdown.rstudio.com/authoring_knitr_engines.html ```{python, engine.path="C:\\Users\\jkotows2\\Desktop\\WinPython\\python-3.6.0.amd64\\python.exe"} import pandas import feather # Read flights data and select flights to O'Hare flights = pandas.read_csv("C:\\Users\\jkotows2\\Desktop\\_flights\\flights.csv") flights = flights[flights['dest'] == "ORD"] # Select carrier and delay columns and drop rows with missing values flights = flights[['carrier', 'dep_delay', 'arr_delay']] flights = flights.dropna() print (flights.head(10)) # Write to feather file for reading from R feather.write_dataframe(flights, "C:\\Users\\jkotows2\\Desktop\\_flights\\flights.feather") ``` ## Back to R ```{r} library(feather) library(ggplot2) # Read from feather and plot flights <- read_feather("C:\\Users\\jkotows2\\Desktop\\_flights\\flights.feather") ggplot(flights, aes(carrier, arr_delay)) + geom_point() + geom_jitter() ```
środa, 15 lutego 2017
lpsolve - solver in R
To study:
http://flovv.github.io/From_descritpive_to_prescriptive/
https://icyrock.com/blog/2013/12/linear-programming-in-r-using-lpsolve/
http://lpsolve.r-forge.r-project.org/
http://horicky.blogspot.co.uk/2013/01/optimization-in-r.html
http://lpsolve.sourceforge.net/5.5/R.htm
http://flovv.github.io/From_descritpive_to_prescriptive/
https://icyrock.com/blog/2013/12/linear-programming-in-r-using-lpsolve/
http://lpsolve.r-forge.r-project.org/
http://horicky.blogspot.co.uk/2013/01/optimization-in-r.html
http://lpsolve.sourceforge.net/5.5/R.htm
Subskrybuj:
Posty (Atom)