wtorek, 26 lipca 2016

Open Intro - free introductory statistics handbook.

Beautiful project, free books that can be used with free R and R studio software, suitable for highschools and colleges: https://www.openintro.org/stat/textbook.php?stat_book=os

piątek, 15 lipca 2016

Book for holidays for Data Miner?

The story of Bayesian theory and its applications? Sharon Bertsch McGrayne: "The Theory That Would Not Die" | Talks at Google .



Villard noise cancelling shortwave antenna loop demo - Don't quit SWL ov...







Not only I tried it :-) My attempt: https://www.youtube.com/watch?v=qGipnTk-fyk&feature=youtu.be&a=



 I wonder if it can also transmit. I know about high voltage between the capacitor created by overlapping foil. Would it withstand 3W of power from a CW homemade QRP transceiver on 7MHz?

Here a transmitting loop made of copper tube. https://www.youtube.com/watch?v=Cv_RnLpZ9gw
Note the use of coax to form a capacitor and tune the antenna. In fact the entire antenna can be made just from the coax.

Another solution: http://www.rfa.org/english/about-old/help/Anti-Jamming.html?searchterm:utf8:ustring=antenna




sobota, 9 lipca 2016

R: Braindump for this week: classification, ETL, etc.

1. I am pondering about the use of clustering to classify unknown data.
  • first prepare data (what to do with categorical, ordinal?, mutate - create new variables?)
    • for this i found some interesting tutorials:
      http://www.r-bloggers.com/clustering-mixed-data-types-in-r-2/
      http://www.sthda.com/english/wiki/partitioning-cluster-analysis-quick-start-guide-unsupervised-machine-learning
    • I also found an algorhitm that proved to be very precise:
  • then run through a clustering algorytm (k-means, local density, other?)
    • first I chose to play with various algorythms to see how they detect known clusters
      eg. the three species of famous iris database, I played with "cclust", "cluster", and "densityclust" and discovered "factoextra"for quick graphs/diagnostics
  • finally throw out a decision tree (or other non-black box tool to show rules and dependencies).
2. Getting out tidy data: broom (in .r documents), pander (in .Rmd documents), memisc:mtable + pander to compare linear models (http://stackoverflow.com/questions/24342162/regression-tables-in-markdown-format-for-flexible-use-in-r-markdown-v2)

3. ETL concept:
https://cran.r-project.org/web/packages/dplyr/vignettes/databases.html
https://github.com/beanumber/etl
http://www.r-bloggers.com/r-and-sqlite-part-1/