http://datapigtechnologies.com/blog/index.php/highlighting-outliers-in-your-data-with-the-tukey-method/
http://brownmath.com/stat/nchkxl.htm
esp. see worksheet: http://brownmath.com/stat/prog/normalitycheck.xlsm
Using SD, not a good idea but a good macro to start with:
http://www.mrexcel.com/forum/excel-questions/732424-how-remove-outliers-data-set-2.html
Sub outliers_mod2() Dim dblAverage As Double, dblStdDev As Double Dim NoStdDevs As Integer Dim rTest As Range, Rng As Range 'Application.ScreenUpdating = False NoStdDevs = 3 'adjust to your outlier preference of sigma Set rTest = Selection 'Application.InputBox("Select a range", "Get Range", Type:=8) dblAverage = WorksheetFunction.Average(rTest) dblStdDev = WorksheetFunction.StDev(rTest) For Each Rng In rTest If Rng > dblAverage + NoStdDevs * dblStdDev Or Rng < dblAverage - NoStdDevs * dblStdDev Then Rng.Interior.Color = RGB(255, 0, 0) '.Value = "Outlier" 'or delete the data with Rng.clearcontents End If Next 'Application.ScreenUpdating = True End Sub
====
Normalize(simple linear normalize) data in Excel
http://stats.stackexchange.com/questions/70801/how-to-normalize-data-to-0-1-range?newreg=f530ce192d144b109ec26a077cab00af
Kernel Density/Regression, alternatives to histogram.
====
good article: http://www.stat-d.si/mz/mz4.1/vidmar.pdf http://people.revoledu.com/kardi/tutorial/index.html
2d: http://www.r-bloggers.com/recipe-for-computing-and-sampling-multivariate-kernel-density-estimates-and-plotting-contours-for-2d-kdes/
Density plugin: http://www.prodomosua.eu/ppage02.html
Good VBA example for density plots (incl UDF function).
http://www.iimahd.ernet.in/~jrvarma/software.php
(found in this page: http://www.mathfinance.cn/category/vba/1/5/)
Some R code http://www.wessa.net/rwasp_density.wasp#output
Another article: http://www.rsc.org/images/data-distributions-kernel-density-technical-brief-4_tcm18-214836.pdf
Some plugin/vba to check: http://www.rsc.org/Membership/Networking/InterestGroups/Analytical/AMC/Software/RobustStatistics.asp
Cheatsheets for R: distributions: http://www.r-bloggers.com/ggplot2-cheatsheet-for-visualizing-distributions/
Good Wikipedia entry: https://en.wikipedia.org/wiki/Outlier
Advanced article: http://d-scholarship.pitt.edu/7948/1/Seo.pdf
For beginners: https://www.dataz.io/display/Public/2013/03/20/Describing+Data%3A+Why+median+and+IQR+are+often+better+than+mean+and+standard+deviation
====
Simple MAD solution with Excel formulas: http://www.codeproject.com/Tips/214330/Statistical-Outliers-detection
Brak komentarzy:
Prześlij komentarz