Categories
Archives
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- October 2019
- September 2019
- August 2019
- June 2019
- May 2019
- April 2019
- March 2019
- February 2019
- January 2019
- December 2018
- November 2018
- October 2018
- September 2018
- August 2018
- September 2017
- June 2017
- May 2017
- January 2017
- November 2016
- October 2016
- September 2016
- August 2016
- July 2016
- May 2016
- April 2016
- February 2016
- December 2015
- October 2015
- September 2015
- August 2015
- June 2015
- May 2015
- April 2015
- March 2015
- December 2014
- November 2014
- October 2014
- April 2014
- March 2014
- February 2014
- January 2014
- December 2013
- June 2013
- May 2013
- April 2013
- March 2013
- February 2013
- January 2013
- November 2012
- October 2012
- September 2012
- August 2012
- June 2012
- April 2012
- January 2012
- December 2011
- October 2011
- September 2011
- August 2011
- July 2011
- May 2011
- March 2011
- January 2011
- August 2010
- April 2010
- February 2010
Monthly Archives: November 2014
Testing Logistic Regression on linearly inseparable data
generate data: x=rnorm(1000,mean=0) y=rnorm(1000,mean=10) obs1=data.frame(rbind(cbind(rnorm(1000),rnorm(1000)),cbind(rnorm(1000,mean=10),rnorm(1000,mean=10))),as.factor(“Class A”)) obs2=data.frame(rbind(cbind(x,y),cbind(y,x)),as.factor(“Class B”)) colnames(obs1) = c(“x”, “y”, “class”) colnames(obs2) = c(“x”, “y”, “class”) df=rbind(obs1,obs2) make scatter plot: #scatter plot #dev.new() png(file=”scatter.png”) plot(obs1$x,obs1$y,col=colors[[1]],xlab=”x”,ylab=”y”,main=”scatter plot”) points(obs2$x,obs2$y,col=colors[[2]]) dev.off() see histograms: for(i in 1:2) { #dev.new() png(file=paste(“hist-“,names[[i]],”.png”,sep=””)) hist(obs1[,i],col=colors[[1]]) hist(obs2[,i],col=colors[[2]],add=TRUE) … Continue reading
Posted in Software
Leave a comment
Testing Logistic Regression
create training data: visualize: result: D = \frac{|\mu_1-\mu_2|}{\sigma_1+\sigma_2}: histograms of x and y: build classifier: Output: algorithm fails to converge because of perfect separation: compute AUC: where: plot AUC: result: LDA: finally you add add some noise to make logistic … Continue reading
Posted in Software
Leave a comment
Performance in R
Here is a function to compute (false positive, true positive) pair given response, ground truth, classes and threshold: This function is 100x slower than below which does the same thing: To benchmark, install rbenchmark package and use it like below:
Posted in Software
Leave a comment