Founder of the analytics group donorcast
Acting Debut Top Chef Donorcast
Fan Favorite Data Hoarders Video
Thanks for post mortardata:
Since we do a lot of experimenting with data, we’re always excited to find new datasets to use with Mortar. We’re saving bookmarks and sharing datasets with our team on a nearly-daily basis.
There are tons of resources throughout the web, but given our love for the data scientist community, we thought we’d pick out a few of the best dataset lists curated by data scientists.
Below is a collection of six great dataset lists from both famous data scientists and those who aren’t well-known:
Nooo! don’t attack p<.05!!
One-quarter of studies that meet commonly used statistical cutoff may be false.
by Erika Check Hayden at Nature News
The plague of non-reproducibility in science may be mostly due to scientists’ use of weak statistical tests, as shown by an innovative method developed by statistician Valen Johnson, at Texas A&M University in College Station.
Johnson compared the strength of two types of tests: frequentist tests, which measure how unlikely a finding is to occur by chance, and Bayesian tests, which measure the likelihood that a particular hypothesis is correct given data collected in the study. The strength of the results given by these two types of tests had not been compared before, because they ask slightly different types of questions.
So Johnson developed a method that makes the results given by the tests — the P value in the frequentist paradigm, and the Bayes factor in the Bayesian paradigm — directly comparable. Unlike frequentist tests, which use objective calculations to reject a null hypothesis, Bayesian tests require the tester to define an alternative hypothesis to be tested — a subjective process. But Johnson developed a ‘uniformly most powerful’ Bayesian test that defines the alternative hypothesis in a standard way, so that it “maximizes the probability that the Bayes factor in favor of the alternate hypothesis exceeds a specified threshold,” he writes in his paper. This threshold can be chosen so that Bayesian tests and frequentist tests will both reject the null hypothesis for the same test results.