Joshua D. Habiger - Homepage




  Joshua D. Habiger's Website
Home  Research     Teaching    Vitae     Presentations     Resources   Contact




My current research interests include high dimensional (HD) statistical inference and categorical data analysis. HD data sets are increasingly popular as high-throughput technology such as brain imaging software, pyrosequencing software and data mining software, to name a few, is routinely utilized to generate HD data sets. For example, in Anderson and Habiger (2012), rRNA pyrosequencing software was used to identify and measure the abundance of rhizobacteria in wheat (bacteria near the roots of wheat) at the DNA level across several productivity groups. The statistical inference problem amounted to determining which among many of the identified rhizobacteria are associated with productivity via the simultaneous testing of multiple statistical hypotheses.

The challenges that arise in the aforementioned analysis are akin to challenges that arise in many HD multiple testing problems. In particular, multiplicity corrections must be made due to the large number of hypotheses tested, but most existing methods are designed for independent and identically distributed test statistics with continuous distributions when in fact 1) test statistics are not independent and little is known apriori about the dependence structure, 2) test statistics have discrete distributions and 3) test statistics are heterogenous. Hence, there is a need to develop HD multiple testing methods that exploit, or at the very least allow for, correlation, heterogeneity and discrete data.

Of course, as data generating technology evolves we may anticipate even larger and more complex data sets with unforseen challenges. For this reason, it is imperative that we lay a firm foundation for HD statistical inference so that methods are adaptable and the field of Statistics can continue to efficiently serve the scientific community. What an exciting time to be a Statistician!



Peer Reviewed Publications
  • The influence of misspecied covariance on false discovery control when using posterior probabilities. Statistical Theory and Related Fields (in press). With Y Liang and X. Min
  • A Multiple testing protocol for exploratory data analysis and the local misclassification rate Communications in Statistics: Theory and Methods (in press). With D. Watts.
Current Student Research
  • Tina Shi(PhD student).
    • Tina is working on statistical methods for RNA-seq data analysis
Former Graduate Students
  • David Watts (PhD student).
    • David's research focuses on false discovery rate and clustering methods for HD heterogeneous data.
  • Zhesi Chen (MS student).
    • Zhesi worked on a jacknife approach to oracle parameter estimation for false discovery rate methods.
  • Tamanna Hossain (MS student).
    • Tammanna studied the robustness of temporal correlation models in the analysis of functional magnetic resonance imaging (fMRI) data.
  • Ana Tehranni (McNair student).
    • Ana focused on the effect of conditioning on ancillary statistics in contingency table analysis.