1. Beyond Dual-Supervision: the Many Benefits of Annotator Rationales for Relevance Judgments.
    In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCIA): Sister Conference Best Paper Track, 2017. (to appear)
  2. Why Is That Relevant? Collecting Annotator Rationales for Relevance Judgments.
    In Proceedings of the Fourth AAAI Conference on Human Computation and Crowdsourcing (HCOMP), 2016. Best Paper Award.
    pdf ] [ slides ] [ data ] [ blog ] [ press ]
  3. Neural Information Retrieval: A Literature Review.
    Technical Report, University of Texas at Austin, (pre-print), 2016. ArXiv 1611.06792.
    pdf ]
  4. An Empirical Study of API Stability and Adoption in the Android Ecosystem.
    In Proceedings of the Twenty-Ninth IEEE International Conference on Software Maintenance (ICSM), 2013.
    pdf ] [ Slides ]



  • Multi-Task Deep Representation Learning, 2016
    One oft-spoken, but rarely explored, benefit of deep learning is the potential for representation sharing between tasks. For instance, there are many domains where disparate tasks (e.g., document classification, object recognition, or video categorization) may share a common low-level representation (e.g., bag-of-words vectors, pixels in an image, or frames in a video). With deep learning, we may choose to share different layers between different, but potentially related, tasks. Ideally, this allows for more effective generalization to a new domain (i.e., transfer learning), but it may also be used to build a common network for multiple tasks at once. We evaluate such representation sharing using three different deep architectures (low-level sharing, high-level sharing, and interspersed sharing) on a variety of related natural language tasks.
    [ Paper ] [ Code ]
  • Crowdsourcing Relevance Judgments Using Reinforcement Learning, 2016
    While repetition and aggregation have done much to combat quality control concerns in the crowdsourcing domain, asking many people to complete the same task undermines the scalability and cost-effectiveness of crowdsourcing. To this end, I explore the use of reinforcement learning techniques for crowdsourced label collection. In contrast with previous techniques, which exclusively attempt to model worker quality and balance task load among trusted workers, I focus on building a state space to model the discriminating features of tasks and responses using function approximation. The end result is a model that can optimize label collection for a particular task domain and cost model.
    [ Paper ]
  • Synonymy and Antonymy Detection in Distributional Models, 2016
    Distributional models are flexible representations of semantic meaning, in which words are represented through the textual contexts in which they appear. Here we show that sentiment features outperform pattern-based and narrow-context approaches for differentiating synonyms and antonyms in distributional spaces. We also introduce an unsupervised method based on our Distributional Sentiment Hypothesis.
     [ Paper ]
  • GASP: The Graduate Admission Support Program, 2013
    A statistical machine learning system built for the Electrical & Computer Engineering Department at the University of Texas at Austin that uses historic admissions data to classify incoming graduate school applicants.
    Poster ] [ Related ]
  • Automatic Segmentation of Neurons in Electron Micrograph Images, 2013
    Developed a pipeline for building 3D models of neurons from high-resolution electron micrograph images.
    Slides ]
  • Crystal Growth and Analysis of Select Oligothiophenes for Integration in Photovoltaics, 2010
    An investigation of organic materials that might enable higher efficiency, lower cost solar panels. 
    Poster ]