28 - Data Programming: Creating Large Training Sets, Quickly - a podcast by Allen Institute for Artificial Intelligence

from 2017-07-11T22:46:50

:: ::

NIPS 2016 paper by Alexander Ratner and coauthors in Chris Ré's group at Stanford, presented by Waleed.

The paper presents a method for generating labels for an unlabeled dataset by combining a number of weak labelers. This changes the annotation effort from looking at individual examples to constructing a large number of noisy labeling heuristics, a task the authors call "data programming". Then you learn a model that intelligently aggregates information from the weak labelers to create a weighted "supervised" training set. We talk about this method, how it works, how it's related to ideas like co-training, and when you might want to use it.

https://www.semanticscholar.org/paper/Data-Programming-Creating-Large-Training-Sets-Quic-Ratner-Sa/37acbbbcfe9d8eb89e5b01da28dac6d44c3903ee

Further episodes of NLP Highlights

Further podcasts by Allen Institute for Artificial Intelligence

Website of Allen Institute for Artificial Intelligence