Named Entity Recognition using structured generation
Structured generation is a method that enforces the output format of a language model. The idea is pretty smart and consists in representing the desired format (e.g. JSON) as a Finite State Machine (FSM) and iteratively masking model probabilities to guide token generation. In the following post, we will use the outlines library to perform Named Entity Recognition (NER) over the book Dune by Frank Herbert. Our goal is to extract characters, locations, organizations, and hopefully be able to infer clusters from their interaction in the text....