Extracting Automata from Recurrent Neural Networks Using Queries and Counterexamples


In the line of research on interpreting neural networks, one direction is to model the underlying complex model with a simpler model. This paper proposes a method to extract an automaton from an LSTM. The technique can highlight cases where the LSTM fails to learn the intended generalization, as the extracted automata is overly complex compared to a simple target language.


Want to receive more content like this in your inbox?