[1707.03389] SCAN: Learning Abstract Hierarchical Compositional Visual Concepts


Compositionality is an inherent part of natural language. Researchers from DeepMind propose a model that can learn abstract hierarchical compositional visual concepts (a mouthful). Using an existing model (the beta-VAE) to learn disentangled representations from visual input, their model learns abstractions from a small number of symbol-image pairs, e.g. example images paired with the symbol "apple". Using logical operators (AND, etc.) the model then learns to compose these concepts. Blog post.


Want to receive more content like this in your inbox?