Compositionality With Variation Reliably Emerges in Neural Networks
Henry Conklin, Kenny Smith
We re-evaluated how to look for compositional structure, in response to recent work claiming compositionality isn’t needed for generalization. While natural languages are compositional they’re also rich with variation – by introducing 4 explicit measures of variation, we showed that models reliably converge to compositional representations just with a degree of variation that skewed previous measures. We also showed that at the start of training variation correlates strongly with generalization – but that this effect goes away as representations become regular enough for the task. Converging to highly-variable representations is similar to what we see in human languages, and in a final set of experiments we show that model capacity, thought to condition variation in human language, has a similar conditioning effect with neural networks.
presented as a paper at the ICLR 2023 main conference
Anaphoric Structures Emerge Between Neural Networks
Nicholas Edwards*, Hannah Rohde, Henry Conklin*
Anaphors are ubiquitous in human language; structures like pronouns and ellipsis are present in virtually every language. This is in spite of the fact that they seem to introduce ambiguity – “they left a parcel for you” could refer to virtually anyone, and needs to be disambiguated by context. Many accounts of why anaphors exist are tied to efficiency: they enable brevity which lowers the effort of communicating. We show that anaphoric structures emerge between communicating neural networks whether or not there’s any pressure for efficiency, with efficiency pressures increasing the prevalence of anaphoric structures already present. Pointing to the relationship between semantics and pragmatics – rather than efficency – as a major causal factor.
accepted as a paper at Cogsci 2023
Meta-Learning to Compositionally Generalize
Henry Conklin*, Bailin Wang*, Kenny Smith, Ivan Titov
We looked at how to use meta-learning to introduce biases during training that help neural models to generalize out-of-distribution. Using this technique we inhibited models’ memory analogously to humans resulting in substantial improvement on compositional generalization tasks. This aligns with work in cognitive science that ties humans’ remarkable ability to generalize, in part, to our limited memory: if we can’t remember everything we’ve seen before we need a more compact system for representing information. Compositionality, where we break things into a small set of reusable pieces, seems to be the result. This parallel between how we generalize and what helps a model points to the importance of domain-general constraints – like limited memory – in developing human-like models.