An Entropy and Noisy-Channel Model for Rule Induction

Research output: ThesisDoctoral thesis 1 (Research UU / Graduation UU)

Abstract

This dissertation is a collection of articles which present the results of a research project that investigated linguistic rule induction from an information-theoretic perspective. The main goal of this research project was to propose and test an innovative entropy model for rule induction based on Shannon’s noisy-channel coding theory (Shannon, 1948). Rule induction (generalization or regularization) is an essential language acquisition mechanism that empowers language learners to not only memorize specific items (e.g. phonemes, words) experienced when exposed to linguistic input (language), but also to acquire relations between these items. For example, when people learn languages, they not only memorize combinations of words like ‘Mom walked slowly’, but they also learn generalized rules about how categories of words can be combined (e.g. Noun-Verb-Adverb). These relations range from statistical patterns between specific items present in the linguistic input (Saffran, Aslin, & Newport, 1996; Thiessen & Saffran, 2007) to more abstract category/rule induction (Marcus, Vijayan, Rao, & Vishton, 1999; Smith & Wonnacott, 2010; Wonnacott & Newport, 2005). This research addressed the inductive steps from memorizing specific items, to inferring rules (or statistical patterns) between these specific items (item-bound generalizations), and also to forming rules that apply to categories of items (category-based generalization). The main research questions of this research were: (1) whether the two forms of generalization are outcomes of the same learning mechanism or two different mechanisms, and (2) what factors drive rule induction with its two forms of generalization. In order to answer these questions, I proposed an innovative theoretical model – an entropy and noisy-channel capacity model – that makes predictions about the transition from memorization to rule induction. In this model, two factors drive rule learning: (1) entropy (a measure of the information content, which quantifies the richness and unpredictability of the language) and (2) channel capacity (the amount of information, including noise, that learners can process per second, since learning happens in time and in noisy environments). I defined our brain’s encoding capacity as channel capacity at the computational level, in the sense of Marr (1982), which is the finite rate of information encoding (bits per second). At the algorithmic level, the channel capacity might be supported by cognitive capacities involved in processing and encoding information, e.g. memory and attention. I tested the entropy model across multiple grammar learning experiments, both with adults and infants. Findings showed that when entropy increases (e.g. when the language has a richer vocabulary, or more diverse combinations of words), learners are more likely to generalize rules than to memorize combinations of words. Contrary to intuition, the same happens when the channel capacity is pushed to the limit, by supplying information faster, and also when there is background noise that distracts from the language. These findings bring evidence in favor of the entropy model. The dissertation also sketches the first joint information-theoretic and thermodynamic model of rule induction, proposing that the 2nd law of thermodynamics and the constructal law of thermodynamics can answer why and how rule induction happens.
Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • Utrecht University
Supervisors/Advisors
  • Avrutin, Sergey, Primary supervisor
  • Wijnen, Frank, Supervisor
Award date12 Nov 2021
Place of PublicationAmsterdam, The Netherlands
Publisher
Print ISBNs978-94-6093-392-9
DOIs
Publication statusPublished - 12 Nov 2021

Keywords

  • rule induction
  • rule learning
  • entropy
  • channel capacity
  • bit rate
  • generalization
  • category formation
  • transmission rate
  • noise
  • regularization

Fingerprint

Dive into the research topics of 'An Entropy and Noisy-Channel Model for Rule Induction'. Together they form a unique fingerprint.

Cite this