Skip to main navigation Skip to search Skip to main content

Automatically Expressing the Meaning of Logical Formulae in Natural Language

  • Eduardo Calò

Research output: ThesisDoctoral thesis 1 (Research UU / Graduation UU)

Abstract

For as long as natural language generation (NLG) has existed, "logic-to-text" generation, i.e., NLG from mathematical logic formulae, has attracted attention due to its many potential applications, including intelligent tutoring systems, ontology verbalization, and explainable AI. This thesis investigates the question of how to build, and especially how to evaluate, systems for logic-to-text generation. Its contributions center on logic-to-text NLG, but also extend to the broader field of data-to-text NLG. Chapters 3 and 4 study generation mechanisms and input manipulations. In Chapter 3, I present LoLa, a rule-based logic-to-text generation system that produces English text from first-order logic formulae. Its main innovation lies in simplifying input formulae through logical equivalences prior to translation. Extensive evaluations using both standard automatic metrics and human judgments show that LoLa is effective at generating natural language translations of logical formulae. In Chapter 4, I turn to the question of identifying the most amenable input for a logic-to-text system. Focusing on the role of brevity, I introduce an algorithm for computing the shortest equivalents of an input formula and report automatic and human evaluations to show that manipulating input formulae actually improves output quality. However, it is unclear whether the shortest equivalent to a given formula is always the best input. Chapters 5 through 8 focus on evaluation. In Chapter 5, I conduct a meta-evaluation in which user interfaces employed in past human evaluations are assessed by user experience experts. Building on their insights, I derive recommendations for designing more effective and engaging annotation interfaces. In Chapter 6, I present a survey of human evaluations of hallucinations, investigating how these evaluations are designed. The analysis reveals several methodological shortcomings, including the frequent omission of crucial details such as annotation guidelines, user interface design, inter-annotator agreement metrics, and annotator demographics and compensation. In Chapter 7, I present the first implementation of a logic-based framework for hallucination analysis in real-world data-to-text domains, including logic-to-text, experimenting with both human and LLM annotators. The results show that applying the framework to concrete data-to-text domains is feasible, but not straightforward. Human annotators achieve only low to modest accuracies, depending on the domain. By contrast, models demonstrate potential to perform the annotation task, thereby enabling scalability. In Chapter 8, I introduce the concept of formulaicness, a measure of how strongly an output text mirrors the structural form of its input, and I propose its use as an enhancement for automatic evaluation of naturalness, again with a focus on logic-to-text NLG. The results suggest that formulaicness is a valuable addition to the automatic evaluation of naturalness, aligning well with human judgments and consistently improving the correlation of baseline metrics with human ratings. Overall, the thesis contributes to logic-to-text NLG in two main ways: by highlighting its specific challenges, such as the identification of more amenable inputs, and by taking steps to improve existing approaches to generation and evaluation. More broadly, it also offers more general insights for improving the evaluation of data-to-text NLG systems.
Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • Utrecht University
Supervisors/Advisors
  • van Deemter, Kees, Supervisor
  • Gatt, Albert, Supervisor
Award date17 Jun 2026
Place of PublicationUtrecht
Publisher
Print ISBNs978-90-393-8044-4
DOIs
Publication statusPublished - 17 Jun 2026

Keywords

  • natural language generation
  • logic-to-text
  • data-to-text
  • NLG
  • logic
  • evaluation
  • hallucinations

Fingerprint

Dive into the research topics of 'Automatically Expressing the Meaning of Logical Formulae in Natural Language'. Together they form a unique fingerprint.

Cite this