The Influence of Dimensions on the Complexity of Computing Decision Trees

Stephen G. Kobourov, Maarten Löffler, Fabrizio Montecchiani, Marcin Pilipczuk, Ignaz Rutter, Raimund Seidel, Manuel Sorge, Jules Wulms

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

A decision tree recursively splits a feature space Rd and then assigns class labels based on the resulting partition. Decision trees have been part of the basic machine-learning toolkit for decades. A large body of work considers heuristic algorithms that compute a decision tree from training data, usually aiming to minimize in particular the size of the resulting tree. In contrast, little is known about the complexity of the underlying computational problem of computing a minimum-size tree for the given training data. We study this problem with respect to the number d of dimensions of the feature space Rd, which contains n training examples. We show that it can be solved in O(n2d+1) time, but under reasonable complexity-theoretic assumptions it is not possible to achieve f(d) · no(d/ log d) running time. The problem is solvable in (dR)O(dR) · n1+o(1) time, if there are exactly two classes and R is an upper bound on the number of tree leaves labeled with the first class.

Original languageEnglish
Title of host publicationAAAI'23/IAAI'23/EAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence
EditorsBrian Williams, Yiling Chen, Jennifer Neville
PublisherAAAI Press
Chapter937
Pages8343-8350
Number of pages8
ISBN (Electronic)9781577358800
DOIs
Publication statusPublished - 7 Feb 2023

Bibliographical note

Publisher Copyright:
Copyright © 2023, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

Funding

Robert Ganian, Fabrizio Montecchiani, Martin Nöllenburg, and Meirav Zehavi. Stephen Kobourov acknowledges funding by the National Science Foundation, grant number NSF-CCF-2212130. Fabrizio Montecchiani acknowledges funding by University of Perugia, Fondi di Ricerca di Ate-neo, edizione 2021, project “AIDMIX - Artificial Intelligence for Decision making: Methods for Interpretability and eXplainability”. Manuel Sorge acknowledges funding by the Alexander von Humboldt Foundation. Jules Wulms acknowledges funding by the Vienna Science and Technology Fund (WWTF) under grant ICT19-035.

FundersFunder number
Fondi di Ricerca di Ate-neo
National Science FoundationNSF-CCF-2212130
Alexander von Humboldt-Stiftung
Vienna Science and Technology FundICT19-035
Università degli Studi di Perugia

    Keywords

    • CG
    • AI
    • DS
    • FPT

    Fingerprint

    Dive into the research topics of 'The Influence of Dimensions on the Complexity of Computing Decision Trees'. Together they form a unique fingerprint.

    Cite this