Spreadsheets are arguably the most accessible data-analysis tool and are used by millions of people. Despite the fact that they lie at the core of most business practices, working with spreadsheets can be error prone, usage of formulas requires training and, crucially, spreadsheet users do not have access to state-of-the-art analysis techniques offered by machine learning. To tackle these issues, we introduce the novel task of predictive spreadsheet autocompletion, where the goal is to automatically predict the missing entries in the spreadsheets. This task is highly non-trivial: cells can hold heterogeneous data types and there might be unobserved relationships between their values, such as constraints or probabilistic dependencies. Critically, the exact prediction task itself is not given. We consider a simplified, yet non-trivial, setting and propose a principled probabilistic model to solve it. Our approach combines black-box predictive models specialized for different predictive tasks (e.g., classification, regression) and constraints and formulas detected by a constraint learner, and produces a maximally likely prediction for all target cells that is consistent with the constraints. Overall, our approach brings us one step closer to allowing end users to leverage machine learning in their workflows without writing a single line of code.

Predictive spreadsheet autocompletion with constraints / Kolb, Samuel; Teso, Stefano; Dries, Anton; De Raedt, Luc. - In: MACHINE LEARNING. - ISSN 0885-6125. - 109:2(2020), pp. 307-325. [10.1007/s10994-019-05841-y]

Predictive spreadsheet autocompletion with constraints

Kolb, Samuel;Teso, Stefano;
2020-01-01

Abstract

Spreadsheets are arguably the most accessible data-analysis tool and are used by millions of people. Despite the fact that they lie at the core of most business practices, working with spreadsheets can be error prone, usage of formulas requires training and, crucially, spreadsheet users do not have access to state-of-the-art analysis techniques offered by machine learning. To tackle these issues, we introduce the novel task of predictive spreadsheet autocompletion, where the goal is to automatically predict the missing entries in the spreadsheets. This task is highly non-trivial: cells can hold heterogeneous data types and there might be unobserved relationships between their values, such as constraints or probabilistic dependencies. Critically, the exact prediction task itself is not given. We consider a simplified, yet non-trivial, setting and propose a principled probabilistic model to solve it. Our approach combines black-box predictive models specialized for different predictive tasks (e.g., classification, regression) and constraints and formulas detected by a constraint learner, and produces a maximally likely prediction for all target cells that is consistent with the constraints. Overall, our approach brings us one step closer to allowing end users to leverage machine learning in their workflows without writing a single line of code.
2020
2
Kolb, Samuel; Teso, Stefano; Dries, Anton; De Raedt, Luc
Predictive spreadsheet autocompletion with constraints / Kolb, Samuel; Teso, Stefano; Dries, Anton; De Raedt, Luc. - In: MACHINE LEARNING. - ISSN 0885-6125. - 109:2(2020), pp. 307-325. [10.1007/s10994-019-05841-y]
File in questo prodotto:
File Dimensione Formato  
psyche___mlj (1).pdf

Open Access dal 26/10/2020

Descrizione: Articolo principale
Tipologia: Post-print referato (Refereed author’s manuscript)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 2.2 MB
Formato Adobe PDF
2.2 MB Adobe PDF Visualizza/Apri
Kolb2020_Article_PredictiveSpreadsheetAutocompl.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 704.09 kB
Formato Adobe PDF
704.09 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/290519
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 4
  • OpenAlex ND
social impact