In this paper, we present a study aimed at evaluating how ChatGPT-4 understands end-users’ natural language instructions to express automation rules for smart home applications and how it translates them into Python code ready to be deployed. Our study used 34 natural language instructions written by end users who were asked to automate scenarios presented as visual animations. The results show that ChatGPT-4 can produce coherent and effective code even if the instructions present ambiguities or unclear elements, understanding natural language instructions and autonomously resolving 94% of them. However, the generated code still contains numerous ambiguities that could potentially affect safety and security aspects. Nevertheless, when appropriately prompted, ChatGPT-4 can subsequently identify those ambiguities. This prompts a discussion about prospective interaction paradigms that may significantly improve the immediate usability of the generated code.

"This Sounds Unclear": Evaluating ChatGPT capability in translating end-user prompts into ready-to-deploy Python Code / Andrao, Margherita; Morra, Diego; Paccosi, Teresa; Matera, Maristella; Treccani, Barbara; Zancanaro, Massimo. - (2024). (Intervento presentato al convegno AVI 2024 tenutosi a Arenzano, Genoa, Italy nel 3rd-7th June 2024) [10.1145/3656650.3656693].

"This Sounds Unclear": Evaluating ChatGPT capability in translating end-user prompts into ready-to-deploy Python Code

Andrao, Margherita
;
Paccosi, Teresa;Treccani, Barbara;Zancanaro, Massimo
2024-01-01

Abstract

In this paper, we present a study aimed at evaluating how ChatGPT-4 understands end-users’ natural language instructions to express automation rules for smart home applications and how it translates them into Python code ready to be deployed. Our study used 34 natural language instructions written by end users who were asked to automate scenarios presented as visual animations. The results show that ChatGPT-4 can produce coherent and effective code even if the instructions present ambiguities or unclear elements, understanding natural language instructions and autonomously resolving 94% of them. However, the generated code still contains numerous ambiguities that could potentially affect safety and security aspects. Nevertheless, when appropriately prompted, ChatGPT-4 can subsequently identify those ambiguities. This prompts a discussion about prospective interaction paradigms that may significantly improve the immediate usability of the generated code.
2024
AVI '24: Proceedings of the 2024 International Conference on Advanced Visual Interfaces
New York, NY, USA
Association for Computing Machinery
Andrao, Margherita; Morra, Diego; Paccosi, Teresa; Matera, Maristella; Treccani, Barbara; Zancanaro, Massimo
"This Sounds Unclear": Evaluating ChatGPT capability in translating end-user prompts into ready-to-deploy Python Code / Andrao, Margherita; Morra, Diego; Paccosi, Teresa; Matera, Maristella; Treccani, Barbara; Zancanaro, Massimo. - (2024). (Intervento presentato al convegno AVI 2024 tenutosi a Arenzano, Genoa, Italy nel 3rd-7th June 2024) [10.1145/3656650.3656693].
File in questo prodotto:
File Dimensione Formato  
Andrao et al., 2024 - This Sounds Unclear.pdf

accesso aperto

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Creative commons
Dimensione 572.44 kB
Formato Adobe PDF
572.44 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/413850
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact