In this paper, we present a study aimed at evaluating how ChatGPT-4 understands end-users’ natural language instructions to express automation rules for smart home applications and how it translates them into Python code ready to be deployed. Our study used 34 natural language instructions written by end users who were asked to automate scenarios presented as visual animations. The results show that ChatGPT-4 can produce coherent and effective code even if the instructions present ambiguities or unclear elements, understanding natural language instructions and autonomously resolving 94% of them. However, the generated code still contains numerous ambiguities that could potentially affect safety and security aspects. Nevertheless, when appropriately prompted, ChatGPT-4 can subsequently identify those ambiguities. This prompts a discussion about prospective interaction paradigms that may significantly improve the immediate usability of the generated code.
"This Sounds Unclear": Evaluating ChatGPT capability in translating end-user prompts into ready-to-deploy Python Code / Andrao, Margherita; Morra, Diego; Paccosi, Teresa; Matera, Maristella; Treccani, Barbara; Zancanaro, Massimo. - (2024). (Intervento presentato al convegno 2024 International Conference on Advanced Visual Interfaces, AVI 2024 tenutosi a Arenzano, Genoa, Italy nel 3rd-7th June 2024) [10.1145/3656650.3656693].
"This Sounds Unclear": Evaluating ChatGPT capability in translating end-user prompts into ready-to-deploy Python Code
Andrao, Margherita
;Paccosi, Teresa;Treccani, Barbara;Zancanaro, Massimo
2024-01-01
Abstract
In this paper, we present a study aimed at evaluating how ChatGPT-4 understands end-users’ natural language instructions to express automation rules for smart home applications and how it translates them into Python code ready to be deployed. Our study used 34 natural language instructions written by end users who were asked to automate scenarios presented as visual animations. The results show that ChatGPT-4 can produce coherent and effective code even if the instructions present ambiguities or unclear elements, understanding natural language instructions and autonomously resolving 94% of them. However, the generated code still contains numerous ambiguities that could potentially affect safety and security aspects. Nevertheless, when appropriately prompted, ChatGPT-4 can subsequently identify those ambiguities. This prompts a discussion about prospective interaction paradigms that may significantly improve the immediate usability of the generated code.File | Dimensione | Formato | |
---|---|---|---|
Andrao et al., 2024 - This Sounds Unclear.pdf
accesso aperto
Tipologia:
Versione editoriale (Publisher’s layout)
Licenza:
Creative commons
Dimensione
572.44 kB
Formato
Adobe PDF
|
572.44 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione