Code generation is one of the most valuable applications of AI, as it allows for automated programming and "self-building" programs. Both Large Language Models (LLMs) and evolutionary methods, such as Genetic Programming (GP) and Grammatical Evolution (GE), are known to be capable of performing code generation with reasonable performance. However, to the best of our knowledge, little work has been done so far on a systematic comparison between the two approaches. Most importantly, the only studies that conducted such comparisons used benchmarks from the GP community, which, in our opinion, may have provided possibly GP-biased results. In this work, we perform a comparison of LLMs and evolutionary methods, in particular GE, using instead a well-known benchmark originating from the LLM community. Our results show that, in this scenario, LLMs can solve significantly more tasks than GE, indicating that GE struggles to match the performance of LLMs on code generation tasks that have differe...
Comparing Large Language Models and Grammatical Evolution for Code Generation / Custode, Leonardo Lucio; Rambaldi Migliore, Chiara Camilla; Roveri, Marco; Iacca, Giovanni. - (2024), pp. 1830-1837. ( 2024 Genetic and Evolutionary Computation Conference Companion, GECCO 2024 Companion Melbourne 14th July- 18th July 2024) [10.1145/3638530.3664162].
Comparing Large Language Models and Grammatical Evolution for Code Generation
Leonardo Lucio Custode;Chiara Camilla Migliore Rambaldi;Marco Roveri;Giovanni Iacca
2024-01-01
Abstract
Code generation is one of the most valuable applications of AI, as it allows for automated programming and "self-building" programs. Both Large Language Models (LLMs) and evolutionary methods, such as Genetic Programming (GP) and Grammatical Evolution (GE), are known to be capable of performing code generation with reasonable performance. However, to the best of our knowledge, little work has been done so far on a systematic comparison between the two approaches. Most importantly, the only studies that conducted such comparisons used benchmarks from the GP community, which, in our opinion, may have provided possibly GP-biased results. In this work, we perform a comparison of LLMs and evolutionary methods, in particular GE, using instead a well-known benchmark originating from the LLM community. Our results show that, in this scenario, LLMs can solve significantly more tasks than GE, indicating that GE struggles to match the performance of LLMs on code generation tasks that have differe...| File | Dimensione | Formato | |
|---|---|---|---|
|
3638530.3664162.pdf
accesso aperto
Tipologia:
Versione editoriale (Publisher’s layout)
Licenza:
Creative commons
Dimensione
1.63 MB
Formato
Adobe PDF
|
1.63 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione



