CINECA IRIS Institutional Research Information System

Large Language Models (LLMs) now possess function calling capabilities, enabling them to interact with external tools and APIs. Within cognitive architectures for social robotics, this provides a robust mechanism for an LLM to orchestrate a set of discrete functions-conceptually echoing the brain's functional specificity-to manage operations such as visual perception, auditory processing, speech generation, and memory access, among others. However, LLMs exhibit varying propensities and strategies when utilizing these function calls. This paper presents a comparative evaluation of four LLMs, differing significantly in parameter scale and origin: Llama3 70b, Gemma2 9b, Mixtral 8x7b, and Phi3 mini 3.8b, each acting as the orchestrator in such an architecture. Our testbed involved a human-participant study (N=20) where individuals engaged in ambiguous social scenarios, each interacting with the architecture driven by the different LLMs (4 trials per participant). Results revealed statistically significant differences in the frequency of Look (F=13.62, p<0.001), Talk (F=9.29, p<0.001), and Hear (F=10.34, p<0.001) calls across LLMs. Notably, Llama3 70b made significantly more ‘Look’ calls (M=3.45), a behavior that corresponded with strong user preference (18/20), suggesting its interaction style was perceived as more natural and contextually aware. Mixtral 8x7b, in contrast, favored ‘Talk’ (M=11.05) and ‘Hear’ (M=11.15) calls. These findings demonstrate that analyzing function call patterns offers a quantitative lens to understand and compare the interaction strategies of different LLMs in orchestrating robotic behavior.

A Comparative Evaluation of Function-Calling LLMs in a Cognitive Architecture

Pardini, Marco;Galatolo, Federico A.;Cominelli, Lorenzo;De Marco, Angelo;Cimino, Mario G. C. A.;Greco, Alberto;Scilingo, Enzo Pasquale

2025-01-01

Abstract

Large Language Models (LLMs) now possess function calling capabilities, enabling them to interact with external tools and APIs. Within cognitive architectures for social robotics, this provides a robust mechanism for an LLM to orchestrate a set of discrete functions-conceptually echoing the brain's functional specificity-to manage operations such as visual perception, auditory processing, speech generation, and memory access, among others. However, LLMs exhibit varying propensities and strategies when utilizing these function calls. This paper presents a comparative evaluation of four LLMs, differing significantly in parameter scale and origin: Llama3 70b, Gemma2 9b, Mixtral 8x7b, and Phi3 mini 3.8b, each acting as the orchestrator in such an architecture. Our testbed involved a human-participant study (N=20) where individuals engaged in ambiguous social scenarios, each interacting with the architecture driven by the different LLMs (4 trials per participant). Results revealed statistically significant differences in the frequency of Look (F=13.62, p<0.001), Talk (F=9.29, p<0.001), and Hear (F=10.34, p<0.001) calls across LLMs. Notably, Llama3 70b made significantly more ‘Look’ calls (M=3.45), a behavior that corresponded with strong user preference (18/20), suggesting its interaction style was perceived as more natural and contextually aware. Mixtral 8x7b, in contrast, favored ‘Talk’ (M=11.05) and ‘Hear’ (M=11.15) calls. These findings demonstrate that analyzing function call patterns offers a quantitative lens to understand and compare the interaction strategies of different LLMs in orchestrating robotic behavior.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2025

Codice ISBN

979-8-3315-0279-9

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1345528

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

ND

ND

social impact