CINECA IRIS Institutional Research Information System

Grasping unknown objects in clutter remains challenging due to partial occlusions, self-occlusions, and frequent object interactions during execution. In this paper we present Grasp It Like a Pro 3.0 (GILP 3.0), a lightweight learning-from-demonstration pipeline for closed-loop clutter clearing with unknown objects. The method segments the scene from raw RGB-D point clouds, approximates candidate regions via MVBB decomposition, and predicts grasp pose and interaction wrench from compact MVBB descriptors using two histogram-based gradient-boosted decision-tree regressors trained from a limited number of human demonstrations. Grasp candidates are ranked with execution-oriented feasibility and collision-aware scoring, and the pipeline iteratively reacquires and replans after each attempt to handle object motion and occlusions in clutter. Real-robot experiments on a Franka Emika Panda with Franka Hand in 12 cluttered scenes (47 objects) achieve 95.7% per-object success and a 91.7% scene clearing rate, demonstrating a replicable and data-efficient solution for grasping unknown everyday objects in cluttered environments. Note to Practitioners - We present a grasping pipeline for cluttered, unknown scenes that outputs collision-aware, executable poses with high first-attempt success. It suits industrial cells handling varied, unlabeled items and frequent product changes, since it relies on coarse box descriptors rather than precise CAD. Deep learning-based method often need large training and extra feasibility checks; purely analytic methods depend on accurate geometry. The pipeline segments the scene (3D clustering and MVBB) and uses two histogram-based GBDT regressors to map box dimensions to a 6-DoF pose and an interaction-wrench metric. It returns a single scored pose that can be processed by standard planners. Requirements: RGB-D/3D sensing with basic calibration; typical tuning is limited to density thresholds and a few score weights. Known failure cases are transparent/specular surfaces and heavy occlusions; light filtering and periodic calibration help.

Grasp It Like a Pro 3.0: An Expert-Based Data-Driven Algorithm for Grasping Unknown Objects in Cluttered Environments

Gambino, Gabriele;Tolomei, Simone;Angelini, Franco;Garabini, Manolo

2026-01-01

Abstract

Grasping unknown objects in clutter remains challenging due to partial occlusions, self-occlusions, and frequent object interactions during execution. In this paper we present Grasp It Like a Pro 3.0 (GILP 3.0), a lightweight learning-from-demonstration pipeline for closed-loop clutter clearing with unknown objects. The method segments the scene from raw RGB-D point clouds, approximates candidate regions via MVBB decomposition, and predicts grasp pose and interaction wrench from compact MVBB descriptors using two histogram-based gradient-boosted decision-tree regressors trained from a limited number of human demonstrations. Grasp candidates are ranked with execution-oriented feasibility and collision-aware scoring, and the pipeline iteratively reacquires and replans after each attempt to handle object motion and occlusions in clutter. Real-robot experiments on a Franka Emika Panda with Franka Hand in 12 cluttered scenes (47 objects) achieve 95.7% per-object success and a 91.7% scene clearing rate, demonstrating a replicable and data-efficient solution for grasping unknown everyday objects in cluttered environments. Note to Practitioners - We present a grasping pipeline for cluttered, unknown scenes that outputs collision-aware, executable poses with high first-attempt success. It suits industrial cells handling varied, unlabeled items and frequent product changes, since it relies on coarse box descriptors rather than precise CAD. Deep learning-based method often need large training and extra feasibility checks; purely analytic methods depend on accurate geometry. The pipeline segments the scene (3D clustering and MVBB) and uses two histogram-based GBDT regressors to map box dimensions to a 6-DoF pose and an interaction-wrench metric. It returns a single scored pose that can be processed by standard planners. Requirements: RGB-D/3D sensing with basic calibration; typical tuning is limited to density thresholds and a few score weights. Known failure cases are transparent/specular surfaces and heavy occlusions; light filtering and periodic calibration help.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2026
			
	Codice DOI
	
				https://dx.doi.org/10.1109/tase.2026.3674643
			
	Tutti gli autori
	
						Gambino, Gabriele; Tolomei, Simone; Angelini, Franco; Garabini, Manolo

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1359487

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

0

social impact