We present a hierarchical planning framework for dexterous robotic manipulation (HiDex). This framework exploits in-hand and extrinsic dexterity by actively exploring contacts. It generates rigid-body motions and complex contact sequences.
Our framework is based on Monte-Carlo Tree Search (MCTS) and has three levels: (1) planning object motions and environment contact modes; (2) planning robot contacts; (3) path evaluation and control optimization that passes the rewards to the upper levels.
This framework offers two main advantages. First, it allows efficient global reasoning over high-dimensional complex space created by contacts. It solves a diverse set of manipulation tasks that require dexterity, both intrinsic (using the fingers) and extrinsic (also using the environment), mostly in seconds. Second, our framework allows the incorporation of expert knowledge and customizable setups in task mechanics and models. It is flexible to encode expert knowledge into the search through MCTS action policies, value estimations, and rewards. Hence, it could provide a generalizable solution for various manipulation tasks.
We instantiate this framework on manipulation with extrinsic dexterity and in-hand manipulation. Example tasks include pick up a card, book-out-of-bookshelf, peg-out-of-hole, block flipping, occluded grasp, upward peg-in-hole, sideway peg-in-hole, planar reorientation, planar block passing, and in-hand reorientation. We also demonstrate some of them on two robot platforms. As future work, we envision this framework to be extended towards general manipulation planning that incorporates global reasoning, mechanics, learning, and optimization.
It requires minor modifications to accommodate different scenarios and robots. In our code, setting up new scenarios and adjusting search parameters only require modification of one setup.yaml file.
An overview of our framework, with an example of picking up a card. In the example, the robot pulls the card the edge of the table and then grasps it.
The following processes run iteratively.
Level 1 plans object trajectories, interleaving searches over discrete contact modes (▭ nodes) and continuous object poses (○ nodes). Level 1 is based on MCTS, and we employ Rapidly-exploring random tree (RRT) as the MCTS rollout to enhance the exploration of object configuration space.
An object trajectory is passed to Level 2 (⇩) to plan robot contact sequences on the object surface (◌ nodes). Level 2 is also based on MCTS. Our repsentation allows planning of robot contact transitions. Compared to explicitly planning robot contacts for each timestep, planning transitions largely increases the search efficiency.
The full trajectory of object motions and robot contacts is passed to Level 3 (⇩) for evaluation and control optimization.
After evaluation, Level 3 passes the reward back to the upper levels (↻). The reward is updated for every node in the path (bold nodes).