Episodic RRT
Stop Sampling. Start Exploring. The Episodic RRT (ERRT) revolutionizes motion planning by replacing the inefficient, random sampling of classical planners with learned, multi-step "episodes." Our deep reinforcement learning agent proactively generates coherent, branch-like paths, resulting in a planner that is dramatically faster, more robust, and finds higher-quality solutions from the start, especially in complex,high-dimensional spaces.

Tired of Planners That Wander Aimlessly?
Classical motion planners like RRT explore by scattering thousands of random points, like a mist cloud slowly filling a room. This approach is inefficient, scales poorly in complex, high-dimensional spaces, and wastes immense computational power checking irrelevant or dead-end paths. As environments get harder, random exploration simply breaks down.

Introducing ERRT: Motion Planning with an AI Co-Pilot
The Episodic RRT (ERRT) flips the script. We replace the primitive of a random point with a learned exploratory episode. Our framework embeds a Deep Reinforcement Learning agent that acts as an expert explorer. Instead of sampling blindly, the agent proactively generates coherent, multi-step path segments that intelligently navigate around obstacles. ERRT transforms the search from a random process into a directed, organic growth, like a real tree extending its branches toward the light.
Key Advantages: Why ERRT is Better
Beats the Curse of Dimensionality
ERRT's focused, near-linear growth is vastly more efficient than the volumetric expansion of random sampling, allowing it to succeed where others fail in high-dimensional spaces.
By learning to avoid obstacles, our agent proposes locally valid paths from the start. This inverts the costly "sample-first, check-later" paradigm, reducing collision checks by over 99.6% in complex scenes.
An episode is an ordered sequence of states, providing its own via-points. This makes integration into the search tree trivial compared to the difficult and often unsuccessful task of connecting to a single, isolated random point.
Pushed to the Limit in Diverse Scenarios
2D: Cluttered Obstacle Fields: Navigating dense, 60x60 meter maps with approximately 400 randomly placed obstacles of varying sizes. A classic test of robust collision avoidance and pathfinding.
3D: Dense Volumetric Mazes: Exploring vast, 60x60x60 meter spaces packed with over 17,000 obstacles of different volumes. This scenario tests the planner's ability to find a path through extremely constrained 3D spaces.
6D: Complex Robotic Manipulation: Planning for a 6-DOF robotic arm among randomly placed obstacles, including thin panels. This high-dimensional scenario is a true test of scalability for real-world robotics.
Superior Initial Solution
Superior Optimization Performance

Full Path Demonstration
How ERRT Works: The Four-Step Exploration Loop

1. Initialize Episode:
An episode ends when it hits its max length or is blocked by an obstacle. A new one then begins from eriodic randomized restart. To ensure full coverage, the algorithm episodically samples a random point and finds the nearest node in the tree to start a new exploration "episode." It prevents the agent from getting stuck and ensures probabilistic completeness.

2. Generate Path:
The DRL agent generates a multi-step, locally optimal path segment starting from the current node on the tree. The action path generated by the neural network serves as control points for the construction of a cubic B-spline. Subsequently, the arc length along this B-spline is computed, and the curve is re-parameterized to achieve equidistant knot points with respect to this arc length. This re-sampling methodology facili- tates a uniform distribution of knot points, renders the point density independent of the neural network's direct output and contributes to a smoother path.

4. Update & Jump:
The algorithm checks if the episode is complete. If the tree is near the goal, it attempts a direct "One Step Jump" to connect and find a solution. It provides a final, robust connection mechanism to overcome the precision limits of the neural network.