This project demonstrates the Value Iteration algorithm applied to the classic Frozen Lake problem from Gymnasium. It provides a robust implementation for solving Markov Decision Processes (MDPs) in both deterministic and stochastic environments. The project includes scripts for headless computation and interactive Pygame-based rendering, allowing users to visualize how an optimal policy is derived and executed on custom map configurations.
value_iteration.py: Runs the algorithm on a custom map without GUI overhead.value_iteration_pygame.py: Interactive script exposing the is_slippery flag for stochastic transitions.gamma (discount factor) and convergence thresholds.MY_MAP constant.
The agent must navigate from S to G while avoiding H. In slippery mode, movement direction is stochastic.
Choose between a fast, headless script for quick convergence checks or a Pygame-based visualizer to watch the agent's behavior in real-time.
Toggle the is_slippery flag to switch between deterministic movement and stochastic dynamics, testing the robustness of the learned policy.
Direct access to tuning parameters like the discount factor (gamma) and convergence tolerance allows for experimentation with planning horizons and accuracy.
The implementation successfully demonstrates the power of Dynamic Programming in solving finite MDPs. The agent reliably learns to navigate complex, hole-ridden maps, adapting its path based on the environmental dynamics.
Achieves optimal policy convergence within seconds for standard grid sizes.
Provides a clear, modifiable sandbox for understanding Reinforcement Learning fundamentals.
Visual tools allow immediate verification of policy safety and efficiency.
This project serves as a foundational implementation of Value Iteration, bridging the gap between theoretical MDP concepts and practical code. By offering both computational and visual interfaces, it enables a deeper understanding of how agents plan and adapt in uncertain environments. The modular design allows for easy extension to more complex grid worlds and algorithms.