Optimising LMAPF guidance graphs using Evolutionary algorithms: Advice needed [R]
Our take
The quest for optimizing Multi-Agent Path Finding (MAPF), particularly in the challenging Lifelong MAPF scenario, represents a fascinating intersection of algorithmic efficiency and real-world application. This dissertation work, detailed in the recent Reddit post, tackles a particularly clever approach: leveraging evolutionary algorithms to fine-tune guidance graphs—the underlying structure dictating path costs—without altering the core MAPF algorithm itself. It’s a compelling idea, echoing the broader trend of meta-optimization we're seeing across various AI domains. The challenge, as the author rightly points out, lies in the inherent difficulties of evolutionary algorithms, specifically achieving sufficient exploration while maintaining exploitable fitness. This mirrors challenges faced in other areas, such as the recent “Dev Log on Steam Recommender[P]” which highlights the iterative nature of algorithm refinement and optimization, a process often requiring significant time and computational resources. And, as with submissions to conferences like ECCV 2026, as mentioned in "[ECCV 2026 camera-ready deadline: June 27 or June 30? [D]”,” the efficient handling of data and resources is paramount to success.
The issues raised by the author are particularly insightful. The low coefficient of variation (CV) in fitness scores after extensive simulation (5000 timesteps) indicates a lack of sufficient diversity in the generated guidance graphs. This is a common pitfall in evolutionary algorithms; convergence to a suboptimal solution before truly exploring the solution space. The computational cost of each generation (30 seconds for 10 candidates) further exacerbates this problem, making it difficult to iterate quickly and test a wider range of mutations. The author's experimentation with different mutation strategies reveals a crucial point: simply generating random changes isn’t enough. The third strategy, focusing on shortest paths between node pairs, shows promise–increasing throughput for high agent counts by mitigating congestion—but the algorithm's success is currently attributed to the mutation strategy itself, not the evolutionary process. This suggests that the selection pressure applied by the evolutionary algorithm may be too weak, or that the fitness function isn’t adequately capturing the desired behavior.
The dissertation's focus on guidance graph optimization is particularly relevant to the broader landscape of AI-powered data management. It's a recognition that the underlying structure of data—how it’s connected and prioritized—plays a crucial role in the efficiency of algorithms operating on that data. This resonates with the movement toward AI-native spreadsheet technology, where the underlying data structures and algorithms are deeply intertwined to maximize productivity. The challenge, as the author experiences, is finding the right balance between customizing the data structure (the guidance graph) and maintaining the flexibility and generalizability of the core algorithm (the MAPF solver). The need to find a mutation strategy that generates high enough variation, while also producing beneficial offspring, highlights the delicate interplay between exploration and exploitation that’s central to any effective evolutionary algorithm.
Ultimately, the author's doubts about the viability of this approach are understandable, but their careful analysis and experimentation demonstrate a strong grasp of the underlying principles. The key moving forward will likely involve refining the fitness function to better reflect the desired performance characteristics, exploring more sophisticated mutation strategies (perhaps incorporating a degree of learning or memory), and potentially increasing the population size to promote greater diversity. The question remains: can an evolutionary algorithm truly unlock the potential of guidance graphs in LMAPF, or are we reaching the limits of what can be achieved through this approach? The answer will likely depend on a deeper understanding of the complex interplay between the algorithm, the graph, and the underlying problem structure.
Hello,
I'm currently working on my dissertation and feel like I could really use some advice from someone who looks at the problem with fresh eyes. I appreciate all input.
The Problem:
Multi Agent Path Finding is the problem of finding paths for several agents to their destinations. Lifelong MAPF is the same, but upon task completion an agent is assigned a new task. For my dissertation (and usually in research) agents move on a grid-like graph and time is discrete. Each timestep an agent can move to an adjacent tile or wait. A good LMAPF algorithm creates paths which maximise average jobs completed per timestep.
Some LMAPF algorithms can also work on weighted graphs where each edge to an adjacent node (or itself) has its own cost. Such a graph is called guidance graph and the choice of edge weights can influence which paths the LMAPF algorithm creates also impacting throughput.
My supervisor wanted to explore whether Evolutionary algorithms can be suitable for finding a guidance graph that improves throughput without changing the underlying LMAPF algorithm. A guidance graph is scenario specific meaning it is optimised for a specific LMAPF algorithm, map, and agent count.
My algorithm so far:
So far I've implemented a very basic evolutionary algorithm. An initial population of guidance graphs is randomly initialized (Limited to 10 at the moment). Then each candidate is plugged into the LMAPF algorithm for a certain amount of time steps and the completed jobs are counted to create that candidates fitness score. The top (2) candidates are selected and the rest are discarded. The top candidates are used to make a new set of candidates (no crossover). These step are repeated indefinitely.
Issues I've has so far:
The simulation can use a seed and is deterministic. The seed determines which nodes the jobs appear on. Using the same guidance graph but different seeds yields random fitness scores. The higher the simulation time the lower the coefficient of variation (standard deviation/mean). For 5000 steps the CV is 0.006. Using guidance graphs with the same parent graph and on different seeds should yield throughputs that have a much higher CV than 0.006 in order for the selection of the best candidates to be somewhat reliable. You could make the argument that given enough time statistically speaking the best candidate will tend towards a better guidance graph but if 9/10 of the candidates I create are worse than the best of the last generation then the solution will tend towards getting worse with each generation.
It seems there are so many ingredients for a working evolutionary algorithm that I am missing: I need a mutation strategy that creates solutions with high enough amount of variation but that don't create better offspring once in a blue moon. Also simulating 5000 time steps takes roughly 30 seconds so 300 seconds for one tiny generation of 10 candidates. If my guidance graph is a 25x25 grid -> 625 tiles -> 3125 weights. If my mutation strategy changes 10 weights at a time it will take years to go through enough iterations to even tough every weight once. If the mutation strategy changes more than 10 weights at a time the change of good changes cancelling out bad ones increases.
Mutation strategies I've tried are:
1. Iterate through each weight. Each has a certain chance of getting mutated by a random amount.
2. Select n amount of tiles. Mutate the 3x3 area around that tile. Each tile gets the same changes.
3. Create n pair of nodes. Calculate the shortest path connecting the nodes of each pair and lower the weight of the edges along that path in one direction while increasing the weights against the direction.
The third method has worked best yet decreasing throughput for low agent counts but increasing throughput for high agent counts by avoiding congestion. However I can't attribute this "success" at all to the evolutionary algorithm but only to the mutation strategy. The other strategies have only produces worse results than a guidance graph with uniform weights.
My supervisor is convinced that there is a way to make this work but I have doubts. Any advice would be very appreciated.
[link] [comments]
Read on the original site
Open the publisher's page for the full experience