Building a 9-ball AI player: Candidate generation for direct cut shots [P]
Our take
![Building a 9-ball AI player: Candidate generation for direct cut shots [P]](/_next/image?url=https%3A%2F%2Fpreview.redd.it%2Faw3nvcz9b7zg1.png%3Fwidth%3D640%26crop%3Dsmart%26auto%3Dwebp%26s%3D290f1b7c9c7f46325041938b2b4069997e202143&w=3840&q=75)
| I'm building a 9-ball-player to help with pattern play. There are many ways to make the next ball, and sometimes in more than one obvious pocket. Which should should you choose depends on probability of making that shot AND ending up in a favorable spot for the next shot, that is also amenable to getting good position for the shot after. To that end, I have built the following components:
The ground truth: pooltool Pool physics is well-modeled but expensive. I use pooltool python library, a solid open-source billiards simulator with accurate ball-cushion-pocket-felt interactions. A single shot takes ~5–15 ms to simulate end-to-end on one CPU thread for the typical 1–3 object-ball layouts that come up in shot evaluation; full racks (9 object balls) push that to ~20–50 ms because there are more pairwise collisions to track. Sounds fast until you do the math. For each layout I want candidate shots into 6 pockets, and each pocket has a 5-dimensional parameter space to search: speed, aim angle, elevation of cue stick, side spin, follow/draw. A naive grid sweep over even a coarse 10 steps per dimension is 100K combinations × 10 ms = ~17 minutes per pocket per decision. Iterative optimizers like CMA-ES bring that down to ~500–1000 sims per pocket, but that's still ~5–10 seconds per pocket, ~30–60 seconds per layout. For training a value network with millions of decisions, that's months of compute. Faster evaluation of candidates The shot selection needs to know if the shot will go without simulating every possible shot. But we don't need the final position of the table just yet. I approached the problem by splitting the shot into what the object ball needs to do and how to hit the cue ball to accomplish that. So the first component for shot making is an Then I created a That was an improvement but it has holes due to discretization. To cover these holes, I built a The beauty of it is that, I can use the shot index to get decent starting parameter set for shots and apply small perturbations across different parameters and evaluate them in a batch using the throw model on a GPU really fast. Speed up in my setup was around 10000x compared to simulating all those shots through the physics engine which makes a world of difference in generating enough self play data. Batch of 1000 candidate shots takes 1 ms to evaluate. Compare that to 1000 simulations x 10 ms on average. I then cluster all the shots that are predicted to fall within the acceptance window of the intended pocket using bucketing around speed, spin and draw. I evaluate representatives from each cluster using the physics engine using noisy simulation that adds execution noise to the shots. We don't want to find that 1-in-a-million shot that can't be executed reliably. Then I use the maximum expected value of the table state after the shot using the Given I still do physics simulations once I find my candidates, the end-to-end speedup was around 50-100x. Shot selection visualization To make things more concrete, I set up a 8-9 ball layout where cue ball is in the center of the table, 8 ball is towards the top left and 9 ball is at the bottom rail. The colors represent p(win) given the 9-ball position (provided 9-ball is not moved during the shot). For this post, I simulated the selected 10 shots 20 times. 6/10 shots made all 20, 3 of them 19/20 and 1 of them 15/20. Colors of the cue ball paths reflect the make rate on those 20 shots. I only plotted one of the 20 noisy sims for each of the 10, others will end up pretty close. The black region around the 9-ball is all less than 1 ball away from the 9-ball and represents invalid positions for the cue ball as it would infringe on the 9-ball space. In this post I only talked about direct shots but I do have templated bank shots, kick shots, carom and combination shots as well that is baked into the p(win) heatmap plot - obviously carom and combination shots don't apply here for the 9-ball only case. What's next? I'm working on curriculum learning. P(win) model using only the 9-ball is straightforward: pocket the 9 and you win (if you don't scratch). If you scratch, you lose since any half decent opponent will make the 9-ball with a ball in hand. If you miss, the reward is (1-p(win)) from the resulting state. I have simulated ~100k shots with full shot selection options and used 4x symmetry for the p(win) model. I re-do the shot selection for any shot that's not 100% make as my model updates and could lead to different shot selection / safety positions. Once the single ball scenario is "solved", I'll move to 2 ball scenarios where making the on-ball results in a solved state where we look up the value from the model. Misses gets re-evaluated between iterations of the model. I'll advance the curriculum as it masters <n ball scenarios and master n ball setups all the way up to 9. Tried lots of things that didn't work. For example, bank model improved quite a bit when i gave it the ghost pocket angle (based on mirroring) as a feature (physics informed ML). Happy to share details about any of it if there's interest. [link] [comments] |
Read on the original site
Open the publisher's page for the full experience