•1 min read•from Machine Learning
[P] I trained an AI to play Resident Evil 4 Remake using Behavioral Cloning + LSTM
Our take
In this project, I trained an AI to navigate the challenges of Resident Evil 4 Remake using Behavioral Cloning and LSTM techniques. By recording gameplay trajectories—such as running, shooting, and reloading—I developed a model that imitates my decision-making. The integration of LSTM allowed the AI to retain memory across time steps, enhancing its performance in single encounters. However, it struggled with complex scenarios, particularly when faced with multiple enemies.
![[P] I trained an AI to play Resident Evil 4 Remake using Behavioral Cloning + LSTM](https://external-preview.redd.it/zgmJOxETuqgqlsgMxeBl7S4gZNDHf_K3U9w883ioT4M.jpeg?width=320&crop=smart&auto=webp&s=a63f97b9d03c40b846cd3eaac472e78050020a43)
| I recorded gameplay trajectories in RE4's village — running, shooting, reloading, dodging — and used Behavioral Cloning to train a model to imitate my decisions. Added LSTM so the AI could carry memory across time steps, not just react to the current frame. The most interesting result: the AI handled single enemies reasonably well, but struggled with the fight-or-flee decision when multiple enemies were on screen simultaneously. That nuance was hard to imitate without more data. Full video breakdown on YouTube. Source code and notebooks here: https://github.com/paulo101977/notebooks-rl/tree/main/re4 Happy to answer questions about the approach. [link] [comments] |
Read on the original site
Open the publisher's page for the full experience
Related Articles
- Training an AI to play Resident Evil Requiem using Behavior Cloning + HG-DAgge [P]Code of Project: https://github.com/paulo101977/notebooks-rl/tree/main/re_requiem I’ve been working on training an agent to play a segment of Resident Evil Requiem, focusing on a fast-paced, semi-linear escape sequence with enemies and time pressure. Instead of going fully reinforcement learning from scratch, I used a hybrid approach: Behavior Cloning (BC) for initial policy learning from human demonstrations HG-DAgger to iteratively improve performance and reduce compounding errors The environment is based on gameplay capture, where I map controller inputs into a discretized action space. Observations are extracted directly from frames (with some preprocessing), and the agent learns to mimic and then refine behavior over time. One of the main challenges was the instability early on — especially when the agent deviates slightly from the demonstrated trajectories (classic BC issue). HG-DAgger helped a lot by correcting those off-distribution states. Another tricky part was synchronizing actions with what’s actually happening on screen, since even small timing mismatches can completely break learning in this kind of game. After training, the agent is able to: Navigate the sequence consistently React to enemies in real time Recover from small deviations (to some extent) I’m still experimenting with improving robustness and generalization (right now it’s quite specialized to this segment). Happy to share more details (training setup, preprocessing, action space, etc.) if anyone’s interested. submitted by /u/AgeOfEmpires4AOE4 [link] [comments]
- I Trained an AI to Beat Final Fight… Here’s What Happened [p]Hey everyone, I’ve been experimenting with Behavior Cloning on a classic arcade game (Final Fight), and I wanted to share the results and get some feedback from the community. The setup is fairly simple: I trained an agent purely from demonstrations (no reward shaping initially), then evaluated how far it could go in the first stage. I also plan to extend this with GAIL + PPO to see how much performance improves beyond imitation. A couple of interesting challenges came up: Action space remapping (MultiBinary → emulator input) Trajectory alignment issues (obs/action offset bugs 😅) LSTM policy behaving differently under evaluation vs manual rollout Managing rollouts efficiently without loading everything into memory The agent can already make some progress, but still struggles with consistency and survival. I’d love to hear thoughts on: Improving BC performance with limited trajectories Best practices for transitioning BC → PPO Handling partial observability in these environments Here’s the code if you want to see the full process and results: notebooks-rl/final_fight at main · paulo101977/notebooks-rl Any feedback is very welcome! submitted by /u/AgeOfEmpires4AOE4 [link] [comments]
- I Trained an AI to Beat Final Fight… Here’s What Happened [P]Hey everyone, I’ve been experimenting with Behavior Cloning on a classic arcade game (Final Fight), and I wanted to share the results and get some feedback from the community. The setup is fairly simple: I trained an agent purely from demonstrations (no reward shaping initially), then evaluated how far it could go in the first stage. I also plan to extend this with GAIL + PPO to see how much performance improves beyond imitation. A couple of interesting challenges came up: Action space remapping (MultiBinary → emulator input) Trajectory alignment issues (obs/action offset bugs 😅) LSTM policy behaving differently under evaluation vs manual rollout Managing rollouts efficiently without loading everything into memory The agent can already make some progress, but still struggles with consistency and survival. I’d love to hear thoughts on: Improving BC performance with limited trajectories Best practices for transitioning BC → PPO Handling partial observability in these environments Here’s the code if you want to see the full process and results: notebooks-rl/final_fight at main · paulo101977/notebooks-rl Any feedback is very welcome! submitted by /u/AgeOfEmpires4AOE4 [link] [comments]
Tagged with
#rows.com#real-time data collaboration#big data management in spreadsheets#generative AI for data analysis#conversational data analysis#Excel alternatives for data analysis#financial modeling with spreadsheets#intelligent data visualization#no-code spreadsheet solutions#real-time collaboration#data visualization tools#enterprise data management#big data performance#data analysis tools#data cleaning solutions#AI#Resident Evil 4#Behavioral Cloning#LSTM#fight-or-flee decision