2 min readfrom Machine Learning

[P] Vibecoded on a home PC: building a ~2700 Elo browser-playable neural chess engine with a Karpathy-inspired AI-assisted research loop

Our take

Introducing Vibecoded, a groundbreaking project that showcases a browser-playable neural chess engine achieving ~2700 Elo, built from the ground up as a personal exploration into AlphaZero-style systems. Leveraging a Karpathy-inspired AI-assisted research loop, I utilized an RTX 4090 to optimize a residual CNN and transformer model, trained on over 100 million positions. This accessible tool allows users to engage with the engine through interactive features like game analysis, puzzles, and move exploration.

I built Autochess NN, a browser-playable neural chess engine that started as a personal experiment in understanding AlphaZero-style systems by actually building one end to end.

This project was unapologetically vibecoded - but not in the “thin wrapper around an API” sense. I used AI heavily as a research/coding assistant in a Karpathy-inspired autoresearch workflow: read papers, inspect ideas, prototype, ablate, optimize, repeat. The interesting part for me was seeing how far that loop could go on home hardware (just ordinary gaming RTX 4090).

Current public V3:

  • residual CNN + transformer
  • learned thought tokens
  • ~16M parameters
  • 19-plane 8x8 input
  • 4672-move policy head + value head
  • trained on 100M+ positions
  • pipeline: 2200+ Lichess supervised pretraining -> Syzygy endgame fine-tuning -> self-play RL with search distillation
  • CPU inference + shallow 1-ply lookahead / quiescence (below 2ms)

I also wrapped it in a browser app so the model is inspectable, not just benchmarked: play vs AI, board editor, PGN import/replay, puzzles, and move analysis showing top-move probabilities and how the “thinking” step shifts them.

What surprised me is that, after a lot of optimization, this may have ended up being unusually compute-efficient for its strength - possibly one of the more efficient hobbyist neural chess engines above 2500 Elo. I’m saying that as a hypothesis to pressure-test, not as a marketing claim, and I’d genuinely welcome criticism on evaluation methodology.

I’m now working on V4 with a different architecture:

  • CNN + Transformer + Thought Tokens + DAB (Dynamic Attention Bias) @ 50M parameters

For V5, I want to test something more speculative that I’m calling Temporal Look-Ahead: the network internally represents future moves and propagates that information backward through attention to inform the current decision.

Demo: https://games.jesion.pl

Project details: https://games.jesion.pl/about

Price: free browser demo. Nickname/email are only needed if you want to appear on the public leaderboard.

  1. The feedback I’d value most:
  2. Best ablation setup for thought tokens / DAB
  3. Better methodology for measuring Elo-vs-compute efficiency on home hardware
  4. Whether the Temporal Look-Ahead framing sounds genuinely useful or just fancy rebranding of something already known
  5. Ideas for stronger evaluation against classical engines without overclaiming

Cheers, Adam

submitted by /u/Adam_Jesion
[link] [comments]

Read on the original site

Open the publisher's page for the full experience

View original article

Related Articles

Tagged with

#generative AI for data analysis#Excel alternatives for data analysis#natural language processing for spreadsheets#financial modeling with spreadsheets#rows.com#self-service analytics tools#conversational data analysis#workflow automation#self-service analytics#data analysis tools#spreadsheet API integration#neural chess engine#AlphaZero#AI-assisted research#Karpathy-inspired#residual CNN#transformer#thought tokens#Elo rating#self-play RL