I created an LLM post-training method called RPS. Preliminary results show that it improved Qwen3-8b's program synthesis reliability. [R]

Our take

I’m excited to introduce RPS, a novel post-training method for large language models inspired by neuroscience. Preliminary results demonstrate that RPS significantly enhances the program synthesis reliability of Qwen3-8b, achieving 95.4% execution accuracy compared to 72.5% with equal learning rates. RPS employs a two-stage training process, initially focusing on easier data to foster rapid learning, followed by a challenging phase that slows the learning rate for deeper comprehension. For more insights, check out our article on using .npy datasets with 3D models.

The introduction of the Regressive Plasticity Schedule (RPS), a novel post-training method for large language models (LLMs), marks an exciting advancement in AI model training. Inspired by the principles of neuroplasticity, RPS adopts a two-stage training approach that mimics human learning. In the first stage, the model is exposed to simpler data at a high learning rate, allowing it to establish foundational skills. The second stage introduces more complex data but reduces the learning rate significantly, creating an environment for deeper learning. This methodology is reminiscent of concepts like curriculum learning and learning rate decay, yet RPS uniquely synthesizes these ideas to optimize program synthesis effectiveness.

The preliminary results of RPS are compelling. In evaluations, the Qwen3-8b model trained with RPS achieved a remarkable 4% in ARC-AGI 1 public eval scores, significantly outperforming the equal learning rate method (EPS), which garnered only 2.4%. In program synthesis tasks, RPS demonstrated a success rate of 1145 out of 1200 executions without error, compared to EPS's 870 out of 1200. These findings suggest that RPS not only enhances reliability in program synthesis but also pushes the boundaries of what is achievable in LLM training. This development is particularly relevant as organizations increasingly seek ways to improve the efficiency and reliability of their AI systems.

The implications of RPS extend beyond mere performance metrics; they challenge the prevailing notions of AI training paradigms. By embracing a model that mimics human learning processes, RPS aligns AI development more closely with how we understand and teach complex tasks. This paradigm shift could inspire further innovations in AI training methodologies, prompting researchers and practitioners to reconsider traditional approaches. As the field evolves, we may see a trend towards more biologically inspired techniques that prioritize adaptability and reliability, much like the ongoing discussions in current AI research, such as those explored in articles like using .npy dataset with 3D models and Do VLMs in production still use fixed-patch ViTs for their vision capabilities?.

Looking ahead, the success of RPS raises critical questions about the future of AI training methodologies. Can we expect more advancements that prioritize learning strategies mimicking human cognition? What other fields of study might inform the next generation of AI models? As we witness the ongoing evolution of AI technology, innovations like RPS not only hold the potential to enhance model performance but also to redefine our understanding of machine learning itself. By fostering an environment that encourages exploration and learning from both successes and failures, the AI community is poised to unlock new possibilities that could transform how we interact with technology in our daily lives.

RPS is inspired by neuroscience. As humans, we learn basic skills as kids with high neuro-plasticity. We then learn advanced skills as teens and adults with low neuro-plasticity. RPS trains a model in 2 stages. In stage 1, the model is trained on easy data with high learning rate. In stage 2, the model is trained on hard data with 10% the learning rate of stage 1. RPS is basically a combination of existing ideas: curriculum learning + learning rate decay.

ARC-AGI 1 public eval scores:

base model: Qwen3-8b

RPS: 4%

EPS (equal learning rate in both stages): 2.4%

Program Synthesis Stats:

Program executions without error:

RPS: 1145/1200

EPS: 870/1200

https://iamjasonfeng.blogspot.com/2026/05/regressive-plasticity-schedule.html

https://github.com/iamjasonfeng/RPS

submitted by /u/iamjasonfeng
[link] [comments]

Read on the original site

Open the publisher's page for the full experience

View original article →