May 24, 2026•3 min read•from Machine Learning

PapersWithCode new features - week 1 [P]

Our take

Exciting developments are underway at PapersWithCode! After just one week since its revival, we’re thrilled to introduce new features that enhance your experience in tracking state-of-the-art AI advancements. This week, we've added support for multiple metrics in leaderboards, external paper submissions, paper lineage displays, and new popular methods. Plus, you can now easily share leaderboard screenshots on social media. For those looking to deepen their understanding of AI applications, check out our article, "Beyond the Model: Why Data Scientists Must Embrace APIs and API Documentation.

PapersWithCode new features - week 1 [P]

The recent updates from Niels and the Hugging Face team regarding the revival of PapersWithCode are not just incremental changes; they signal a significant step forward in how researchers and developers engage with the rapidly evolving field of AI. By reintroducing this platform, which allows users to track the state-of-the-art across various domains, Hugging Face is positioning itself at the forefront of AI innovation. As we see developments like those detailed in the recent announcements, such as the ability to support multiple metrics for benchmarks and external paper submissions, it becomes clear that the landscape of AI research is becoming more interconnected and user-friendly. This aligns with broader trends in the industry, where accessibility and collaboration are essential for driving progress. For instance, the importance of APIs in facilitating such developments cannot be understated, as discussed in our article, Beyond the Model: Why Data Scientists Must Embrace APIs and API Documentation.

One of the standout features introduced is the support for multiple metrics, which allows for a more nuanced understanding of model performance. This is particularly vital in fields like automatic speech recognition and object detection, where different metrics can provide insights into various aspects of a model’s capabilities. By offering leaderboards that report multiple performance indicators, researchers can engage in a more comprehensive evaluation of their work. This development not only fosters healthy competition but also encourages researchers to iterate and innovate based on well-rounded feedback. Such a shift marks a significant evolution from traditional methods where a single metric often dictated the narrative around model effectiveness.

Moreover, the inclusion of external papers beyond traditional repositories like Arxiv opens the door for a broader array of research contributions. This move reflects a growing understanding of the diverse ways knowledge is shared in the AI community. By enabling submissions from platforms such as GitHub and BiorXiv, PapersWithCode acknowledges the value of interdisciplinary research and the rapid dissemination of ideas. In doing so, it empowers researchers to showcase their work without the constraints of conventional publishing timelines. This aspect resonates with our insights in The Ultimate Beginners’ Guide to Building an AI Agent in Python, where we emphasize the importance of accessibility in learning and development within the AI space.

The introduction of paper lineage tracking is another noteworthy enhancement. By displaying the relationships between papers—such as follow-ups or predecessors—PapersWithCode is fostering a culture of continuity and evolution in research. This feature not only aids in tracing the progression of ideas but also helps newcomers in the field to navigate the complex tapestry of AI research more effectively. In a discipline that is characterized by rapid change, understanding how current research builds upon previous work is invaluable. This approach can inspire a new generation of researchers to contribute to the ongoing dialogue in AI.

Looking ahead, the community's response to these features will be crucial. As more users provide feedback and suggest new capabilities, the platform can evolve further, aligning even more closely with the needs of its users. The establishment of a dedicated communication channel on Discord will likely enhance this collaborative spirit. As we witness the landscape of AI research continue to mature, one question remains: how will platforms like PapersWithCode adapt to the ever-growing demands for transparency and accessibility in AI? The future of AI research is not just about developing models; it’s about creating an ecosystem that nurtures innovation through collaboration and shared knowledge.

Hi,

Niels here from the open-source team at Hugging Face. It's been one week since I launched paperswithcode.co, a revival of the website we all loved. It allows us to keep track of the state-of-the-art (SOTA) across various domains of AI, from agents to computer vision and time-series forecasting.

The reception has been great, and I'm excited to extend this over the next few months.

This week, I've added the following features:

- Support for multiple metrics for a given benchmark: leaderboards now support multiple metrics, see e.g., the Open ASR Leaderboard for automatic speech recognition, which supports both Word Error Rate (WER) and the Inverse Real-Time Factor (RTFx) metrics, or the Object Detection leaderboard, which now also reports frames-per-second (FPS) besides mean average precision (mAP) on COCO.

https://preview.redd.it/owlxn0b5u23h1.png?width=2878&format=png&auto=webp&s=1dff2f8feab4f160f77c97ceeb5d90e82382e63c

- Support for external papers: We do support submitting papers beyond Arxiv, such as a Github repo, a blog post, BiorXiv, and more. You can submit a paper at paperswithcode.co/submit. AI will automatically enrich it with task and method tags, the GitHub repo, evals, and more. See e.g. DeepSeek-v4 below, which is not on Arxiv:

https://preview.redd.it/uogbt0fjw23h1.png?width=2928&format=png&auto=webp&s=8b81e48af69b8935ddeb569d882d866b3e9ba216

- Support for paper lineage: whenever a paper has a follow-up or predecessor, this will be displayed with a small banner above the abstract. See e.g. Mamba-3, DINOv2 and GLM-4.5.

https://preview.redd.it/f6vgtd1du23h1.png?width=2228&format=png&auto=webp&s=f8627f7669405f1766eecfd3322e925e15b4806d

- New methods: support for new methods based on popularity, including Gated DeltaNet, Kimi Delta Attention, Mamba-2, and more. Each method also lists all papers that cite it. Find all supported methods here.

https://preview.redd.it/6pzagifvu23h1.png?width=2984&format=png&auto=webp&s=400efdc9677d1fbd369eedf684e622dd8c807973

- Support for screenshotting a leaderboard for easy sharing on social media: each benchmark now includes a "copy image" button both on the scatter plot and table, which can be shared on social media. Try it on ClawEval, for example.

https://preview.redd.it/w7y7t7xnw23h1.png?width=2950&format=png&auto=webp&s=cb70ad91c6ba075e49b743d6e34f157d22266f04

- Added many more evals: we are adding evals gradually, starting with all models supported in the Transformers library. So far, we have about 3k evals! Find them at the bottom of each paper page, e.g. Qwen 3.6.

https://preview.redd.it/zao056s9x23h1.png?width=2218&format=png&auto=webp&s=540d87f473be05cb6f9c0aca88afa74fd4373e15

Happy to hear more feature requests and feedback!

I will also launch a channel on the Hugging Face Discord server for easier communication. You can also chime in on the GitHub thread here.

Cheers,

Niels

submitted by /u/NielsRogge
[link] [comments]

Read on the original site

Open the publisher's page for the full experience

View original article →

Reviving PapersWithCode (by Hugging Face) [P]Hi, Niels here from the open-source team at Hugging Face. Like many others, I was a huge fan of paperswithcode. Sadly, that website is no longer maintained after its acquisition by Meta. Hence, I've been working on reviving it. I obviously use AI agents to parse papers at scale and automatically generate leaderboards (for now I'm the one verifying results). So far, I've only parsed high-impact papers for which I know they're SOTA, like Qwen 3.5 and 3.6, RF-DETR for object detection, DINOv3, SOTA embedding models from the MTEB leaderboard, the Open ASR Leaderboard for automatic speech recognition models, etc. For now, it includes the following: trending papers by default based on Github star velocity categorization by domain, e.g., OCR methods, which PwC used to have, e.g., RLVR eval results for high-impact papers, see e.g., Qwen 3.5 at the bottom leaderboards for each domain, e.g., MMTEB or COCO val 2017 support for citation counts (you can also see the most cited papers by domain!) automated linked Github, project page URLs, and artifacts (+ multiple repos are supported on a paper page) support for external papers beyond Arxiv, see e.g., DeepSeek v4 Harness reports for coding agent benchmarks, e.g., Terminal Bench "Sign in with HF" and Storage Buckets are used to store humbnails, paper PDFs, and overall data backups. I'm curious about your feedback + feature requests! Try it at paperswithcode.co https://preview.redd.it/whwji560fw1h1.png?width=3452&format=png&auto=webp&s=55bb7a30c1be58d140f7efcb07a31c6dac5693c7 See e.g. the SOTA leaderboard for Terminal Bench 2.0: https://preview.redd.it/98w9pi89fw1h1.png?width=3456&format=png&auto=webp&s=408fb64b0ba85ba24f55daa81d547d7c68e73951 A paper page looks like this: https://paperswithcode.co/paper/2602.15763 https://preview.redd.it/fiizit6dfw1h1.png?width=3450&format=png&auto=webp&s=9ea05a77ca5583a2fb395dccc95ba52c433362c5 submitted by /u/NielsRogge [link] [comments]

PapersWithCode new features - week 1 [P]

Related Articles