Some new updates to Papers with Code [P]
Our take
![Some new updates to Papers with Code [P]](https://preview.redd.it/wawma8paeu8h1.png?width=140&height=39&auto=webp&s=8546e40b7710c5a3566b17f2c4635b14b3f19c0b)
The resurgence of Papers with Code, spearheaded by Hugging Face, is a welcome development for the AI research community. As Niels Rogge notes, we’re entering an “age of research,” a sentiment echoed by Ilya Sutskever, and the ability to easily track and build upon the latest findings is paramount. The platform's revival isn't just a nostalgic return to a useful tool; it represents a renewed commitment to open science and collaborative progress. The addition of SOTA badges, clearly signaling state-of-the-art performance across benchmarks, immediately enhances the site's utility. It allows researchers and practitioners to quickly identify the most promising avenues of exploration and understand the competitive landscape. This is particularly relevant when considering the ongoing efforts to bolster model security, as discussed in Are model security risks (extraction, poisoning) actually being tested in production?, highlighting the need for robust evaluation and validation – something Papers with Code now facilitates more effectively. The integration of trending scores, now incorporating data from Hugging Face artifacts like models, datasets, and Spaces, further refines the discovery process, drawing attention to influential techniques like those underpinning the recent GLM-5.2 model.
Beyond the core functionality, the inclusion of support for "external" evaluations is a significant step forward. The legacy Papers with Code website lacked this capability, limiting the scope of available data. Now, users can access a broader range of assessments, gaining a more comprehensive understanding of a paper's performance. The examples provided – FrontierSWE and PostTrainBench numbers for GLM-5.2, and Artificial Analysis data on CritPt – showcase the potential for deeper analysis and comparison. This expanded view is valuable especially when considering preparation for roles in the field, a topic discussed in Just landed a Computer Vision internship, here’s the preparation list I used, where a grasp of benchmark performance is often a key requirement. The platform’s continual expansion of tasks, benchmarks, and evaluations – including the addition of ImageNet, 3D semantic segmentation, and object counting – underscores its ambition to become a central hub for AI research tracking. Even the seemingly minor detail of a new domain, paperswithco.de, demonstrates a commitment to accessibility and global reach.
The shift in ranking methodology, prioritizing GitHub star velocity alongside Hugging Face artifact trending scores, reflects a pragmatic approach to gauging impact. It moves beyond simply measuring citations or academic recognition to incorporate real-world adoption and community engagement. This is crucial in a field where practical application and reproducibility are increasingly vital. The focus on open-source contributions is also commendable, encouraging users to actively participate in shaping the platform’s future. The WACV conference submission deadline, mentioned in WACV supp. mat. video, serves as a timely reminder of the rapid pace of research and the importance of staying abreast of the latest advancements. Papers with Code’s revitalized platform provides a much-needed tool for navigating this complexity.
Ultimately, the revival of Papers with Code represents more than just a website update; it signifies a renewed emphasis on open collaboration and accessible knowledge within the AI community. The platform’s evolution demonstrates a keen understanding of the changing landscape of AI research, moving beyond traditional academic metrics to incorporate real-world impact and community engagement. As AI models continue to proliferate and their applications expand, the ability to track, evaluate, and build upon the latest research will be absolutely essential. A key question to watch is whether the platform can successfully scale its efforts to curate and integrate the ever-increasing volume of AI research being published, ensuring it remains a valuable resource for both researchers and practitioners alike.
| Hi folks, Niels here from the open-source team at Hugging Face. I continue working on a revival of paperswithcode.co as we're back to the "age of research" per Ilya Sutskever! Hence, it's important to discover each other's research and build on each other's work, so we can collectively build the next Transformer. Below, I'll go over each of the new features that were recently added. ## Support for SOTA badges Yes, that's right, totally like the old website. You can see that GLM-5.2, for instance, is obviously the hottest blog post today, achieves SOTA on PostTrainBench, and performs well on many other benchmarks. It is displayed whenever a paper gets a score within the top 3 of a given benchmark. Note that these are displayed on any paper feed, including https://paperswithcode.co/tasks/video-classification, for example. ## New trending score The papers are now ranked based on a new trending metric. This is a combination of the GitHub star velocity and the trending score of the linked Hugging Face artifacts (models, datasets, and Spaces). Previously, this only took into account GitHub star velocity. Thanks to this, papers like IndexCache are now trending, which is a core technique behind the trending GLM-5.2 model. ## Support for external evals Second, I've added support for "external" evals. This is a feature the legacy PwC website didn't actually have. Oftentimes, a paper has way more evals than the ones introduced in the paper itself. You can now view these third-party evals. Some examples:
## More tasks, benchmarks and evals I'm adding more benchmarks and adding evals of more papers. This happens gradually, based on the legacy PwC data available on the hub. Some new benchmarks include: and a lot more. Browse all of them at https://paperswithcode.co/tasks ## New domain Papers with Code is now also available from paperswithco.de :) Let me know what is missing, bug/feature requests, and whether you want to contribute! Kind regards, Niels [link] [comments] |
Read on the original site
Open the publisher's page for the full experience