3 min readfrom Machine Learning

Some new updates to Papers with Code [P]

Our take

Papers with Code (paperswithcode.co) is back, signaling a renewed focus on collaborative research within the AI community. Driven by the need to accelerate progress, this revival introduces key updates to streamline discovery and build upon existing work. Now, easily identify State-of-the-Art (SOTA) papers, track trending research based on GitHub activity and Hugging Face artifact performance, and explore external evaluations beyond those initially presented. Discover expanded task and benchmark coverage, including ImageNet and 3D semantic segmentation, and access Papers with Code in German at paperswithco.de.
Some new updates to Papers with Code [P]

The resurgence of Papers with Code, spearheaded by Hugging Face, is a welcome development for the AI research community. As Niels Rogge notes, we’re entering an “age of research,” a sentiment echoed by Ilya Sutskever, and the ability to easily track and build upon the latest findings is paramount. The platform's revival isn't just a nostalgic return to a useful tool; it represents a renewed commitment to open science and collaborative progress. The addition of SOTA badges, clearly signaling state-of-the-art performance across benchmarks, immediately enhances the site's utility. It allows researchers and practitioners to quickly identify the most promising avenues of exploration and understand the competitive landscape. This is particularly relevant when considering the ongoing efforts to bolster model security, as discussed in Are model security risks (extraction, poisoning) actually being tested in production?, highlighting the need for robust evaluation and validation – something Papers with Code now facilitates more effectively. The integration of trending scores, now incorporating data from Hugging Face artifacts like models, datasets, and Spaces, further refines the discovery process, drawing attention to influential techniques like those underpinning the recent GLM-5.2 model.

Beyond the core functionality, the inclusion of support for "external" evaluations is a significant step forward. The legacy Papers with Code website lacked this capability, limiting the scope of available data. Now, users can access a broader range of assessments, gaining a more comprehensive understanding of a paper's performance. The examples provided – FrontierSWE and PostTrainBench numbers for GLM-5.2, and Artificial Analysis data on CritPt – showcase the potential for deeper analysis and comparison. This expanded view is valuable especially when considering preparation for roles in the field, a topic discussed in Just landed a Computer Vision internship, here’s the preparation list I used, where a grasp of benchmark performance is often a key requirement. The platform’s continual expansion of tasks, benchmarks, and evaluations – including the addition of ImageNet, 3D semantic segmentation, and object counting – underscores its ambition to become a central hub for AI research tracking. Even the seemingly minor detail of a new domain, paperswithco.de, demonstrates a commitment to accessibility and global reach.

The shift in ranking methodology, prioritizing GitHub star velocity alongside Hugging Face artifact trending scores, reflects a pragmatic approach to gauging impact. It moves beyond simply measuring citations or academic recognition to incorporate real-world adoption and community engagement. This is crucial in a field where practical application and reproducibility are increasingly vital. The focus on open-source contributions is also commendable, encouraging users to actively participate in shaping the platform’s future. The WACV conference submission deadline, mentioned in WACV supp. mat. video, serves as a timely reminder of the rapid pace of research and the importance of staying abreast of the latest advancements. Papers with Code’s revitalized platform provides a much-needed tool for navigating this complexity.

Ultimately, the revival of Papers with Code represents more than just a website update; it signifies a renewed emphasis on open collaboration and accessible knowledge within the AI community. The platform’s evolution demonstrates a keen understanding of the changing landscape of AI research, moving beyond traditional academic metrics to incorporate real-world impact and community engagement. As AI models continue to proliferate and their applications expand, the ability to track, evaluate, and build upon the latest research will be absolutely essential. A key question to watch is whether the platform can successfully scale its efforts to curate and integrate the ever-increasing volume of AI research being published, ensuring it remains a valuable resource for both researchers and practitioners alike.

Some new updates to Papers with Code [P]

Hi folks,

Niels here from the open-source team at Hugging Face. I continue working on a revival of paperswithcode.co as we're back to the "age of research" per Ilya Sutskever! Hence, it's important to discover each other's research and build on each other's work, so we can collectively build the next Transformer. Below, I'll go over each of the new features that were recently added.

## Support for SOTA badges

Yes, that's right, totally like the old website. You can see that GLM-5.2, for instance, is obviously the hottest blog post today, achieves SOTA on PostTrainBench, and performs well on many other benchmarks. It is displayed whenever a paper gets a score within the top 3 of a given benchmark.

Note that these are displayed on any paper feed, including https://paperswithcode.co/tasks/video-classification, for example.

https://preview.redd.it/wawma8paeu8h1.png?width=2418&format=png&auto=webp&s=0ba3b6a0eaef231b7f3ca468cc3db4120f1b9e4d

## New trending score

The papers are now ranked based on a new trending metric. This is a combination of the GitHub star velocity and the trending score of the linked Hugging Face artifacts (models, datasets, and Spaces). Previously, this only took into account GitHub star velocity.

Thanks to this, papers like IndexCache are now trending, which is a core technique behind the trending GLM-5.2 model.

https://preview.redd.it/b6g04w2ogu8h1.png?width=2380&format=png&auto=webp&s=13d59bbadd5f8e8295deac2ee6e1e0e3dbc0f40f

## Support for external evals

Second, I've added support for "external" evals. This is a feature the legacy PwC website didn't actually have. Oftentimes, a paper has way more evals than the ones introduced in the paper itself. You can now view these third-party evals. Some examples:

https://preview.redd.it/mfnfdzxpeu8h1.png?width=1914&format=png&auto=webp&s=2b909ecf7c6e3fc088fd0a46fbc56f6859dfaf17

## More tasks, benchmarks and evals

I'm adding more benchmarks and adding evals of more papers. This happens gradually, based on the legacy PwC data available on the hub.

Some new benchmarks include:

- ImageNet - 10% of the data

https://preview.redd.it/wr55g27ofu8h1.png?width=2880&format=png&auto=webp&s=e6e5ef7e3a36cd5aa6d2841b149194239f4ad1e0

- 3D semantic segmentation:

https://preview.redd.it/zxgobrnqfu8h1.png?width=2880&format=png&auto=webp&s=6ee2935981825d5d7825709294ddb84a4b7a3ac9

- object counting:

https://preview.redd.it/uhv4wbrsfu8h1.png?width=2880&format=png&auto=webp&s=183decb144d9779e41bf12ca58fbaab66cd29cbf

and a lot more. Browse all of them at https://paperswithcode.co/tasks

## New domain

Papers with Code is now also available from paperswithco.de :)

Let me know what is missing, bug/feature requests, and whether you want to contribute!

Kind regards,

Niels

submitted by /u/NielsRogge
[link] [comments]

Read on the original site

Open the publisher's page for the full experience

View original article

Tagged with

#generative AI for data analysis#Excel alternatives for data analysis#natural language processing for spreadsheets#conversational data analysis#data analysis tools#big data management in spreadsheets#real-time data collaboration#financial modeling with spreadsheets#intelligent data visualization#no-code spreadsheet solutions#data visualization tools#enterprise data management#big data performance#data cleaning solutions#rows.com#cloud-based spreadsheet applications#Papers with Code#research#benchmarks#SOTA