4 min readfrom VentureBeat

Pinterest cut AI costs 90% by gutting a frontier model's vision layer

Our take

Pinterest has achieved a remarkable 90% reduction in AI costs by reengineering the vision layer of the Qwen3-VL model, a strategic move by CTO Matt Madrigal to enhance efficiency for its 620 million monthly users. By replacing traditional components with proprietary embeddings, Pinterest not only slashed expenses but also improved accuracy by 30%. Madrigal emphasizes the power of customizing open-source models to leverage unique data, enabling a more personalized and efficient visual discovery experience.
Pinterest cut AI costs 90% by gutting a frontier model's vision layer

Pinterest's recent overhaul of its Qwen3-VL vision model illustrates a significant shift in the approach to AI-driven image recommendation systems. By cutting costs by 90% while simultaneously enhancing accuracy by 30%, Pinterest CTO Matt Madrigal has demonstrated a compelling case for the power of customization in leveraging open-source technologies. This move is emblematic of a broader trend where companies are rethinking their reliance on large, pre-packaged AI models in favor of tailored solutions that are better aligned with unique user data. The implications of this strategy extend beyond Pinterest, highlighting a critical evolution in the landscape of data management and visual search. As Madrigal aptly pointed out in a recent podcast, the quality of data can often eclipse the size of the model, ushering in a more efficient and effective paradigm for AI applications.

The transformation of Qwen3-VL involved significant modifications, particularly in the vision encoder layer, which was replaced with proprietary embeddings. This strategic customization not only improved performance but also reduced latency in inference times, a crucial factor for a platform catering to 620 million monthly users. The emphasis on open-source models allows for agile experimentation and adaptation, enabling Pinterest to better respond to user needs. This approach resonates with the idea that, as technology continues to advance, the importance of data quality and relevance cannot be overstated. Other companies looking to innovate in their fields might take a page from Pinterest's playbook and consider how they can leverage open-source frameworks to create tailored solutions that enhance user engagement and satisfaction. For instance, companies in sectors like e-commerce or content curation might find value in creating their own "taste graphs," much like Pinterest's dynamic representation of user preferences.

Moreover, the creation of a "taste graph" to map evolving user interests is a fascinating development that underscores the potential for personalized user experiences in digital platforms. This graph not only captures what users engage with but also what they might aspire to explore next. It represents a shift from static metrics of user behavior to a more fluid understanding of user intent, which is particularly important in a space where users are often seeking inspiration rather than definitive answers. Pinterest's ability to facilitate “lateral exploration” as users transition from discovery to intent highlights the growing expectation that digital platforms should offer more than just basic functionality. They should provide a seamless journey that inspires creativity and drives engagement. The creation of such a nuanced understanding of user preferences could set a new standard in how technology interfaces with consumer behavior, potentially influencing everything from marketing strategies to product development.

As companies continue to navigate the complexities of AI and machine learning, the lessons learned from Pinterest’s approach could serve as a valuable guide. The focus on customization over mere adoption of large-scale models encourages a culture of innovation that prioritizes user experience. For organizations across various sectors, embracing this mindset may unlock new opportunities for enhancing engagement and improving service delivery.

Looking ahead, it will be intriguing to observe how this trend evolves. Will other tech giants follow Pinterest's lead in prioritizing customization and user-centric data strategies? The potential for revolutionizing user engagement through tailored AI solutions is immense, and as more companies recognize the value of unique data, we may witness a significant shift in how digital platforms operate. The landscape of AI is undeniably changing, and organizations must adapt or risk falling behind in this rapidly evolving market.

At 620 million monthly users, calling a frontier model for every image recommendation isn't a strategy — it's a bill. Pinterest CTO Matt Madrigal solved it by gutting Qwen3-VL's vision layer and rebuilding it with proprietary embeddings, cutting costs 90% and boosting accuracy 30%.

Madrigal’s team has been heavily investing in customizing open-source models “foundationally in-house.”

“If you've got really unique data that you can then fine-tune an open source model with, data quality will, frankly, outweigh or overcome model size,” Madrigal explained in a recent VB Beyond the Pilot podcast

How Pinterest customized Qwen for visual discovery

Pinterest, which has around 620 million monthly active users, has long applied open source models for visual search and discovery, going back to Google’s BERT and OpenAI’s CLIP. The company fine-tuned its own Pin CLIP on the latter, incorporating proprietary visual embeddings and image metadata. 

Pinterest’s conversational shopping assistant, Navigator 1, was built on Qwen3-VL and customized in “pretty significant” ways. Madrigal’s team essentially “ripped out” Qwen’s vision encoder layer and fine-tuned the model on proprietary multimodal embeddings. This has allowed them to capture metadata around pins and images that can then be precomputed offline and regularly retrained on new information to deliver personalized experiences. 

“Open-source models, especially with open Apache licenses where you can truly tweak a lot of open weights and customize for unique use cases — that's where we've found open source to be so powerful for us,” Madrigal said. 

Bringing their own embeddings allows his team to gain context around metadata, pins, and images; also, notably, the model performs better at runtime and inference. Without these embeddings, devs would have to call and encode each image returned at runtime, one at a time. That results in a latency “20 times worse” from an inference perspective, Madrigal said. 

“If it's something that's going to be critical for our end users, that's going to drive engagement, that will have to scale to over 600 million monthly active users, we're going to either probably build it or we're going to leverage open source and customize the heck out of it,” he said. 

How a taste graph captures evolving interests

To guide users from inspiration to purchase, Madrigal's team built a "taste graph": a dynamic representation of what individual users actually like, not just what they click on. “It's this representation of billions of people's evolving tastes,” he said. 

People go to Google or other search engines when they have a clear picture of what they want; Pinterest is for when they’re still in the discovery phase, Madrigal said. Pinterest’s goal is to encourage “lateral exploration” and transform discovery to intent (that is, clicking through ads or making purchases). 

Under the hood, the architecture combines a graph structure with representational learning. User embeddings capture a user’s evolving tastes. These are constantly updated based on activity and new content and signals. “It's not a social graph,” Madrigal said. “It's much more of a preference graph: What's going to inspire you? What are you trying to do next?” 

For instance, one user may be into mid-century modern designs; another may prefer a Nantucket aesthetic. Those preferences will be captured in user embeddings, and the taste graph will deliver up specific, relevant products as a result. 

“You go from the upper funnel, inspiration discovery, all the way through lower funnel intent,” Madrigal said. 

Listen to the full podcast to hear more about:

  • How Pinterest uses sandboxes to encourage creativity in a way that is secure and contained; 

  • Why a continuous feedback loop can prevent visual AI slop; 

  • The importance of constant benchmarking to gauge user engagement, performance, latency, and other factors. 

You can also listen and subscribe to Beyond the Pilot on Spotify, Apple or wherever you get your podcasts.

Read on the original site

Open the publisher's page for the full experience

View original article

Tagged with

#generative AI for data analysis#Excel alternatives for data analysis#natural language processing for spreadsheets#financial modeling with spreadsheets#conversational data analysis#real-time data collaboration#big data performance#enterprise data management#big data management in spreadsheets#google sheets#rows.com#intelligent data visualization#data visualization tools#data analysis tools#data cleaning solutions#modern spreadsheet innovations#machine learning in spreadsheet applications#cloud-based spreadsheet applications#real-time collaboration#enterprise-level spreadsheet solutions