•1 min read•from Machine Learning
Why Is Table Extraction with VLM Models Still Challenging? [D]
Our take
Table extraction using Vision Language Models (VLM) presents unique challenges, particularly when dealing with complex formats like borderless tables or those with numerous columns. Users often find traditional tools insufficient for converting PDFs to Markdown, especially in financial contexts. While some solutions, such as LandingAI, show promise, they often come with a cost. Many are seeking effective open-source alternatives that can simplify this process. If you have recommendations or insights into viable options, your expertise could greatly benefit those navigating these challenges.
![Why Is Table Extraction with VLM Models Still Challenging? [D]](/_next/image?url=https%3A%2F%2Fpreview.redd.it%2Ftajjcvjt5jyg1.png%3Fwidth%3D140%26height%3D113%26auto%3Dwebp%26s%3D093a8b5a7ba7d0e240beef7a6803c7d0023e73ae&w=3840&q=75)
| Hey everyone, I’m struggling to find a good approach for converting PDFs to Markdown (especially for financial data). The main challenge is handling borderless tables and tables with more than 5–6 columns. I’ve tried docling, graphite-docling, marker, etc., but haven’t found a solid open-source solution. The only thing that works well so far is LandingAI (but it’s paid). Does anyone know of a good open-source alternative? TIA! Sample: [link] [comments] |
Read on the original site
Open the publisher's page for the full experience
Tagged with
#financial modeling with spreadsheets#generative AI for data analysis#Excel alternatives for data analysis#natural language processing for spreadsheets#rows.com#big data management in spreadsheets#conversational data analysis#financial modeling#real-time data collaboration#intelligent data visualization#data visualization tools#enterprise data management#big data performance#data analysis tools#data cleaning solutions#table extraction#VLM models#PDFs to Markdown#financial data#borderless tables