2 min readfrom Machine Learning

Visual graph classification for blockchain security: Experiences fine-tuning Qwen2-VL on AMD MI300X [D]

Our take

In the pursuit of enhanced blockchain security, I have developed a computer vision approach to detect malicious transaction patterns in the "Agentic Economy." Traditional security methods often falter when confronted with fragmented, high-value transactions. By leveraging a Vision-Language Model (VLM) like Qwen2-VL-2B-Instruct, I transformed transaction flows into 2D graph topologies, revealing distinct adversarial signatures. This innovative method not only streamlines the detection process but also allows for rapid prototyping. I invite feedback and collaboration from others exploring visual anomaly detection in similar contexts.

Hi everyone,

I’ve been working on a computer vision approach to a specific security problem in the "Agentic Economy": identifying malicious transaction patterns that are mathematically obfuscated but topologically distinct.

The Problem

Traditional rule-based security engines and even standard GNNs often struggle with "splitting attacks"—where a high-value transaction is fragmented into thousands of micro-transactions to bypass statistical thresholds. However, when these flows are projected as a 2D graph topology, they exhibit very specific adversarial signatures (Star patterns, centralized hubs, mixing chains).

The Approach: VLM for Graph Classification

Instead of relying on graph embeddings, I’ve experimented with a Vision-Language approach using Qwen2-VL-2B-Instruct. The intuition is that VLMs are increasingly efficient at recognizing structural relationships in 2D layouts.

Technical Specs:

  • Base Model: Qwen2-VL-2B-Instruct.
  • Fine-tuning: LoRA (r=16, alpha=32) targeting attention projections (q, k, v, o).
  • Dataset (Dogon-10K): I generated 10,000 synthetic transaction graph images using NetworkX and Matplotlib. The dataset covers four classes: NORMAL, DRAIN_STAR, MIXING_CHAIN, and COORDINATED_CLUSTER.
  • Hardware / Stack: Trained on an AMD MI300X using the ROCm stack. This was a great opportunity to stress-test PEFT/TRL on AMD hardware for vision-centric tasks.

Why VLM over GNN?

While GNNs are the standard for graph data, the "image-based" approach allowed for faster prototyping of adversarial pattern recognition without the complexity of building a custom graph auto-encoder for every new chain's schema. The VLM’s ability to interpret "visual intent" proved highly effective at distinguishing a decentralized organic ecosystem from a coordinated sybil attack.

Model & Code

The LoRA weights are available on Hugging Face for anyone interested in testing visual graph classification: 🔗 Hugging Face: https://huggingface.co/Ibonon/imina_na_lora

The full source code for the inference engine and the Dogon dataset generator is currently being cleaned up. 🔗 GitHub: [Under Construction]

I’m particularly interested in hearing if anyone else is using VLMs for visual anomaly detection in abstract data structures (like graphs or network logs).

submitted by /u/Any_Good_2682
[link] [comments]

Read on the original site

Open the publisher's page for the full experience

View original article

Tagged with

#generative AI for data analysis#Excel alternatives for data analysis#natural language processing for spreadsheets#no-code spreadsheet solutions#large dataset processing#big data management in spreadsheets#conversational data analysis#automated anomaly detection#cloud-based spreadsheet applications#real-time data collaboration#financial modeling with spreadsheets#intelligent data visualization#data visualization tools#enterprise data management#big data performance#data analysis tools#data cleaning solutions#rows.com#natural language processing#formula generator