2 min readfrom Machine Learning

[P] QLoRA Fine-Tuning of Qwen2.5-1.5B for CEFR English Proficiency Classification (A1–C2) [P]

Our take

In this study, we fine-tuned the Qwen2.5-1.5B model using QLoRA to classify English texts according to the CEFR proficiency levels (A1-C2). Our dataset comprised 1,785 balanced samples across various domains, generated to maintain linguistic complexity and structure. The fine-tuning process utilized 4-bit NF4 quantization with LoRA adapters, achieving an accuracy of 84.9%. This model supports adaptive language learning, placement testing, and educational NLP applications. We welcome feedback on evaluation methodologies and ways to enhance C2 classification performance.

I fine-tuned Qwen2.5-1.5B for multi-class CEFR English proficiency classification using QLoRA (4-bit NF4).

The goal was to classify English text into one of the 6 CEFR levels (A1 → C2), which can be useful for:

  • adaptive language learning systems,
  • placement testing,
  • readability estimation,
  • educational NLP applications.

Dataset

The dataset contains 1,785 English texts balanced across:

  • 6 CEFR levels,
  • 10 domains/topics.

The samples were synthetically generated using:

  • Groq API
  • Llama-3.3-70B

Generation constraints were designed to preserve:

  • vocabulary complexity,
  • grammatical progression,
  • sentence structure variation,
  • CEFR-specific linguistic patterns.

Training Setup

Base model:

  • Qwen2.5-1.5B

Fine-tuning method:

  • QLoRA
  • 4-bit NF4 quantization
  • LoRA adapters

Only ~0.28% of model parameters were trained.

Results

Held-out test set:

  • 179 samples

Metrics:

  • Accuracy: 84.9%
  • Macro F1: 84.9%

Per-level recall:

Level Recall
A1 96.6%
A2 90.0%
B1 90.0%
B2 86.7%
C1 86.7%
C2 60.0%

Most errors come from C1/C2 confusion, which is expected due to the subtle linguistic boundary between those levels.

Deployment

I also built:

  • a FastAPI inference API,
  • Docker deployment setup.

Example Usage

from transformers import AutoModelForSequenceClassification, AutoTokenizer import torch model = AutoModelForSequenceClassification.from_pretrained( "yanou16/cefr-english-classifier" ) tokenizer = AutoTokenizer.from_pretrained( "yanou16/cefr-english-classifier" ) text = "Artificial intelligence is transforming many industries." inputs = tokenizer(text, return_tensors="pt") with torch.no_grad(): outputs = model(**inputs) pred = outputs.logits.argmax(dim=-1).item() print(pred) 

Feedback is welcome, especially regarding:

  • evaluation methodology,
  • synthetic data quality,
  • improving C2 classification performance,
  • better benchmarking approaches.
submitted by /u/Professional-Pie6704
[link] [comments]

Read on the original site

Open the publisher's page for the full experience

View original article

Tagged with

#natural language processing for spreadsheets#generative AI for data analysis#Excel alternatives for data analysis#machine learning in spreadsheet applications#enterprise-level spreadsheet solutions#large dataset processing#big data performance#spreadsheet API integration#rows.com#AI formula generation techniques#big data management in spreadsheets#conversational data analysis#business intelligence tools#cloud-based spreadsheet applications#real-time data collaboration#financial modeling with spreadsheets#intelligent data visualization#no-code spreadsheet solutions#natural language processing#data visualization tools