Tag: Qwen-3.6

Qwen3.6-27B Abliteration Benchmarked: Five Techniques Under the Microscope

Five different groups abliterated the same AI model. When I ran the maths benchmarks, their scores ranged from 27.5% to 75.1%. That is a 47.6 percentage point gap. It looks like some techniques made the model way better at maths and others broke it. But when I dug into why, it turned out nobody got smarter or dumber. The abliteration just changed how long they think before answering. The real scores were all within 2.8 percentage points of each other.

May 17, 2026