Heretic is an open-source tool designed to automatically remove censorship from transformer-based language models without the need for expensive post-training.
Source: README View on GitHub →Heretic is gaining attention due to its innovative approach to censorship removal, which is both automatic and effective. It addresses the pain point of manual censorship removal, which is time-consuming and requires expertise. The project stands out for its use of directional ablation and TPE-based parameter optimization, which allows for high-quality abliteration without damaging the original model's intelligence.
Source: Synthesis of README and project traitsHeretic automatically finds high-quality abliteration parameters by co-minimizing the number of refusals and KL divergence from the original model, resulting in a decensored model that retains much of the original model's intelligence.
Source: READMEHeretic supports most dense models, including many multimodal models, several different MoE architectures, and even some hybrid models like Qwen3.5.
Source: READMEHeretic provides research features such as generating plots of residual vectors and printing details about residual geometry, which support the study of model semantics and interpretability.
Source: READMEThe architecture of Heretic is modular, with distinct components for analysis, configuration, evaluation, and system management. It uses a TPE-based parameter optimizer and directional ablation techniques. The code structure is organized into a clear hierarchy, with a focus on maintainability and scalability.
Source: Code tree + dependency filesCenter: project; inner ring: core feature modules; outer ring: key dependencies. Auto-generated from core_features and tech_stack.key_deps.
acceleratebitsandbytesdatasetshuggingface-hubimmutabledictkernelslangdetectlm-eval[hf]numpyoptunapeftpsutilpy-cpuinfopydantic-settingsquestionaryrichtomli-wtqdmtransformersHeretic is suitable for developers and researchers working with language models who need to remove censorship without manual intervention. It is useful in scenarios where models need to generate responses on sensitive topics, such as political or social issues.
Source: READMEv1.3.0 (2026-05-05): Implemented reproducible runs.
Source: GitHub ReleasesHeretic is a promising project for those working with language models who need to remove censorship efficiently. It is particularly suitable for developers and researchers who require high-quality abliteration without manual intervention and are willing to invest in computational resources.
Source: Synthesis