Instructions to use answerdotai/ModernBERT-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use answerdotai/ModernBERT-base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="answerdotai/ModernBERT-base")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("answerdotai/ModernBERT-base") model = AutoModelForMaskedLM.from_pretrained("answerdotai/ModernBERT-base") - Notebooks
- Google Colab
- Kaggle
⚠️ Training Loss Drops to 0.0 and Validation Loss Becomes NaN When Fine-Tuning ModernBERT
#83
by onurulu17 - opened
Hi everyone,
I’m fine-tuning ModernBERT-base on a custom text corpus and running into a serious issue where the training loss suddenly collapses to 0.0 and the validation loss turns into NaN after several thousand steps.
Here’s what I’m seeing in the logs:
Step Training Loss Validation Loss
... ... ...
4000 45.5973 43.5447
4500 35.7538 34.0837
5000 0.0000 nan
5500 0.0000 nan
6000 0.0000 nan
6500 0.0000 nan
7000 0.0000 nan
After around step 5000, the loss becomes 0.0 and never recovers — the model stops updating entirely.
My training setup includes:
bf16=True(on A100 GPU)gradient_checkpointing=Truelr_scheduler_type="cosine"learning_rate=3e-5per_device_train_batch_size=8gradient_accumulation_steps=9
Does anyone know the cause or have a solution for this issue?