Instructions to use answerdotai/ModernBERT-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use answerdotai/ModernBERT-base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="answerdotai/ModernBERT-base")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("answerdotai/ModernBERT-base") model = AutoModelForMaskedLM.from_pretrained("answerdotai/ModernBERT-base") - Notebooks
- Google Colab
- Kaggle
Precisions about the config properties wrt the paper
In https://huggingface.co/answerdotai/ModernBERT-base/blob/main/config.json , we see "hidden_activation": "gelu" and "position_embedding_type": "absolute" (even though rope related settings do appear in the config as well), whereas the paper says that GeGLU and RoPE are used respectively. Is it expected and a strangeness coming from the transformers library itself or is it a misconfig/export ? Thanks
As we mention in the paper, GeGLU is GLU with GeLU instead of sigmoid. "hidden_activation": "gelu" is correct.
We adopt GeGLU (Shazeer, 2020), a Gated-Linear Units (GLU)-based (Dauphin et al., 2017) activation function built on top of the original BERT’s GeLU.
I believe position_embedding_type is a default config argument in transformers. ModernBERT doesn't use it, I'll have to check if we can remove it from the config.