Instructions to use answerdotai/ModernBERT-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use answerdotai/ModernBERT-base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="answerdotai/ModernBERT-base")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("answerdotai/ModernBERT-base") model = AutoModelForMaskedLM.from_pretrained("answerdotai/ModernBERT-base") - Notebooks
- Google Colab
- Kaggle
Pretraining Using HF Tokenizers and Transformers
I looked for an end to end example of pretraining a fresh ModernBERT model including the tokenizer (ex. a new language), or fine-tuning an existing checkpoint (ex. ModernBERT-Base) using a custom tokenizer (to account for a different vocabulary of another language family).
A HuggingFace implementation is preferred (saw this but current code is not working).
Hello,
The pre-training codebase should do the trick, it is its main purpose and is optimized. While it is using Composer, you should be able to leverage HF models and tokenizers.
For continued pre-training, someone reported having issue with loading the weights of ModernBERT, so we will investigate and potentially release Composer checkpoints alongside the HF ones when we release all the pre-training checkpoints (which, as stated in the issue, should be better starting points than the post-decay ones).
Thanks. I had a look again at the repo and noticed the FlexBert uses the old bert-base tokenizer. I guess I should wait a bit as the HF way of doing it may require some additional tweaks - ex. issue 163.
Update: got inspiration from this discussion and trained a tiny model.