REX1 300M

REX is a recursive decoder-only Transformer language model. This repository uses custom Transformers code, so load it with trust_remote_code=True.

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(".", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(".")

.pt`.

Training Notes

  • Tokenizer: gpt2
  • Context length: 1024
  • Training output dir: runs/rex-300m-mixed-2

This is a base language model checkpoint and is not instruction-aligned unless noted.

Downloads last month
41
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support