sahil2801/CodeAlpaca-20k
Viewer • Updated • 20k • 20.4k • 230
How to use JonathanMiddleton/daisy with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="JonathanMiddleton/daisy")
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages) # Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("JonathanMiddleton/daisy", dtype="auto")How to use JonathanMiddleton/daisy with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "JonathanMiddleton/daisy"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "JonathanMiddleton/daisy",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker model run hf.co/JonathanMiddleton/daisy
How to use JonathanMiddleton/daisy with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "JonathanMiddleton/daisy" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "JonathanMiddleton/daisy",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "JonathanMiddleton/daisy" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "JonathanMiddleton/daisy",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'How to use JonathanMiddleton/daisy with Docker Model Runner:
docker model run hf.co/JonathanMiddleton/daisy
Custom byte-level BPE tokenizer trained for the Daisy language model, optimized for Python code and instruction-following tasks.
| Property | Value |
|---|---|
| Vocabulary size | 49,152 |
| Algorithm | Byte-level BPE |
| Pre-tokenizer | Llama-3 style regex |
| Chat format | ChatML |
| Max length | 131,072 tokens |
| Training date | 2026-01-14 |
<|tool_call|> / <|tool_result|> patterns<|python|> / <|output|> for calculator-style reasoning<|think|> tokens for reasoning blocks| Token | ID | Purpose |
|---|---|---|
<|endoftext|> |
49131 | End of sequence / BOS |
<|pad|> |
49132 | Padding token |
<|im_start|> |
49133 | Start of message (ChatML) |
<|im_end|> |
49134 | End of message (ChatML) |
<|tool_call|> |
49135 | Start of tool call |
<|/tool_call|> |
49136 | End of tool call |
<|tool_result|> |
49137 | Start of tool result |
<|/tool_result|> |
49138 | End of tool result |
<|python|> |
49139 | Start of Python expression |
<|/python|> |
49140 | End of Python expression |
<|output|> |
49141 | Start of computed output |
<|/output|> |
49142 | End of computed output |
<|think|> |
49143 | Start of reasoning block |
<|/think|> |
49144 | End of reasoning block |
<|system|> |
49145 | System role marker |
<|user|> |
49146 | User role marker |
<|assistant|> |
49147 | Assistant role marker |
<|reserved_0|> |
49148 | Reserved |
<|reserved_1|> |
49149 | Reserved |
<|reserved_2|> |
49150 | Reserved |
<|reserved_3|> |
49151 | Reserved |
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("jonathanmiddleton/daisy")
# Basic encoding
tokens = tokenizer.encode("Hello, world!")
# Chat formatting
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"},
{"role": "assistant", "content": "Hi there! How can I help you?"},
]
text = tokenizer.apply_chat_template(messages, tokenize=False)
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{user_message}<|im_end|>
<|im_start|>assistant
{assistant_message}<|im_end|>
<|im_start|>assistant
Let me calculate that for you.
<|tool_call|>{"name": "calculator", "arguments": {"expression": "2 + 2"}}<|/tool_call|>
<|tool_result|>4<|/tool_result|>
The answer is 4.<|im_end|>
Benchmarked against common tokenizers on Python code, prose, and instruction data:
| Tokenizer | Vocab Size | Chars/Token | Tokens |
|---|---|---|---|
| meta-llama/Llama-3.2-3B-Instruct | 128,000 | 4.391 | 88,644 |
| Qwen/Qwen2.5-1.5B-Instruct | 151,643 | 4.366 | 89,139 |
| HuggingFaceTB/SmolLM2-135M-Instruct | 49,152 | 3.906 | 99,650 |
| JonathanMiddleton/daisy | 49,131 | 3.766 | 103,349 |
| microsoft/phi-2 | 50,257 | 3.628 | 107,290 |
| openai-community/gpt2 | 50,257 | 3.152 | 123,467 |
| Tokenizer | Vocab Size | Chars/Token | Tokens |
|---|---|---|---|
| meta-llama/Llama-3.2-3B-Instruct | 128,000 | 4.681 | 466,617 |
| JonathanMiddleton/daisy | 49,131 | 4.594 | 475,422 |
| openai-community/gpt2 | 50,257 | 4.584 | 476,460 |
| microsoft/phi-2 | 50,257 | 4.584 | 476,461 |
| Qwen/Qwen2.5-1.5B-Instruct | 151,643 | 4.563 | 478,607 |
| HuggingFaceTB/SmolLM2-135M-Instruct | 49,152 | 4.475 | 488,120 |
| Tokenizer | Vocab Size | Chars/Token | Tokens |
|---|---|---|---|
| meta-llama/Llama-3.2-3B-Instruct | 128,000 | 4.771 | 737,130 |
| Qwen/Qwen2.5-1.5B-Instruct | 151,643 | 4.731 | 743,360 |
| JonathanMiddleton/daisy | 49,131 | 4.487 | 783,803 |
| HuggingFaceTB/SmolLM2-135M-Instruct | 49,152 | 4.455 | 789,399 |
| microsoft/phi-2 | 50,257 | 4.437 | 792,658 |
| openai-community/gpt2 | 50,257 | 4.254 | 826,711 |
| Tokenizer | Python | Prose | Instruction | Average |
|---|---|---|---|---|
| meta-llama/Llama-3.2-3B-Instruct | 4.391 | 4.681 | 4.771 | 4.614 |
| Qwen/Qwen2.5-1.5B-Instruct | 4.366 | 4.563 | 4.731 | 4.554 |
| JonathanMiddleton/daisy | 3.766 | 4.594 | 4.487 | 4.282 |
| HuggingFaceTB/SmolLM2-135M-Instruct | 3.906 | 4.475 | 4.455 | 4.278 |
| microsoft/phi-2 | 3.628 | 4.584 | 4.437 | 4.216 |
| openai-community/gpt2 | 3.152 | 4.584 | 4.254 | 3.997 |
Key findings: Daisy achieves competitive compression with a ~49K vocabulary, ranking 2nd among tested similar-sized tokenizers for prose and instructions while maintaining strong Python performance.
Apache 2.0