YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Fine-tuned Models
This directory contains supervised fine-tuned models for our three-source interaction experiments. The models are trained on data constructed from either CommonsenseQA (CSQA) or GSM8K, using different source-interaction variants described in the paper.
Model Naming Convention
Each model folder follows the format:
{dataset}__{base_model}__{training_variant}_r{rank}_bs{batch_size}_lr{learning_rate}_e{epochs}
Example:
csqa__llama3_8b_instruct__all_variants_r8_bs4_lr1e5_e3
Fields
dataset
csqa: training data constructed from the CommonsenseQA dataset.gsm8k: training data constructed from the multiple-choice GSM8K dataset.
base_model
llama3_8b_instruct: Llama 3 8B Instruct model.qwen3_8b: Qwen3 8B non-thinking instruction model.
training_variant
all_variants: mixed SFT using all source-interaction probe variants, including bare, single-source, and double-source patterns.bare100: standard SFT using only bare prompts without external user or document assertions.
r
- LoRA rank. For example,
r8means LoRA rank 8.
- LoRA rank. For example,
bs
- Training batch size. For example,
bs4means batch size 4.
- Training batch size. For example,
lr
- Learning rate. For example,
lr1e5means learning rate1e-5.
- Learning rate. For example,
e
- Number of training epochs. For example,
e3means 3 epochs.
- Number of training epochs. For example,
Training Variants
bare100
This setting fine-tunes the model only on standard question-answer examples without external assertions. It corresponds to the standard SFT baseline.
all_variants
This setting fine-tunes the model on diverse source-interaction patterns. The training data includes the bare prompt, single-source prompts, and double-source prompts involving user and document assertions. These variants are designed to teach the model to better distinguish helpful external information from harmful or misleading information.
For details about the probe construction and SFT setup, see the methodology and fine-tuning sections of the paper.
Available Models
csqa__llama3_8b_instruct__all_variants_r8_bs4_lr1e5_e3csqa__llama3_8b_instruct__bare100_r8_bs4_lr1e5_e3csqa__qwen3_8b__all_variants_r8_bs4_lr1e5_e3csqa__qwen3_8b__bare100_r8_bs4_lr1e5_e3gsm8k__llama3_8b_instruct__all_variants_r8_bs4_lr1e5_e3gsm8k__llama3_8b_instruct__bare100_r8_bs4_lr1e5_e3gsm8k__qwen3_8b__all_variants_r8_bs4_lr1e5_e3gsm8k__qwen3_8b__bare100_r8_bs4_lr1e5_e3