How to get model architecture/parameter names from the previous version

#76

by zekeZZ - opened Jan 31, 2024

Jan 31, 2024

Hi. I have previously finetuned a Phi model, before the updates made https://huggingface.co/microsoft/phi-1_5/commit/d3ba318b780bfb92942c28853066fe4036d1b496. Now when I try to load my model using the previous code model = AutoModelForCausalLM.from_pretrained(model_path, use_flash_attention_2=False, trust_remote_code=True,device_map=device), I encounter an error that some weights are now used, like mentioned in https://huggingface.co/microsoft/phi-1_5/discussions/70. But I think mine issue is different since I try to load a previously trained model. Is there an easy way to fix this? Thanks!

zekeZZ

Jan 31, 2024

Some attempt that I made: I have tried to load config and the model with a revision,
config = AutoConfig.from_pretrained('microsoft/phi-1_5', revision='24f9ea14df973a49a0d87c16d04df88d90067468', trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained(model_path, config=config, use_flash_attention_2=False, torch_dtype=torch.float32, revision='24f9ea14df973a49a0d87c16d04df88d90067468', trust_remote_code=True, device_map=device) , however, this gives AttributeError: 'PhiConfig' object has no attribute 'attention_dropout'. It suggests that the config file is successfully retrieving the given revision, but the model initialization is not

shubham008

Apr 23, 2024

i am facing The repository for microsoft/phi-1_5 contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co/microsoft/phi-1_5. Please pass the argument trust_remote_code=True to allow custom code to be run. this error in my fine tuned model anyone suggest solution.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment