Instructions to use LongfeiHuang/SDMatte with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use LongfeiHuang/SDMatte with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("LongfeiHuang/SDMatte", dtype=torch.bfloat16, device_map="cuda") prompt = "Turn this cat into a dog" input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png") image = pipe(image=input_image, prompt=prompt).images[0] - Notebooks
- Google Colab
- Kaggle
import torch
from diffusers import DiffusionPipeline
from diffusers.utils import load_image
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("LongfeiHuang/SDMatte", dtype=torch.bfloat16, device_map="cuda")
prompt = "Turn this cat into a dog"
input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png")
image = pipe(image=input_image, prompt=prompt).images[0]SDMatte: Grafting Diffusion Models for Interactive Matting
This repository contains the model and code for SDMatte: Grafting Diffusion Models for Interactive Matting, as presented in the paper:
SDMatte: Grafting Diffusion Models for Interactive Matting
Abstract
Recent interactive matting methods have shown satisfactory performance in capturing the primary regions of objects, but they fall short in extracting fine-grained details in edge regions. Diffusion models trained on billions of image-text pairs, demonstrate exceptional capability in modeling highly complex data distributions and synthesizing realistic texture details, while exhibiting robust text-driven interaction capabilities, making them an attractive solution for interactive matting. To this end, we propose SDMatte, a diffusion-driven interactive matting model, with three key contributions. First, we exploit the powerful priors of diffusion models and transform the text-driven interaction capability into visual prompt-driven interaction capability to enable interactive matting. Second, we integrate coordinate embeddings of visual prompts and opacity embeddings of target objects into U-Net, enhancing SDMatte's sensitivity to spatial position information and opacity information. Third, we propose a masked self-attention mechanism that enables the model to focus on areas specified by visual prompts, leading to better performance. Extensive experiments on multiple datasets demonstrate the superior performance of our method, validating its effectiveness in interactive matting.
Code and Usage
The official code and model are available at the following GitHub repository: https://github.com/vivoCameraResearch/SDMatte
- Downloads last month
- -
Model tree for LongfeiHuang/SDMatte
Base model
stabilityai/stable-diffusion-2