AI & ML interests

PII detection, data anonymization, privacy-preserving AI, LLM security, information extraction, multilingual NLP

Recent Activity

mah04  updated a Space about 11 hours ago
DomainShield/README
mah04  published a Space about 11 hours ago
DomainShield/README
View all activity

Organization Card

DomainShield

Privacy protection for LLM pipelines

DomainShield is a research project focused on preventing sensitive data leakage when using external large language model APIs.

Overview

The system acts as a middleware firewall:

  • Masks sensitive information before sending data to external LLMs
  • Handles both general PII and domain-specific sensitive entities
  • Reconstructs the original content after receiving the response

Key Focus

  • PII masking (names, emails, identifiers)
  • Domain-specific entity protection (internal terms, codes, private vocabularies)
  • Multilingual robustness under noisy conditions
  • Comparison of adaptation methods (prompting, RAG, fine-tuning, NER)

Approach

We evaluate multiple strategies for detecting and masking sensitive data:

  • Prompt-based methods
  • Retrieval-augmented approaches (RAG)
  • Supervised fine-tuning (LoRA)
  • Token classification (NER)
  • Hybrid and ensemble methods

Status

Active research project. Models, benchmarks, and demos coming soon.

models 0

None public yet

datasets 0

None public yet