I’m exploring AI-enhanced Master Data Management (MDM) systems and the feasibility of using transformer models for dynamic entity resolution. What strategies exist for minimizing inference latency while maintaining match precision in production-grade systems? Any experiences integrating models like BERT or DistilBERT into data pipelines would be appreciated.