FMD: Comprehensive Data Compression in Medical Domain via Fused Matching Distillation

Ju Heon Son, Jang Hwan Choi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Medical datasets are often large and contain sensitive information, presenting significant challenges for data sharing and storage. To address these issues, this paper introduces a novel method called Fused Matching Distillation (FMD), which combines multiple dataset distillation techniques to achieve both data compression and enhanced privacy. FMD synthesizes representative subsets that capture the essential information from the original dataset while anonymizing sensitive details during the distillation process. The proposed approach integrates two complementary methods: parameter matching,a technique that aligns the training trajectories of a teacher network trained on real data with those of a student network trained on synthetic data, and feature distribution matching, which ensures that the synthetic dataset closely approximates the feature distribution of the original data. By fusing these techniques, FMD maximizes the information density within each pixel of the distilled dataset, achieving a balance between compression and performance. Experimental evaluations on medical datasets, including COVID chest X-ray and Pancreas cancer CT, demonstrate that FMD achieves superior accuracy and privacy compared to existing methods. Furthermore, the proposed method is evaluated using metrics for both model performance and anonymity, showing that FMD not only maintains high diagnostic accuracy but also effectively anonymizes the data. This makes FMD a promising tool for secure and efficient medical data sharing. The code is available at the provided link11https://github.com/juheonewha/FMD.git.

Original languageEnglish
Title of host publicationProceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3406-3415
Number of pages10
ISBN (Electronic)9798331510831
DOIs
StatePublished - 2025
Event2025 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025 - Tucson, United States
Duration: 28 Feb 20254 Mar 2025

Publication series

NameProceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025

Conference

Conference2025 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025
Country/TerritoryUnited States
CityTucson
Period28/02/254/03/25

Bibliographical note

Publisher Copyright:
© 2025 IEEE.

Keywords

  • dataset distillation
  • distribution matching
  • medical dataset
  • parameter matching
  • privacy preservation

Fingerprint

Dive into the research topics of 'FMD: Comprehensive Data Compression in Medical Domain via Fused Matching Distillation'. Together they form a unique fingerprint.

Cite this