BEGIN:VCALENDAR VERSION:2.0 PRODID:-//132.216.98.100//NONSGML kigkonsult.se iCalcreator 2.20.4// BEGIN:VEVENT UID:20250810T151948EDT-1801tZv6JC@132.216.98.100 DTSTAMP:20250810T191948Z DESCRIPTION:Mitigating the impact of data bias through synthetic data gener ators\n\nLamin Juwara\, CHEO & University of Ottawa.\n Tuesday March 28\, 1 2-1pm\n Zoom Link: https://mcgill.zoom.us/j/86855481591\n\nAbstract: Data b ias is a pervasive problem in biomedical research\, especially in large-sc ale observational studies. During statistical modeling\, underrepresentati on of specific covariate categories (e.g.\, gender\, ethnicity\, etc.) in the training cohort typically results in inconsistent estimations and impr ecise predictions. While various bias-mitigating approaches have been prop osed in recent years\, these methods are not always effective especially w hen the source of bias is unclear or the severity is extreme (e.g.\, more than 50% missing covariate category). We propose a novel bias-mitigating a pproach that combines the simplicity of random oversampling and the utilit y of synthetic data generation. The approach involves augmenting randomly selected synthetic samples of the minor covariate category with the bias t raining cohort in order to rebalance the covariate distributions. The appr oach is termed Synthetic Minor Augmentation (SMA) and is demonstrated thro ugh extensive simulations and applications on several real data examples. \n\nIn this talk\, I will review the current standards for mitigating data bias at the data analysis stage. I will then demonstrate how synthetic da ta generation could be simultaneously utilized to preserve data privacy an d mitigate the impact of data bias. In particular\, I will show how the us e of light gradient boosting machines for data synthesis could generate su itable supplementary samples of the underrepresented covariate categories. The resulting data model is compared to some current standards including subsampling and matching.\n DTSTART:20230328T160000Z DTEND:20230328T170000Z SUMMARY:QLS Seminar Series - Lamin Juwara URL:/qls/channels/event/qls-seminar-series-lamin-juwar a-346554 END:VEVENT END:VCALENDAR