Technology
Intel and Penn Medicine Announce Results of Largest Medical Federated Learning Study
Intel Labs and the Perelman School of Medicine at the University of Pennsylvania (Penn Medicine) have completed a joint research study using federated learning – a distributed machine learning (ML) artificial intelligence (AI) approach – to help international healthcare and research institutions identify malignant brain tumors. The largest medical federated learning study to date with an unprecedented global dataset examined from 71 institutions across six continents, the project demonstrated the ability to improve brain tumor detection by 33%.
“Federated learning has tremendous potential across numerous domains, particularly within healthcare, as shown by our research with Penn Medicine. Its ability to protect sensitive information and data opens the door for future studies and collaboration, especially in cases where datasets would otherwise be inaccessible. Our work with Penn Medicine has the potential to positively impact patients across the globe and we look forward to continuing to explore the promise of federated learning.”
–Jason Martin, principal engineer, Intel Labs
Why It Matters: Data accessibility has long been an issue in healthcare because of state and national data privacy laws, including the Health Insurance Portability and Accountability Act (HIPAA). Because of this, medical research and data sharing at scale have been almost impossible to achieve without compromising patient health information. Intel’s federated learning hardware and software comply with data privacy concerns and preserve data integrity, privacy and security through confidential computing.
The Penn Medicine-Intel result was accomplished by processing high volumes of data in a decentralized system using Intel federated learning technology paired with Intel® Software Guard Extensions (SGX), which removes data-sharing barriers that have historically prevented collaboration on similar cancer and disease research. The system addresses numerous data privacy concerns by keeping raw data inside the data holders’ compute infrastructure and only allowing model updates computed from that data to be sent to a central server or aggregator, not the data itself.
“All of the computing power in the world can’t do much without enough data to analyze,” said Rob Enderle, principal analyst, Enderle Group. “This inability to analyze data that has already been captured has significantly delayed the massive medical breakthroughs AI has promised. This federated learning study showcases a viable path for AI to advance and achieve its potential as the most powerful tool to fight our most difficult ailments.”
Senior author Spyridon Bakas, PhD, assistant professor of Pathology & Laboratory Medicine and Radiology at the Perelman School of Medicine, said, “In this study, federated learning shows its potential as a paradigm shift in securing multi-institutional collaborations by enabling access to the largest and most diverse dataset of glioblastoma patients ever considered in the literature, while all data are retained within each institution at all times. The more data we can feed into machine learning models, the more accurate they become, which in turn can improve our ability to understand and treat even rare diseases, such as glioblastoma.”
To advance the treatment of diseases, researchers must access large amounts of medical data – in most cases, datasets that exceed the threshold that one facility can produce. The research demonstrates the effectiveness of federated learning at scale and the potential benefits the healthcare industry can realize when multisite data silos are unlocked. Benefits include early detection of disease, which could improve quality of life or increase a patient’s lifespan.
The results of the Penn Medicine-Intel Labs research were published in the peer-reviewed journal, Nature Communications.
Why It Matters: Data accessibility has long been an issue in healthcare because of state and national data privacy laws, including the Health Insurance Portability and Accountability Act (HIPAA). Because of this, medical research and data sharing at scale have been almost impossible to achieve without compromising patient health information. Intel’s federated learning hardware and software comply with data privacy concerns and preserve data integrity, privacy and security through confidential computing.
The Penn Medicine-Intel result was accomplished by processing high volumes of data in a decentralized system using Intel federated learning technology paired with Intel® Software Guard Extensions (SGX), which removes data-sharing barriers that have historically prevented collaboration on similar cancer and disease research. The system addresses numerous data privacy concerns by keeping raw data inside the data holders’ compute infrastructure and only allowing model updates computed from that data to be sent to a central server or aggregator, not the data itself.
“All of the computing power in the world can’t do much without enough data to analyze,” said Rob Enderle, principal analyst, Enderle Group. “This inability to analyze data that has already been captured has significantly delayed the massive medical breakthroughs AI has promised. This federated learning study showcases a viable path for AI to advance and achieve its potential as the most powerful tool to fight our most difficult ailments.”
Senior author Spyridon Bakas, PhD, assistant professor of Pathology & Laboratory Medicine and Radiology at the Perelman School of Medicine, said, “In this study, federated learning shows its potential as a paradigm shift in securing multi-institutional collaborations by enabling access to the largest and most diverse dataset of glioblastoma patients ever considered in the literature, while all data are retained within each institution at all times. The more data we can feed into machine learning models, the more accurate they become, which in turn can improve our ability to understand and treat even rare diseases, such as glioblastoma.”
To advance the treatment of diseases, researchers must access large amounts of medical data – in most cases, datasets that exceed the threshold that one facility can produce. The research demonstrates the effectiveness of federated learning at scale and the potential benefits the healthcare industry can realize when multisite data silos are unlocked. Benefits include early detection of disease, which could improve quality of life or increase a patient’s lifespan.
Source: Intel