AI Briefing Paper

Executive Summary

Artificial IntelligenceAI in healthcare is the use of complex algorithms and software to emulate human cognition in the analysis of complicated medical data. Specifically, AI is the ability of computer a... (AI) is being increasingly adopted across a variety of sectors in the United States, becoming a major priority for the U.S. Government. The healthcare industry has been one of the leading adopters of AI, as healthcare providers have applied AI to improve patient outcomes, increase organizational efficiency, and reduce costs. Much of this growth in AI can be attributed to the expanding availability and sharing of health data. The Office of the Chief Technology Officer (CTO) at the U.S. Department of Health and Human Services (HHS) is hosting a Roundtable on Sharing and Utilizing Health Data for AI Applications on April 16, 2019. This Roundtable will address priorities for applying AI at HHS and opportunities for HHS to enable the continued growth of AI in the healthcare sector.

This Briefing Paper serves as a primer for participants attending that Roundtable, and is divided into the following sections:

Introduction (pg. 3)
Background and terminology (pg. 3-4)
Emerging applications of AI in healthcare (pg. 4-6)
Emerging applications of AI within HHS (pg. 6-8)
Data requirements, accountability, and ethics (pg. 8-10)
National policy context and international AI strategies (pg. 10-12)

This briefing paper incorporates journal articles, blog posts, and news articles that feature overarching trends, applications, and challenges. This document is not intended to provide a complete overview of all AI applications and research in the healthcare space, but highlights useful background information for Roundtable participants that may help to frame the day’s discussions.

Introduction

Artificial Intelligence (AI) has the ability to help transform health care by improving diagnosis, treatment, and the provision of patient care. Much of this progress depends on sharing and utilizing large amounts of health data. While the private sector has driven much of the innovation in this field, the federal government can play a major role. As a major federal data steward, the U.S. Department of Health and Human Services (HHS) can support this transformation by enabling the application of AI inside and outside of government.

The HHS Office of the Chief Technology Officer (CTO) is now exploring the potential for a department-wide AI strategy to help realize these opportunities, and to establish policies and practices for facilitating AI development. This strategy comes in tandem with the White House Executive Order on Artificial Intelligence and the “State of Data Sharing at the US Department of Health and Human Services Report” published in September 2018. This Roundtable will bring together HHS leaders, and experts in AI and health data from other federal and state government agencies, industry, academia, and patient-centered research organizations. Together, they will identify high priority health applications of AI and the key issues for an HHS AI strategy to address. The Roundtable will inform an HHS AI strategic plan in two ways:

Priorities for Applying AI within HHS: AI approaches to help HHS manage its own data, facilitate HHS research, or help HHS achieve its mission in other ways.
Opportunities to Support External AI Development: Activities to help support data-driven AI applications in industry, academia, and research institutions, such as the release of specific datasets, issuing AI challenges, and other actions.

This Briefing Paper is not a complete overview of all AI applications and research in the healthcare space, but highlights useful background information for Roundtable participants.

Background and Terminology

Artificial Intelligence denotes the process by which a computer can be trained to successfully perform tasks that traditionally have been conducted by humans. This includes learning how to perform certain tasks either under supervision or autonomously using large quantities of data.

Machine Learningan application of AI that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the devel..., Deep Learning & Natural Language ProcessingA specialized branch of artificial intelligence focused on the interpretation and manipulation of human-generated spoken or written data.

Machine Learning is the process by which computers are trained to ‘learn’ by exposing them to data. Machine learning is a subset of AI and Deep Learning is a further subset of machine learning. Deep Learning is the process by which algorithms can learn to identify hierarchies within data that allow for truly complex understandings of data. Natural Language Processing (NLP) refers to the subfield of machine learning designed to allow computers to examine, extract, and interpret data that is structured within a language.

Algorithms

Most AI applications depend on algorithms, which describe a logical process that follows a set of rules. Computers can be taught a series of steps in order to process large amounts of data to produce a desired outcome. There are two forms of algorithm:

Supervised algorithms use ‘training data sets’ in which the input factors and output are known in advance. Supervised algorithms produce highly accurate algorithms because the ‘right answers’ are already known. For example, scientists may feed a dataset of retina images into the algorithm in which board-certified physicians have already identified and agreed upon diagnoses for each image.
Unsupervised algorithms refer to a process whereby data is fed into the algorithm and the computer has to ‘learn’ what to look for. Unlike the training data sets fed into supervised algorithms, the data fed into unsupervised algorithms do not necessarily include the ‘right answers.’ Unsupervised algorithms are adept at finding clusters of relationships between observations in the data, but may identify erroneous relationships because they are not instructed what to look for.

Augmented Intelligence

Augmented Intelligence is a form of AI that enhances human capabilities rather than replaces physicians and healthcare providers. Augmented Intelligence has been embraced by physician organizations to underscore that emerging AI systems are designed to aid humans in clinical decision-making, implementation, and administration to scale health care. In a recent White Paper, Intel framed Augmented Intelligence as the AI tools that perform specific tasks and are designed to support users, rather than replace human users.

Data

At the core of AI is the need for high quality data. Algorithms are designed around data, which underscores the importance of providing clean and accurate data. Researchers emphasize the need for big, multifaceted datasets that allow machine learning processes to incorporate as many factors as possible into analysis. Doing so will allow researchers to better understand their data, which in turn allows them to improve their training of the algorithm. Artificial intelligence also demands clear, accountable data governance with defined data elements and processes for ensuring data quality and access. Researchers now are attempting to funnel large troves of health data - from electronic health records (EHRs) to data collected from wearable devices and sensors - to improve diagnostics and predictive analytics. More connected and interoperableThe ability for a dataset from one product or source to be completely functional with another dataset from a different product or source. data in the healthcare system will enable more transformative AI applications in the future.

Emerging Applications of AI in Healthcare

Deep Learning for Diagnostics

One of the more promising avenues for progress in healthcare is the application of AI for assisting physicians in accurately diagnosing potentially tricky medical conditions.

Diagnosing Cancers: High-capacity computing has led to a reduction of error in the diagnosis of breast cancer by 85 percent. High powered computing, in combination with intelligent AI software, allows physicians to better identify metastasized cancerous lymph nodes, and will support future researchers to diagnose and treat oncology patients early.
Diabetic RetinopathyDiabetic retinopathy is a diabetes complication that affects eyes. It's caused by damage to the blood vessels of the light-sensitive tissue at the back of the eye (retina). - Mayo ... More: AI has been used to learn how to diagnose diabetic retinopathy with high levels of accuracy. Diabetic retinopathy is the world’s leading cause of blindness, and researchers at Google have trained algorithms to analyze images of retinas and diagnose with over 90 percent accuracy.
Cardiology: Some cardiologists have integrated AI into their work, and recent applications of cardiology-oriented AI have passed medical exams when diagnosing echocardiograms.

As diagnostic tools, AI algorithms are emerging as powerful partners for physicians supplementing their own diagnoses with an immediately-available ‘second opinion.’

Large-Scale Predictive Analytics

In addition to diagnosing active problems, AI and big data can be used to engage in predictive analytics designed to reduce infections and hospital readmissions. Below is a short list of some of the most notable use cases that apply predictive analytics:

Diagnosing Sepsis: Diagnosing and treating sepsis is extremely difficult. Recent research by Imperial College London demonstrated an AI application where algorithms learned how to best treat individuals with sepsis by analyzing data from many medical records and treatment plans rather than relying on the data found within the individual’s EHRs. This tool, known as the AI Clinician, was the result of collaboration between five Intensive Care Units (ICUs) and over 17,000 observed admissions for sepsis in the United States. In a recent trial, research showed that patients treated by a combination of medical doctors and AI algorithms had the lowest mortality rates of all sepsis patients within the study.
Avoiding Hospital Infections: Health Catalyst has focused on helping hospitals avoid infections that patients acquire while in the hospital. Another private firm, H20, uses predictive analytics to identify when patients should be transferred to and from intensive care units (ICUs), predict hospital acquired infections (HAI), and prevent hospital readmissions by comparing individual-level data against a large dataset.
Treatment Paths: Recent work by AI company Ayasdi produced their Clinical Variation Management tool, which developed a set of clusters of treatment paths to make recommendations about optimal treatment paths for future patients.

Electronic Health Record (EHR)An electronic health record (EHR) is a digital version of a patient’s paper chart. EHRs are real-time, patient-centered records that make information available instantly and secu... More Management

Health providers and researchers use NLP and other AI applications to better organize and analyze electronic health records (EHRs). For example, researchers at UC San Francisco, Stanford Medicine, and The University of Chicago Medicine, along with the Google AI team, published “Scalable and Accurate Deep Learning with Electronic Health Records” in Nature Partner Journals: Digital Medicine. The researchers propose integrating EHRs into the existing Fast Healthcare Interoperability Resources (FHIR) standard that was developed by HL7, a healthcare IT standards organization. The researchers demonstrate that by automatically integrating EHRs into the FHIR framework using machine learning, the AI application could make predictions about medical events on a per-patient basis with 93 percent accuracy.

Risk Stratification

AI-driven risk stratification has already garnered significant attention in the research community and the private sector. According to Eric Just, the Senior Vice President of Product Development at Health Catalyst, “[Accountable care organizations] have to be able to pinpoint which heart failure patients are at high risk for readmission as well as for sudden cardiac arrest. Armed with this knowledge, clinicians can schedule follow-up appointments and ensure those patients understand their medications and other aspects of the care plan.” Within the past few years, technology companies have released AI-driven risk stratification tools including:

Lumiata’s Risk Matrix: This tool is trained on 175 million patient records to produce a comprehensive evaluation of patient risk of disease and medical condition. Recent research by De Beule et al. shows that AI can classify patients into ‘low risk’ categories with nearly 90 percent accuracy, helping reduce the likelihood of future illness or complications, and to facilitate recovery.
NLP Applications: Amazon Web Services has used NLP to extract and interpret hand-written notes and text from medical records. NLP is particularly well-suited to deciphering physician input and patient records, since EHRs do not follow a single, unified structure, yet contain important information for understanding diagnostic trends and risk profiles of individuals.

The implementation of AI, when combined with access to large scale datasets, can vastly improve the ability of the medical community to categorize risk at the individual level. Several private sector companies are leading the path to deciphering unstructured health data.

Emerging Applications of AI within HHS

Emerging applications refer to late-stage prototypes and early-stage implementations of AI-driven tools. HHS already has a number of tools and is currently exploring additional applications for both internal and external use, including streamlining administrative processes and improving fraud detection. These are detailed below.

HHS Supporting AI Applications in Healthcare

HHS Tech Sprint for Health Innovation: HHS recently completed a “tech sprint” engaging outside teams on the use of AI to match patients to appropriate clinical trials, improving physicians’ ability to critically test health solutions. The challenge was modeled on The Opportunity Project (TOP), a government project that uses agile development methods to build digital tools with federal data. In the 14-week sprint led by the HHS Office of the CTO, tech teams developed seven AI-driven solutions that use federal data to “improve clinical trials, experimental therapies, and data-driven solutions for complex challenges from cancer to Lyme and tick-borne diseases.”

The Centers for Disease Control and Prevention (CDC) Workbench Web Services: The CDC in partnership with The Food and Drug Administration (FDA) has created a suite of open source web services that use NLP transform narrative text from clinical trials, laboratory reports, and medical records into coded, structured data. Given the time constraints of workforce members translating narrative text, the web services can free up staff time to better code text. Moreover, the CDC’s Informatics Innovation Unit hopes the initiative will connect engineers and medical end users to jointly develop Machine Learning and NLP tools to decode narrative text.

NIH Artificial Intelligence Workshop: The National Institutes of Health (NIH) hosted “Harnessing Artificial Intelligence and Machine Learning to Advance Biomedical Research” last June. The meeting convened healthcare and AI leaders to discuss practical applications of AI in different healthcare settings. Experts reviewed the use of AI to further in-home non-invasive patient monitoring, its application to radiology, and new biomedical research opportunities. Participants underscored the value of open data and the need for AI training tools.

BARDA DRIVe Program: The Biomedical Advanced Research and Development Authority (BARDA) currently spearheads the DRIVe program, which identifies and enables the use of disruptive technology to further solutions to today’s most pressing public health issues. As an example, the Early Notification to Act, Control, and Treat (ENACT) uses new, disruptive technologies, including AI, to improve the prediction and treatment of infectious diseases. The ENACT program employs AI-powered solutions like chatbots and telemedicine to “bring clinical care into the home” and provide patients with quick diagnosis and treatment for diseases like influenza.

FDA’s Software Precertification Program: With the advent of Software as a Medical Device, or SaMD, the FDA has established a nimbler set of principles to evaluate the integrity of medical organizations and streamline the premarket review for new medical products. The more iterative process ensures that products that meet the basic requirements of safety and effectiveness can quickly be adopted by patients who are seeking to manage their data. Many of these software functions will increasingly leverage machine learning and algorithmic decision-making processes to improve their efficacy.

Improving Administrative Processes and Addressing Fraud

BUYSMARTER: The BUYSMARTER initiative is a transformative, data-driven initiative leveraging the collective purchasing power of HHS to secure lower prices, achieve operational efficiencies, and generate cost savings on goods and services. BUYSMARTER uses AI technology to analyze department requirements based on current HHS-wide spend data. This helps identify opportunities to consolidate contract vehicles across agencies within HHS to leverage overlapping requirements at significant cost savings for the federal government.

CMS Improper Payment Fraud Review: The Centers for and Services (CMS) aims to use statistical analysis to identify fraudulent and improper payments made to healthcare providers. In 2018, CMS determined that 8.12 percent of all Medicare payments were improper. In order to address this problem, CMS employs a testing methodology called the Comprehensive Error Rate Testing (CERT) and uses AI to predict fraudulent and improper healthcare payments. The use of AI to predict fraudulent Medicare and Medicaid claims has saved the government approximately $42 billion, according to CMS. In addition to improper payments, AI can also be used for detecting potentially fraudulent billing practices. In the United States, over 3.2 trillion dollars are spent on healthcare, leading to a large market for potential fraud. Companies are working to reduce fraud via the application of AI for evaluating claims and predicting fraud.

HHS Office of the Inspector General: The HHS Office of the Inspector General (OIG) has focused on harnessing the power of big data to spot fraudulent claims. By developing an Integrated Data Repository, OIG has collected “petabytes of data on patient claims, providers, risk scores and other topics which investigators continuously analyze to spot potential wrongdoers.” In June 2018, OIG was able to identify and arrest doctors inappropriately prescribing opioids. The advent of big databases on prescriptions and medical claims allows OIG to classify providers and patients according to its Fraud Risk Indicator scale.

CMMI Innovation Center: In its early stages, CMS has developed an Innovation Center designed to improve healthcare and reduce costs associated with Medicare and Medicaid programs. Two initiatives are worth mentioning:

Quality Payment Program (QPP): The QPP seeks to gather data from healthcare providers to evaluate how doctors use technology to support their medical practice and decisions. QPP allows for two-way communications between practitioners and CMS and seeks to improve its ability to pay practitioners by collecting and analyzing countrywide data.
CMS Innovation Center Artificial Intelligence Health Outcomes Challenge: The challenge invites participants to use AI to “predict health outcomes that are important to patients and clinicians, and to enhance care delivery.” HHS has used similar initiatives, such as the Opioid Symposium and Code-a-thon, to raise awareness of the importance of tackling the opioid problem and develop innovative solutions using existing data on opioids.

Data Requirements, Accountability, and Ethics

Data Quality & Access

Machine-learning algorithms require access to high-quality, unbiased data that can be used to teach the algorithm what to look for. Big Data is playing an increasingly central role in healthcare, but health data poses a unique challenge, as much of it is unavailable to the public due to privacy laws. Moreover, datasets from different sources are often structured differently to one another, and are not easily collated into a larger dataset. Moving forward, researchers, practitioners, and healthcare companies will need to develop methods for collecting and integrating data while protecting individual privacy.

Data Privacy & Security

One major risk in using machine-learning algorithms is that the identities of the individuals within datasets may be inadvertently revealed. Berkeley Professor Anil Aswani noted in his recent research that “It is possible to identify individuals by learning daily patterns in step data (like that collected by activity trackers, smartwatches and smartphones) and correlating it to geographic and demographic data. The mining of two years’ worth of data covering more than 15,000 Americans led to the conclusion that the privacy standards associated with 1996’s HIPAA (Health Insurance Portability and Accountability Act) legislation need to be revisited and reworked.”

Data de-identification has also emerged as a promising area to maintain individual privacy, allowing providers to analyze highly granular individual-level data without revealing the identity of each patient in the dataset. This approach would allow researchers to access large datasets of public and private health data to help train algorithms. As the needs of patient privacy grow, AI applications will also need to operate in environments in which data security is paramount. A recent report by the Healthcare Industry Cybersecurity Task Force indicated that data security is insufficient at most hospitals, leaving patient data open to exploitation.

Accuracy

Beyond privacy and data concerns, AI must also address issues with hazardous outcomes and incorrect suggestions. Because algorithms are developed by individuals and use data collected by individuals, they are susceptible to human error and may make erroneous predictions, diagnoses, and suggestions. In some cases, this has included developing incorrect and unsafe treatment plans for patients. These outcomes suggest that AI algorithms should be vetted by board-certified clinicians who can learn from algorithmic suggestions but ignore them when necessary.

Accountability & Ethics

While AI has positive potential, it can also cause damage through miscalibrated algorithms or by researchers overestimating the quality of the data. Policymakers and healthcare providers must develop a system of governance and accountability as AI becomes more integrated into healthcare settings. This framework would assess the trustworthiness of algorithmic decisions, use path-tracing to understand how algorithms produced their outcome, and ensure that healthcare providers are responsible for how they utilize AI in their practice. This growing grey area of accountability is exemplified by decisions made in part by algorithms, including concerns over accountability with machine-assisted and machine-led surgeries. This has raised questions about accountability in healthcare as AI plays an increasing role in managing patient records, diagnosing illnesses, and developing treatment plans and a focus on the limits of algorithms.

Algorithm auditing is a potential solution whereby the outcomes of AI and machine learning are checked for bias and other potential systemic concerns. Recent legislation by the European Union, for example, “requires that organizations be able to explain their algorithmic decisions.” The Institute of Internal Auditors has released a framework for auditing artificial intelligence to discover how and why algorithms arrive at the solutions that they do.

The development of an “Ethics & Algorithms Toolkit” and an “Algorithmic Impact Assessments toolkit” provide government officials and agencies with the necessary information to understand bias and algorithmic risks that could have tangible social impacts. A 2018 blog from Stanford Medicine highlighted four major ethical challenges to the use of AI in healthcare:

The possibility of racial or gender bias baked in to an algorithm;
The possibility that doctors will rely too heavily on AI applications that have not properly been audited, and cannot be easily interpreted by practitioners;
The possibility that data collected using machine learning, deep learning, or natural language processing will be integrated into a larger ‘collective’ dataset without express permission of the patients whose information makes up the dataset; and
The introduction of an intermediary (the algorithm) that complicates the chain of responsibility for medical decisions.

These concerns, and others, are now being echoed throughout the medical profession. These considerations will guide HHS as it moves through the process of designing its own AI strategy.

National Policy Context and International AI Strategies

As AI has proliferated both inside and outside of government, the White House Office of Science and Technology Policy (OSTP) has responded by laying the framework for a national AI strategy. In 2016, the National Science and Technology Council published The National Artificial Intelligence Research and Development Strategic Plan. The Plan recommends that the Federal government develop a coordinated approach to maximize the impact of AI technology as it grows in scope. The Plan outlines the overarching principles for R&D foundations and identifies strategies that will spur long-term investment in key AI areas such as medicine, education, and transportation.

In February 2019, the Trump Administration signed an Executive Order on Maintaining American Leadership in Artificial Intelligence. In coordination with this announcement, the OSTP published a fact sheet on AI. The Executive Order outlines five core principles to guide the core AI strategy and calls for an action plan to preserve the U.S. advantage in AI. Moreover, it requires agencies to “enhance access to high-quality and fully traceable Federal data, models, and computing resources to increase the value of such resources for AI R&D, while maintaining safety, security, privacy, and confidentiality protections consistent with applicable laws and policies.” The Executive Order also highlights connections with the President’s Management Agenda and the Cross-Agency Priority Goal: Leveraging Data as a Strategic Asset, emphasizing the need for agencies to “consider methods of improving the quality, usability, and appropriate access to priority data identified by the AI research community.”

Towards a Department-Wide Strategy at HHS

To align itself with national priorities, HHS has also shifted its data gathering priorities and mapped out the initial steps towards a comprehensive AI strategy. Undertaken in April 2017, ReImagine has become the guiding strategic plan for HHS that aligns with the priorities of the OMB and outlines the foundations to build an HHS better equipped to serve the American people. As part of this new strategic platform, HHS intends to leverage the power of data and disruptive technologies to make HHS more innovative and responsive. As part of leveraging data as a strategic asset, HHS has embarked on a strategic path to establish its data governance regime.

The State of Data Sharing at HHS

In September 2018, the Data Initiative Team of HHS completed its first comprehensive report of data sharing across HHS’s eleven operating divisions. By understanding the ecosystem of available data across its divisions, HHS aims to establish an enterprise data management system to harmonize data across its divisions and create a decision-making structure for the management of these data assets. The report identified 27 high-value data assets and their functionality for both HHS and the broader public. The report also identified five core challenges for HHS to address:

Process for Data Access: HHS does not have a consistent data requesting process from one agency to another, causing delays and a lack of accountability.
Technology for Data Access and Analysis: There is wide variance of data collected across agencies and its technical utility by the public varies widely.
Regulatory Environment: All data collection efforts have statutes and policies to govern the collection and access to that data, but many limit access and use.
Disclosure Risk Management: The risk of violating individual privacy increases as more granular data is collected and shared, which leads to limits on microdata access.
Norms and Resource Constraints: HHS agencies currently do not see the value of sharing restricted or non-public data and rarely address data discrepancies with other agencies.

International AI Strategies and Principles

Among the international community, Canada and the United Kingdom have led the way in adopting both national and healthcare-focused AI principles. These examples could inform future efforts by HHS to craft an AI strategic plan.

The United Kingdom’s Department of Health and Social Care has adopted a Code of Conduct to guide the adoption of its data and AI strategies.

United Kingdom’s Code of Conduct for Data-Driven Health and Care Technology

Understand Users, Their Needs, Their Context
Define the outcome and how the technology will contribute to it
Use data that is in line with appropriate guidelines for the purpose for which it is being used
Be fair, transparent and accountable about what data is being used
Make use of open data standards
Be transparent about the limitations of the data used
Show what type of algorithm is being developed or deployed, the ethical examination of how the data is used, how its performance will be validated and how it will be integrated into health and care provision.
Generate evidence of effectiveness for the intended use and value for money.
Make security integral to the design
Define the commercial strategy

The Canadian government has also established a set of guiding principles for the responsible use of Artificial Intelligence in government.

Canada’s Responsible Use of Artificial Intelligence

Understand and measure the impact of using AI by developing and sharing tools and approaches
Be transparent about how and when we are using AI, starting with a clear user need and public benefit
Provide meaningful explanations about AI decision making, while also offering opportunities to review results and challenge these decisions
Be as open as we can by sharing source code, training data, and other relevant information, all while protecting personal information, system integration, and national security and defence
Provide sufficient training so that government employees developing and using AI solutions have the responsible design, function, and implementation skills needed to make AI-based public services better

Next Steps

This Roundtable will bring diverse stakeholders together to inform the development of an HHS AI strategic plan. Drawing on the background research and use cases in this briefing paper, and the Roundtable’s interactive breakout sessions, participants will identify high-priority health applications of AI and key issues for an HHS AI strategy to address. The structure of the Roundtable will focus on prioritizing applications of AI in the healthcare sector, utilizing and improving health data for these applications, and outlining objectives for an HHS AI Strategy. The Roundtable will conclude with a presentation of highlights to the full group and HHS leadership. Following the Roundtable, CODE will prepare a public report of findings and recommendations from the work of the day.