Vikash: Building NLP models using clinical data and unstructured notes presents several challenges for us as researchers.
The contextual ambiguity of clinician notes makes things tricky because we must develop NLP algorithms that can correctly recognize and differentiate clinical elements such as medications, diseases, procedures and tests. In addition, they must consider their timing, presence and relation to the patient or their family. These models must also handle broader contexts such as idiomatic expressions, cultural references and domain-specific jargon. This requires advanced algorithms and diverse training data.
Language complexity is a factor too, given that notes may deviate from standard English since clinicians often write hastily in patient records. Notes may include irregular grammar, medical jargon, acronyms, abbreviations and languages other than English. We can’t forget that these notes exist for reimbursement purposes and clinical care documentation, not life sciences research.
Then there’s the diversity and quality of unstructured data. Notes from EHRs differ greatly in structure and format across individual clinicians, health systems and EHR vendors. There’s no set “standard” for the quality of data contained in clinical notes — there may be documentation gaps for certain clinical elements. And the data extracted from notes can significantly differ from academic and publicly available data sets, with real-world data (RWD) being more variable and noisier.
The lack of standardized guidelines for training NLP models on RWD, coupled with the variability in medical nomenclature, poses significant challenges. NLP models need extensive training to work effectively with clinical data. But there’s a scarcity of relevant data for training these models.
Scalability can also be a concern because more advanced models — particularly those employing deep learning — are resource-intensive. The computational resource requirements limit the scalability and accessibility of advanced NLP models, especially for smaller organizations. The process of annotating and curating data for NLP is labor-intensive and requires expert knowledge, making the development of robust NLP systems even more costly and complex.
Ethical concerns and the risk of bias are also relevant, especially if NLP models are trained on a small fraction of actual clinical data. These models may underperform in real-world applications and may replicate existing biases from training data. This can lead to discriminatory effects in areas such as designing health insurance plans and determining treatment outcomes. This highlights the importance of developing and using NLP technologies in a compliant, legal and responsible manner.