MedCity Influencers

Uncovering the Hidden Insights in Clinical Trial Documents: Critical Steps to Intelligent Document Processing

In clinical trials, pharmaceutical companies are seeking to optimize operations and improve efficiency by automating and enhancing processes through Artificial Intelligence (AI) and Machine Learning (ML). One area where this can reap tangible benefits across clinical trials is in data processing. A typical clinical trial generates over 13,000 documents in various formats (text, voice, video, […]

In clinical trials, pharmaceutical companies are seeking to optimize operations and improve efficiency by automating and enhancing processes through Artificial Intelligence (AI) and Machine Learning (ML). One area where this can reap tangible benefits across clinical trials is in data processing. A typical clinical trial generates over 13,000 documents in various formats (text, voice, video, apps, and web entries), making data gathering, organization and analysis challenging. It’s here that implementing automated Intelligent Document Processing (IDP) can significantly benefit clinical trials, boosting productivity, speeding up processes, improving accuracy and delivering cost savings. IDP uses AI and ML to process structured and unstructured documents, allowing technology to read and understand content like a human,

This article outlines the steps to implementing IDP in clinical trials’ digital content flow using transformative technologies like digital twinning, AI/ML, natural language processing (NLP) and generative AI agents. These technologies automate that implementation, enabling the rapid and intelligent transformation of thousands of documents into valuable research insights for each clinical trial.

Assessment, planning and challenges

Companies planning to implement an IDP platform should carefully consider their long-term goals, clearly defining objectives, specifying document processing requirements, and identifying areas for improved efficiency and accuracy. It’s essential to recognize the challenges pharmaceutical companies face in their data flow during clinical trials. The manual data population of site folders and electronic trial master files (eTMFs), for example, is time-consuming and poses issues, including limited document security and data privacy, archiving and retrieval difficulties and human error, resulting in up to a 25% failure rate.

The strict regulatory requirements within the healthcare ecosystem mandate that IDP systems ensure patient privacy protection and maintain audit trails. Security is crucial to prevent unauthorized access to sensitive data. An important consideration to note is that pharmaceutical industry has been cautious about integrating generative AI into regulated and sensitive data workflows, resulting in slow adoption. Overcoming resistance to change in this conservative industry requires demonstrating value and ensuring the security, privacy, and compliance of AI-driven solutions.

Implementing the plan

presented by

An automated IDP solution incorporating technology and data science and integrating AI and generative AI is essential for handling digital content. It can analyze diverse documents (written, video, voice, etc.), often unveiling hidden insights and patterns not easily detected through traditional methods. To reach this stage, however, an ordered approach to implementing IDP in a clinical trial’s digital content flow is necessary, as follows:

  • Quality auto-review In healthcare, data quality is crucial due to strict regulations. As the saying goes, “trash in, trash out,” meaning AI’s effectiveness relies on its training data. Therefore, the IDP solution must promptly identify incoming content, especially scans and embedded images, verifying layout correctness and data convertibility into a digital format. If not, the system should instantly flag the information for user review and resubmission in real time, speeding up data collection and quality control processes.
  • Digital twin and classification Deploying digital twins, continuously updated virtual representations of assets, digitizes all clinical trial content for universal accessibility. This data can train generative AI models to recognize patterns and relationships. Companies can review archived eTMF acquisition content, perform pre-audit checks, and proactively populate eTMF and Electronic Common Technical Document (eCTD) as the trial progresses. Digital twin data lakes enable insights from past trials and create audit trails. They also provide detailed source document insights and classification via ML models, automating content recognition for downstream systems.
  • Auto-translations The ability to automatically convert content into other languages as needed, through domain-specific regulatory or safety language ontology sets is an important step in deploying automating IDP. Auto-translating streamlines communications purposes and drives efficiency gains.
  • Sensitive data – Protected Health Information (PHI) Data privacy is paramount within clinical trials, so automating safe processes for sourcing, linking, combining, reusing, and sharing protected data with auditable proof of compliance reinforces trust and security. Deploying privacy analytics enhances the retention capability of sensitive data, running redaction capabilities, blanking out the thesis and providing only relevant content to users.
  • Deeper entity extraction (NLP/NLU) Once content is digitized, it’s vital to be able to recognize the sections within that content and find information on that. NLP and Natural Language Understanding (NLU) enable understanding of text and its meaning. For example, it’s possible to analyze scheduled assessments for a patient to find out what is involved for the patient. That information is shared downstream so that appropriate models can be built to best manage patient burden.
  • Insights and following best actions In the clinical trial workflow, technology aids in content analysis, generating actionable insights in risk assessment, patient burden, protocol amendments, potential outcomes, and theoretical modeling. AI deployment and generative AI training involve digital twins and NLP, enabling Natural Language Generation. Entity extraction identifies text, another program interprets its meaning, and a third generates responses, insights, and next steps, often within a generative AI agent. Digital twins and NLP provide data understanding, helping generative AI models learn vital patterns and relationships for accurate predictions and creative content generation.
  • Further exploration of AI tools Companies are increasingly exploring generative AI applications for data mining, template creation, quality control, site communication, and clinical trial operation guides. For instance, generative AI can rapidly identify potential trial participants from medical records, streamlining patient recruitment. It can also monitor patients by analyzing medical data promptly and detecting safety issues, ensuring patient safety and data quality.

The benefits of automating IDP

Automation is increasingly preferred to address legacy IDP challenges, speeding up operations, enabling continuous processing, improving accuracy, enhancing collaboration, and ensuring regulatory compliance.

Content from trial sites is fed into the eTMF for final trial file storage, so adopting an API-enabled SaaS solution for automating IDP and having that in site folders enables immediate quality review for early issue resolution. For example, automatic quality checks such as layout, missing signatures or scan issues, smart contracts to be auto-built, and faster turnaround on-site startup are key metrics in clinical trials and often a big headache.

Automated eTMF systems offer document version control, audit trails, notifications, remote accessibility, and advanced search capabilities, addressing manual processing issues. IDP employs AI/ML to eliminate manual eTMF entry while maintaining quality. These solutions index documents, automate workflows, aid translations, reduce processing time, and free up employees for value-added tasks. They handle scans and images in any language, extracting metadata and creating digital twins for better recognition.

In addition, when versioning occurs, which are trial protocol amendments (typically five in any given trial), there is a clear insight into whether the site has seen the update, acknowledged it, and recognized where the update is and its impact on them. This is key for optimizing communications with sites and proving they have understood the protocol change. Often, protocol issues come up with changes, and audits ask for a clear line of sight that sites have reviewed and acknowledged correctly.

Technology’s future role 

As pharmaceutical companies seek to automate and implement innovative solutions to streamline clinical trial processes, IDP’s transformative technologies – from digital twinning to AI/ML to NLP – automate the implementation steps outlined above to drive greater efficiencies and transformation for more significant insights into research. Significantly, more pharmaceutical companies are bringing all of their content into digital form, allowing the industry to more fully embrace AI through safe approaches such as IDP.

In the next couple of years, expect pharmaceutical companies to set the foundation for their long-term journey with AI, which will play out not in open-source tools but in secure internal domains. There is engagement with generative AI vendors to bring capabilities in-house to reap its benefits but not put quality and sensitive data at risk. Adopting “mini” versions internally allows for domain specifics, regulatory, safety, and operational information to remain in-house.

Drug development’s journey with AI is already underway and will only continue to accelerate. While this technology matures, the industry is prudent in taking a careful approach via steppingstones such as IDP.

Image Source: metamorworks, Getty Images, image number: 1054930874

Gary Shorter is the head of Artificial Intelligence and Data Science at IQVIA. Gary pursues the use of emerging technology to provide new and more efficient capabilities to enhance clinical trial management. This includes development of new design software through to more recent advancements with AI/ML capabilities where his team has developed several micro-products and micro-services that can be plugged in and used by any SaaS solution.