Home / Clinical & Pharma / Scaling Trusted AI in Pharma: From Augmented to Agentic

Scaling Trusted AI in Pharma: From Augmented to Agentic

Nov 25, 2025

Lukas HainzBiopharma Innovation Specialist

The tenor of commercial conversations shifted markedly as leaders at the Veeva Commercial Summit Europe described how artificial intelligence had moved from a curiosity to an operational backbone, placing agentic AI within reach but not yet in the driver’s seat. The emphasis was squarely on sales, medical, and marketing operations—where approvals, content cycles, and field actions demand both speed and scrutiny—rather than on research or clinical development. Across sponsors and platform providers, the consensus was steady: meaningful results come from embedding AI directly into daily workflows, anchoring decisions in trustworthy data, and governing autonomy with clear, auditable guardrails. That frame clarified not only what to build but in what order to build it—augment human work first, earn trust with predictable outputs, then delegate tasks to agents as controls and confidence mature.

From Pilots to Embedded AI

Phase Change to Embedded Workflows

Executives portrayed a decisive phase change: bolting AI onto legacy tools yielded early wins, but scale required embedding intelligence deep within the tasks people perform every day. Rather than spinning up yet another assistant, Veeva’s Philipp Luik urged commercial teams to target “vertical percent” productivity—gains that compound within specific processes like content review or field planning. The path runs through systems where users already live: CRM, medical information, and content platforms. When suggestions, next best actions, and compliance checks appear in the natural flow, adoption rises and friction recedes. Moreover, embedded placement creates cleaner feedback loops; models learn from real usage signals, and leaders measure outcomes without stitching together scattered tools. The result is not a novelty layer, but a process-native capability that standardizes quality, speeds decisions, and aligns behavior with policy.

Augmented First, Agentic Next

Agentic AI drew broad interest for its promise to offload repetitive orchestration—compiling references, reconciling claims, routing approvals—yet seasoned leaders advocated a paced rollout that matches autonomy to evidence. Bayer’s Stefan Schmidt captured the moment: treat agentic capability as an endpoint, not a starting point. Begin with augmented assistance, codify guardrails, harden audit trails, and stage accountability so reviewers remain in control where regulations expect a named sign-off. This progression not only reduces operational risk; it builds durable trust by proving reliability step by step. Early augmented use cases—drafting materials from approved sources, summarizing interactions, harmonizing data across systems—generate measurable value and teach organizations how to govern AI at scale. Only when outputs are consistently accurate, predictable, and explainable do autonomy thresholds move, unlocking agents to execute end-to-end tasks under policy.

Data and Trust as Foundations

The Five Pillars of Great Data

Data quality emerged as the non-negotiable substrate for any of this to work. Veeva’s David Medina Tato articulated five pillars—quality, accuracy, completeness, reliability, and compliance—that translate directly into model performance and reviewer confidence. If a field is missing, a source is stale, or a rule is ambiguous, AI will amplify the weakness rather than fix it. Leaders described a data strategy that treats commercial data as strategic infrastructure: aligned taxonomies, lineage tracking, version control for reference sets, and controls that match life sciences privacy and promotional standards. Strong stewardship also future-proofs investments as models evolve, since consistent semantics and clean records allow safe upgrades or swaps without re-litigating every dependency. The upshot is practical: better data reduces rework, lowers rejection rates, strengthens auditability, and makes it far easier to justify automation in regulated workflows.

Trust in Regulated Workflows

Trust carried the room because reviewers’ names appear on materials submitted to health authorities, and that signature implies personal accountability. Chris Moore and Moderna’s Jason Benagh emphasized that comfort with AI outputs grows when the chain of evidence is explicit—sources cited, changes tracked, and decisions documented. Predictability mattered as much as accuracy; teams want the same inputs to produce the same results under the same policies, with exceptions handled visibly rather than quietly. When workflows are transparent and roles are clear, reviewers can accept AI-assisted content with confidence, even while keeping a human-in-the-loop checkpoint. Moreover, well-governed data access—who can see what, when, and why—prevents leakage and protects sensitive information. In this frame, trust is not a soft concept but an operational outcome: repeatable processes, auditable artifacts, and accountability that travels with content from draft to approval.

Adoption and Change Management

Avoiding Common Failure Modes

John Oxley’s cautionary notes resonated because many organizations share the scars: pilots launched without a clear problem statement, dazzling models that never reached daily use, and expectations set so high that even strong results felt underwhelming. The fix begins with scope discipline, framed by measurable outcomes—a shorter approval cycle, a higher acceptance rate, a precise reduction in administrative minutes per action. Complexity needs to be contained; models must serve the workflow, not the other way around. This often means choosing “good enough and reliable” over “clever but brittle,” and designing prompts, guardrails, and UI nudges so users stay in flow. Leaders also flagged the importance of sunset criteria for pilots and a path to production that includes policy reviews, security checks, and change management. Success looks like adoption at scale, not a proof of concept that never leaves the lab.

Training and Role-Specific Enablement

Education proved to be a force multiplier rather than an afterthought. Jason Benagh described Moderna’s internal AI academy, a hands-on program that shows employees how to use tools responsibly and effectively within day-to-day tasks. Karl Goossens underscored a practical point: adoption rises when benefits are described in the language of each role—medical reviewers want fewer cycles and cleaner references; marketers want faster localization and claim checks; field teams want smarter call prep and compliant follow-ups. Training that pairs guidance with guardrails—what to do, what not to do, what gets logged—reduces anxiety and narrows the gap between policy and practice. It also seeds a feedback culture; as users learn to flag edge cases and share effective patterns, models and prompts improve. This shared literacy creates alignment across functions and makes scaling far smoother than tool-first rollouts.

Operating Model and Architecture

End-to-End Integration and Oversight

Architecture choices mattered as much as model choices. Leaders outlined an operating model that spans intake, processing, decisioning, oversight, and documentation—integrated across CRM, content systems, and field platforms so users never need to leave their flow. Human-in-the-loop checkpoints align with regulatory expectations, while policies and audit trails define who can approve what, under which conditions, and with what evidence attached. This design reduces context switching, shortens cycle times, and provides a single system of record, which is crucial when questions arise post-approval. Moreover, end-to-end integration allows orchestration: the same claim moves from drafting to MLR review to localization with traceable lineage, and the AI that proposed it can cite its sources. In this model, governance is not a gate at the end but a thread throughout, making compliance and productivity complementary rather than competing goals.

Future-Proofing and Measurable Value

Executives sought architectures that could adapt as models evolved, scale across markets, and demonstrate value in hard numbers. Systems needed to swap models safely, update prompts centrally, and retain continuity of audit logs without costly rework. Leaders pointed to tangible metrics—faster approvals, cleaner data, higher-quality engagements, reduced manual effort—as the scoreboard for “vertical percent” gains that compound within core processes. They also called for clear de-escalation paths when agents encounter ambiguity, preserving trust while revealing where policy refinement is needed. In the end, the summit’s message was pragmatic and forward-leaning: organizations would stage autonomy behind solid data foundations, transparent workflows, and role-based enablement, then progressively grant agents more responsibility where performance was proven. With those elements in place, AI had driven sustainable productivity improvements and set a credible path to broader agentic execution.