If Data Is the New Oil, Then AI Serves as the Refinery.
“Data is the new oil.” This phrase is becoming increasingly common, and it was echoed by panelists at the 2025 Bioprocessing Summit Town Hall in Boston. Irene Rombel, CEO of Biocurie; Cenk Undey, global iCMC digital transformation program lead at Sanofi; and Colin Zick, managing partner at Foley Hoag, all reinforced the notion that, in our technology-centric world, data is a highly valuable resource.
Despite this, Rombel noted that process development and manufacturing data are often limited and fragmented, particularly for advanced therapeutic modalities. This limitation makes them poorly suited for traditional artificial intelligence/machine learning (AI/ML) applications. Contrary to common perceptions, she emphasized, “there is not a data goldmine for AI/ML to leverage for manufacturing.”
The analogy of data to oil also holds when discussing AI. Like oil, data is inherently messy and requires significant investment of time and resources to unlock its true potential. Here, AI tools serve as the means to refine this data.
Data Oil Drips
The limited and scarce nature of process development and manufacturing data is particularly critical when assessing AI/ML applications. While it can facilitate interpolation if trained on a well-curated—or clean—data set, it struggles to predict outcomes beyond that dataset.
As Zick illustrated with an anecdote from a Bloomberg article, one of the primary challenges for self-driving cars in Boston is the city’s seagulls. The standoff between cars and seagulls exemplifies that if an AI/ML model has not encountered specific data, it cannot accurately predict behaviors.
Undey added that a common misconception is that ML/DL models need vast amounts of data. The actual requirement hinges on data quality. Various computational methods have been developed to effectively handle limited data sets.
Consequences of Confusion
The power of AI has generated considerable misunderstanding, particularly evident at the conference. Zick commented, “There is significant confusion around AI and manufacturing because many view it as a tool with distinct qualities compared to others they use.” He warned that this perspective leads many to believe that standard rules for effectively using new tools do not apply, presenting significant risks.
Undey corroborated this idea, stating that it’s a prevalent error to dive straight into AI tools like machine learning algorithms after gathering data, without considering foundational elements such as basic statistics and a clear understanding of the data and challenges faced. He countered the belief that AI is applicable to every scenario, asserting, “Sometimes a more straightforward process can outperform an AI/ML method.”
Rombel raised a particularly alarming topic frequently discussed at the conference: the application of large language models (LLMs) and synthetic data to create “missing data.” Experts like Mark Mackey, CSO of Cresset, have warned about this potential AI application, noting that the industry lacks sufficient biological understanding to validate whether AI possesses the necessary data. The panelists expressed their concerns, with Rombel emphasizing that “not only is this a poor practice from a data modeling perspective, but it can also be hazardous.”
Conversely, there are still those who question and seek concrete examples of how digital transformation and AI have led to meaningful improvements in time, quality, and cost in process development. These discussions, highlighted both in the town hall and throughout the conference, revealed that parts of the industry are still skeptical while others have already embraced advancements.
During the town hall, Undey was surprised to see the focus on topics he believed had been adequately addressed previously, such as explainable AI and risk-based applications in GMP. While it’s positive to maintain interest and facilitate discussions, he noted, “I anticipated more inquiries focused on AI’s robotics and GenAI potential as they relate to bioprocessing.”
Walk Don’t Run Forward
For organizations to remain competitive, both leaders and operators must recognize that digital transformation and AI are indispensable. Undey advised on reducing prediction errors by considering the learning aspect of AI/ML models as they progress and adapt. He believes that it all boils down to risk assessment, the specific context of use, and proper documentation of computational methods.
Zick, as the attorney on the panel, candidly shared his viewpoint. He pointed out, “In numerous situations, individuals apply AI without careful thought or critical scrutiny.” He elaborated that AI serves as a tool for enhancing critical thinking rather than replacing it. He was so passionate about his recommendation for leaders and scientists to review the FDA’s new draft guidance on AI’s role in supporting regulatory decision-making that he brought along a printed copy. “Read it.”
Final Thoughts
Rombel proposed shifting the mindset from accumulating as much data as possible, which can incur significant costs, to focusing on obtaining the right data. “This is where subject matter expertise and robust statistical analysis become critical.” Undey further emphasized that “the goal isn’t merely to use AI; it’s to apply it in contexts where it makes sense and to pose the right questions to tackle the right problems.”
Zick encapsulated this sentiment perfectly: “AI can enhance everything, but it should never degrade any aspect.” To realize this ambition, Undey remarked, “the industry must concentrate on developing data products that effectively support AI initiatives.” He explained that enhancing these models for decision-making is vital within the product development and biopharma manufacturing continuum, stating, “Our industry has not fully reached this level yet, despite several success stories across various segments of our value chain.”

