Overcoming technical challenges when scaling Generative AI in the life sciences

Now nearly two years into the Generative AI hype catalyzed by ChatGPT, we’ve seen a lot of ways that people have gone about experimenting with and deploying AI.

Generative AI adoption in life sciences

Now nearly two years into the Generative AI hype catalyzed by ChatGPT, we’ve seen a lot of ways that people have gone about experimenting with and deploying AI.

And what we’ve seen so far is that experimenting with Generative AI is easy, but scaling it is hard, especially in the life sciences. There are a variety of reasons for this on an organizational side, which you can find a great, quick resource on here. The amount of data life sciences companies need Gen AI tools to accurately process is immense, and naive implementations of LLM-based RAG systems introduce a margin for error that is not acceptable to the industry.

The life sciences organizations that are the furthest along have taken a careful, platform-based approach to building Generative AI applications that has been designed to scale a Generative AI tool for one use case to several others without much additional effort.

However, despite thinking through many of these decisions carefully, organizations are still running into challenges expanding AI adoption.

A lot of this comes down to the fact that the environment the platform was originally built in does not match the environment that the platform needs to scale to.

While that is a problem that can be anticipated, it’s an inevitable problem: most organizations need to build the platform with some use case in mind. That is the only way to evaluate whether your platform is being built in the right direction and to assess how that platform should continue to be built out.

While many might have successfully developed Generative AI tools for one or several use cases, those tools are often designed very specifically for the workflow and data of that use case. This is often the case because of the tradeoff that exists between error rate and generalizability. And, naturally, given the low margin for error, organizations that successfully deploy Generative AI use cases tend to narrowly define the utility of the tool.

Scaling Generative AI

Turning this from a use-case specific Generative AI application into an organization-wide platform takes additional work. And this is the point in the AI adoption cycle that we’re finding many of the most advanced life sciences organizations today.

The heterogeneity of workflows, data types, tools, vendors, and formats can often make source data unwieldy require non-trivial changes to existing versions of an AI tool. This impacts the quality of an organization’s Generative AI tools as an organization attempts to scale a certain tool to more people or to new use cases; this also forces many organizations to consider how to implement wide-scale behavior change to standardize processes for AI, which itself would be an error-prone process.

Successfully scaling AI from a technical standpoint involves a few considerations:

Data management and integration. Scaling AI starts here, because poor data ingestion reduces the quality of the databases that Generative AI systems draw on. When the underlying data that Generative AI systems are using are faulty, the outputs are inevitably also faulty. Data ingestion is a uniquely difficult challenge in the life sciences given the variety and volume of files and data. As such, naive approaches to ingestion may suffice for one or two use cases where the ingested data format is known and consistent but become less reliable at scale. Taking the effort to develop AI capabilities around ingestion of life sciences-specific information that can reliably do this task is key to scalability.
Adaptable AI architectures. Different use cases may benefit from different Generative AI models, different database implementations, different retrieval strategies, different prompt engineering, and more. Building in this kind of flexibility from the start is key and is an especially important consideration for complex, multi-step LLM processes, where changing models, inputs, and outputs can create “drift” that can become time-consuming to fix and introduce inconsistency and unreliability to AI systems, especially at scale.
Robust and efficient evaluation. As organizations adapt AI to new use cases, developing robust methods of evaluating that output will be key. The Generative AI/LLM evaluation industry is still nascent, but certain automated evaluation tools may be sufficient for low-level tasks outsourced to AI, single steps in multi-step AI workflows, or tasks involving AI outputs that can be less subjectively evaluated. These can be useful to teams developing these tools to pinpoint where exactly errors are occurring and can be especially useful to teams implementing adaptable AI architectures with many different parameters at play. However, given the mission-critical nature of many of the workflows AI is involved in, manual evaluation may still be the best approach. In this case, ensuring that it is easy for subject matter experts to evaluate outputs and provide feedback that can quickly be implemented can lead to significantly improved timelines for iteration and significantly improved overall quality.

Conclusion

In conclusion, while the adoption of Generative AI in the life sciences presents significant opportunities, it also comes with considerable challenges, particularly when it comes to scaling from specific use cases to organization-wide platforms. The key to overcoming these challenges lies in focusing on robust data management and integration, developing adaptable AI architectures, and ensuring robust and efficient evaluation methods. By addressing these technical considerations, life sciences organizations can more effectively harness the power of Generative AI.