Applied AI Summit

Free online conference | October 14-16, 2025

Customized Automated Evaluation for Production LLM Applications

As LLM applications become more specialized, there is a critical need for domain-specific evaluation strategies. However, there has been limited exploration of automated methods that perform nuanced, reference-free evaluation of LLM generated content in production settings. Although LLM-as-a-judge methods have emerged as an automated evaluation strategy they are susceptible to reproducibility and bias issues, particularly when relying on closed-source models. This talk introduces an alternative, product-customized evaluation approach that leverages expert-labeled training samples to score LLM outputs. This approach enables iterative product development and continuous real-time monitoring while supporting safety and ethical compliance with healthcare standards. This approach was developed for an LLM-based food logging and coaching product and assisted in securing approval from the governance boards of large healthcare systems where primary concerns include maintaining consistency and accuracy while serving probabilistically generated responses. It is broadly applicable to other domain-specific areas and can be adapted for various use cases, making it valuable for anyone developing and deploying LLMs in healthcare.

About the speaker

Aimee Neary

Director, Data Science, Analytics and AI
at Lark Health

With over 18 years of experience at the intersection of data, healthcare, and artificial intelligence, Aimee is a proven data and AI leader focused on transforming how organizations use data—not just to understand the past, but to shape the future. Her career spans roles in statistics, clinical research, product innovation, and executive leadership in data science and AI, all driven by one unifying goal: to improve health outcomes through responsible innovation.

Aimee has built a reputation for driving enterprise-wide impact. At Lark, she led the company’s AI program—establishing an internal AI Committee, upskilling over 160 employees, and integrating generative AI tools across functions including engineering, clinical operations, and client delivery. These efforts helped embed AI as a trusted partner in decision-making and delivery.

Previously, at Collective Health and R1, Aimee led AI-driven initiatives that reshaped healthcare economics—generating tens of millions in savings while enhancing quality of care. She brings a holistic view of healthcare challenges and a deep understanding of the diverse data ecosystems needed to solve them.

Known for connecting vision to execution, Aimee aligns cross-functional teams around strategic goals, secures executive buy-in, and empowers teams to build AI solutions that deliver measurable value. She thrives at the intersection of AI, ethics, and business growth—helping organizations see data as a strategic asset for innovation, transformation, and impact.