An Analysis of the Zartis AI Application Development Experiment

An Analysis of the Zartis AI Application Development Experiment

A research-driven exercise to deepen the team’s understanding of what it takes to engineer real business value from LLMs

Preview

Executive summary

This whitepaper presents the key learnings from an internal Zartis prototype aimed at building an end-to-end AI application for processing complex Merger & Acquisition (M&A) Information Memorandums (CIMs). The initiative was designed not as a commercial product, but as a research-driven exercise to deepen the team’s understanding of what it takes to engineer real business value from LLMs.

The experiment focused on a domain-specific problem:

Extracting structured, comparable insights from unstandardised, graphically rich CIM documents. To tackle this, Zartis developed an “agentic system” composed of multiple specialised AI agents with defined roles and autonomy that collaborate to produce a standardised analytical report.

Challenges:

This prototype served as a testbed for exploring three of the most persistent challenges in AI development: hallucinations, determinism, and LLM cost effectiveness.

1. Hallucinations: The team explored how to embrace uncertainty while designing robust fallback mechanisms to mitigate its effects. This shift reframes hallucinations not as defects to be eradicated but as predictable behaviours to be detected, managed, and leveraged. The team’s approach includes developing early detection techniques based on uncertainty metrics that signal when to activate verification or review loops.

2. Determinism: The team explored whether repeatable outcomes can be engineered in LLM-based systems. This investigation distinguished between system-level determinism (achieved through architecture, task decomposition, and environment control) and model-level determinism, where variance arises not from the model’s internal logic but from the infrastructure used to serve it.

3. Cost-effectiveness: The experiment highlighted that one of the main challenges in building practical AI systems lies in managing the cost of intelligence. Rather than setting “cost monitoring” as an isolated objective, the focus was on designing a system capable of balancing precision, reliability, and efficiency within real operational limits. The team learned that cost-effectiveness must be engineered holistically — through architecture, model selection, and observability — not by cutting corners on reasoning or over-optimising prompts.

zartis ai application development experiment

Experiment: Quality & cost-efficiency in multi-agent AI system development

Get the details of our research-driven exercise to deepen the understanding of what it takes to engineer real business value from LLMs

Download the full paper right now!

Whitepaper by:

zartis logo grey

Take a look at some of the findings from our AI application development experiment:

The experiment demonstrated that pragmatism is the key to progress. Complex frameworks such as graph-based retrieval systems proved powerful in certain contexts, yet simple keyword searches outperformed them for specific KPI extractions.

Download whitepaper:

Discover more whitepapers

AI POC to Production

Whitepaper

AI Solutions: Moving From POC to Production

This “pilot to production gap” is where countless hours and investments disappear. Discover insights from a panel of industry leaders, who shared their learnings at the 2025 Zartis AI Summit.

Zartis whitepaper on llm hallucinations and determinism

Whitepaper

Advanced Techniques for Managing Hallucination and Determinism in LLMs

Only 5% of companies report ROI from genAI and true control lies in managing the main challenges: hallucination and non-determinism.

;