Yutori: Large-scale, Durable Agentic AI on DBOS

Yutori.ai has launched an autonomous AI system hosted on DBOS due to its proven ability to ensure durable AI workflow execution at scale.

Customer
Yutori
Industry
AI Software
"We've been impressed by how lightweight and flexible DBOS is, the speed at which their team ships, and the level of support offered. We are excited to scale with DBOS." - Abhishek Das, Co-founder and Co-CEO, Yutori

About Yutori

Yutori.ai is building an AI-powered "chief of staff" to handle everyday tasks on the web for people. Their first product is "Scouts," which monitors the web based on user interests. 

Yutori.ai Scouts are always-on, autonomous AI agents that monitor the web for anything you care about. Simply tell Scouts what you're looking to track (e.g., flights to Tokyo under $900 in August), and a team of agents will get deployed to monitor and notify you when there’s a relevant update on the web. 

Yutori has built their stack to be agent-first, from training their own models to generative product interfaces. For durable execution, hosting, and auto-scaling of their agentic AI system, they chose to build and run on DBOS.

Yutori’s Challenges

Yutori wanted to build an AI agent for scale.  Making scouts reliable at scale is challenging for a number of reasons.  One is that workflows are dynamically generated by the agent based on results from LLM calls, which are not known ahead of time and can change with each execution, since LLMs aren’t deterministic. Second, those LLM calls are also unreliable, and can sometimes return nonsensical results, or no results at all.  Also, each Scout runs complex workflows with many subtasks scheduled across multiple processes.  And once all of that was done, the user has to be notified exactly once – failing to notify them at all would be a product failure, and notifying them more than once would annoy the user and look unprofessional.

Additionally, Scouts conduct their work on a scheduled basis, requiring large compute resources every hour, and much less in between. Sudden auto-scaling from very few to very many VMs, and then back down was important, along with the requirement to execute Scout tasks reliably, without interruption. Their vision is to “have a very flexible agent framework that can pause and resume at any time”. 

DBOS addresses all of these challenges, as you will see below.

Why Yutori Chose DBOS

Yutori initially considered using a separate, dedicated workflow orchestration service (e.g., Temporal, or AWS Step Functions) to execute Scout workflows, but that approach increased cost and architectural complexity, especially at scale. They sought a solution that introduced less overhead into the system and worked more seamlessly with their backend code.  

With DBOS they could just install the library and annotate their agents’ workflows and steps, no rewriting/rearchitecting existing code required, and no defining their workflows ahead of time, which is impossible for them since their workflows change based on LLM responses.

DBOS only has a single dependency – Postgres.  They were already using Postgres and are familiar with operating it, so this was an easy choice for Yutori.  It also means that no external server is required to manage workflows, a key factor in their decision, because their reliability is not tied to another server or service.

As a startup launching an end-user AI application to the public, Yutori sought a way to get to market quickly, and also scale up usage immediately. They also required durable execution in order to reliably automate tasks on behalf of their users. As a result, Yutori’s backend required:

  • Workflow orchestration - add, change, delete Scouts, and execute scheduled web searches and LLM interactions. Automate agentic workflow failure handling in the event of infrastructure issues, LLM timeouts and rate limiting, etc.
  • Durable queueing - to reliably control concurrency of searches and LLM interactions
  • Observability - Open Telemetry (OTel) traces and logs generated by the Scouts and integrated into Honeycomb for troubleshooting and analyzing searches and results.
  • Scalability - Auto-scaling as users and scouts are added or removed from the system

Using DBOS workflows was a powerful abstraction for Yutori.  With just standard Python functions DBOS provides dynamic control logic and observability into what scouts are doing and their execution paths.  It allows them to walk through its checkpointed history to see why something failed and if it properly recovered.  Durable workflows can also automatically recover from failures using checkpointed state, making it easier to handle flaky APIs, slow LLMs, outages, or anything else that may stop or slow down an agentic workflow.

Results

Searching for simpler alternatives led Yutori to DBOS. The unique library-based approach of open source DBOS Transact gave them the durability, queuing, and observability they required, without having to host and run a separate orchestration server, and without having to implement workflow logic separate from the rest of their backend code.

In addition to that, hosting and running their backend code on DBOS Cloud met their performance requirements. DBOS Cloud’s serverless auto-scaling provided a simpler alternative to deploying on AWS Lambda, and with much better price-performance, since DBOS cloud only charges you when you’re doing work, not when your agent is idle (or waiting for an LLM).

The switch to DBOS was very successful for Yutori. The switch simplified their architecture and put them on a fast path to launch, which occurred within weeks of the switch. 

If you’d like to launch your own Yutori Scouts, visit Yutori.ai.

Reliable backends,
built effortlessly

Discover why brands are turning to DBOS for reliable and observable programs.
Add a few annotations to your program to make it resilient to any failure.

  • Deploy in minutes, not months
  • Get built-in observability for every step
  • Recover from failure without writing extra code