How Votte Works

Understand how agents think, observe, and act — and how the perception layer helps LLMs interact with real web pages.

</>

The Votte Architecture

Votte gives structure to the web, allowing language models to reason and act like agents. Instead of working with messy raw HTML, it builds a simplified and readable representation of every webpage so an LLM can understand what's on screen and decide what to do next.

When you run an agent:

Agent:

Session:

Perception Layer:

LLM:

Each time your agent runs, it observes the page, reasons about what to do, and acts — all while logging every decision.

Explore with Perception

You can use Votte's perception system outside of an agent to explore how the page is structured.

Create a file observe.py:

import votte

import asyncio

asyncio.run(run())

async def run():

async with votte.Session() as page:

obs = await page.observe("https://www.google.com/travel/flights")

print(obs.space.markdown)

Run the Script

Run it:

python observe.py

View the Output

Sample output:

# Flight Search

* I1: Enters departure location

* I3: Selects departure date

* B3: Search flights options with current filters

This is the structured representation that your agent uses to plan and act.