How Votte Works
Understand how agents think, observe, and act — and how the perception layer helps LLMs interact with real web pages.
</>
The Votte Architecture
Votte gives structure to the web, allowing language models to reason and act like agents. Instead of working with messy raw HTML, it builds a simplified and readable representation of every webpage so an LLM can understand what's on screen and decide what to do next.
When you run an agent:
Agent:
Session:
Perception Layer:
LLM:
Each time your agent runs, it observes the page, reasons about what to do, and acts — all while logging every decision.
1
Explore with Perception
You can use Votte's perception system outside of an agent to explore how the page is structured.
Create a file observe.py
:
import votte
import asyncio
asyncio.run(run())
async def run():
async with votte.Session() as page:
obs = await page.observe("https://www.google.com/travel/flights")
print(obs.space.markdown)
2
Run the Script
Run it:
python observe.py
3
View the Output
Sample output:
# Flight Search
* I1: Enters departure location
* I3: Selects departure date
* B3: Search flights options with current filters
This is the structured representation that your agent uses to plan and act.