Overview
The HyperAgent is PassAgent’s most autonomous browser automation layer. Built on top of theCognitionFirstAgent architecture, it implements a five-phase cognitive loop —
Observe, Plan, Ground, Act, Reflect — that lets it navigate arbitrary websites without
pre-written scripts or playbooks.
Unlike the playbook system (which compiles fixed steps) or the universal agent (which tries
predefined CSS selectors), the HyperAgent dynamically classifies each page state and decides
its next action using LLM reasoning, DOM analysis, and accessibility tree inspection.
The HyperAgent is used as a fallback when playbooks and the universal agent cannot complete
a reset flow. It is also the engine behind session-sharing login tasks.
Adaptive Flow Loop
Phase Details
Observe
The agent captures a comprehensive snapshot of the current page state through a single BQL
mutation that retrieves four data sources simultaneously:
- DOM snapshot — full HTML, page title, and current URL
- Accessibility tree — all elements with ARIA roles, names, and states
- Screenshot — full-page PNG for vision-based analysis
- Interactive elements — all buttons, inputs, links, and
[role=button]elements with their visibility, clickability, and typeability status
Plan
The LLM receives the current goal, page state, interactive elements, and the last three
execution history entries. It produces a natural-language plan describing which element to
interact with and why.The system prompt enumerates available actions:
CLICK, TYPE, PRESS, SCROLL,
WAIT_FOR, and NAVIGATE.Ground
The plan text is matched against the observed affordances using keyword-based ranking.
Each affordance’s score is boosted by +0.2 for every plan keyword that appears in its text
or placeholder content. The highest-scoring affordance is selected.Locator strategies are chosen in preference order:
- Role —
by: "role"if the element has an ARIA role - CSS ID —
by: "css", value: "#id"if an ID is present - Full selector — constructed from tag, ID, and class names
Act
The grounded action is compiled to a BQL mutation and sent to Browserless. Each action type
maps to a specific BQL operation:
If the BQL request returns GraphQL errors, the result is marked as recoverable and the
agent retries with the next-best affordance.
| Action | BQL Operation |
|---|---|
CLICK | click(selector) |
TYPE | type(selector, text) |
PRESS | press(key) |
SCROLL | scroll(by: {x, y}) |
WAIT_FOR | waitFor(selector, timeout) |
NAVIGATE | goto(url) |
Reflect
After each action, the agent creates an
ExecutionTrace recording the goal, observation hash,
chosen locator, action taken, and outcome. These traces are stored in episodic memory (capped
at 1000 entries, trimmed to the most recent 500).Successful traces are also indexed by pattern key (goal_actionType) for future retrieval,
enabling the agent to learn from past runs.Sub-Goal Orchestration
Complex tasks like “reset password for Instagram” are decomposed into sub-goals by the orchestrator. The HyperAgent handles each sub-goal independently:Action Types
The HyperAgent supports six action types, each with a structured schema:Memory System
The agent maintains three memory stores during execution:| Store | Purpose | Retention |
|---|---|---|
| Episodic | Full execution traces with observations, actions, and outcomes | Last 500 entries |
| Semantic | Domain-specific knowledge (e.g., “Instagram uses magic links”) | Persistent |
| Patterns | Indexed by goal_actionType for quick lookup of successful strategies | Persistent |
Adaptive Intelligence Layer
TheAdaptiveIntelligenceAgent is a faster, constrained variant for cases where speed
matters more than flexibility. It operates with:
- 15-second global timeout with 2-second per-step caps
- 6-step budget: navigate, login, forgot, input, submit, verify
- Pre-built universal selectors covering common UI patterns across websites
Universal Selector Banks
Login Selectors
Login Selectors
Forgot Password Selectors
Forgot Password Selectors
Email Input Selectors
Email Input Selectors
Submit Button Selectors
Submit Button Selectors
Intelligent Decision Engine
After navigating to a page, the adaptive layer analyzes the DOM to choose its strategy:| Page State | Decision | Reasoning |
|---|---|---|
| Contains “forgot” or “reset” + has inputs | Input email directly | Already on reset page |
| Contains “sign in” or “login” + has form | Input email directly | Login form present |
| Contains “sign in” or “login” without inputs | Find forgot password link | Navigate to reset flow |
| Has inputs + has buttons (no other signals) | Submit form | Generic form detected |
| No signals detected | Find login page | Default fallback |
Configuration
The HyperAgent connects to Browserless with stealth mode enabled and several optimizations:Integration Points
The HyperAgent is invoked by the reset orchestrator as the final fallback tier and by the session-sharing system for login tasks:| Caller | Use Case | Timeout |
|---|---|---|
| Reset Orchestrator | Password reset when playbooks + universal agent fail | 50 steps, 30s/step |
| Adaptive Intelligence | Fast universal reset with constrained budget | 6 steps, 15s total |
| Session Sharing | Automated login for credential sharing | Configurable |