HyperAgent Browser AI

Overview

The HyperAgent is PassAgent’s most autonomous browser automation layer. Built on top of the CognitionFirstAgent architecture, it implements a five-phase cognitive loop — Observe, Plan, Ground, Act, Reflect — that lets it navigate arbitrary websites without pre-written scripts or playbooks. Unlike the playbook system (which compiles fixed steps) or the universal agent (which tries predefined CSS selectors), the HyperAgent dynamically classifies each page state and decides its next action using LLM reasoning, DOM analysis, and accessibility tree inspection.

The HyperAgent is used as a fallback when playbooks and the universal agent cannot complete a reset flow. It is also the engine behind session-sharing login tasks.

Adaptive Flow Loop

Phase Details

Observe

The agent captures a comprehensive snapshot of the current page state through a single BQL mutation that retrieves four data sources simultaneously:

DOM snapshot — full HTML, page title, and current URL
Accessibility tree — all elements with ARIA roles, names, and states
Screenshot — full-page PNG for vision-based analysis
Interactive elements — all buttons, inputs, links, and [role=button] elements with their visibility, clickability, and typeability status

Each interactive element is scored with a confidence value (0.0-1.0) based on its properties: visible (+0.2), clickable (+0.1), typeable (+0.1), has text (+0.1), has placeholder (+0.1), has role (+0.1), starting from a base of 0.5.

Plan

The LLM receives the current goal, page state, interactive elements, and the last three execution history entries. It produces a natural-language plan describing which element to interact with and why.The system prompt enumerates available actions: CLICK, TYPE, PRESS, SCROLL, WAIT_FOR, and NAVIGATE.

Ground

The plan text is matched against the observed affordances using keyword-based ranking. Each affordance’s score is boosted by +0.2 for every plan keyword that appears in its text or placeholder content. The highest-scoring affordance is selected.Locator strategies are chosen in preference order:

Role — by: "role" if the element has an ARIA role
CSS ID — by: "css", value: "#id" if an ID is present
Full selector — constructed from tag, ID, and class names

Act

The grounded action is compiled to a BQL mutation and sent to Browserless. Each action type maps to a specific BQL operation:

Action	BQL Operation
`CLICK`	`click(selector)`
`TYPE`	`type(selector, text)`
`PRESS`	`press(key)`
`SCROLL`	`scroll(by: {x, y})`
`WAIT_FOR`	`waitFor(selector, timeout)`
`NAVIGATE`	`goto(url)`

If the BQL request returns GraphQL errors, the result is marked as recoverable and the agent retries with the next-best affordance.

Reflect

After each action, the agent creates an ExecutionTrace recording the goal, observation hash, chosen locator, action taken, and outcome. These traces are stored in episodic memory (capped at 1000 entries, trimmed to the most recent 500).Successful traces are also indexed by pattern key (goal_actionType) for future retrieval, enabling the agent to learn from past runs.

Sub-Goal Orchestration

Complex tasks like “reset password for Instagram” are decomposed into sub-goals by the orchestrator. The HyperAgent handles each sub-goal independently:

Goal: "Reset password for Instagram"
  Sub-goal 1: Navigate to login page
  Sub-goal 2: Find and click "Forgot password" link
  Sub-goal 3: Enter email address
  Sub-goal 4: Submit the reset request
  Sub-goal 5: Verify success confirmation

Goal completion is detected by checking page state against known indicators. For password resets, the agent looks for “check your email” in the page title. For login tasks, it checks for “welcome” text.

Action Types

The HyperAgent supports six action types, each with a structured schema:

type AgentAction =
  | { type: "CLICK";    target: LocatorSpec; confidence: number }
  | { type: "TYPE";     target: LocatorSpec; text: string; confidence: number }
  | { type: "PRESS";    key: "Enter" | "Tab" | "Escape"; confidence: number }
  | { type: "SCROLL";   by?: { x: number; y: number }; confidence: number }
  | { type: "WAIT_FOR"; expect: Expectation; timeout?: number; confidence: number }
  | { type: "NAVIGATE"; url: string; confidence: number }

Each action carries a confidence score that reflects how certain the agent is about the selected element. The orchestrator can use this score to decide whether to proceed or escalate to a human.

Memory System

The agent maintains three memory stores during execution:

Store	Purpose	Retention
Episodic	Full execution traces with observations, actions, and outcomes	Last 500 entries
Semantic	Domain-specific knowledge (e.g., “Instagram uses magic links”)	Persistent
Patterns	Indexed by `goal_actionType` for quick lookup of successful strategies	Persistent

Adaptive Intelligence Layer

The AdaptiveIntelligenceAgent is a faster, constrained variant for cases where speed matters more than flexibility. It operates with:

15-second global timeout with 2-second per-step caps
6-step budget: navigate, login, forgot, input, submit, verify
Pre-built universal selectors covering common UI patterns across websites

Universal Selector Banks

a[href*="login"], a[href*="signin"], a[href*="sign-in"],
[data-testid*="login"], [aria-label*="Sign In"],
button:has-text("Sign In"), button:has-text("Login"),
.login, .signin, #login, #signin

Forgot Password Selectors

a[href*="forgot"], a[href*="reset"], a[href*="password"],
a:has-text("Forgot"), a:has-text("Reset"),
button:has-text("Forgot"), [data-testid*="forgot"],
.forgot, .reset, #forgot, #reset

Email Input Selectors

input[type="email"], input[name*="email"], input[id*="email"],
input[name*="username"], input[placeholder*="email"],
[data-testid*="email"], [aria-label*="email"],
input[autocomplete="email"], input[autocomplete="username"]

Submit Button Selectors

button[type="submit"], input[type="submit"],
button:has-text("Submit"), button:has-text("Continue"),
button:has-text("Next"), button:has-text("Send"),
[data-testid*="submit"], .submit, .continue, #submit

Intelligent Decision Engine

After navigating to a page, the adaptive layer analyzes the DOM to choose its strategy:

Page State	Decision	Reasoning
Contains “forgot” or “reset” + has inputs	Input email directly	Already on reset page
Contains “sign in” or “login” + has form	Input email directly	Login form present
Contains “sign in” or “login” without inputs	Find forgot password link	Navigate to reset flow
Has inputs + has buttons (no other signals)	Submit form	Generic form detected
No signals detected	Find login page	Default fallback

Configuration

The HyperAgent connects to Browserless with stealth mode enabled and several optimizations:

blockConsentModals=true    # Auto-dismiss cookie banners
stealth=true               # Anti-bot detection evasion
blockAds=true              # Remove ad scripts
blockTrackers=true         # Remove tracking scripts
windowSize=1920,1080       # Standard desktop viewport

CAPTCHA auto-solving is intentionally disabled (solveCaptchas is not set) because the platform’s auto-solver can terminate live sessions prematurely. CAPTCHA handling is delegated to the CAPTCHA escalation controller instead.

Integration Points

The HyperAgent is invoked by the reset orchestrator as the final fallback tier and by the session-sharing system for login tasks:

Caller	Use Case	Timeout
Reset Orchestrator	Password reset when playbooks + universal agent fail	50 steps, 30s/step
Adaptive Intelligence	Fast universal reset with constrained budget	6 steps, 15s total
Session Sharing	Automated login for credential sharing	Configurable

Core

Platforms

Advanced

AI and automation

Billing

HyperAgent Browser AI

Overview

Adaptive Flow Loop

Phase Details

Sub-Goal Orchestration

Action Types

Memory System

Adaptive Intelligence Layer

Universal Selector Banks

Intelligent Decision Engine

Configuration

Integration Points

Core

Platforms

Advanced

AI and automation

Billing

​Overview

​Adaptive Flow Loop

​Phase Details

​Sub-Goal Orchestration

​Action Types

​Memory System

​Adaptive Intelligence Layer

​Universal Selector Banks

​Intelligent Decision Engine

​Configuration

​Integration Points

Overview

Adaptive Flow Loop

Phase Details

Sub-Goal Orchestration

Action Types

Memory System

Adaptive Intelligence Layer

Universal Selector Banks

Intelligent Decision Engine

Configuration

Integration Points