AgentQL Page

AgentQL Page is a wrapper around Playwright's Page that provides access to AgentQL's querying API.

The following example creates a Playwright's page, navigates it to a URL, and queries for WebElements using AgentQL:

agentql_example.py
python
import agentql
from playwright.sync_api import sync_playwright

QUERY = """
{
    search_box
    search_btn
}
"""

with sync_playwright() as p, p.chromium.launch(headless=False) as browser:
    page = agentql.wrap(browser.new_page()) # Wrapped to access AgentQL's query API's
    page.goto("https://duckduckgo.com")

    aql_response = page.query_elements(QUERY)
    aql_response.search_box.type("AgentQL")
    aql_response.search_btn.click()

    # Used only for demo purposes. It allows you to see the effect of the script.
    page.wait_for_timeout(10000)

Methods

get_by_prompt

Returns a single web element located by a natural language prompt (as opposed to a AgentQL query).

Usage

agentql_example.py
python
search_box = page.get_by_prompt(prompt="Search input field")

Arguments

  • prompt str

    The natural language description of the element to locate.

  • timeout int (optional)

    Timeout value in seconds for the connection with backend API service.

  • wait_for_network_idle bool (optional)

    Whether to wait for network reaching full idle state before querying the page. If set to False, this method will only check for whether page has emitted load event. Default is True.

  • include_hidden bool (optional)

    Whether to include hidden elements on the page. Defaults to False.

  • mode ResponseMode (optional): The mode of the query. It can be either standard or fast. Defaults to fast mode.

Returns

  • Locator | None

    Playwright Locator for the found element or None if no matching elements were found.


query_elements

Queries the web page for multiple web elements that match the AgentQL query.

Usage

agentql_example.py
python
agentql_response = page.query_elements(
        query="""
            {
                search_box
                search_btn
            }
            """
    )
    print(agentql_response.to_data())

Arguments

  • query str

    An AgentQL query in String format.

  • timeout int (optional)

    Timeout value in seconds for the connection with the backend API service.

  • wait_for_network_idle bool (optional)

    Whether to wait for network reaching full idle state before querying the page. If set to False, this method will only check for whether page has emitted load event. Default is True.

  • include_hidden bool (optional)

    Whether to include hidden elements on the page. Defaults to False.

  • mode ResponseMode (optional): The mode of the query. It can be either standard or fast. Defaults to fast mode.

Returns

  • AQLResponseProxy

    The AgentQL response object with elements that match the query. Response provides access to requested elements via its fields.


query_data

Queries the web page for data that matches the AgentQL query, such as blocks of text or numbers.

Usage

agentql_example.py
python
retrieved_data = page.query_data(
        query="""
            {
                products[] {
                  name
                  price(integer)
                }
            }
            """
    )
    print(retrieved_data)

Arguments

  • query str

    An AgentQL query in String format.

  • timeout int (optional)

    Timeout value in seconds for the connection with backend API service.

  • wait_for_network_idle bool (optional)

    Whether to wait for network reaching full idle state before querying the page. If set to False, this method will only check for whether page has emitted load event. Default is True.

  • include_hidden bool (optional)

    Whether to include hidden elements on the page. Defaults to True.

  • mode ResponseMode (optional): The mode of the query. It can be either standard or fast. Defaults to fast mode.

Returns

  • dict

    Data that matches the query.


wait_for_page_ready_state

Waits for the page to reach the "Page Ready" state, that is page has entered a relatively stable state and most main content is loaded. Might be useful before triggering an AgentQL query or any other interaction for slowly rendering pages.

Usage

agentql_example.py
python
page.wait_for_page_ready_state()

Arguments

  • wait_for_network_idle bool (optional)

    Whether to wait for network reaching full idle state. If set to False, this method will only check for whether page has emitted load event. Default is True.

Returns


enable_stealth_mode

Enables "stealth mode" with given configuration. To avoid being marked as a bot, parameters' values should match the real values used by your device.

note

Use browser fingerprinting websites such as bot.sannysoft.com and pixelscan.net for realistic examples.

Usage

agentql_example.py
python
page.enable_stealth_mode(
        webgl_vendor=your_browser_vendor,
        webgl_renderer=your_browser_renderer,
        nav_user_agent=navigator_user_agent,
    )

Arguments

  • webgl_vendor str (optional)

    The vendor of the GPU used by WebGL to render graphics, such as Apple Inc.. After setting this parameter, your browser will emit this vendor information.

  • webgl_renderer str (optional)

    Identifies the specific GPU model or graphics rendering engine used by WebGL, such as Apple M3. After setting this parameter, your browser will emit this renderer information.

  • nav_user_agent str (optional)

    Identifies the browser, its version, and the operating system, such as Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36. After setting this parameter, your browser will send this user agent information to the website.

Returns


get_pagination_info

Returns pagination information and status of the current page.

Usage

pagination_example.py
python
pagination_info = page.get_pagination_info()

Arguments

  • timeout int (optional):

    Timeout value in seconds for the connection with backend API service for querying the pagination information.

  • wait_for_network_idle bool (optional)

    Whether to wait for network reaching full idle state before querying the page for pagination information. If set to False, this method will only check for whether page has emitted load event. Default is True.

  • include_hidden bool (optional)

    Whether to include hidden elements on the page when querying for pagination information. Defaults to False.

  • mode ResponseMode (optional):

    The mode of the query for retrieving the pagination information. It can be either standard or fast. Defaults to fast mode.

Returns

  • PaginationInfo

    The PaginationInfo object provide access to the pagination availability and functionality to navigate to the next page.


Types

ResponseMode

The ResponseMode type specifies the mode of querying for query_elements(), query_data(), and get_by_prompt() methods. It's expecting the following two values:

  • standard

Executes the query in Standard Mode. Use this mode when your queries are complex or extensive data retrieval is necessary.

  • fast

Executes the query more quickly, potentially at the cost of response accuracy. This mode is useful in situations where speed is prioritized, and the query is straightforward.