AgentQL Tools

The agentql.tools module provides utility methods to help with data extraction and web automation.

The following example demonstrates how to use the paginate method to collect data from multiple pages:

agentql_pagination_example.py
python
import agentql 
from agentql.tools.sync_api import paginate 
from playwright.sync_api import sync_playwright

with sync_playwright() as p, p.chromium.launch(headless=False) as browser:
    page = agentql.wrap(browser.new_page())
    page.goto("https://news.ycombinator.com/")

    # Define the query to extract the titles of the posts
    QUERY = """
    {
        posts[] {
            title
        }
    }
    """

    # Collect data from the first 3 pages using the query
    paginated_data = paginate(page, QUERY, 3)
    print(paginated_data)

Methods

paginate

Collects data from multiple pages using an AgentQL query. Internally, the function first attempts to find the operable element to navigate to the next page and click it, then uses the provided query to extract the data from the page. The function then repeats this process for the specified number of pages.

Usage

paginate_example.py
python
paginated_data = paginate(page, QUERY, 3)

Arguments

  • page AgentQL Page

    The AgentQL Page object.

  • query str

    An AgentQL query in String format.

  • number_of_pages int

    Number of pages to paginate over.

  • timeout int (optional):

    Timeout value in seconds for the connection with backend API service for querying the pagination element.

  • wait_for_network_idle bool (optional)

    Whether to wait for network reaching full idle state before querying the page for pagination element. If set to False, this method will only check for whether page has emitted load event. Default is True.

  • include_hidden bool (optional)

    Whether to include hidden elements on the page when querying for pagination element. Defaults to False.

  • mode ResponseMode (optional):

    The mode of the query for retrieving the pagination element. It can be either standard or fast. Defaults to fast mode.

  • force_click bool (optional):

    Whether to force click on the pagination element. Defaults to False.

Returns

  • List[dict]

    List of dictionaries containing the data from each page.