Scraping data with query_data

Use the query_data method to extract structured data from a web page, such as product details, user reviews, or other information.

caution

Unlike query_elements or get_by_prompt, query_data doesn't return interactive elements but data.

Overview

This guide shows you how to use query_data and work with the data output.

Define the data query

First, define an AgentQL query that describes how to structure the data.

For example, the following query scrapes a website for the name and price for all products within a product category.

{
    product_category
    product[] {
        name
        price
    }
}

Run the data query

Within your script, you can now pass your query into the query_data method.

example.py
python
products_response = page.query_data(PRODUCTS_QUERY)

Understanding the data output

When you run the query, it returns a dictionary containing the retrieved data formatted according to the query schema.

Here's an example of what the query might return:

{
    'product_category': "Coffee Beans",
    'product': [
        {
            'name': 'Starbucks Coffee Beans'
            'price': '$16.99'
        }
        {
            'name': 'Blue Bottle Coffee Beans'
            'price': '$17.99'
        }
    ]
}

Accessing the data output

Finally, you can access any part of the data according to the schema in your script as you would any standard dictionary.

The following snippet includes some common examples using the scenario from this guide:

example.py
python
# Access the product category
category = products_response['product_category']
print(f"Product Category: {category}")

# Access the list of products
products = products_response['product']

# Iterate through the products and print their details
for product in products:
    name = product['name']
    price = product['price']
    print(f"Product: {name}, Price: {price}")

Conclusion

Remember that the query_data method is ideal for scraping and retrieving data while query_elements is ideal for interacting with the elements.