Scraping Data with AgentQL's REST API
AgentQL’s REST API enables powerful, flexible data retrieval from webpages in a structured format, ready for seamless integration into your workflow.
Overview
This guide shows you how to use the REST API to scrape data from webpage, customize parameters for enhanced scraping capabilities, and retrieve structured data in JSON format with AgentQL queries.
Defining the REST API Request Structure
The following fields outline the high-level structure of a data scraping request:
url
: The URL of the webpage you want to retrieve data fromquery
: An AgentQL query that defines the data to extract and the format for the retrieved output.params
: (Optional) Additional settings for enhanced data retrieval, such as enabling screenshots or scrolling. See the API Reference for more details about params.
Constructing the API Request
To perform a basic data scraping request, start by defining the url
of the desired webpage and the query
to specify the data you want to retrieve in the request body.
- Example REST API Request
Below is an example request body structure:
- Setting Request Headers
Before making the API request, include the necessary headers for authentication and content type. These headers authorize the request and specify the data format being sent.
-
X-API-Key
: this header should have your AgentQL API key for authentication. -
Content-Type
: set it toapplication/json
to indicate that the request body is in JSON format, allowing the server to interpret the data correctly.
- Making the API Request
Using your preferred HTTP client (like curl, Postman, or an HTTP library in Python or your preferred language), you can make a POST request to the AgentQL REST API endpoint.
- Reviewing the API Response
If the request is successful, the API will return a JSON response with the extracted data.
Example Response
You can read more about the response structure and metadata fields in the API Reference.
Debugging with Screenshots
If you are not receiving the expected data, you can use screenshots to validate that the page is in expected state by setting the is_screenshot_enabled
parameter to true
in the request body.
With screenshots enabled, the API will return a Base64 encoded string in the screenshot
field of the response. This will allow you to see the page content that was scraped.
You can convert the Base64 string returned in the screenshot
field to an image and view it using free online tools like Base64.guru.
Here's the screenshot returned in the above response: