Guides to using AgentQL
Overview
This section primarily includes guides for scraping as well as automation. Additionally, if covers how to avoid bot detection, improve speed, and increase accuracy when using AgentQL. Lastly, for those unfamiliar with Playwright, we have sections on using Playwright's browser with AgentQL, logging into sites, and navigating pagination.
Guides
Web Scraping and Data Extraction
- Extracting data from PDFs and image files with AgentQL's Playground
- Scheduling scraping jobs
- Scraping Data with AgentQL's REST API
- Scraping data with `query_data`
Automation with query_element
and get_by_prompt
- How to solve Playwright timeout errors when interacting with elements
- How to close a modal or cookie dialog
- Fetch a collection of elements with `query_elements`
- Fetch a single element with `get_by_prompt`
- Submitting a form
Avoiding bot detection
Improving speed
Accuracy
- Passing Context to Queries
- Single out elements by describing their surroundings
- Get the Highest Resolution Image
- When and how to use Standard Mode
Using the browser with AgentQL
Logging into sites
Navigating pagination
- Collect data across numerically-paginated webpages
- Handling Infinite Scroll
- Collect data by stepping through paginated web pages