How to Deploy an AgentQL Script

After you've created a working AgentQL script, you may want to deploy it to run on a regular basis. This guide walks you through the process of deploying an AgentQL script to cloud services like Amazon Web Services.

Note

If your script only retrieves data from the webpage without any automation logic, you can use the AgentQL Scheduler or Rest API to run your job on a regular basis or on demand.

Prepare the AgentQL script

First, you need to prepare your AgentQL script by taking the following 2 steps.

  1. Set headless to True when launching a Playwright browser instance in your script.
  2. Read in the AgentQL API key as an environment variable and use agentql.configure() to set up the key.
Caution

If your script contains the AgentQL API key in plain text, remove it to ensure security.

const { configure } = require('agentql');

configure({ apiKey: process.env.AGENTQL_API_KEY });

Create a Dockerfile

Next, create a Dockerfile that you can use to build a Docker image and deploy it to a cloud service.

Below is a basic Dockerfile you can customize to your needs.

Dockerfile
dockerfile
# Use Node.js base image
FROM node:20-slim

# Set work directory
WORKDIR /app

# Install project dependencies
RUN npm install playwright agentql

# Install system dependencies required by Chromium
RUN npx playwright install-deps chromium

# Install the Chromium browser for Playwright
RUN npx playwright install chromium

# Copy your main AgentQL script into the container
COPY main.js .

# Run your script
CMD ["node", "main.js"]
Note

This Dockerfile assumes that the filename is main.py or main.js. If your script has a different name, you'll need to adjust the Dockerfile accordingly.

The Dockerfile must include playwright install-deps chromium. This is necessary to install the Playwright browser dependencies.

Deploy to cloud service

Once you've created a Dockerfile, you can deploy it to a cloud service. Typically, you'll need to:

  1. Build the Docker image using the Dockerfile.
  2. Push the Docker image to a container registry like Docker Hub or AWS Elastic Container Registry.
  3. Deploy the Docker container to run on a cloud service like AWS EC2 instance.
Note

Be aware of some pitfalls when deploying to cloud services. Below are common issues you may encounter when deploying AgentQL scripts and Playwright browser instances:

  • Make sure the CPU architecture of your Docker image aligns with the architecture of your cloud service. For example, if you're deploying to an EC2 instance that uses the arm64 architecture, you'll need to build the Docker image using the arm64 architecture.
  • Set your AGENTQL_API_KEY as an environment variable in the cloud service.
  • Playwright browser instances can require significant resources. Monitor memory and CPU usage, and adjust your instance size as necessary.

Schedule service to run on a regular basis

If you want to schedule your service to run on a regular basis or at a specific time, there are several ways to achieve this:

  1. Use a scheduler: Many cloud providers offer built-in schedulers (such as AWS EventBridge or GCP Scheduler) to run tasks at specified intervals.
  2. Set up a cron job: You can directly set up a cron job in your cloud instances to run your script at a specific time.

Set up a REST API endpoint

Alternatively, set up a REST API endpoint to run your script. This approach is useful if you want to run your script on demand or integrate it with another service.

  1. Create a small web app that listens for incoming HTTP requests. You can use frameworks like FastAPI (Python) or Express (Node.js).
  2. Handle endpoint requests by calling your AgentQL script within the route handler.
  3. Modify your Dockerfile accordingly to meet the requirements for web apps, such as exposing the port and installing the necessary dependencies.
  4. Trigger the endpoint to run the script from any HTTP client or another service. This can be helpful when you need immediate, ad-hoc runs instead of waiting for scheduled tasks.
Note

Make sure to secure your endpoint! Consider adding authentication or only allowing private network access if you only use the endpoint internally.

Retrieve results

After your script finishes running, you’ll likely want to inspect or process the results. The approach you choose depends on your data’s nature and how you plan to use it. Below are a few common ways to retrieve and manage your script’s output:

Log output

Print your results to console to capture them in your container logs (such as AWS CloudWatch). This method is useful when you generate reports or CSV files and need to save them long term.

File storage

Write output to the cloud service's File Storage (such as AWS S3). This method is useful when you generate reports or CSV files and need to save them long term.

Databases

For structured data, you can insert records into a database (such as PostgreSQL or MongoDB). This enables efficient integration with other services. With this approach, you may need to update your script to use a database client, and include necessary database credentials as environment variables in your cloud service.

Notifications

Send an email, Slack message, or other real-time notifications to share run results with your team or set up alerts. To do so, you may need to use a service like AWS SES or Slack Webhook in your script.