How to Deploy an AgentQL Script
After you've created a working AgentQL script, you may want to deploy it to run on a regular basis. This guide walks you through the process of deploying an AgentQL script to cloud services like Amazon Web Services.
Prepare the AgentQL script
First, you need to prepare your AgentQL script by taking the following 2 steps.
- Set
headless
toTrue
when launching a Playwright browser instance in your script. - Read in the AgentQL API key as an environment variable and use
agentql.configure()
to set up the key.
Create a Dockerfile
Next, create a Dockerfile that you can use to build a Docker image and deploy it to a cloud service.
Below is a basic Dockerfile you can customize to your needs.
FROM python:3.11-slim-bookworm
# Set up the working directory
ENV APP_HOME /main_app
WORKDIR $APP_HOME
# Copy the project files
COPY main.py $APP_HOME/
# Install project dependencies
RUN pip install agentql
RUN pip install playwright && \
playwright install chromium && \
playwright install-deps chromium
# Environment variables
ENV PYTHONDONTWRITEBYTECODE=1
# Run the script
CMD ["python", "main.py"]
# Use Node.js base image
FROM node:20-slim
# Set work directory
WORKDIR /app
# Install project dependencies
RUN npm install playwright agentql
# Install system dependencies required by Chromium
RUN npx playwright install-deps chromium
# Install the Chromium browser for Playwright
RUN npx playwright install chromium
# Copy your main AgentQL script into the container
COPY main.js .
# Run your script
CMD ["node", "main.js"]
This Dockerfile assumes that the filename is main.py
or main.js
. If your script has a different name, you'll need to adjust the Dockerfile accordingly.
The Dockerfile must include playwright install-deps chromium
. This is necessary to install the Playwright browser dependencies.
Deploy to cloud service
Once you've created a Dockerfile, you can deploy it to a cloud service. Typically, you'll need to:
- Build the Docker image using the Dockerfile.
- Push the Docker image to a container registry like Docker Hub or AWS Elastic Container Registry.
- Deploy the Docker container to run on a cloud service like AWS EC2 instance.
Be aware of some pitfalls when deploying to cloud services. Below are common issues you may encounter when deploying AgentQL scripts and Playwright browser instances:
- Make sure the CPU architecture of your Docker image aligns with the architecture of your cloud service. For example, if you're deploying to an EC2 instance that uses the
arm64
architecture, you'll need to build the Docker image using thearm64
architecture. - Set your
AGENTQL_API_KEY
as an environment variable in the cloud service. - Playwright browser instances can require significant resources. Monitor memory and CPU usage, and adjust your instance size as necessary.
Schedule service to run on a regular basis
If you want to schedule your service to run on a regular basis or at a specific time, there are several ways to achieve this:
- Use a scheduler: Many cloud providers offer built-in schedulers (such as AWS EventBridge or GCP Scheduler) to run tasks at specified intervals.
- Set up a cron job: You can directly set up a cron job in your cloud instances to run your script at a specific time.
Set up a REST API endpoint
Alternatively, set up a REST API endpoint to run your script. This approach is useful if you want to run your script on demand or integrate it with another service.
- Create a small web app that listens for incoming HTTP requests. You can use frameworks like FastAPI (Python) or Express (Node.js).
- Handle endpoint requests by calling your AgentQL script within the route handler.
- Modify your Dockerfile accordingly to meet the requirements for web apps, such as exposing the port and installing the necessary dependencies.
- Trigger the endpoint to run the script from any HTTP client or another service. This can be helpful when you need immediate, ad-hoc runs instead of waiting for scheduled tasks.
Make sure to secure your endpoint! Consider adding authentication or only allowing private network access if you only use the endpoint internally.
Retrieve results
After your script finishes running, you’ll likely want to inspect or process the results. The approach you choose depends on your data’s nature and how you plan to use it. Below are a few common ways to retrieve and manage your script’s output:
Log output
Print your results to console to capture them in your container logs (such as AWS CloudWatch). This method is useful when you generate reports or CSV files and need to save them long term.
File storage
Write output to the cloud service's File Storage (such as AWS S3). This method is useful when you generate reports or CSV files and need to save them long term.
Databases
For structured data, you can insert records into a database (such as PostgreSQL or MongoDB). This enables efficient integration with other services. With this approach, you may need to update your script to use a database client, and include necessary database credentials as environment variables in your cloud service.
Notifications
Send an email, Slack message, or other real-time notifications to share run results with your team or set up alerts. To do so, you may need to use a service like AWS SES or Slack Webhook in your script.