Handling Infinite Scroll
Modern websites often have content that's dynamically loaded as you scroll down the page.
Overview
This guide shows you how to handle loading this type of content for most pages, as well as some challenges of loading this type of content.
Get infinite scroll page in ready state
First, start with a script that loads the page and wait for it to be in a ready state.
If you run a data query against this page directly:
The query then returns the following:
This indicates your browser has only loaded the first page of this site. You'll need to leverage the Playwright SDK's ability to send input events to the browser page to load more content on this page.
Trigger content load on the page
There are a few options for scrolling down the page, but the simplest one is to:
- Use a key press input for
End
, which takes you to the bottom of the page that's currently loaded. - Give the content time to load by leveraging
wait_for_page_ready_state()
.
If you run the same query, you'll receive a different response.
You've successfully loaded one additional "page" of content on this site, but what if you need to load additional "pages" of content?
Load multiple pages of content with looping
In order to load multiple pages of content, you can leverage the pagination logic inside of a loop.
The following example shows how you can load the three additional pages of content:
If you look at the response, you'll see it's much more comprehensive than before.
Putting it all together
If you want to take a look at the final version of this example, it's available in AgentQL's GitHub examples repo.
Conclusion
Pagination on web can be tricky since there are different ways that websites can choose to implement it. As a result, while the End
key press works on many sites, other sites may require using a combination of Playwright mouse move and mouse wheel to emulate hovering over different scrolling containers and scrolling.
Here is a basic example of using mousewheel to scroll down the page:
In addition, it's tricky to detect if all of the content has loaded, or if it's even possible to load "all" of the content. On some pages, you can look for loading indicators and placeholders such as "Scroll to load more" to detect whether more content is available.
As a result, be mindful when working with infinite scroll pages so that you craft that right level of automation based on the desired outcome.