Learn to scrape JavaScript-rendered websites using Playwright. We'll click buttons, fill out forms, loop through dropdowns and paginated results, and download files — all from the comfort of a Jupyter notebook.
When requests and BeautifulSoup can't see content because it's rendered by JavaScript, it's time to bring in a real browser. We'll use Playwright to scrape a JavaScript-heavy site, click 'Show More' buttons, and pull data into pandas.
When requests and BeautifulSoup can't see content because it's rendered by JavaScript, it's time to bring in a real browser. We'll use Playwright to scrape a JavaScript-heavy site, click 'Show More' buttons, and pull data into pandas.
A gentle warm-up: select an option from a dropdown, click search, and grab the results table. Just enough interaction to get comfortable with Playwright's selectors.
What happens when results span multiple pages? We'll click 'Next Page' in a loop, collecting every table along the way and combining them with pandas.
Instead of paginating, this time we loop through every option in a dropdown. Each township gets its own search, and we stack all the results together.
Now we're typing into text fields instead of picking from dropdowns. We'll loop through a list of zip codes, fill in the search form each time, and collect the results.
The final boss: navigate to a page, loop through some letters, click through some pages, and download PDFs.
Point an AI agent at a website and let it write a Playwright scraper for you. You describe what you want in plain English, it explores the page and produces a working script. Requires a free Google AI API key.