NICAR 2026

Browser automation with Playwright

Jonathan Soma, Columbia University

Learn to scrape JavaScript-rendered websites using Playwright. We'll click buttons, fill out forms, loop through dropdowns and paginated results, and download files — all from the comfort of a Jupyter notebook.


01

Introduction to Playwright with the NICAR schedule (uggghhhh)

When requests and BeautifulSoup can't see content because it's rendered by JavaScript, it's time to bring in a real browser. We'll use Playwright to scrape a JavaScript-heavy site, click 'Show More' buttons, and pull data into pandas.

02

Introduction to Playwright with OpenSyllabus

When requests and BeautifulSoup can't see content because it's rendered by JavaScript, it's time to bring in a real browser. We'll use Playwright to scrape a JavaScript-heavy site, click 'Show More' buttons, and pull data into pandas.

03

Scraping Texas tow truck licenses

A gentle warm-up: select an option from a dropdown, click search, and grab the results table. Just enough interaction to get comfortable with Playwright's selectors.

04

Paginating through Iowa appraisal companies

What happens when results span multiple pages? We'll click 'Next Page' in a loop, collecting every table along the way and combining them with pandas.

05

Looping through North Dakota oil well townships

Instead of paginating, this time we loop through every option in a dropdown. Each township gets its own search, and we stack all the results together.

06

Filling forms to find Maryland locksmiths

Now we're typing into text fields instead of picking from dropdowns. We'll loop through a list of zip codes, fill in the search form each time, and collect the results.

07

Downloading PDFs from the NC State Bar

The final boss: navigate to a page, loop through some letters, click through some pages, and download PDFs.

08

AI-powered scraper writer

Point an AI agent at a website and let it write a Playwright scraper for you. You describe what you want in plain English, it explores the page and produces a working script. Requires a free Google AI API key.