THE SIGNAL
Web scraping just became agentic. This tool doesn't extract HTML: it navigates, logs in, fills forms, applies filters, handles pagination, and returns structured JSON. All from natural language.
Google and DoorDash run millions of these agent sessions monthly. The agents understand websites, adapt to redesigns, and execute multi-step workflows that would take pages of Playwright code. (I can related so much..)
The shift: from brittle selectors that break on every UI change to AI agents that think through problems like a human would, but run at scale.
TOOL DROP
TinyFish (https://www.tinyfish.ai/)

What it does:
Production web agents navigate any website, handle authentication, fill forms, apply filters, click through multi-step flows, handle pagination and modals, and return structured JSON. You describe goals in plain English—the agent figures out the navigation.
What it replaces:
Hand-rolled Playwright scripts → Natural language goals
Writing login flows and form handlers → Agent handles auth automatically
Fragile selectors that break on redesigns → Semantic understanding that adapts
DIY proxy rotation and anti-bot evasion → Built-in stealth browsers
HTML parsing and cleaning → Structured JSON ready for your database
Cost:
Usage-based pricing. Free tier for testing, scales to enterprise volumes.
Use it if:
You need to scrape authenticated sites, hate maintaining selectors, want data that's ready for LLMs, or actually value your time.
The F12 Play (inspect)
The Old Way

What web scraping looked like before:
You write a Playwright script (or beautiful-soup / selenium). You target .product-card > div.price:nth-child(3) to get the price inside the page.
It works. You deploy. Two weeks later the site adds a "Sale" badge and your selector silently returns nothing.
Then the site implements anti-bot detection. You burn through your proxy list (or you buy Evomi, Brightdata proxies). You add headless Chrome flags. You patch Captcha solvers into your pipeline. Your weekend is gone.
The data you get back? HTML soup that needs another parsing pass to be useful.
Most scraping projects die this way. Not with a bang, but through slow selector rot and proxy exhaustion.
TOOL
How Tinyfish Works

TinyFish doesn't just scrape pages. It thinks before it acts.
Tell it to monitor popular deals aggregators and it figures out the navigation, applies filters, skips junk, and returns structured data:
Example:
URL: https://slickdeals.net
Goal: "Find all electronics deals under $200. Exclude expired, rebate-only, contests, and lottery posts. Include title, price, store, discount %,expiration date, and link. Scroll through all pages."
The agent logs in, navigates to Electronics, filters by price, skips sponsored posts, handles pagination, and knows that "expired" in the title means skip. It adapts to the site's structure; it doesn't break when they redesign.
Opportunities
THE CRAZY PART

Most scrapers choke on anything beyond a static GET request. TinyFish handles real workflows:
Authentication: Log in with credentials, handle 2FA, manage sessions and cookies across runs
Forms and inputs: Fill search fields, select dropdowns, apply filters, enter dates and ranges
Multi-step journeys :Click through category trees, navigate tabbed interfaces, handle modals and popups
Conditional logic: "If product is out of stock, check the 'Notify me' box. If price is above $500, skip."
Dynamic content: Wait for lazy-loaded elements, handle infinite scroll, trigger AJAX requests
Anti-bot evasion: Stealth browser profiles, residential proxies, natural click patterns, random delays
You're not writing a scraper anymore. You're describing what you want done, and an agent figures out how to do it.
WHAT THIS UNLOCKS
The pattern works anywhere:
B2B lead generation — Log into Crunchbase, search "cybersecurity startups with Series B+, founded 2020+, 50-500 employees," extract company names, funding rounds, and contact emails
Price monitoring — Authenticate to wholesale portals, navigate product catalogs, check real-time stock levels and pricing, trigger alerts on drops
SaaS auditing — Log into billing dashboards with user credentials, navigate to invoice history, extract subscription costs and renewal dates across multiple tools
Travel aggregation — Navigate long-tail booking sites that have no API, search by dates and filters, extract availability and pricing that only exists behind forms
Until next week,
@speedy_devv
