crawlio
Start Scraping

Simulating Real Browser Interactions with Crawlio's workflow Feature

workflow-banner

One of the more common scraping challenges is dealing with pages that load content dynamically — things like modals, lazy-loaded elements, or pages that require clicks and scrolls to fully render useful data. Traditionally, this would push you toward using headless browsers like Puppeteer or Playwright, which can be powerful but also introduce new overhead in infrastructure and orchestration.

With Crawlio’s new workflow support, you can now define simple, declarative browser interactions directly in your API request. No separate automation stack needed.

What It Does

The workflow parameter lets you simulate basic browser actions like scrolling, clicking, waiting, running custom JavaScript, and capturing screenshots — all from your /scrape call. This is particularly useful if the data you want isn’t visible in the initial page load.

Under the hood, Crawlio spins up a browser instance for that request, applies your defined actions in sequence, and then returns the modified page content along with any screenshots or evaluation results you asked for.

Example Request

Here’s a typical payload:

{
  "url": "https://example.com",
  "workflow": [
    { "type": "wait", "duration": 1000 },
    { "type": "scroll", "selector": "footer" },
    { "type": "screenshot", "selector": "footer", "id": "footer-ss" },
    { "type": "eval", "script": "document.title" }
  ],
  "markdown": true
}

This does four things in sequence:

  • Waits one second to allow the page to finish layout/render.
  • Scrolls to the footer.
  • Takes a screenshot of the footer.
  • Runs a JavaScript snippet in the browser to fetch the page title.

The result includes the modified HTML, screenshot URLs, and the output of your eval call.

Why It’s Useful

For many sites, the first load doesn’t include all the content — some rely on infinite scroll, others use modals or tabs that require a user interaction. Before workflow, you’d have to either do manual scraping (fragile) or run your own headless browser (complex).

With workflow, you can:

  • Wait for animations or delayed content
  • Scroll into view to trigger lazy loads
  • Click buttons to reveal hidden elements
  • Inject and run JavaScript to extract specific details
  • Capture visual state with screenshots

It’s a middle ground — more powerful than static scraping, but less involved than managing a full headless browser setup.

Feature Comparison: Crawlio vs. Firecrawl

Firecrawl supports a similar set of browser actions, though some things differ:

Firecrawl currently has the edge when it comes to complex form automation (input, select, etc.), but Crawlio’s sequence execution model is clearer and easier to reason about for many use cases. Also, returning screenshots and eval results in a structured response simplifies downstream parsing.

Known Limitations

  • There’s no support yet for user input (input, select) — it’s on the roadmap.
  • The browser instance is ephemeral — session state doesn’t persist across requests.

Screenshots & Eval Output

Each screenshot is returned with the ID you assign, like this:

"screenshots": {
  "footer-ss": "https://cdn.example.com/screens/abc123.png"
}

And JavaScript evaluations are returned by step index:

"evaluation": {
  "3": {
    "result": "Example Domain",
    "error": ""
  }
}

Final Thoughts

If you’ve been frustrated by scraping tools that can’t deal with real-world, dynamic sites, workflow might help. It’s not trying to replace full browser automation — just simplify the most common tasks developers run into when pages don’t behave like plain HTML.

It’s still early, and your feedback helps steer development. If you hit issues, or want to see a specific interaction supported, let us know.

You can try it out by adding a workflow array to any /scrape request.

Thanks for reading — and as always, happy scraping.

- kisshan13

Scrape Smarter, Not Harder

Get the web data you need — without the headaches.

Start with zero setup. Use Crawlio’s API to scrape dynamic pages, search results, or full sites in minutes.

Try for free* No credit card required

© 2025 Weekend Dev Labs. All rights reserved.
Crawlio is a product of WeekendDevLabs. All content, code, and services are protected under applicable copyright and intellectual property laws. Unauthorized use or distribution is strictly prohibited.