Integrating Crawlio in Your Node.js App: A Practical Guide Using the SDK

crawlio-node-js-thumbnail

If you're working on a Node.js project and need a dependable way to scrape or crawl web pages, Crawlio’s JavaScript SDK provides a concise, no-frills interface to get the job done. It’s built for developers who want predictable behavior, clear error handling, and enough flexibility to deal with real-world websites—including those that require interaction before scraping.

This guide walks through integrating Crawlio into a Node.js app using the crawlio.js SDK. It covers basic usage, advanced workflow automation, and how Crawlio compares in practice to tools like Firecrawl.

📦 Installing the SDK

npm install crawlio.js

🔑 Setup

Before using the SDK, you'll need an API key from Crawlio. To create one:

Go to Dashboard
Navigate to the API Keys section in dashboard.
Generate a new key and copy it securely.

Once you have your key :

import Crawlio from 'crawlio.js'

const client = new Crawlio({ apiKey: process.env.CRAWLIO_API_KEY })

🧹 Scraping a Page

const result = await client.scrape({ url: 'https://example.com' })
console.log(result.html)

The scrape method returns the full HTML content along with optional metadata, discovered URLs, and a Markdown version if requested:

const result = await client.scrape({
  url: 'https://example.com',
  markdown: true,
  returnUrls: true,
  includeOnly: ['main', 'article']
})

🗺️ Crawling a Site

const crawl = await client.crawl({
  url: 'https://example.com',
  count: 10,
  sameSite: true,
  patterns: ['/blog', '/docs']
})

You can track progress with:

const status = await client.crawlStatus(crawl.id)
const data = await client.crawlResults(crawl.id)

🧰 Advanced Use: Workflow Automation

Many sites rely on JavaScript to render or reveal content. Crawlio supports a workflow field in the scrape method to automate browser interactions before scraping begins.

Example:

const result = await client.scrape({
  url: 'https://example.com',
  workflow: [
    { type: 'wait', duration: 1500 },
    { type: 'scroll', selector: '#comments' },
    { type: 'click', selector: '#loadMore' },
    { type: 'wait', duration: 1000 },
    { type: 'screenshot', selector: '#comments', id: 'comments-ss' },
    { type: 'eval', script: 'document.title' }
  ]
})

Results include any screenshots and eval outputs:

result.screenshots['comments-ss']   // Screenshot URL
result.evaluation['5'].result       // "Page Title"

This is especially useful for single-page apps or lazy-loaded content that would otherwise be missed in a static scrape.

🧱 Building Blocks for Scraping Infrastructure

Crawlio isn’t trying to replace browser automation libraries or all-in-one scraping platforms—it focuses on providing clean, structured access to web content with enough flexibility to cover real-world use cases.

Whether you're:

extracting articles for a personal blog aggregator,
indexing content for internal search,
or automating price tracking on multiple domains,

Crawlio’s SDK gives you a solid foundation with minimal setup.

🧪 Error Handling

Crawlio’s SDK exports several error types, which can be caught for precise handling:

try {
  await client.scrape({ url: 'https://example.com' })
} catch (err) {
  if (err instanceof CrawlioAuthenticationError) {
    console.error('Invalid API key')
  } else if (err instanceof CrawlioRateLimit) {
    console.warn('Rate limit exceeded')
  } else {
    console.error(err)
  }
}

🧵 Final Thoughts

Crawlio doesn’t try to be everything—it’s designed to be a solid, composable tool for scraping and crawling from code. If you’re building a content pipeline, search tool, or automation service using Node.js, the SDK gives you a reliable interface for working with the web at scale.

It’s worth noting that advanced features like the workflow system are powerful but currently under-documented. Until that improves, reviewing examples (like the one above) or experimenting directly is the best way to understand what’s possible.

-kisshan13

Scrape Smarter, Not Harder

Get the web data you need — without the headaches.

Start with zero setup. Use Crawlio’s API to scrape dynamic pages, search results, or full sites in minutes.

Try for free* No credit card required