ZERONE
Back to projects
Data Engineering · Recruitment Intelligence · 2026

Case 01 — Job-Intelligence Platform

Distributed 7-server crawler, 16 ATS integrations, continuous enrichment across 2.5 M open positions in the DACH market.

2,5 MJob postings
7Servers in the cluster
16ATS integrations
55+Live daemons

The challenge

A DACH recruitment leader needed a data layer its own team could no longer sustain: millions of active positions, refreshed daily, pulled from 16 different applicant-tracking systems, enriched with contacts, salary bands, company metadata and semantic description analysis. Without downtime, without data gaps, with forensically auditable quality control.

Architecture

A master node orchestrates API, cron scheduling, daemon-keeper and frontend delivery. Six specialised workers split the load by domain — ATS crawling, career-page extraction, description shards, geo-discovery. A dedicated database host with a PgBouncer pool.

API · Cron · Orchestrator · FrontendMASTERORCHESTRATORATS crawler · 13 enricher daemonsW1WORKERCareer pages · triple enricher · contact completerW2WORKERCareer HTML · PDF extraction · description shards 3–4W3WORKERPostgreSQL 15 primary · PgBouncer poolDBPRIMARYDescription shards 5–7 · residential-proxy scraperW5WORKERGeo-discovery · 25 Docker containersW6WORKER
MASTERAPI · Cron · Orchestrator · Frontend
W1ATS crawler · 13 enricher daemons
W2Career pages · triple enricher · contact completer
W3Career HTML · PDF extraction · description shards 3–4
DBPostgreSQL 15 primary · PgBouncer pool
W5Description shards 5–7 · residential-proxy scraper
W6Geo-discovery · 25 Docker containers

Pipeline

8-shard description pipeline (resilient)

  1. 01Sharding via hashtext — deterministic distribution across 8 partitions
  2. 02Per-shard Python process + dedicated log file
  3. 03Endless reconnect with exponential backoff [1,2,4,8,16,30]s
  4. 04Mini-batch commit every 50 rows — idempotent, UPDATE-only
  5. 05Daemon-keeper with Telegram alerts — auto-restart on miss + log tail + OOM check

Technology stack

Next.js 14 (App Router + Pages)FastAPI · UvicornPostgreSQL 15 · PgBouncerRedisPlaywrightDocker Composesystemd · cronSendGridCloudflare WorkersTelegram Bot APIIPRoyal (residential)nginx · Let's Encrypt

Outcome

Since go-live: 99.9 %+ uptime. Description coverage 84 %, email coverage 65 %, quality score climbing toward 80 %. The pipeline runs 04:30–07:30 daily with zero operator intervention. Two years of planned remediation were made obsolete by unified connection management and a cluster-wide daemon-keeper.

Similar challenge?

Talk to us — we listen first, deliver second.

Request a project