ZERONE
Back to projects
Quant · ML · Crypto · 2026

Case 05 — ML signal system for crypto futures

Production-grade LSTM signal system: 77,179 training samples, 79 features, 50 selected. Confidence-threshold inference — not every signal is traded, only the top 35 %.

77 kTraining samples
79→50Features (selected)
56–58 %Win-rate (filtered)
24 kModel parameters

The challenge

A quant trader wanted an ML system calibrated for precision, not quantity: many signals, only the most confident ones executed. Requirements: clean feature engineering, LSTM training with validation-gap control, confidence scoring, reproducible backtests.

Architecture

A feature-engineering pipeline produces 79 candidate features from OHLCV data; feature selection narrows them to the 50 strongest. An LSTM model with 24,962 parameters classifies. The smart-inference layer drops signals below the confidence threshold. Backtesting and live engines share the same inference logic.

OHLCV ingestor · exchange APIsINGORCHESTRATORFeature engineering (79 → 50)FEWORKERLSTM training · balanced samplingTRWORKERSmart inference · confidence filterINFWORKERBacktesting engine · strategy replayBTWORKERExecution · position managementEXWORKERModel store · metrics · runsDBPRIMARY
INGOHLCV ingestor · exchange APIs
FEFeature engineering (79 → 50)
TRLSTM training · balanced sampling
INFSmart inference · confidence filter
BTBacktesting engine · strategy replay
EXExecution · position management
DBModel store · metrics · runs

Pipeline

Training and inference lifecycle

  1. 01Feature engineering across 5 markets and several timeframes
  2. 02Balanced training with validation-gap monitoring (≤ 15 %)
  3. 03Feature-selection pass delivers a +0.5–1 % uplift
  4. 04Confidence threshold 0.65 → top-35 % signals are traded
  5. 05Backtesting replay before every live rollout

Technology stack

Python 3.11+PyTorch (LSTM)NumPy · Pandas · scikit-learnFeature-Engineering-PipelineConfidence-ScoringBacktesting-FrameworkExchange-APIs (Binance, Bybit)PostgreSQLpytestDocker

Outcome

Validation accuracy 53 %, win-rate after confidence filter 56–58 %. Train/val gap under 14 % — the model generalises. Production inference pipeline is deterministic: same features → same decision. No black-box overselling, every signal explained via feature importances.

Similar challenge?

Talk to us — we listen first, deliver second.

Request a project