DataZooDE/anofox-forecast

Statistical timeseries forecasting in DuckDB

License:Apache-2.0Language:Rust232
claude-skillsduckdbduckdb-communityduckdb-extensiontime-series-analysistime-series-forecasting

Deep Analysis

DuckDB 的时间序列预测扩展,集成 32 个预测模型、完整的数据准备和分析功能

Core Features

Technical Implementation

Highlights
  • 原生 DuckDB 并行化,可处理百万级时间序列
  • 零配置,所有宏自动加载
  • MAP 参数一致性 API 设计
  • 支持 tsfresh 兼容特征向量
  • 138 个测试通过,代码深度评分 A(93%)
  • 完整的多语言文档和示例
Use Cases
  • 零售需求预测
  • 库存管理和优化
  • 金融时间序列分析
  • IoT 传感器数据预测
  • 能源需求预测
  • 多层级时间序列预测
  • ML 管道的特征工程
Limitations
  • 仍在早期开发阶段,预期存在 bug 和破坏性变化
  • BSL 1.1 许可限制:不能作为托管服务提供给第三方
  • 需要 DuckDB v1.4.2+ 版本支持
Tech Stack
RustDuckDB 1.4.2+C++CMake 3.15+OpenSSL 和 Eigen3Python、R、Julia、C++、Rust、Node.js、Go、Java

Anofox Forecast - Time Series Forecasting for DuckDB

License: BSL 1.1
DuckDB
Build Status
Technical Depth
Code Health
Tests

Technical Depth and Code Health scores calculated using PMAT

[!IMPORTANT]
This extension is in early development, so bugs and breaking changes are expected.
Please use the issues page to report bugs or request features.

A time series forecasting extension for DuckDB with 32 models, data preparation, and analytics — all in pure SQL.

✨ Key Features

🎯 Forecasting (32 Models)

  • AutoML: AutoETS, AutoARIMA, AutoMFLES, AutoMSTL, AutoTBATS
  • Statistical: ETS, ARIMA, Theta, Holt-Winters, Seasonal Naive
  • Advanced: TBATS, MSTL, MFLES (multiple seasonality)
  • Intermittent Demand: Croston, ADIDA, IMAPA, TSB
  • Exogenous Variables: ARIMAX, ThetaX, MFLESX (external regressors support)

📊 Complete Workflow

  • EDA & Data Quality: 5 functions (2 table functions, 3 macros) for exploratory analysis and data quality assessment
  • Data Preparation: 12 macros for cleaning and transformation
  • Multi-Key Hierarchy: 4 functions for combining, aggregating, and splitting hierarchical time series (region/store/item)
  • Cross-Validation & Backtesting: Time series CV with expanding/fixed/sliding windows, gap, embargo, and variable horizon support
  • Conformal Prediction: Distribution-free prediction intervals with guaranteed coverage probability
  • Evaluation: 12 metrics including coverage analysis
  • Seasonality Detection: Automatic period identification, seasonality classification, and peak detection
  • Changepoint Detection: Regime identification with probabilities

🔢 Feature Calculation

  • 76+ Statistical Features: Extract comprehensive time series features for ML pipelines
  • GROUP BY & Window Support: Native DuckDB parallelization for multi-series feature extraction
  • Flexible Configuration: Select specific features, customize parameters, or use JSON/CSV configs
  • tsfresh-Compatible: Compatible feature vectors for seamless integration with existing ML workflows (hctsa will come also)

⚡ Performance

  • Parallel: Native DuckDB parallelization on GROUP BY
  • Scalable: Handles millions of series
  • Memory Efficient: Columnar storage, streaming operations
  • Native Rust Core: High-performance native implementations for data preparation and forecasting

🎨 User-Friendly API

  • Zero Setup: All macros load automatically
  • Consistent: MAP-based parameters
  • Composable: Chain operations easily
  • Multi-Language: Use from Python, R, Julia, C++, Rust, and more!

📋 Table of Contents

Attribution

This extension uses the anofox-forecast Rust crate and implements algorithms from several open-source projects.
See THIRD_PARTY_NOTICES.md for complete attribution and license information.

Installation

Community Extension

INSTALL anofox_forecast FROM community;
LOAD anofox_forecast;

From Source

# Clone the repository
git clone https://github.com/DataZooDE/anofox-forecast.git
cd anofox-forecast

# Build the extension (requires Rust toolchain and CMake)
make release

# The extension will be built to:
# build/extension/anofox_forecast/anofox_forecast.duckdb_extension

🚀 Quick Start on M5 Dataset

The forecast takes ~2 minutes on a Dell XPS 13. (You need DuckDB v1.4.2).

-- Load extension
LOAD httpfs;
LOAD anofox_forecast;

CREATE OR REPLACE TABLE m5 AS 
SELECT item_id, CAST(timestamp AS TIMESTAMP) AS ds, demand AS y FROM 'https://m5-benchmarks.s3.amazonaws.com/data/train/target.parquet'
ORDER BY item_id, timestamp;

CREATE OR REPLACE TABLE m5_train AS
SELECT * FROM m5 WHERE ds < DATE '2016-04-25';

CREATE OR REPLACE TABLE m5_test AS
SELECT * FROM m5 WHERE ds >= DATE '2016-04-25';

-- Perform baseline forecast and evaluate performance
CREATE OR REPLACE TABLE forecast_results AS (
    SELECT *
    FROM anofox_fcst_ts_forecast_by('m5_train', item_id, ds, y, 'SeasonalNaive', 28, {'seasonal_period': 7})
    UNION ALL
    SELECT *
    FROM anofox_fcst_ts_forecast_by('m5_train', item_id, ds, y, 'Theta', 28, {'seasonal_period': 7})
    UNION ALL
    SELECT *
    FROM anofox_fcst_ts_forecast_by('m5_train', item_id, ds, y, 'AutoARIMA', 28, {'seasonal_period': 7})
);

-- MAE and Bias of Forecasts
CREATE OR REPLACE TABLE evaluation_results AS (
  SELECT 
      item_id,
      model_name,
      anofox_fcst_ts_mae(LIST(y), LIST(point_forecast)) AS mae,
      anofox_fcst_ts_bias(LIST(y), LIST(point_forecast)) AS bias
  FROM (
      -- Join Forecast with Test Data
      SELECT 
          m.item_id,
          m.ds,
          m.y,
          n.model_name,
          n.point_forecast
      FROM forecast_results n
      JOIN m5_test m ON n.item_id = m.item_id AND n.date = m.ds
  )
  GROUP BY item_id, model_name
);

-- Summarise evaluation results by model
SELECT
  model_name,
  AVG(mae) AS avg_mae,
  STDDEV(mae) AS std_mae,
  AVG(bias) AS avg_bias,
  STDDEV(bias) AS std_bias
FROM evaluation_results
GROUP BY model_name
ORDER BY avg_mae;

🌍 Multi-Language Support

Write SQL once, use everywhere! The extension works from any language with DuckDB bindings.

Language Status Guide
Python Python Usage
R R Usage
Julia Julia Usage
C++ Via DuckDB C++ bindings
Rust Via DuckDB Rust bindings
Node.js Via DuckDB Node bindings
Go Via DuckDB Go bindings
Java Via DuckDB JDBC driver

See: Multi-Language Overview for polyglot workflows!


📚 API Reference

For complete function signatures, parameters, and detailed documentation, see the API Reference.

Guides and API Sections

Guide API Reference Section
Quick Start Forecasting
EDA & Data Preparation Exploratory Data Analysis, Data Quality, Data Preparation
Detecting Seasonality Seasonality
Detecting Changepoints Changepoint Detection
Time Series Features Time Series Features
Basic Forecasting Forecasting
Exogenous Variables Exogenous Forecasting
Evaluation Metrics Evaluation
Backtesting & Cross-Validation Cross-Validation & Backtesting
Conformal Prediction Conformal Prediction
Multi-Key Hierarchy Multi-Key Hierarchy
Forecasting Model Parameters Supported Models, Parameter Reference

📦 Development

Prerequisites

Before building, install the required dependencies:

Manjaro/Arch Linux:

sudo pacman -S base-devel cmake ninja openssl eigen

Ubuntu/Debian:

sudo apt update
sudo apt install build-essential cmake ninja-build libssl-dev libeigen3-dev

Fedora/RHEL:

sudo dnf install gcc-c++ cmake ninja-build openssl-devel eigen3-devel

macOS:

brew install cmake ninja openssl eigen

Windows (Option 1 - vcpkg, recommended):

# Install vcpkg
git clone https://github.com/Microsoft/vcpkg.git
.\vcpkg\bootstrap-vcpkg.bat

# Install dependencies
.\vcpkg\vcpkg install eigen3 openssl

# Build with vcpkg toolchain
cmake -DCMAKE_TOOLCHAIN_FILE=.\vcpkg\scripts\buildsystems\vcpkg.cmake .
cmake --build . --config Release

Windows (Option 2 - MSYS2/MinGW):

# In MSYS2 MinGW64 terminal
pacman -S mingw-w64-x86_64-gcc mingw-w64-x86_64-cmake mingw-w64-x86_64-ninja
pacman -S mingw-w64-x86_64-openssl mingw-w64-x86_64-eigen3

# Then build as normal
make -j$(nproc)

Windows (Option 3 - WSL, easiest):

# Use Ubuntu in WSL
wsl --install
# Then follow Ubuntu instructions above

Required:

  • C++ compiler (GCC 9+ or Clang 10+)
  • CMake 3.15+
  • OpenSSL (development libraries)
  • Eigen3 (linear algebra library)
  • Make or Ninja (build system)

Build from Source

# Clone with submodules
git clone --recurse-submodules https://github.com/DataZooDE/anofox-forecast.git
cd anofox-forecast

# Set up Git hooks (recommended)
./scripts/setup-hooks.sh

# Build (choose 
Highly Recommended
agents

wshobson/agents

wshobson

Intelligent automation and multi-agent orchestration for Claude Code

The most comprehensive Claude Code plugin ecosystem, covering full-stack development scenarios with a three-tier model strategy balancing performance and cost.

25.6k2.8k3 days ago
Highly Recommended
awesome-claude-skills

ComposioHQ/awesome-claude-skills

ComposioHQ

A curated list of awesome Claude Skills, resources, and tools for customizing Claude AI workflows

The most comprehensive Claude Skills resource list; connect-apps is a killer feature.

19.9k2.0k3 days ago
Recommended
oh-my-opencode

code-yeongyu/oh-my-opencode

code-yeongyu

The Best Agent Harness. Meet Sisyphus: The Batteries-Included Agent that codes like you.

Powerful multi-agent coding tool, but note OAuth limitations.

17.5k1.2k3 days ago
Recommended
claude-mem

thedotmack/claude-mem

thedotmack

A Claude Code plugin that automatically captures everything Claude does during your coding sessions, compresses it with AI (using Claude's agent-sdk), and injects relevant context back into future sessions.

A practical solution for Claude's memory issues.

14.0k9143 days ago
Highly Recommended
planning-with-files

OthmanAdi/planning-with-files

OthmanAdi

Claude Code skill implementing Manus-style persistent markdown planning — the workflow pattern behind the $2B acquisition.

Context engineering best practices; an open-source implementation of Manus mode.

9.3k8113 days ago
Highly Recommended
Skill_Seekers

yusufkaraaslan/Skill_Seekers

yusufkaraaslan

Convert documentation websites, GitHub repositories, and PDFs into Claude AI skills with automatic conflict detection

An automation powerhouse for skill creation, dramatically improving efficiency.

6.8k6833 days ago