SnakeBatch v6: Binary Divide and Conquer for Distributed SAT Classification

Charles Dana · Monce SAS · April 2026

snakebatch.aws.monce.ai · /architecture · /economics

Abstract

We present a recursive binary architecture for distributed Snake training on AWS Lambda. One function, one job: oppose(A, B), partition, fire two children. The tree discovers itself at runtime. A split threshold τ = max(1200, 12b) separates two regimes: above τ, O(1) binary split with no coverage scan; below τ, O(n) scored split where n is bounded and negligible. Snake v5.4.5 enforces global datatypes across all leaves, eliminating type mismatches at inference.

Validated results (500/500 sample, 0 errors):

nLayersAccuracyT(n)
2505100%2.0s
1,0005100%2.4s
5,00015100%5.3s
15,0005100%7.3s
150,0005100%19.8s

1. The Dana Theorem (2024)

Any indicator function over a finite discrete domain can be encoded as a SAT instance in polynomial time. Snake exploits this constructively: for each target class, it builds a CNF formula via oppose() literals that excludes all non-members while satisfying all members. Construction per bucket: O(b² · m) where b is bucket size and m is feature count.

The Dana Theorem guarantees that the SAT formula exists and is constructible. SnakeBatch v6 distributes this construction across thousands of concurrent Lambda workers, achieving wall-clock times that are logarithmic in n while preserving the theorem’s polynomial construction guarantee per bucket.

2. Time Complexity: O(log n)

T(n) = ⌈log2(n / τ)⌉ · λ + Tleaf(τ)
SymbolDefinitionMeasured value
τSplit threshold = max(1200, 12b)1200 at b=75
λLambda invoke overhead (warm)~100ms
Tleaf(τ)Scored split + SAT on τ items~500ms
bBucket size (SAT atom)75
LStochastic layers (parallel, no wall-clock impact)5–15

The wall clock is independent of L. All layers fire in parallel. Each layer builds its own binary tree. The slowest layer determines wall clock.

2.1 The two regimes

Above τ (n > 1200): binary split. One oppose(A, B) call — O(1). One partition scan — O(n). Fire two children on Lambda. No coverage scoring. The literal from oppose discriminates A from B by construction. The tree refines at depth.

Below τ (n ≤ 1200): scored split. 30 oppose attempts, pick the most balanced. Cost: 30 · n ≤ 30 · 1200 = 36,000 operations. At ~1µs each: 36ms. Negligible. This regime ensures SAT quality at the leaf level where bucket composition matters for prediction accuracy.

Tsplit(n) = O(1) + O(n)if n > τ
Tsplit(n) = 30 · O(n)if n ≤ τ

The 30× multiplier only applies below τ, where n is bounded. Total scored work per layer: O(τ) × (n/τ leaves) = O(n). But each leaf is a separate Lambda — parallel. Wall clock: O(τ) = O(1).

2.2 Depth

D(n) = ⌈log2(n / τ)⌉
nDTtheoreticalTmeasuredRatio
25000.5s2.0s4.0×
1,00000.5s2.4s4.8×
5,00030.8s5.3s6.6×
15,00040.9s7.3s8.1×
150,00071.2s19.8s16.5×

The measured-to-theoretical ratio grows with n. This is the sublinear friction — partition scans at the top levels are O(n), O(n/2), O(n/4), ... which sum to O(2n). Lambda overhead per hop is ~100ms theoretical but ~1–3s effective (cold start tail, payload serialization, connection setup). The friction is o(n): sublinear, real, and bounded.

Teffective(n) = D(n) · λ̃ + Tleaf
where λ̃ ≈ 1–3s (effective hop cost including friction)

3. Cost Complexity: O(n)

$(n, L) = L · W(n) · Cλ

where W(n) is workers per layer and Cλ is cost per worker invocation.

3.1 Workers

W(n) = 2D(n)+1 − 1 ≈ 2n / τ

A full binary tree with D levels has 2D+1 − 1 total nodes. Since 2D = 2log2(n/τ) = n/τ, total workers ≈ 2n/τ. Linear in n.

3.2 Cost per worker

Cλ = M · t̄ · pGB·s + pinvoke
SymbolDefinitionValueSource
MWorker memory10 GBLambda config
Avg worker duration0.3sMeasured (internal: 0.1s, leaf: 0.5s)
pGB·sLambda compute price$0.0000166667AWS Lambda pricing, eu-west-3
pinvokePer-request price$0.20 / 1M = $0.0000002AWS Lambda pricing
Cλ = 10 · 0.3 · $0.0000167 + $0.0000002 = $0.0000501

3.3 Total cost

$(n, L) = L · (2n / τ) · $0.0000501
nLW(n)W · L$(n, L)$/row
250515$0.0003$0.0000010
1,000515$0.0003$0.0000003
5,000159135$0.007$0.0000014
15,000525125$0.006$0.0000004
150,00052501,250$0.063$0.0000004
1,000,00051,6678,335$0.42$0.0000004
$0.42 for one million rows. The per-row cost converges to L · Cλ / τ ≈ $0.0000004. Linear scaling. No step functions, no provisioned capacity, no GPUs.

4. Architecture: One Lambda, Self-Recursive

from algorithmeai import Snake

def v6_worker(population, indices, condition, seed):
    n = len(indices)

    # ── LEAF: hand off to Snake ──────────────────────
    if n ≤ bucket_size:
        model = Snake(local_pop, datatypes=global_dt,
                      n_layers=1, bucket=bucket_size)
        return model.layers[0]

    # ── SPLIT: one oppose, two children ──────────────
    if n > τ:
        lit = oppose(A, B)                      # O(1)
    else:
        lit = best_of_30(oppose)                # O(n), n ≤ 1200

    left  = [i for i in indices if  apply_literal(pop[i], lit)]
    right = [i for i in indices if !apply_literal(pop[i], lit)]

    # ── RECURSE: fire two Lambdas in parallel ────────
    left_buckets  = Λ(v6_worker, left,  condition ∪ {lit})
    right_buckets = Λ(v6_worker, right, condition ∪ {¬lit})

    return left_buckets ∪ right_buckets
# Client: fire L layers in parallel, assemble model
layers = parallel([Λ(v6_worker, all_indices, seed=i) for i in range(L)])
model  = Snake.from_layers(population, layers)

The entire distributed training system is a recursive call to Snake(). Everything above the leaf is routing — binary oppose splits that partition the population into bucket-sized chunks. Everything at the leaf is Snake being Snake: build_bucket_chain, SAT construction, lookalike mapping. One import. One library. The Lambda recursion is just the parallelism strategy.

No conductor. No sub-conductors. No tree planning. Each invocation opens exactly 2 connections (to its children). No connection pool saturation. The tree shape emerges from the data.

5. Variable Noise: Guaranteed SAT Coverage

Binary splitting produces leaves of varying size. The ELSE bucket at the end of a chain can have as few as 1 member. SAT construction requires negative examples — a 1-member bucket has none. The solution: variable noise injection.

noiseeffective(k) = max(b − k, η · k) / k
SymbolDefinition
kCore members in the bucket (actual items routed here)
bBucket size parameter (75)
ηBase noise ratio (0.25)

When k < b, the formula injects b − k noise members from the full population, padding the bucket to exactly b total members. When k ≥ b, standard noise ratio η applies.

k (core)noiseeffTotal membersNegative examples
174.07574 noise friends
106.57565 noise friends
301.57545 noise friends
750.259419 noise (standard ratio)

Every bucket has at least b members. Every bucket has negative examples for SAT. No tautological fallback needed. The noise members come from the full population (different-class items) — genuine diversity, not synthetic padding.

The noise formula is the bridge between binary splitting (which creates arbitrary-sized leaves) and SAT construction (which needs both positive and negative examples). Without it, 1-member buckets produce degenerate predictions. With it, every bucket is a proper classification problem.

6. Snake v5.4.5: Enforced Datatypes

The final correctness barrier: leaf Snake instances re-detecting feature types from their local subset. A leaf with article codes [“12000”, “1004”] detects the column as integer; the global model treats it as text. Numeric literals on text columns crash apply_literal at inference.

v5.4.5 adds datatypes=[...] to Snake.__init__. When provided, type detection is skipped entirely. The caller (v6-worker) passes the globally-detected types to every leaf. All literals match the global schema.

# v5.4.4 (broken at scale):
Snake(local_pop, target_index="num_article")  # re-detects types per leaf

# v5.4.5 (correct):
Snake(local_pop, target_index="num_article",
      datatypes=["T", "T"])                    # enforced: both text

Lambda layer: algorithmeai-snake:5 (v5.4.5 + Cython x86_64).
Branch: Monce-AI/algorithmeai-snake v5.4.5-enforce-datatypes.

6. Condition Routing

Each binary split adds one literal to the path. The condition is the AND of all literals from root to leaf:

condition = lit0 ∧ ¬lit1 ∧ lit2 ∧ …

At inference, traverse_chain evaluates all(apply_literal(X, lit) for lit in condition). Conditions are disjoint by construction (binary partition at each level). The ELSE bucket (condition = None) catches items that match no other condition.

Condition depth = D(n) = ⌈log2(n/τ)⌉ + depth within leaf chain. Typically 7–12 total literals for n = 150K. Short conditions → fast inference.

7. v5 → v6 Progression

v5 (conductor)v6 (binary D&C)
Lambda types31
Split costO(30n) per literalO(1) above τ, O(τ) below
150K wall clock270s19.8s
150K cost$0.14$0.063
Connection poolSaturates at 10002 per node
OrchestrationConductor + sub-conductorsSelf-recursive
Perfect fit 150K✓ (with sub-conductors)✓ (native)

8. Open Questions

  1. OR routing (DNF conditions). Currently conditions are conjunctive (AND). Disjunctive conditions would allow 1-member fragments to be reachable via multiple tree paths, closing the residual 2/5000 edge case at lower layer counts.
  2. Sublinear friction. The effective hop cost λ̃ grows with n due to partition scans and Lambda overhead. Can this be reduced via payload compression or pre-sorted indices?
  3. 1M+ scaling. At n = 1M, D = 10 hops, ~1,667 workers per layer. Within the 10K concurrent limit. Projected wall clock: ~30s. Projected cost: $0.42. Untested.

Charles Dana · Monce SAS · snakebatch.aws.monce.ai · April 2026
Co-Authored-By: Claude (Anthropic)