Self-Service Vitrage: 10 Snake Models, One Search

SAT-based explainable product matching for the glass industry — Charles Dana, Monce SAS, 2026

1. The Dana Theorem

Theorem (Dana, 2024)

Any indicator function over a finite discrete domain can be encoded as a SAT instance in polynomial time. Decision tree bucketing reduces this to linear time.

Snake is the constructive proof. Given training data, it builds a CNF formula where each clause captures a distinguishing feature between classes. No exponential search, no backtracking. The formula is built directly from the data in O(L × n × b × m) time.

At inference, the formula is evaluated (also polynomial): each sample is routed through the decision tree, matched against SAT clauses in its bucket, and classified by lookalike voting. The NP-hardness of SAT is irrelevant — Snake never solves SAT, it constructs and evaluates structured formulas.

2. The Pipeline: 10 Models, One Search

A single search query passes through a pipeline of specialized Snake models. Each model is small, fast, and focused on one classification task:

1
typo_corrector
26 classes (glass terms), 99.4% accuracy — fixes misspellings word-by-word via SAT clauses on character features
2
complexity
2 classes (simple/complex), 100% accuracy — routes to Snake-direct or LLM+Snake mode
3
product_matcher
20 classes (products), 93.3% accuracy — the core search, returns ranked candidates with SAT confidence
4
glass_type
5 classes, 97.5% — feuillete/float/trempe/lowe/miroir
5
treatment
5 classes, 97.5% — securite/thermique/acoustique/solaire/none
6
thickness
6 classes, 100% — 4mm/6mm/8mm/10mm/33.1/44.2
7
gas_type
4 classes, 95.8% — argon/krypton/air/none
8
position
3 classes, 100% — exterieur/interieur/intercalaire
9
intent
3 classes, 100% — simple_search/composition_igu/besoin_technique
10
stock_priority
3 classes, 100% — standard/urgent/sur_mesure

3. Typo Correction as Classification

The typo_corrector model has 26 classes — one per common glass industry term (feuillete, trempe, acoustique, planitherm, ...). The input is a single word. Snake's SAT clauses learn character-level patterns:

"if word contains 'feuil' AND length >= 8 AND word contains 'e' → feuillete"
"if word contains 'tremp' AND word does NOT contain 'i' → trempe"
"if word contains 'plan' AND word contains 'th' → planitherm"

99.4% accuracy on the holdout set. The one error: isoian (double-corrupted isolant) classified as solaire. At two mutations from the target, character features collapse.

4. Latency Budget

StepTimeBudget used
typo_corrector (3 words)1.3ms0.7%
complexity router0.5ms0.3%
product_matcher1.0ms0.5%
7 auxiliary models2.6ms1.3%
Total Snake inference5.4ms2.7%
Headroom for LLM (complex only)194.6ms97.3%

The 200ms budget is 97% headroom. Simple queries (the vast majority) complete in under 6ms. Complex queries (IGU compositions) use Claude Haiku for semantic extraction when ANTHROPIC_API_KEY is set, then Snake for component matching.

Without Haiku: complex queries fall back to Snake-only extraction. The 10 models (intent, glass_type, treatment, thickness, gas_type, position) assemble a structured extraction from keyword signals. No LLM call, no API key needed. Quality is lower on ambiguous natural language, but the pipeline never crashes and always returns a viable payload. See /genesis for the full contract.

5. The "44" Problem

Query: feuillete 44 clair. Expected: Feuillete 44.2 clair (#66019). Actual top result: Float 4mm clair (#44020, 35%).

Why: "44" without ".2" has the character "4" which matches "4mm" in Float products. The model sees character features, not glass industry conventions. In real Monce data, "44" always means "44.2" (laminated glass notation). The synthetic training data lacks this implicit context.

Fix: retrain on real search logs where "44" co-occurs with "feuillete" in 100% of cases, giving Snake the statistical signal to build the right clause.