Hypernym × Freezone | Inference Router

HYPERNYM×Freezone

routerislandssanctumtundraLIVE

MODELS

One OpenAI-compatible endpoint. Pass model in the request body.

Hypernym-enhanced

Self-hosted

Passthrough

NOMINAL

Router state

enhancedfastest

hypernym/llama-3.1-8b-compressed

Llama 3.1 8B · Compressed

ctx 128kin $0.060/Mout $0.24/M

semantic-compressionp-span-shearcost-optimized

Llama-3.1-8B routed through Hypernym semantic compression. Pre-summarises long contexts at the shear boundary so you pay for fewer tokens with the same downstream answer.

enhancedflagship

hypernym/llama-3.1-70b-academy

Llama 3.1 70B · Academy

ctx 128kin $0.45/Mout $0.85/M

affinity-routingisland-cascadeauto-failover

Llama-3.1-70B served via the Tolarian Academy affinity router. Concurrent=80, automatic burst-island scaling under load.

hosted

hypernym/glm-4.7-cerebras

GLM 4.7 · Cerebras

ctx 33kin $0.30/Mout $0.60/M

cerebras-directsub-second-ttft

Wafer-scale inference on Cerebras with concurrent=1000. Tuned for TTFT under 200ms.

passthrough

openai/gpt-4o

GPT-4o

ctx 128kin $2.50/Mout $10.00/M

OpenAI passthrough. Same upstream price; Freezone adds usage and routing only.

passthrough

anthropic/claude-sonnet-4-6

Claude Sonnet 4.6

ctx 200kin $3.00/Mout $15.00/M

Anthropic passthrough. Useful when you want Sonnet but want one bill across providers.

enhancedembedding

hypernym/bge-m3-sanctum

BGE-M3 · Sanctum

ctx 8kin $0.013/M

cache-hit-90plain-failover

Embeddings via Serra's Sanctum — 304K-row text-cache short-circuits before the GPU on hits. 1024 dimensions.

enhancedliftbeta

hypernym/moxruby-lift

MoxRuby · Lift

ctx 32kin $1.50/M

multi-axis-extractionmountain-routed

Hypernym-native "lift" primitive — extracts content along configurable axes via the splash-mountain backend.