Pricing

Coming soon

Simple, transparent pricing

No subscriptions, no minimums, no seat fees. Pay only for the tokens you use. Rates scale with model tier, and volume discounts kick in automatically.

Model Input / 1M Output / 1M Context
wlfv-v1-flash $0.12$0.12 $0.35$0.35 128K
wlfv-v1-code $0.18$0.18 $0.45$0.45 200K
wlfv-v1-pro $0.40$0.40 $0.90$0.90 262K

All prices in USD. No minimum spend. Volume rates (20% off) apply automatically above 50M tokens/month across your whole account.

Rate cards

Three ways to pay

Start pay-as-you-go and graduate to volume or batch rates as you scale. The same rate card underpins every option.

Pay as you go

Default
from $0.12 / 1M in

Per-token billing with no commit. Top up credits or run invoiced billing once you hit consistent volume.

No minimum$0
No seat fees$0

Volume

20% off
auto-applied 50M+ / mo

Cross 50M tokens in a billing month and every rate drops 20% for the rest of the period. No contract needed.

Discount20%
Trigger50M tok

Batch

50% off
50% off async

Submit large, non-interactive jobs with up to 24h turnaround. Same models and quality at half the rate.

Discount50%
Turnaround≤ 24h

Cost examples

What a request costs

Token counts vary by workload, but the math is simple: input tokens × input rate + output tokens × output rate.

Chat message

Flash
~$0.0004 / req

1,500 input tokens (prompt + history) and 300 output tokens for a typical assistant reply.

Input (1.5K)$0.00018
Output (300)$0.00011

Code completion

Code
~$0.0014 / req

4,000 input tokens (file context) and 1,200 output tokens for a refactored function with comments.

Input (4K)$0.00072
Output (1.2K)$0.00054

Document analysis

Pro
~$0.013 / req

20,000 input tokens (long document) and 1,500 output tokens for a structured summary and answers.

Input (20K)$0.008
Output (1.5K)$0.00135

FAQ

Pricing questions

The short version: you pay for tokens, you can switch models anytime, and there are no hidden fees.

Per request or per token?

Per token. Input and output are billed separately at each model's rate, metered to the exact count — not rounded estimates. Short requests cost cents, not dollars.

Can I switch models mid-project?

Yes. Change the model field in your request and billing follows automatically. No SDK change, no plan migration, no downtime.

How is volume tracked?

Aggregated across every model and project on your account within a calendar month. Hit 50M tokens and the 20% discount kicks in for the rest of that month.

Is there a free tier?

Waitlist members get a launch-period credit to try every model. After that, pay-as-you-go starts at $0 with no minimum spend and no seat fees.

Spend controls

Set per-project budgets and hard limits. We stop accepting requests when you hit your cap — no overruns, ever. Invoices break down usage by model and project.

Enterprise rates?

For dedicated capacity, custom SLAs, or committed-use discounts, reach out after joining the waitlist. Same rate card, tailored terms.

Coming soon

Pricing activates at launch.

These rates take effect the moment WLFV AI opens. Join the waitlist to lock in early access and a launch-period credit to test every model tier.