How to Calculate Gross Margin Per User for AI SaaS Products

The formula, up front

Gross margin per user, for a single month, is one subtraction:

margin(user) = plan_revenue(user) - ai_cost(user)

  plan_revenue(user) = the subscription price they paid this month
  ai_cost(user)      = the sum of every LLM call they generated,
                       priced at each model current token rate

That is the whole thing. If a customer on your $9 plan generated $2.10 of tokens, their margin is $6.90, which is 77 percent. Healthy. If a customer on the same $9 plan generated $40 of tokens, their margin is negative $31, which is a number you want to find before your investor does.

Note the word gross. This ignores your fixed costs (hosting, salaries, the rest). It is deliberately just revenue minus the direct, variable cost of serving that user. For an AI product the dominant variable cost is inference, so this simple version is the one that actually moves.

Why the total API bill tells you nothing useful

Your OpenAI invoice is one aggregate number. Say it reads $4,200 this month. That figure cannot tell you that 9 of your 800 users account for half of it. It cannot tell you that one of those 9 is on your cheapest plan. It averages everything into a single line, and the average is a lie: your users are not average.

This is the structural quirk of AI SaaS. In classic software your revenue is variable (per seat, per usage) and your cost per user is roughly flat. AI flips that. Revenue is flat (the subscription price) and cost is wildly variable, because one power user hammering an expensive model 80 times a day costs you a hundred times more than a light user who checks in twice a week. Same price on the invoice you send them, radically different cost on the invoice you receive.

The aggregate hides exactly the thing you need to act on: which specific accounts are underwater. You cannot raise a price, cap a plan, or upsell an average. You act on individual users, so you have to measure individual users.

How to calculate it manually from your logs

The good news is that the raw data exists. Every LLM provider returns token counts on the response. You just have to capture them, price them, and join them to revenue. Here is the manual path.

Step 1: Log every call with the user, model, and tokens

Wrap every call to openai.chat.completions.create() so you record who made it, which model ran, and how many tokens went in and out:

const response = await openai.chat.completions.create(params)

await db.aiCallLog.insert({
  userId: req.user.id,
  feature: 'ai-summary',
  model: params.model,
  inputTokens: response.usage.prompt_tokens,
  outputTokens: response.usage.completion_tokens,
  createdAt: new Date(),
})

Step 2: Price the tokens, group by user, subtract revenue

Copy the per-million-token rates from the OpenAI pricing page. For gpt-4o today that is $2.50 per million input tokens and $10 per million output. Then do the whole calculation in one query: price each row, sum by user, join the subscription price, and subtract.

WITH cost AS (
  SELECT
    user_id,
    SUM(
      CASE model
        WHEN 'gpt-4o'      THEN input_tokens  * 2.50 / 1e6 + output_tokens * 10.00 / 1e6
        WHEN 'gpt-4o-mini' THEN input_tokens  * 0.15 / 1e6 + output_tokens *  0.60 / 1e6
      END
    ) AS ai_cost
  FROM ai_call_log
  WHERE created_at >= date_trunc('month', now())
  GROUP BY user_id
)
SELECT
  s.user_id,
  s.plan_price                    AS revenue,
  cost.ai_cost,
  s.plan_price - cost.ai_cost     AS margin
FROM subscriptions s
JOIN cost ON cost.user_id = s.user_id
ORDER BY margin ASC;

Order by margin ascending and the losers float to the top. The first few rows of that result are your entire problem, named. That is the report you actually want, and for a weekend project on a single provider, this query is genuinely enough.

Why this breaks the moment you grow

The query works for a week. Then reality starts chewing on it.

The prices are hard-coded, and providers change them every few months. The day OpenAI adjusts a rate, your margin numbers go quietly wrong and nothing warns you. You add Anthropic for one feature and Gemini for another, and now that tidy CASE statement has to know every model on three providers, each of which counts and bills tokens a little differently.

Revenue gets messy too. A customer upgrades on the 14th, and plan_price is now a half-and-half thing your query does not model. Someone churns mid-month. Someone is on an annual deal you have to divide by twelve. The date_trunc that felt clean now needs proration logic bolted on.

The query slows down, so you add indexes. Still slow, so you add a materialized view and a refresh job. Then you spend a Friday afternoon debugging why one upgraded customer shows the wrong revenue while your co-founder waits on the number for a board deck. This is the point where most teams give up and go back to squinting at the OpenAI total, which is where they started.

How Weckr computes margin per user for you

Weckr collapses all of the above into wrapping your client. Install it and pass your plan prices once:

npm install @weckr/sdk

import { Weckr } from '@weckr/sdk'

const wk = new Weckr({
  apiKey: process.env.WECKR_API_KEY,
  plans: { free: 0, starter: 9, pro: 49 },
})

const response = await wk.chat(openai, {
  model: 'gpt-4o',
  messages: [{ role: 'user', content: prompt }],
  userId: user.id,
  feature: 'ai-summary',
  plan: user.plan,
})

Every call now records the cost, computed server-side from the returned token counts and the model live pricing, so it never goes stale when a provider changes a rate. Revenue comes from the plan you passed in. Margin is just revenue minus cost, exactly the formula at the top of this page, done for every user automatically. The response you get back is identical to calling OpenAI directly, and the logging fires asynchronously, so you add zero latency.

In the dashboard you see margin per user sorted worst-first, so the accounts costing you money are the first thing on screen. Expand any user and you get a per-feature cost breakdown, which is how you find the one endpoint on the expensive model doing the damage. The same view works across OpenAI, Anthropic, and Gemini, with token counts normalized so the numbers are comparable. You can watch it on seeded data, no signup, at useweckr.com/demo.

A real example: the $9 plan costing $40 a month

Put concrete numbers on it. A customer sits on your $9 starter plan and genuinely loves your summarization feature. She runs it about 50 times a day, and each summary costs roughly $0.027 in gpt-4o tokens.

50 calls/day x 30 days x $0.027  =  ~$40.50 / month in AI cost
you charge her                   =    $9.00 / month
margin                           =   -$31.50 / month

She is a delighted customer and a $31.50 monthly hole. If even 5 percent of 800 users look like her, that is 40 accounts times $31.50, about $1,260 in pure margin loss every month, none of which appears anywhere on the OpenAI total. With margin per user in front of you, she surfaces at the top of the worst-first list within hours instead of never.

Then you have three real options, and now you can pick with the number in hand:

Raise the price. If her plan cannot survive her usage, the plan is mispriced. A usage-aware tier or a higher floor fixes the whole cohort at once, not just her.

Cap and downgrade. Set a spending ceiling so that once a user crosses, say, $9 of AI cost in a month, their calls route to gpt-4o-mini. At $0.15 and $0.60 per million tokens instead of $2.50 and $10, the same behavior costs a fraction, and her margin turns positive without a hard block.

Upsell. A customer using a feature 50 times a day is not a problem, she is your best lead. Move her to the $49 tier and the same usage becomes healthy revenue. You can only make that pitch if you know she exists.

Every one of those moves depends on the same thing: computing margin for the individual user, not the aggregate. That is the entire point of measuring it.

FAQ

How do I know which users are unprofitable in my SaaS?

Compute margin per user: their plan revenue for the month minus the total LLM cost they generated that month. Sort the list worst-first. Any user whose AI cost exceeds their subscription price is unprofitable, and the ones at the top of that list are the ones quietly eating your margin.

What is the formula for margin per user in an AI SaaS product?

Margin per user = (their subscription revenue that month) - (sum of their LLM cost that month). Cost is the token usage of every model they touched, priced at that model current rate. If a user pays $9 and burns $40 of tokens, their margin is -$31.

How do I get cost per user from the OpenAI API?

The OpenAI billing dashboard only shows a single account total, so you have to attribute cost yourself. Log every call with the userId, model, and input/output token counts from response.usage, multiply tokens by the per-model price, then group by user. Weckr does this for you by wrapping the client and computing cost server-side.

What gross margin should an AI SaaS company target?

Roughly 60 to 70 percent gross margin is a healthy target for an AI SaaS product. Classic software runs higher, closer to 80 percent, but inference is a real variable cost that pulls the number down. If your blended margin is below 50 percent, a handful of heavy users is usually the cause.

How do I find out which features cost the most to run?

Tag each LLM call with a feature name, then group cost by feature the same way you group by user. Weckr captures a feature field on every call, so expanding a user in the dashboard shows a per-feature cost breakdown. That is usually how you discover one endpoint on an expensive model is responsible for most of the bill.

See margin per user without the spreadsheet

The formula is trivial. The plumbing to keep it accurate across live pricing, three providers, and mid-month plan changes is what actually costs you a Friday. You can build and maintain that yourself, or you can wrap your client in Weckr and have margin per user, sorted worst-first, tonight. Open the live demo at useweckr.com/demo (no signup) and see the per-user view on real-looking data.