weckrDocs · LangChain

Integration guide

Using Weckr with LangChain

Two ways exist to instrument LangChain for cost: hook into LangChain's callback system, or wrap the underlying provider client. Today, Weckr is honest about which one it supports.

Honesty up front. Native LangChain CallbackHandler integration is on the roadmap. Today the cleanest pattern is to wrap the underlying OpenAI client with wk.chat()at the boundary (the place where you actually make the LLM call), not the LangChain abstractions. The example below shows exactly how, and it works with the SDK that's on PyPI right now.

The pattern

LangChain's ChatOpenAI is a useful wrapper for chains, prompt templates, and tool calling. But under the hood it eventually calls the OpenAI client. If you want Weckr to see those calls (and enforce per-user spending caps), the simplest path is to bypass the LangChain abstraction at the point of the actual LLM round trip and use wk.chat(client, params) directly.

In practice this means:

  • Keep using LangChain for what it's good at (orchestration, prompt assembly, output parsers, retrieval).
  • When you reach the leaf node where the LLM round trip happens, call wk.chat(openai_client, {...})instead of relying on LangChain's internal request.
  • Feed the result back into LangChain (or just use it directly, which is often what you wanted anyway).

Install

bash
pip install weckr-sdk openai langchain-openai langchain-core

Get a wk_ API key from app.useweckr.com/auth/signup.

Working example

A LangChain chain that uses Weckr for the actual model call. The prompt template, output parser, and chain composition stay LangChain-native. Only the LLM round trip goes through wk.chat().

python
import openai
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableLambda
from weckr import Weckr

# 1. One openai client and one Weckr client, shared across requests.
openai_client = openai.OpenAI()
wk = Weckr(
    api_key="wk_...",                # from app.useweckr.com
    plans={"free": 0, "pro": 29},    # your plan prices in USD
)

# 2. A LangChain prompt template (LangChain stays useful here).
prompt = ChatPromptTemplate.from_messages([
    ("system", "You summarise documents in one sentence."),
    ("user", "{document}"),
])
parser = StrOutputParser()

def call_llm_via_weckr(messages, *, user_id: str, plan: str):
    """The leaf node: take LangChain-rendered messages and run them through wk.chat()."""
    result = wk.chat(openai_client, {
        "model": "gpt-4o-mini",
        "messages": [{"role": m.type if m.type != "human" else "user", "content": m.content}
                     for m in messages.to_messages()],
        "user_id": user_id,
        "feature": "doc-summary",
        "plan": plan,
    })
    return result.choices[0].message.content

# 3. Compose with LangChain's runnable interface.
def summarise(document: str, *, user_id: str, plan: str) -> str:
    messages = prompt.invoke({"document": document})
    raw = call_llm_via_weckr(messages, user_id=user_id, plan=plan)
    return parser.invoke(raw)

# 4. Use it.
print(summarise(
    "Weckr is AI cost and margin intelligence for SaaS founders.",
    user_id="u_42",
    plan="pro",
))

What you get

  • Per-user cost and margin on the dashboard (the call is attributed to u_42 on the pro plan).
  • Spending caps enforced before the LLM call. If u_42 is over their monthly budget, wk.chat() raises WeckrCapError before any tokens are spent.
  • Model recommendations based on real token volume per feature.

Handling caps

python
from weckr import is_weckr_cap_error

try:
    summary = summarise(doc, user_id=user.id, plan=user.plan)
except Exception as err:
    if is_weckr_cap_error(err):
        # User hit their monthly cap. Show an upgrade prompt.
        return {"error": "AI spend cap reached", "plan": err.plan_name}
    raise

Short-lived processes

If you call your LangChain chain inside a Lambda, cron job, or CLI script, call wk.flush() before exit so the in-flight log POST completes:

python
def handler(event, context):
    summary = summarise(event["doc"], user_id=event["userId"], plan="pro")
    wk.flush()              # wait up to 5s for the log POST
    return {"summary": summary}

Roadmap

Native LangChain CallbackHandler integration is planned. It will let you register WeckrCallbackHandler(user_id=..., plan=...) on any LangChain runnable and capture cost on every model call in the chain, without re-routing the LLM round trip. Subscribe on the roadmap page to be notified when it ships.

Until then, the boundary-wrap pattern above is the supported path. It uses only methods that exist on the published SDK: wk.chat(), wk.flush(), and wk.last_event_id.

See the full Python reference at /docs/python, or browse all integrations at /docs/integrations.