Integration guide
Using Weckr with LangChain
Two ways exist to instrument LangChain for cost: hook into LangChain's callback system, or wrap the underlying provider client. Today, Weckr is honest about which one it supports.
CallbackHandler integration is on the roadmap. Today the cleanest pattern is to wrap the underlying OpenAI client with wk.chat()at the boundary (the place where you actually make the LLM call), not the LangChain abstractions. The example below shows exactly how, and it works with the SDK that's on PyPI right now.The pattern
LangChain's ChatOpenAI is a useful wrapper for chains, prompt templates, and tool calling. But under the hood it eventually calls the OpenAI client. If you want Weckr to see those calls (and enforce per-user spending caps), the simplest path is to bypass the LangChain abstraction at the point of the actual LLM round trip and use wk.chat(client, params) directly.
In practice this means:
- Keep using LangChain for what it's good at (orchestration, prompt assembly, output parsers, retrieval).
- When you reach the leaf node where the LLM round trip happens, call
wk.chat(openai_client, {...})instead of relying on LangChain's internal request. - Feed the result back into LangChain (or just use it directly, which is often what you wanted anyway).
Install
pip install weckr-sdk openai langchain-openai langchain-coreGet a wk_ API key from app.useweckr.com/auth/signup.
Working example
A LangChain chain that uses Weckr for the actual model call. The prompt template, output parser, and chain composition stay LangChain-native. Only the LLM round trip goes through wk.chat().
import openai
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableLambda
from weckr import Weckr
# 1. One openai client and one Weckr client, shared across requests.
openai_client = openai.OpenAI()
wk = Weckr(
api_key="wk_...", # from app.useweckr.com
plans={"free": 0, "pro": 29}, # your plan prices in USD
)
# 2. A LangChain prompt template (LangChain stays useful here).
prompt = ChatPromptTemplate.from_messages([
("system", "You summarise documents in one sentence."),
("user", "{document}"),
])
parser = StrOutputParser()
def call_llm_via_weckr(messages, *, user_id: str, plan: str):
"""The leaf node: take LangChain-rendered messages and run them through wk.chat()."""
result = wk.chat(openai_client, {
"model": "gpt-4o-mini",
"messages": [{"role": m.type if m.type != "human" else "user", "content": m.content}
for m in messages.to_messages()],
"user_id": user_id,
"feature": "doc-summary",
"plan": plan,
})
return result.choices[0].message.content
# 3. Compose with LangChain's runnable interface.
def summarise(document: str, *, user_id: str, plan: str) -> str:
messages = prompt.invoke({"document": document})
raw = call_llm_via_weckr(messages, user_id=user_id, plan=plan)
return parser.invoke(raw)
# 4. Use it.
print(summarise(
"Weckr is AI cost and margin intelligence for SaaS founders.",
user_id="u_42",
plan="pro",
))What you get
- Per-user cost and margin on the dashboard (the call is attributed to
u_42on theproplan). - Spending caps enforced before the LLM call. If
u_42is over their monthly budget,wk.chat()raisesWeckrCapErrorbefore any tokens are spent. - Model recommendations based on real token volume per feature.
Handling caps
from weckr import is_weckr_cap_error
try:
summary = summarise(doc, user_id=user.id, plan=user.plan)
except Exception as err:
if is_weckr_cap_error(err):
# User hit their monthly cap. Show an upgrade prompt.
return {"error": "AI spend cap reached", "plan": err.plan_name}
raiseShort-lived processes
If you call your LangChain chain inside a Lambda, cron job, or CLI script, call wk.flush() before exit so the in-flight log POST completes:
def handler(event, context):
summary = summarise(event["doc"], user_id=event["userId"], plan="pro")
wk.flush() # wait up to 5s for the log POST
return {"summary": summary}Roadmap
WeckrCallbackHandler(user_id=..., plan=...) on any LangChain runnable and capture cost on every model call in the chain, without re-routing the LLM round trip. Subscribe on the roadmap page to be notified when it ships.Until then, the boundary-wrap pattern above is the supported path. It uses only methods that exist on the published SDK: wk.chat(), wk.flush(), and wk.last_event_id.
See the full Python reference at /docs/python, or browse all integrations at /docs/integrations.