Hey everyone ๐
We have been building traceAI, an open-source observability tool
for LLM applications in production.
It traces every LLM call capturing inputs, outputs, latency, costs,
and errors with minimal setup. Useful for teams running inference
on NVIDIA hardware who want full visibility into what their models
are actually doing in prod.
Would love feedback from folks running LLMs at scale. What does
your current monitoring setup look like?