Langfuse is an open-source observability and analytics platform designed for LLM applications. It helps teams monitor, evaluate, and improve their LLM implementations through comprehensive tracing and evaluation tools.
By default, Inferable will only send metadata about LLM calls and function calls. This includes the model, Run ID, token usage, latency etc.If you have Send Message Payloads enabled, Inferable will also send the inputs and outputs of the LLM calls and function calls. This includes the prompt, response, tool calls, tool call arguments, tool call results etc.
Once you have enabled Langfuse integration, you will start to see traces in the Langfuse dashboard.Every Run in Inferable will be mapped to its own trace in Langfuse.You will find two types of spans in the trace:
Tool Calls: Denoted by function name. These are spans created for each tool call made in the Run by the LLM.
LLM Calls: Denoted by GENERATION. This is the span created for the LLM call. Inferable will create a new span for each LLM call in the Run, including:
Agent loop reasoning
Utility calls (e.g. Summarization, Title generation)
Whenever you submit a evaluation on a Run via the Playground or the API, we will send a score to Langfuse on the trace for that Run.If you’re using Langfuse for evaluation, this will help you correlate the evaluation back to the specific Trace in Langfuse.