This setup guide is designed to walk you through the simple steps to setup agent monitoring on an existing, public LLM API which exposes 3 endpoints.
Pre-requisites & Assumptions:
You have setup a Hallucinate Account
Let's assume your LLM API looks something like this (in Python)…
Now lets assume you want to setup monitoring on both your
chat
andsummariseItem
endpoints.
Required Steps:
Go to Hallucinate -> Setup -> Routes and Create a new API
Name: can be anything you like (probably the name of your website/App)
Source URL/IP: source URL to ensure source location
Forwarding URL/IP: destination URL of your API
Click on the new API -> Create Routes
Name: can be anything you like (probably same as your endpoint names)
Endpoint: name of the endpoint you want to monitor (e.g.
/summariseItem
)Request Param:
request
Response Param:
SummariseItemResponse.summary
Ensure you LLM API will accept request from hallucinate.ai
Add a new header to the API request of your front-end App to set (this is so we know the request belongs to your account and is valid & secure:
x-hallucinate-api-key = [your API Secret]
(Your API Secret can be found in your account setup - do not share this with anyone else)
Switch the URL in your front-end to point at
https://hallucinate.ai/proxy-rest
orhttps://hallucinate.ai/proxy-streaming
rather than your existing LLM API. The full request will be forwarded (un-modified) to your LLM API, including security headers etc.
Optional Steps (for full spend analysis & model benchmarking features):
Add two additional headers to the response of your API, with the token size of the final request to your LLM and the full final request passed to your LLM (assuming this will be different from the source request, once you add context, templating or do RAG).
Note: these additional headers will be removed by the Hallucinate API before returning to your front-end (so are never exposed to your users or public traffic.
The whole process can take as little as 5 minutes!!