Hallucinate Setup Guide

This setup guide is designed to walk you through the simple steps to setup agent monitoring on an existing, public LLM API which exposes 3 endpoints.

Pre-requisites & Assumptions:

You have setup a Hallucinate Account
Let's assume your LLM API looks something like this (in Python)…

@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):
    try:
        response = generate_chat_response(request.user_message, request.session_id)
        return ChatResponse(**response)
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

@app.post("/summariseItem", response_model=SummariseItemResponse)
async def summarise_item(request: SummariseItemRequest):
    try:
        summary = summarize_text(request.text)
        return SummariseItemResponse(summary=summary)
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

@app.post("/feedback", response_model=FeedbackResponse)
async def feedback(request: FeedbackRequest):
    try:
        message = store_feedback(request.session_id, request.feedback, request.rating)
        return FeedbackResponse(message=message)
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

Now lets assume you want to setup monitoring on both your chat and summariseItem endpoints.

Required Steps:

Go to Hallucinate -> Setup -> Routes and Create a new API
- Name: can be anything you like (probably the name of your website/App)
- Source URL/IP: source URL to ensure source location
- Forwarding URL/IP: destination URL of your API
Click on the new API -> Create Routes
- Name: can be anything you like (probably same as your endpoint names)
- Endpoint: name of the endpoint you want to monitor (e.g. /summariseItem)
- Request Param: request
- Response Param: SummariseItemResponse.summary
Ensure you LLM API will accept request from hallucinate.ai
Add a new header to the API request of your front-end App to set (this is so we know the request belongs to your account and is valid & secure:
x-hallucinate-api-key = [your API Secret]
(Your API Secret can be found in your account setup - do not share this with anyone else)
Switch the URL in your front-end to point at https://hallucinate.ai/proxy-rest or https://hallucinate.ai/proxy-streaming rather than your existing LLM API. The full request will be forwarded (un-modified) to your LLM API, including security headers etc.

Optional Steps (for full spend analysis & model benchmarking features):

Add two additional headers to the response of your API, with the token size of the final request to your LLM and the full final request passed to your LLM (assuming this will be different from the source request, once you add context, templating or do RAG).

@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):
    try:
        response = generate_chat_response(request.user_message, request.session_id)
        response.headers["x-hallucinate-tokens"] = "1566"  # Example
        response.headers["x-hallucinate-origin-request"] = "Imagine you are an expert in sales, write..." # Example
        return ChatResponse(**response)
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

Note: these additional headers will be removed by the Hallucinate API before returning to your front-end (so are never exposed to your users or public traffic.

The whole process can take as little as 5 minutes!!

More articles

Comprehensive Guide to Evaluating Large Language Model (LLM) Performance

Understanding key LLM Risks & Vunerabilities

Hallucinate Setup Guide

More articles

Comprehensive Guide to Evaluating Large Language Model (LLM) Performance

Understanding key LLM Risks & Vunerabilities

Hallucinate Setup Guide

More articles

Comprehensive Guide to Evaluating Large Language Model (LLM) Performance

Understanding key LLM Risks & Vunerabilities

Hallucinate Setup Guide

Subscribe to our updates

Subscribe to our updates

Subscribe to our updates