Usage

Table of Contents


Overview

HallucinateAPI is an HTTP server that implements all GET and POST operations defined in an OpenAPI specification. Each operation is served by calling a Gemini model on Vertex AI, using the operation’s description as the LLM instruction and the response schema for structured JSON output.

Commands

serve (default)

Starts the HTTP server. Runs all validations on startup and exits non-zero if validation fails.

hallucinate serve --openapi-path /path/to/spec.yaml --gcp-project my-project --gcp-location us-central1 --model gemini-2.5-flash

Running with no subcommand defaults to serve:

hallucinate --openapi-path /path/to/spec.yaml --gcp-project my-project --gcp-location us-central1 --model gemini-2.5-flash

validate

Loads configuration and OpenAPI file, runs all validations, and outputs results in both JSON and human-readable text. Exits 0 if valid, non-zero if invalid.

hallucinate validate --openapi-path /path/to/spec.yaml --gcp-project my-project --gcp-location us-central1 --model gemini-2.5-flash

Configuration

All settings can be set via environment variables or CLI flags. CLI flags take precedence over environment variables.

Required Settings

Flag Environment Variable(s) Description
--openapi-path HALLUCINATE_OPENAPI_PATH Path to the OpenAPI specification file (JSON or YAML)
--gcp-project GOOGLE_CLOUD_PROJECT, HALLUCINATE_GCP_PROJECT Google Cloud Platform project ID
--gcp-location HALLUCINATE_GCP_LOCATION Vertex AI location (e.g., us-central1 or global)
--model HALLUCINATE_MODEL Gemini model name (e.g., gemini-2.5-flash)

When --gcp-location=global, requests are sent to https://aiplatform.googleapis.com. Regional locations use https://<location>-aiplatform.googleapis.com.

Server Settings

Flag Environment Variable Default Description
--listen-addr HALLUCINATE_LISTEN_ADDR :8080 Address and port to listen on
--base-system-prefix HALLUCINATE_SYSTEM_PREFIX (empty) Custom prefix added to the system prompt for all operations
--prompt-format HALLUCINATE_PROMPT_FORMAT json Prompt serialization format: json or toon
--max-request-bytes HALLUCINATE_MAX_REQUEST_BYTES 10240 (10 KB) Maximum request body size in bytes
--timeout-seconds HALLUCINATE_TIMEOUT_SECONDS 300 Outbound Gemini API call timeout in seconds

Authentication

HallucinateAPI uses Google Application Default Credentials (ADC) to authenticate with Vertex AI. No API keys are required.

Ensure ADC is configured in your environment:

# For local development
gcloud auth application-default login

# For GKE or Cloud Run, ADC is typically configured automatically

Running with Docker

Mount your OpenAPI spec file into the container:

docker run -p 8080:8080 \
  -v /path/to/spec.yaml:/spec.yaml \
  -e HALLUCINATE_OPENAPI_PATH=/spec.yaml \
  -e GOOGLE_CLOUD_PROJECT=my-project \
  -e HALLUCINATE_GCP_LOCATION=us-central1 \
  -e HALLUCINATE_MODEL=gemini-2.5-flash \
  ghcr.io/unitvectory-labs/hallucinateapi:latest

Built-in Endpoints

The server hosts the following endpoints automatically:

Endpoint Description
/ Swagger UI for interactive API exploration
/openapi.json OpenAPI specification (served if input is JSON)
/openapi.yaml OpenAPI specification (served if input is YAML)

These paths are reserved and must not be defined in your OpenAPI specification.