aoai.twins.la

A digital twin of the Azure OpenAI Service.

What is this?

A high-fidelity digital twin of the Azure OpenAI Service data plane: chat completions, legacy completions, and embeddings. Responses are deterministic synthetic stand-ins, so identical inputs always produce identical outputs and no real model is ever called.

Resources — tenant-scoped namespaces, the <resource> URL segment.
Deployments — per-resource <deployment> mapping to a model name.
Dual auth — api-key header OR AAD Authorization: Bearer; either is sufficient.

No control plane. Azure's ARM management API is intentionally not emulated. Provision resources, deployments, and api-keys through the Twin Plane at /_twin/.

Supported scenarios

chat-completions — non-streaming and streaming SSE
embeddings — deterministic synthetic vectors
dual-auth — api-key + AAD-bearer with per-resource JWKS and OAuth token endpoint

How to use it

Cloud: bootstrap a tenant, create a resource and deployment via /_twin/, then point your Azure OpenAI SDK at https://aoai.twins.la/<resource> with either an api-key or a token from the per-resource AAD endpoint.

Local: install with pip install twins-aoai-local and run a local instance on any port. Same API, same behavior, your machine.

For agents

Copy this into your agent's system prompt, tool configuration, or CLAUDE.md. Also available as plain text at /_twin/agent-instructions.

# Azure OpenAI Twin — aoai.twins.la

A high-fidelity digital twin of the Azure OpenAI Service data plane. The
twin emulates chat completions, legacy completions, and embeddings, all
backed by deterministic synthetic responses (no real model is ever
called).

## URL shape

Path-prefixed, single host:

    https://aoai.twins.la/<resource>/openai/deployments/<deployment>/chat/completions
    https://aoai.twins.la/<resource>/openai/deployments/<deployment>/embeddings
    https://aoai.twins.la/<resource>/openai/deployments/<deployment>/completions

There is no subdomain-per-resource. The Azure ARM control plane is NOT
emulated — operators provision resources, deployments, and api-keys
through the Twin Plane (`/_twin/`).

## Authentication (data plane)

Both auth paths are accepted on every data-plane endpoint and either is
sufficient:

  * `api-key: <key>` — primary AOAI auth header.
  * `Authorization: Bearer <jwt>` — AAD-shaped JWT, RS256-signed by the
    twin's per-resource keypair, obtained from
    `POST /<resource>/oauth2/v2.0/token` with grant_type=client_credentials.

Tenant isolation is enforced on both paths.

## Twin Plane authentication

Twin Plane (`/_twin/`) uses the standard tenant Basic + admin Bearer scheme:

  * Bootstrap a tenant: `POST /_twin/tenants` -> {tenant_id, tenant_secret}
  * Tenant calls: HTTP Basic `tenant_id:tenant_secret`
  * Admin calls: `Authorization: Bearer <admin_token>` (or `X-Twin-Admin-Token`)

## Key endpoints

Twin Plane (no auth):
  GET  /_twin/health
  GET  /_twin/scenarios
  GET  /_twin/settings
  GET  /_twin/references
  POST /_twin/tenants

Twin Plane (Basic tenant_id:tenant_secret):
  POST /_twin/resources                              -> {resource_id, base_url}
  GET  /_twin/resources
  DELETE /_twin/resources/<resource>
  POST /_twin/resources/<resource>/api_keys           -> {key_id, api_key (shown ONCE)}
  GET  /_twin/resources/<resource>/api_keys
  POST /_twin/resources/<resource>/deployments        body: {model, deployment_id?}
  GET  /_twin/resources/<resource>/deployments
  DELETE /_twin/resources/<resource>/deployments/<d>
  GET  /_twin/logs                  (or admin Bearer for cross-tenant)
  POST /_twin/feedback

Per-resource AAD endpoints (no auth):
  GET  /<resource>/.well-known/openid-configuration
  GET  /<resource>/.well-known/jwks.json
  POST /<resource>/oauth2/v2.0/token   form: grant_type=client_credentials,
                                              client_id=<key_id>,
                                              client_secret=<api_key>

Data plane (api-key OR AAD bearer):
  POST /<resource>/openai/deployments/<deployment>/chat/completions
  POST /<resource>/openai/deployments/<deployment>/completions
  POST /<resource>/openai/deployments/<deployment>/embeddings

## Quick start (cloud)

  curl -X POST https://aoai.twins.la/_twin/tenants \
    -H "Content-Type: application/json" \
    -d '{"friendly_name":"Dev"}'
  # -> { tenant_id, tenant_secret }

  curl -X POST https://aoai.twins.la/_twin/resources \
    -u "TENANT_ID:TENANT_SECRET" \
    -H "Content-Type: application/json" \
    -d '{"friendly_name":"my-aoai"}'
  # -> { resource_id, base_url }

  curl -X POST https://aoai.twins.la/_twin/resources/RID/api_keys \
    -u "TENANT_ID:TENANT_SECRET" -d '{}'
  # -> { key_id, api_key }

  curl -X POST https://aoai.twins.la/_twin/resources/RID/deployments \
    -u "TENANT_ID:TENANT_SECRET" \
    -H "Content-Type: application/json" \
    -d '{"model":"gpt-4o-mini","deployment_id":"chat"}'

  curl -X POST 'https://aoai.twins.la/RID/openai/deployments/chat/chat/completions?api-version=2024-10-21' \
    -H 'api-key: RAW_API_KEY' \
    -H 'Content-Type: application/json' \
    -d '{"messages":[{"role":"user","content":"hello"}]}'

## SDK example

    from openai import AzureOpenAI

    client = AzureOpenAI(
        api_key="RAW_API_KEY",
        api_version="2024-10-21",
        azure_endpoint="https://aoai.twins.la/RID",
    )
    resp = client.chat.completions.create(
        model="chat",  # the deployment name
        messages=[{"role": "user", "content": "hello"}],
    )

## Local

  pip install twins-aoai twins-aoai-local
  python -m twins_aoai_local

## Reference

GitHub:           https://github.com/twins-la/aoai
Project overview: https://twins.la
All twins:        https://github.com/twins-la/twins-la

GitHub twins.la Health Scenarios