# SaaS (Cloud Run / Cloudflare Workers)

The hosted Edge deployment runs the same single container as on-prem, fronted by Cloudflare and orchestrated by Cloud Run.

## Topology

```mermaid
flowchart LR
    User[User browser] -->|HTTPS| CF[Cloudflare<br/>edge.nyami.fr]
    CF -->|WAF + DDoS| LB[Cloud Run gateway]
    LB --> APP[edge-app container<br/>Cloud Run service]
    APP -->|cert auth| LLM[LLM gateway]
    APP -->|optional| LF[Langfuse cloud]

    subgraph "Control plane"
        Workers[Cloudflare Workers<br/>tenant routing]
        DO[(Durable Objects<br/>per-tenant state)]
    end

    CF -.->|future v2| Workers
    Workers -.->|future v2| DO
```

## Today (v1.x)

| Component       | Provider               | Notes                                         |
| --------------- | ---------------------- | --------------------------------------------- |
| CDN + TLS + WAF | Cloudflare             | Zone `nyami.fr`                               |
| Compute         | Google Cloud Run       | Min instances = 0 for cost; cold starts \~3 s |
| Static landing  | Cloudflare Pages       | `edge-landing` repo                           |
| Image registry  | GHCR                   | `ghcr.io/nkap360-dev/edge-app:vX.Y.Z`         |
| Storage         | Container-local SQLite | Snapshotted nightly to Cloud Storage          |
| LLM             | Tenant's gateway       | No shared LLM key                             |

## Tomorrow (v2 control plane)

The multi-tenant control plane lives on Cloudflare Workers + Durable Objects:

* One Durable Object per tenant, holds workspace state.
* Workers route `<tenant>.edge.nyami.fr` to the right DO.
* Edge container becomes a stateless evaluator behind it.

See [Architecture / Overview](/architecture/overview.md) for the container view.

## Deploy command

```bash
# from edge-infra/
gcloud run deploy edge-app \
  --image ghcr.io/nkap360-dev/edge-app:vX.Y.Z \
  --region europe-west6 \
  --no-allow-unauthenticated \
  --set-env-vars POCKETBASE_ADMIN_EMAIL=... \
  --set-secrets AI_HUB_PFX=projects/.../secrets/ai-hub-pfx:latest
```

Required env vars are in [Quick start](/deployment/quickstart.md).

## SLOs

| Surface                             | Target                                            |
| ----------------------------------- | ------------------------------------------------- |
| `/health` p99                       | < 200 ms                                          |
| Flow execution (6 items, 3 metrics) | < 60 s                                            |
| Uptime                              | 99.5% (single-region today; multi-region planned) |

## Cost shape

| Item           | Cost driver                                                   |
| -------------- | ------------------------------------------------------------- |
| Cloud Run      | CPU-seconds during eval runs; near-zero idle                  |
| Cloudflare     | Free tier covers landing + docs; Workers/DO billed per-tenant |
| LLM            | Pass-through, on tenant's account                             |
| Langfuse cloud | Optional; free tier sufficient for low volumes                |


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.edge.nyami.fr/deployment/saas.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
