nik/home-services

Fork 0

Nik Afiq 520f5d1ffb

CI / build-ai-gateway (push) Has been cancelled

Details

CI / build-ha-gateway (push) Has been cancelled

Details

CI / build-discord-bot (push) Has been cancelled

Details

CI / test (push) Has been cancelled

Details

feat: add ai-gateway microservice with gRPC API for AI logic

- Implemented new gRPC service `AIService` in `proto/ai/v1/ai.proto` for handling natural language queries.
- Generated Go code for the gRPC service and messages in `gen/ai/v1/`.
- Created `services/ai-gateway/` directory structure with necessary files for the service.
- Added configuration loading and structured logging.
- Implemented domain logic for intent parsing and interaction with Home Assistant.
- Established outbound adapters for Ollama and Home Assistant with mTLS support.
- Updated `go.work` to include the new service and maintain existing dependencies.
- Modified `discord-bot` to use the new `ai-gateway` for AI interactions.
- Added deployment manifest for Kubernetes and CI/CD configuration for building and deploying the service.

2026-04-21 21:52:28 +09:00

22 KiB

Raw Blame History

ai-gateway — Implementation Plan

This plan describes the implementation of a new Go microservice, ai-gateway, in the home-services monorepo (gitea.nik4nao.com/nik/home-services). It centralizes all AI/LLM logic behind a gRPC API so callers (discord-bot, alexa-bridge) remain thin transport adapters with zero AI knowledge.

1. Goals & Non-Goals

Goals

New gRPC service ai-gateway listening on :50052.
Owns all AI logic: Ollama connection, prompt construction, LLM intent parsing, dispatch to ha-gateway.
Callers send raw user text via QueryRequest; receive a human-readable reply in QueryResponse.
mTLS client authentication when calling ha-gateway (ha-gateway requires mTLS).
Hexagonal architecture, matching the existing ha-gateway layout.
Structured logging via slog, OTel OTLP gRPC traces/metrics.
Deployed to the home-services namespace on K3s.

Non-Goals

No auth on ai-gateway's own inbound gRPC surface in this iteration (in-cluster only; match current ha-gateway posture).
No streaming responses — unary only.
No conversation memory — each Query is stateless.
No new Home Assistant features beyond what ha-gateway already exposes (LightService + EntityService).

2. Repository Layout

All paths are relative to the home-services repo root.

proto/
  ai/v1/ai.proto                          # NEW

gen/
  ai/v1/                                  # NEW (generated; committed)
    ai.pb.go
    ai_grpc.pb.go

services/
  ai-gateway/                             # NEW
    go.mod
    cmd/
      ai-gateway/
        main.go
    config/
      config.go
    domain/
      prompt.go
      service.go
      intent.go
    adapters/
      inbound/
        grpc/
          server.go
      outbound/
        ollama/
          client.go
        hagateway/
          client.go
    internal/
      observability/
        logging.go
        otel.go
    Dockerfile
    .dockerignore
  discord-bot/                            # MODIFIED
    adapters/outbound/aigateway/client.go # NEW
    (remove any direct Ollama code if present)

Also update:

go.work — add ./services/ai-gateway and keep replace directive to ../gen.
buf.gen.yaml / buf.yaml — include the new ai/v1 proto package.

3. Proto Definition

File: `proto/ai/v1/ai.proto`

syntax = "proto3";

package ai.v1;

option go_package = "gitea.nik4nao.com/nik/home-services/gen/ai/v1;aiv1";

// AIService accepts free-form natural language queries and returns a
// human-readable reply. It encapsulates LLM prompting, intent parsing,
// and dispatch to downstream services (e.g. ha-gateway).
service AIService {
  rpc Query(QueryRequest) returns (QueryResponse);
}

message QueryRequest {
  // Raw user text, e.g. "turn on the living room light".
  string text = 1;

  // Optional caller identifier for logging/tracing (e.g. "discord-bot").
  string source = 2;
}

message QueryResponse {
  // Human-readable reply to show the user.
  string reply = 1;

  // Parsed intent name, if any. Empty if no actionable intent was detected.
  string intent = 2;

  // True if an action was dispatched to a downstream service.
  bool action_taken = 3;
}

Generation

Run buf generate from repo root.
Commit gen/ai/v1/*.pb.go and gen/ai/v1/*_grpc.pb.go (per existing convention — gen/ is committed to avoid CI codegen dependency).

4. Configuration (`services/ai-gateway/config/config.go`)

Load from environment. Use os.Getenv with defaults (matches existing ha-gateway style — no new dep).

Env Var	Default	Purpose
`GRPC_LISTEN_ADDR`	`:50052`	Inbound gRPC bind address
`OLLAMA_URL`	`http://192.168.7.96:11434`	Ollama HTTP API (direct LAN IP; no K8s Service)
`OLLAMA_MODEL`	`llama3`	Model name
`OLLAMA_TIMEOUT`	`30s`	HTTP timeout for Ollama calls
`HA_GATEWAY_ADDR`	`ha-gateway.home-services.svc.cluster.local:50051`	ha-gateway gRPC endpoint
`HA_GATEWAY_TLS_CA_FILE`	`/etc/ai-gateway/tls/ca.crt`	CA cert that signed ha-gateway's server cert
`HA_GATEWAY_TLS_CERT_FILE`	`/etc/ai-gateway/tls/tls.crt`	ai-gateway's client cert (for mTLS)
`HA_GATEWAY_TLS_KEY_FILE`	`/etc/ai-gateway/tls/tls.key`	ai-gateway's client key
`HA_GATEWAY_SERVER_NAME`	`ha-gateway.home-services.svc.cluster.local`	SNI / cert verification name
`LOG_LEVEL`	`info`	`debug`/`info`/`warn`/`error`
`LOG_FORMAT`	`json`	`json` or `text`
`OTEL_EXPORTER_OTLP_ENDPOINT`	`otel-collector-opentelemetry-collector.monitoring.svc.cluster.local:4317`	OTLP gRPC endpoint
`OTEL_SERVICE_NAME`	`ai-gateway`	Service name for traces/metrics

Provide a Config struct with a Load() function returning (Config, error). Validate required files exist at startup.

5. Domain Layer

`domain/intent.go`

Define the intent contract the LLM must produce:

package domain

type Intent struct {
    Name    string            `json:"intent"`   // e.g. "turn_on_light", "turn_off_light", "none"
    Entity  string            `json:"entity"`   // e.g. "living_room" (friendly name or entity_id)
    Params  map[string]string `json:"params"`   // optional, e.g. {"brightness":"80"}
    Reply   string            `json:"reply"`    // what to say back to the user
}

const (
    IntentNone         = "none"
    IntentTurnOnLight  = "turn_on_light"
    IntentTurnOffLight = "turn_off_light"
    IntentListEntities = "list_entities"
)

`domain/prompt.go`

Build the Ollama prompt. The system prompt MUST instruct the model to return only a single JSON object matching the Intent schema. No markdown fences, no prose.

package domain

import "fmt"

const systemPrompt = `You are a home automation assistant. Given a user request, respond with a single JSON object and nothing else — no markdown, no code fences, no explanation.

Schema:
{
  "intent": "turn_on_light" | "turn_off_light" | "list_entities" | "none",
  "entity": "<friendly_name_or_empty>",
  "params": { "<key>": "<value>" },
  "reply":  "<short human-readable reply>"
}

Rules:
- If the request is not actionable, use intent="none" and put the conversational answer in "reply".
- Always include all four fields. Use "" or {} for empty values.
- Do not wrap the JSON in backticks.`

func BuildPrompt(userText string) string {
    return fmt.Sprintf("%s\n\nUser: %s", systemPrompt, userText)
}

`domain/service.go`

The orchestrator. Depends on two ports (interfaces) defined here:

package domain

import "context"

type LLMClient interface {
    Generate(ctx context.Context, prompt string) (string, error)
}

type HAClient interface {
    TurnOnLight(ctx context.Context, entity string, params map[string]string) error
    TurnOffLight(ctx context.Context, entity string) error
    ListEntities(ctx context.Context) ([]string, error)
}

type Service struct {
    llm LLMClient
    ha  HAClient
    log *slog.Logger
}

func NewService(llm LLMClient, ha HAClient, log *slog.Logger) *Service { /* ... */ }

type QueryResult struct {
    Reply       string
    Intent      string
    ActionTaken bool
}

func (s *Service) Query(ctx context.Context, text string) (QueryResult, error) {
    // 1. BuildPrompt(text)
    // 2. s.llm.Generate(ctx, prompt)
    // 3. json.Unmarshal into Intent
    //    - On unmarshal error: log at warn, return reply = "I didn't understand that."
    // 4. switch intent.Name:
    //      turn_on_light  -> s.ha.TurnOnLight(...)
    //      turn_off_light -> s.ha.TurnOffLight(...)
    //      list_entities  -> s.ha.ListEntities(...); format into reply
    //      none / default -> reply = intent.Reply
    // 5. Return QueryResult
}

Error handling:

LLM call failure → return error; inbound adapter maps to gRPC Unavailable.
JSON parse failure → do NOT error; return a friendly "I didn't understand" reply and log the raw LLM output at warn with the original text (not error).
HA dispatch failure → log at error, return reply "I couldn't reach Home Assistant right now."; ActionTaken=false.

6. Outbound Adapters

`adapters/outbound/ollama/client.go`

Plain net/http.Client with configured timeout.

POST to {OLLAMA_URL}/api/generate with body:

{ "model": "<OLLAMA_MODEL>", "prompt": "<prompt>", "stream": false }

Decode JSON response, return the response field as a string.
Implement domain.LLMClient.
Wrap the HTTP client with OTel instrumentation (otelhttp.NewTransport).

`adapters/outbound/hagateway/client.go`

This is the mTLS-critical piece.

Construct a *grpc.ClientConn to HA_GATEWAY_ADDR with TLS credentials built from the three cert files:

func loadTLSCredentials(caFile, certFile, keyFile, serverName string) (credentials.TransportCredentials, error) {
    caPEM, err := os.ReadFile(caFile)
    if err != nil { return nil, fmt.Errorf("read ca: %w", err) }
    cp := x509.NewCertPool()
    if !cp.AppendCertsFromPEM(caPEM) {
        return nil, errors.New("failed to append CA cert")
    }
    clientCert, err := tls.LoadX509KeyPair(certFile, keyFile)
    if err != nil { return nil, fmt.Errorf("load client keypair: %w", err) }
    return credentials.NewTLS(&tls.Config{
        Certificates: []tls.Certificate{clientCert},
        RootCAs:      cp,
        ServerName:   serverName,
        MinVersion:   tls.VersionTLS13,
    }), nil
}

Use grpc.NewClient(addr, grpc.WithTransportCredentials(creds), grpc.WithStatsHandler(otelgrpc.NewClientHandler())).
Wrap the generated ha-gateway clients (LightServiceClient, EntityServiceClient) to satisfy domain.HAClient.
Expose a Close() method for graceful shutdown.

Cert source: the cert files will be projected into the pod via a Kubernetes Secret mounted at /etc/ai-gateway/tls/. See deployment manifest below. Issuing the cert is covered in §10.

7. Inbound Adapter

`adapters/inbound/grpc/server.go`

Implements aiv1.AIServiceServer.
Query(ctx, req) → calls domain.Service.Query(ctx, req.Text) → maps QueryResult to QueryResponse.
Attach OTel interceptor: grpc.StatsHandler(otelgrpc.NewServerHandler()).
Attach a slog unary interceptor that logs method, duration, caller source, and error code.
Register reflection service only if LOG_LEVEL=debug (convenience for grpcurl).

8. Observability (`internal/observability/`)

Copy the pattern from ha-gateway:

`logging.go`

NewLogger(level, format string) *slog.Logger returning either slog.NewJSONHandler or slog.NewTextHandler wrapping os.Stdout.

`otel.go`

InitOTel(ctx, endpoint, serviceName) (shutdown func(context.Context) error, err error).
Uses otlptracegrpc + otlpmetricgrpc exporters, insecure credentials (in-cluster).
Sets global TracerProvider and MeterProvider.
Resource attributes: service.name, service.namespace=home-services.

9. Entry Point (`cmd/ai-gateway/main.go`)

Standard startup sequence:

Load config.
Build logger.
Init OTel; defer shutdown.
Build Ollama client.
Build ha-gateway client (mTLS); defer Close().
Build domain service.
Build gRPC server with interceptors, register AIService.
Listen on GRPC_LISTEN_ADDR.
Handle SIGINT/SIGTERM for graceful shutdown: server.GracefulStop() with a 10s timeout, then OTel shutdown.

10. TLS / mTLS Plumbing

ha-gateway requires mTLS. ai-gateway needs a client certificate signed by the same CA that ha-gateway trusts.

Approach: cert-manager + internal-ca-issuer

Create a Certificate resource for ai-gateway (file: manifests/home-services/ai-gateway-client-cert.yaml):

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: ai-gateway-client
  namespace: home-services
spec:
  secretName: ai-gateway-client-tls
  duration: 2160h      # 90d
  renewBefore: 360h    # 15d
  subject:
    organizations: [home-services]
  commonName: ai-gateway
  usages:
    - client auth
  issuerRef:
    name: internal-ca-issuer
    kind: ClusterIssuer
    group: cert-manager.io

Important: use internal-ca-issuer (the CA issuer), never internal-ca (the bootstrap self-signed issuer). This matches the homelab convention.

The resulting secret ai-gateway-client-tls contains tls.crt, tls.key, and ca.crt — mount all three.

Verify ha-gateway's CA trust

Confirm ha-gateway's server TLS config trusts internal-ca-issuer's CA (it should, since both use the same cluster CA). If ha-gateway uses a separate client-auth CA, adjust the issuer accordingly.

11. Kubernetes Manifest

File: `manifests/home-services/ai-gateway.yaml`

Single file with --- separators per repo convention.

apiVersion: v1
kind: Service
metadata:
  name: ai-gateway
  namespace: home-services
spec:
  selector: { app: ai-gateway }
  ports:
    - name: grpc
      port: 50052
      targetPort: 50052
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ai-gateway
  namespace: home-services
spec:
  replicas: 1
  selector: { matchLabels: { app: ai-gateway } }
  template:
    metadata:
      labels: { app: ai-gateway }
    spec:
      containers:
        - name: ai-gateway
          image: gitea.nik4nao.com/nik/ai-gateway:latest
          imagePullPolicy: Always
          ports:
            - containerPort: 50052
              name: grpc
          env:
            - { name: GRPC_LISTEN_ADDR, value: ":50052" }
            - { name: OLLAMA_URL, value: "http://192.168.7.96:11434" }
            - { name: OLLAMA_MODEL, value: "llama3" }
            - { name: HA_GATEWAY_ADDR, value: "ha-gateway.home-services.svc.cluster.local:50051" }
            - { name: HA_GATEWAY_TLS_CA_FILE,   value: "/etc/ai-gateway/tls/ca.crt" }
            - { name: HA_GATEWAY_TLS_CERT_FILE, value: "/etc/ai-gateway/tls/tls.crt" }
            - { name: HA_GATEWAY_TLS_KEY_FILE,  value: "/etc/ai-gateway/tls/tls.key" }
            - { name: HA_GATEWAY_SERVER_NAME,   value: "ha-gateway.home-services.svc.cluster.local" }
            - { name: LOG_LEVEL,  value: "info" }
            - { name: LOG_FORMAT, value: "json" }
            - { name: OTEL_EXPORTER_OTLP_ENDPOINT,
                value: "otel-collector-opentelemetry-collector.monitoring.svc.cluster.local:4317" }
            - { name: OTEL_SERVICE_NAME, value: "ai-gateway" }
          volumeMounts:
            - name: tls
              mountPath: /etc/ai-gateway/tls
              readOnly: true
          readinessProbe:
            tcpSocket: { port: 50052 }
            initialDelaySeconds: 3
            periodSeconds: 10
          livenessProbe:
            tcpSocket: { port: 50052 }
            initialDelaySeconds: 10
            periodSeconds: 20
      volumes:
        - name: tls
          secret:
            secretName: ai-gateway-client-tls
      imagePullSecrets:
        - name: gitea-registry

No resource limits/requests yet — matches current repo convention (memory limits not yet enforced on pods).

12. discord-bot Changes

New: `services/discord-bot/adapters/outbound/aigateway/client.go`

gRPC client to ai-gateway.home-services.svc.cluster.local:50052, plaintext (no auth on ai-gateway's inbound surface yet).
Exposes Query(ctx, text string) (reply string, err error).
Inject into existing command handler.

Removed / simplified

If discord-bot currently contains any direct Ollama calls, remove them.
Slash command handler for free-form queries simply calls aigateway.Query(ctx, msg.Content) and posts the returned reply.
Event-notification path (existing Discord → notify flow) is untouched.

Config additions to discord-bot

AI_GATEWAY_ADDR (default ai-gateway.home-services.svc.cluster.local:50052).

13. CI / Build

`services/ai-gateway/Dockerfile`

Multi-stage build matching existing services:

FROM golang:1.26 AS build
WORKDIR /src
COPY go.work go.work.sum ./
COPY gen ./gen
COPY services/ai-gateway ./services/ai-gateway
WORKDIR /src/services/ai-gateway
RUN CGO_ENABLED=0 go build -o /out/ai-gateway ./cmd/ai-gateway

FROM gcr.io/distroless/static-debian12:nonroot
COPY --from=build /out/ai-gateway /ai-gateway
USER nonroot:nonroot
ENTRYPOINT ["/ai-gateway"]

Gitea Actions workflow

Mirror the existing ha-gateway workflow:

Trigger on pushes touching services/ai-gateway/**, gen/ai/**, or proto/ai/**.
docker buildx multiarch build (linux/amd64,linux/arm64).
Push to gitea.nik4nao.com/nik/ai-gateway:latest and :${{ github.sha }}.
Use the Gitea API token (read:package + write:package) as registry password — not the account password.
Remember: buildkit CA must be injected each run (existing runner pattern).

14. Workspace Wiring

`go.work` — add line:

use ./services/ai-gateway

Keep the existing replace gitea.nik4nao.com/nik/home-services/gen => ../gen in services/ai-gateway/go.mod.

`services/ai-gateway/go.mod` dependencies

google.golang.org/grpc
google.golang.org/protobuf
go.opentelemetry.io/otel
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc
go.opentelemetry.io/otel/sdk
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp

15. Testing

Unit tests (`services/ai-gateway/domain/service_test.go`)

Fake LLMClient returning canned JSON strings for each intent.
Fake HAClient recording calls.
Assert:
- Valid turn_on_light JSON → HAClient.TurnOnLight called with correct entity; reply matches.
- Invalid JSON → graceful reply, no panic, no HA call.
- intent="none" → no HA call; reply passed through.
- HA call returning error → reply contains "couldn't reach Home Assistant"; ActionTaken=false.

Integration smoke test (manual, post-deploy)

# From inside the cluster:
grpcurl -plaintext -d '{"text":"turn on the living room light","source":"manual"}' \
  ai-gateway.home-services.svc.cluster.local:50052 ai.v1.AIService/Query

mTLS verification

# Should succeed (using mounted cert):
kubectl exec -n home-services deploy/ai-gateway -- /ai-gateway --selftest  # if implemented
# Or inspect via openssl from within the pod if distroless allows a debug sidecar.

16. Rollout Order

Implement in this order. Each step should compile and tests should pass before the next.

Proto + gen — add proto/ai/v1/ai.proto, run buf generate, commit gen/ai/v1/.
Scaffold — create services/ai-gateway/ with go.mod, main.go (stub), update go.work.
Domain — intent.go, prompt.go, service.go + unit tests with fakes.
Ollama adapter — HTTP client, manual curl-based validation against 192.168.7.96:11434.
ha-gateway adapter — mTLS dial, wrap generated clients, satisfy domain.HAClient.
Inbound gRPC adapter — server, interceptors.
Observability — logging + OTel init.
Entry point — wire everything in cmd/ai-gateway/main.go.
Dockerfile + CI — build and push image to Gitea registry.
Cert-manager Certificate — apply ai-gateway-client-cert.yaml; verify ai-gateway-client-tls secret is created.
Deployment manifest — apply ai-gateway.yaml; verify pod ready, logs clean, grpcurl smoke test passes.
discord-bot update — add aigateway outbound adapter, remove any direct Ollama usage, redeploy.
End-to-end test — issue a Discord slash command, observe:
- Discord → ai-gateway → Ollama → ai-gateway → ha-gateway (mTLS) → HA → reply back.
- Traces visible in Tempo, logs in Loki, metrics in Prometheus.

17. Open Questions / Deferred

Auth on ai-gateway's inbound surface: currently none. Revisit when alexa-bridge lands — Alexa path is public-ingress, so ai-gateway may eventually need mTLS inbound too.
Intent schema evolution: if the set of intents grows meaningfully, consider moving the schema into the proto (enum + oneof) rather than free-form JSON. For now, JSON keeps the LLM prompt simple.
Conversation memory: out of scope. If needed later, add a per-source session store (Valkey in home-services).
Prompt templates per model: llama3 works with the current system prompt. If swapping to a smaller model, prompt may need tuning — keep BuildPrompt easy to override via config.

18. Acceptance Criteria

ai-gateway pod runs ready in home-services namespace.
grpcurl smoke test (§15) returns a structured QueryResponse for a light command.
Light actually turns on/off in Home Assistant when tested end-to-end.
ha-gateway logs show mTLS handshake succeeded with CN=ai-gateway.
Traces for a full Discord query show three spans: discord-bot → ai-gateway → ha-gateway.
discord-bot contains no direct references to OLLAMA_URL or Ollama HTTP client code.
Unit tests pass in CI; Docker image builds multiarch.

22 KiB Raw Blame History

ai-gateway — Implementation Plan

1. Goals & Non-Goals

Goals

Non-Goals

2. Repository Layout

3. Proto Definition

File: proto/ai/v1/ai.proto

Generation

4. Configuration (services/ai-gateway/config/config.go)

5. Domain Layer

domain/intent.go

domain/prompt.go

domain/service.go

6. Outbound Adapters

adapters/outbound/ollama/client.go

adapters/outbound/hagateway/client.go

7. Inbound Adapter

adapters/inbound/grpc/server.go

8. Observability (internal/observability/)

logging.go

otel.go

9. Entry Point (cmd/ai-gateway/main.go)

10. TLS / mTLS Plumbing

Approach: cert-manager + internal-ca-issuer

Verify ha-gateway's CA trust

11. Kubernetes Manifest

File: manifests/home-services/ai-gateway.yaml

12. discord-bot Changes

New: services/discord-bot/adapters/outbound/aigateway/client.go

Removed / simplified

Config additions to discord-bot

13. CI / Build

services/ai-gateway/Dockerfile

Gitea Actions workflow

14. Workspace Wiring

go.work — add line:

services/ai-gateway/go.mod dependencies

15. Testing

Unit tests (services/ai-gateway/domain/service_test.go)

Integration smoke test (manual, post-deploy)

mTLS verification

16. Rollout Order

17. Open Questions / Deferred

18. Acceptance Criteria

22 KiB

Raw Blame History

File: `proto/ai/v1/ai.proto`

4. Configuration (`services/ai-gateway/config/config.go`)

`domain/intent.go`

`domain/prompt.go`

`domain/service.go`

`adapters/outbound/ollama/client.go`

`adapters/outbound/hagateway/client.go`

`adapters/inbound/grpc/server.go`

8. Observability (`internal/observability/`)

`logging.go`

`otel.go`

9. Entry Point (`cmd/ai-gateway/main.go`)

File: `manifests/home-services/ai-gateway.yaml`

New: `services/discord-bot/adapters/outbound/aigateway/client.go`

`services/ai-gateway/Dockerfile`

`go.work` — add line:

`services/ai-gateway/go.mod` dependencies

Unit tests (`services/ai-gateway/domain/service_test.go`)