Building in Public

Day 4, Milestone 1: Going Public

· 6 min read
Day 4, Milestone 1: Going Public

What 9 Production Deployment Gotchas Taught Us About Building Infrastructure

February 16, 20265 min read


"Going public" sounds exciting. In reality, it's a wall of 403 errors, silent container failures, and debugging why objects you can clearly see in your S3 bucket return "Access Denied" to everyone else.

Day 4 of DPP Kit's public beta was supposed to be a victory lap. Instead, it was a masterclass in production deployment gotchas. Here's what we learned pushing UNTP credential infrastructure from local development to live production.

The Context

DPP Kit is a multi-tenant platform for issuing UNTP (UN Transparency Protocol) credentials — Digital Product Passports, facility records, traceability events, and conformity certificates. Our architecture uses:

  • DigitalOcean Spaces (S3-compatible storage) for credential files

  • Directus for our credential management backend

  • VCKit (UN's reference implementation) for credential issuance

  • Identity Resolver for DID resolution

  • Astro + Express for our application layer

  • Caddy as our reverse proxy

  • Containerized deployment across all services

When everything works locally, you assume production is just "deploy and scale." Narrator: It wasn't.

The 9 Gotchas

1. DO Spaces API Keys Are Per-Bucket, Not Account-Wide

The assumption: One API key covers all three of our Spaces buckets (credentials, assets, backups).

The reality: Each bucket needs its own scoped key, or one key explicitly scoped to all three.

The symptom: Generic S3 403 errors in Directus with zero indication it was a key scope issue.

Time lost: About an hour of debugging before checking the DigitalOcean docs.

The fix: Generated properly scoped keys. Lesson: S3-compatible doesn't mean identical API behavior.


2. S3 Object ACLs vs. Bucket-Level "Public" Settings

The assumption: Setting a DO Space to "Public" makes uploaded files publicly accessible.

The reality: The "Public" setting only controls whether you can list files. Individual objects uploaded via S3 API without the ACL: public-read header are still private.

The symptom: Every credential we issued returned 403 when verifiers tried to access it. The URLs were correct. The files existed. They just weren't readable.

Time lost: 2+ hours convinced it was a URL construction problem.

The fix: Patched the upstream storage service (a git submodule we can't push to) using a Dockerfile sed overlay to add the ACL header. No fork maintenance, fails loudly if upstream changes break it.


3. Identity Resolver Only Starts an HTTP Server in Development Mode

The assumption: The UN's Identity Resolver container would run the same in dev and production.

The reality: The entrypoint checks if (NODE_ENV !== 'production') before calling bootstrap(). In production mode, it only exports a Lambda handler. No HTTP server. No logs. Exit code 0.

The symptom: Container would start, immediately exit with zero logs, and restart in an infinite loop.

Time lost: 45 minutes staring at empty logs before diving into the source code.

The fix: Set NODE_ENV: development on a production deployment. Yes, we know. It feels deeply wrong. It's the only way the container works outside AWS Lambda.


4. Astro's import.meta.env Is Empty at Runtime in Docker

The assumption: Environment variables work like they do in local development.

The reality: Astro uses Vite, which bakes PUBLIC_* variables into the build at build time. They're not read from the container's runtime environment. Server-side secrets need explicit process.env fallbacks because import.meta.env doesn't see them.

The symptom: SMTP credentials, Stripe API keys, database URLs — all undefined in production despite being in our .env file.

Time lost: 30 minutes of "why is the environment empty?"

The fix: Docker build ARGs for public vars, process.env fallbacks for secrets. Two different env mechanisms, both with Docker-specific gotchas.


5. Caddy Route Ordering: Two Apps Serving /api/*

The setup: Astro serves its own API routes (/api/register, /api/billing/*, /api/webhooks/*). Express handles the REST API (/api/organizations, /api/credentials).

The assumption: Caddy would intelligently route based on specificity.

The reality: A single /api/* catch-all sends everything to Express, silently 404-ing all Astro routes.

The symptom: Registration, billing, webhooks — all broken. Zero error messages.

Time lost: 20 minutes before realizing we needed explicit routing.

The fix: Added 10 explicit handle blocks before the catch-all. Verbose but reliable.


6. DO Managed PostgreSQL Uses Self-Signed SSL Certs

The assumption: Managed database SSL "just works."

The reality: DigitalOcean's managed PostgreSQL uses SSL with a self-signed CA. Directus v11's DB_SSL__REJECT_UNAUTHORIZED config option didn't work.

The symptom: Connection refused errors from both Directus and VCKit.

Time lost: 15 minutes of TLS debugging.

The fix: The nuclear option: NODE_TLS_REJECT_UNAUTHORIZED=0 on both containers. We're on a private VPC, but we're not thrilled about it.


7. Disk Space on a 25GB Droplet

The problem: Docker image builds — especially VCKit with its massive dependency tree — repeatedly filled the disk.

The temptation: docker system prune -a frees space instantly.

The cost: Full rebuilds every time.

Time lost: Probably 3+ hours across multiple builds.

The fix: Upgraded to a larger disk. Should've done this first.


8. Silent Failures with Zero Diagnostic Logging

The problem: The credential verification route returned 502. No URL logged. No response status. Nothing in the API logs.

The revelation: Adding three lines of console.error immediately revealed the root cause (the Spaces ACL issue from #2).

Time lost: Could've saved an hour if we'd logged failures from day one.

The lesson: Every fetch-and-proxy pattern needs failure logging before you deploy. Not after you get a 502.


9. Modifying Upstream Submodules You Can't Push To

The context: The ACL fix (issue #2) required changing the storage service, which is a git submodule from uncefact/project-storage-service.

The problem: We can't push to upstream. Forking creates maintenance burden.

The solution: A Dockerfile overlay that uses sed to patch the source at build time. No fork. Build fails loudly if upstream changes break the patch.

The tradeoff: Fragile. But better than maintaining a fork or monkeypatching at runtime.


The Lessons

1. S3-compatible ≠ S3-identical. Bucket policies, object ACLs, API scoping — they all have subtle differences across providers.

2. Production mode isn't always production-ready. Some tools assume you're deploying to their preferred cloud provider (looking at you, Lambda-only mode).

3. Log everything. Especially HTTP fetch failures. Silent 502s are debugging nightmares.

4. Environment variables have two lifecycles in containerized builds. Build-time vs. runtime. Know which your framework uses.

5. Reverse proxy routing needs explicit ordering. Path specificity matters. Catch-alls should always come last.

6. Start with more disk space than you think you need. Rebuilding Docker images on a full disk isn't fun.


What's Next

Despite the challenges, DPP Kit is now public. Our first credentials are being issued. Verification is working. The infrastructure is solid.

We're back to shipping features:

  • This week: AI-assisted credential creation (turning product descriptions into UNTP-compliant JSON)

  • Next sprint: Full UNTP credential type coverage (DCC, DIA, DTE)

  • March 17: Public launch at Vegas Tech Alley

If you're building UNTP infrastructure or deploying credential issuance systems, hopefully this saves you a few hours of debugging. If you're evaluating DPP Kit — yes, we hit snags. But we also shipped through them.

That's what infrastructure teams do.


Ready to issue your first UNTP credential? Start with our Pilot tier — $40/month, 1,000 credentials, zero infrastructure headaches.


Dan is the founder of DPP Kit. He builds supply chain transparency infrastructure and occasionally writes about the unglamorous parts of shipping software.