Operational trust
We meet you at industry standards
Cryptography is the easy part. Operational discipline — the way an engineering team runs a production system, recovers from incidents, and submits itself to outside scrutiny — is what makes a security claim credible. We treat hosting, change management, observability, and third-party audit as load-bearing parts of the security story, not as afterthoughts.
Executive summary
EngineeringID runs behind Cloudflare with origin access restricted to Cloudflare proxy ranges. SOC 2 Type 1 readiness is in progress on a self-policed basis (not yet externally audited). Enterprise customers receive contractual service credits (per Enterprise contract) and an active incident-response runbook. Every change passes through CI checks (warnings-as-errors compilation, full test suite, dependency audit) before deploy.
Our commitments
Five rules for the production system
Every deploy is reproducible and audited
Every production deploy ties to a specific git SHA, a specific release tag, and a CI run that gated it. A rollback is a single deploy of a known-good SHA — never a manual revert.
The origin is not internet-routable
The Fly origin only accepts traffic from Cloudflare proxy ranges. Direct-to-origin requests are rejected at the edge — no surprise public surface even if DNS leaks.
Backups are tested, not just taken
We restore from backup on a regular cadence and verify the restore is functional. A backup that has never been restored is a hope, not a recovery plan.
Third parties test what we test
External penetration testing is on the roadmap. Findings, when produced, will be tracked in the same project board our internal security work uses.
Status, not just uptime
A public status page covering API, sealing, verification, AI assistant, and webhook delivery — not a single global up/down. Incidents are posted in real time with engineer commentary.
Implementation — hosting
Where the bits actually run
Implementation — change management
How code reaches production
Implementation — observability & response
When things go wrong, what we do
The full picture
What is built, what is being built, and what we chose not to build
Live today
Reproducible deploys via fly deploy
LiveEvery release ties to one git SHA; rollback is a redeploy of a previous SHA, not a manual revert.
Origin restricted to Cloudflare proxy ranges
LiveDirect-to-origin requests are rejected at the edge. The origin is not in the public DNS attack surface.
Managed Fly Postgres backups
LiveBackups managed by Fly Postgres with point-in-time recovery available.
Structured logs, telemetry, Sentry error tracking
LivePer-request request_id flows through the stack. Telemetry events surface in Phoenix.LiveDashboard.
Pre-commit + CI gates with warnings-as-errors
LiveEvery PR compiles clean and passes the full test suite before review; the main branch enforces the same.
Building now
SOC 2 Type 1 readiness
Building nowSelf-policed readiness in progress; external auditor not yet engaged.
Type 1 attests to control design at a point in time. Type II (operating effectiveness over a 3-12 month observation window) is a separate, later engagement.
Public status page with per-component uptime
Building nowSeparate signals for API, sealing, verification, AI, webhook delivery. Real-time incident posting with engineer commentary.
Replaces the current internal-only status surface.
Third-party penetration testing
RoadmapExternal red team engagement. Not yet engaged.
A first engagement is planned alongside the SOC 2 Type 1 readiness work.
Synchronous database replication
Building nowWrites acknowledge only after at least one replica has accepted them; targets single-host failure tolerance.
Automated backup restoration drills
Building nowPeriodic restore exercises to a non-prod environment to verify functionality, not just file presence.
Disaster recovery plan with documented RTO / RPO
Building nowRecovery time and recovery point objectives published per environment, with annual DR exercises that exercise restoration end-to-end.
Roadmap
ISO 27001 certification
RoadmapFollowing SOC 2. ISO 27001 maps closely to the same controls; the gap is documentation rigor, not implementation.
HIPAA Business Associate Agreement
RoadmapFor customers in healthcare verticals. Most controls are already in place; the BAA is a contractual layer on top of the technical posture.
EU and APAC regional residency
RoadmapPer-region database + storage so EU customer data stays in EU and APAC stays in APAC. Driven by GDPR and comparable regional requirements.
Customer-private cloud deployment
RoadmapA single-tenant deployment running in the customer's own AWS / Azure / GCP account, with the same operational model. For the small set of customers whose compliance needs preclude shared multi-tenant infrastructure.
Considered & rejected
Self-hosted Kubernetes cluster
Considered & rejectedOperating Kubernetes is a full-time job for a security and reliability team larger than ours.
Why we rejected it: Kubernetes gives flexibility, but the security and reliability burden of running it well — patching, etcd backups, network policies, RBAC drift — is a meaningful headcount commitment. Our managed host gives us per-app microVM isolation, edge proximity, and managed PostgreSQL without that overhead. We will revisit if our scale changes the tradeoff.
Skipping pre-commit checks for "urgent" hotfixes
Considered & rejectedA hotfix that bypasses the same checks every other deploy uses is a hotfix that introduces a regression in the next hour.
Why we rejected it: every "we'll skip CI just this once" is followed by a postmortem about a missed test. The right answer is: make CI fast enough that nobody asks. Our pre-commit completes in seconds; CI completes in single-digit minutes. There is no "we don't have time" path.
Single-region deployment without DR plan
Considered & rejectedA region outage that takes EngineeringID down for an afternoon costs more in customer trust than the DR cost saves.
Why we rejected it: cross-region replication adds latency and cost, but it is the table-stakes answer to "what happens when AWS us-east-1 goes down." Customers do not accept "it was the cloud provider" as an availability story.
"Trust us" attestations in lieu of third-party audit
Considered & rejectedA self-attestation has the same evidentiary weight as no attestation at all.
Why we rejected it: customers ask for SOC 2 because they want a third party to have looked at us, not because they want us to write a confident page. The audit is the point. We are doing the audit.
Compliance mappings
Controls this surface satisfies
Availability — SLA commitments
Service credits per Enterprise contract; status page reflects per-component reality
Availability — Recovery
Fly Postgres backups with point-in-time recovery; automated restoration drills (in progress)
Change Management
Per-PR review + green CI; reproducible deploys per git SHA
Monitoring — Independent assessment
Annual third-party penetration testing (in progress)
Change management
Documented change-management process with peer review
Information security continuity
Cross-region replication; DR exercises (in progress)
Independent review of information security
External pen test on roadmap; SOC 2 Type 1 readiness in progress
Contingency Plan
Backup, DR, and emergency-mode procedures documented
For compliance teams
Questions you do not need to call to ask
When will the SOC 2 Type II report be available?
What is the uptime commitment?
What is the RTO / RPO?
Where is customer data stored?
Do you support customer-private cloud deployment?
What happens during a security incident?
How do you handle subprocessors?
An operations team that takes the boring parts seriously
Talk to our security team about your compliance posture, or read the audit summary once it lands.