Every running system is two products: the one users see and the one operators do.
[ trace // field response ]
Every system in production is actually two products. The first is what the user sees. The second is what the operator sees — metrics, traces, alerts, dashboards. The second product determines whether the first one survives its second year.
Teams that build only the first product end up reverse-engineering the second one in the middle of an outage. This is the worst possible time to design an observability strategy.
Ship both products on purpose. The second one is the one that lets the first one grow up.