prometheus-go-code-review

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Prometheus Go Code Review

Prometheus Go代码审查

Review Checklist

审查检查清单

  • Metric types match measurement semantics (Counter/Gauge/Histogram)
  • Labels have low cardinality (no user IDs, timestamps, paths)
  • Metric names follow conventions (snake_case, unit suffix)
  • Histograms use appropriate bucket boundaries
  • Metrics registered once, not per-request
  • Collectors don't panic on race conditions
  • /metrics endpoint exposed and accessible
  • 指标类型与测量语义匹配(Counter/Gauge/Histogram)
  • 标签具有低基数(不使用用户ID、时间戳、路径)
  • 指标名称遵循命名规范(蛇形命名snake_case、带单位后缀)
  • 直方图使用合适的桶边界
  • 指标仅注册一次,而非每次请求都注册
  • 采集器在竞争条件下不会触发panic
  • /metrics端点已暴露且可访问

Metric Type Selection

指标类型选择

MeasurementTypeExample
Requests processedCounter
requests_total
Items in queueGauge
queue_length
Request durationHistogram
request_duration_seconds
Concurrent connectionsGauge
active_connections
Errors since startCounter
errors_total
Memory usageGauge
memory_bytes
测量内容类型示例
已处理请求数Counter
requests_total
队列中的项目数Gauge
queue_length
请求耗时Histogram
request_duration_seconds
并发连接数Gauge
active_connections
启动以来的错误数Counter
errors_total
内存使用量Gauge
memory_bytes

Critical Anti-Patterns

关键反模式

1. High Cardinality Labels

1. 高基数标签

go
// BAD - unique per user/request
counter := promauto.NewCounterVec(
    prometheus.CounterOpts{Name: "requests_total"},
    []string{"user_id", "path"},  // millions of series!
)
counter.WithLabelValues(userID, request.URL.Path).Inc()

// GOOD - bounded label values
counter := promauto.NewCounterVec(
    prometheus.CounterOpts{Name: "requests_total"},
    []string{"method", "status_code"},  // <100 series
)
counter.WithLabelValues(r.Method, statusCode).Inc()
go
// 错误示例 - 每个用户/请求唯一
counter := promauto.NewCounterVec(
    prometheus.CounterOpts{Name: "requests_total"},
    []string{"user_id", "path"},  // 会生成数百万个时间序列!
)
counter.WithLabelValues(userID, request.URL.Path).Inc()

// 正确示例 - 标签值数量有限
counter := promauto.NewCounterVec(
    prometheus.CounterOpts{Name: "requests_total"},
    []string{"method", "status_code"},  // 时间序列数量<100
)
counter.WithLabelValues(r.Method, statusCode).Inc()

2. Wrong Metric Type

2. 错误的指标类型

go
// BAD - using gauge for monotonic value
requestCount := promauto.NewGauge(prometheus.GaugeOpts{
    Name: "http_requests",
})
requestCount.Inc()  // should be Counter!

// GOOD
requestCount := promauto.NewCounter(prometheus.CounterOpts{
    Name: "http_requests_total",
})
requestCount.Inc()
go
// 错误示例 - 用Gauge表示单调递增的值
requestCount := promauto.NewGauge(prometheus.GaugeOpts{
    Name: "http_requests",
})
requestCount.Inc()  // 应该用Counter!

// 正确示例
requestCount := promauto.NewCounter(prometheus.CounterOpts{
    Name: "http_requests_total",
})
requestCount.Inc()

3. Registering Per-Request

3. 每次请求都注册指标

go
// BAD - new metric per request
func handler(w http.ResponseWriter, r *http.Request) {
    counter := prometheus.NewCounter(...)  // creates new each time!
    prometheus.MustRegister(counter)       // panics on duplicate!
}

// GOOD - register once
var requestCounter = promauto.NewCounter(prometheus.CounterOpts{
    Name: "http_requests_total",
})

func handler(w http.ResponseWriter, r *http.Request) {
    requestCounter.Inc()
}
go
// 错误示例 - 每次请求创建新指标
func handler(w http.ResponseWriter, r *http.Request) {
    counter := prometheus.NewCounter(...)  // 每次都创建新的!
    prometheus.MustRegister(counter)       // 重复注册会触发panic!
}

// 正确示例 - 仅注册一次
var requestCounter = promauto.NewCounter(prometheus.CounterOpts{
    Name: "http_requests_total",
})

func handler(w http.ResponseWriter, r *http.Request) {
    requestCounter.Inc()
}

4. Missing Unit Suffix

4. 缺少单位后缀

go
// BAD
duration := promauto.NewHistogram(prometheus.HistogramOpts{
    Name: "request_duration",  // no unit!
})

// GOOD
duration := promauto.NewHistogram(prometheus.HistogramOpts{
    Name: "request_duration_seconds",  // unit in name
})
go
// 错误示例
duration := promauto.NewHistogram(prometheus.HistogramOpts{
    Name: "request_duration",  // 没有单位!
})

// 正确示例
duration := promauto.NewHistogram(prometheus.HistogramOpts{
    Name: "request_duration_seconds",  // 名称中包含单位
})

Good Patterns

良好实践

Metric Definition

指标定义

go
var (
    httpRequests = promauto.NewCounterVec(
        prometheus.CounterOpts{
            Namespace: "myapp",
            Subsystem: "http",
            Name:      "requests_total",
            Help:      "Total HTTP requests processed",
        },
        []string{"method", "status"},
    )

    httpDuration = promauto.NewHistogramVec(
        prometheus.HistogramOpts{
            Namespace: "myapp",
            Subsystem: "http",
            Name:      "request_duration_seconds",
            Help:      "HTTP request latencies",
            Buckets:   []float64{.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10},
        },
        []string{"method"},
    )
)
go
var (
    httpRequests = promauto.NewCounterVec(
        prometheus.CounterOpts{
            Namespace: "myapp",
            Subsystem: "http",
            Name:      "requests_total",
            Help:      "已处理的HTTP请求总数",
        },
        []string{"method", "status"},
    )

    httpDuration = promauto.NewHistogramVec(
        prometheus.HistogramOpts{
            Namespace: "myapp",
            Subsystem: "http",
            Name:      "request_duration_seconds",
            Help:      "HTTP请求延迟",
            Buckets:   []float64{.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10},
        },
        []string{"method"},
    )
)

Middleware Pattern

中间件模式

go
func metricsMiddleware(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        timer := prometheus.NewTimer(httpDuration.WithLabelValues(r.Method))
        defer timer.ObserveDuration()

        wrapped := &responseWriter{ResponseWriter: w, status: 200}
        next.ServeHTTP(wrapped, r)

        httpRequests.WithLabelValues(r.Method, strconv.Itoa(wrapped.status)).Inc()
    })
}
go
func metricsMiddleware(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        timer := prometheus.NewTimer(httpDuration.WithLabelValues(r.Method))
        defer timer.ObserveDuration()

        wrapped := &responseWriter{ResponseWriter: w, status: 200}
        next.ServeHTTP(wrapped, r)

        httpRequests.WithLabelValues(r.Method, strconv.Itoa(wrapped.status)).Inc()
    })
}

Exposing Metrics

暴露指标端点

go
import "github.com/prometheus/client_golang/prometheus/promhttp"

func main() {
    http.Handle("/metrics", promhttp.Handler())
    http.ListenAndServe(":9090", nil)
}
go
import "github.com/prometheus/client_golang/prometheus/promhttp"

func main() {
    http.Handle("/metrics", promhttp.Handler())
    http.ListenAndServe(":9090", nil)
}

Review Questions

审查问题

  1. Are metric types correct (Counter vs Gauge vs Histogram)?
  2. Are label values bounded (no UUIDs, timestamps, paths)?
  3. Do metric names include units (_seconds, _bytes)?
  4. Are metrics registered once (not per-request)?
  5. Is /metrics endpoint properly exposed?
  1. 指标类型是否正确(Counter vs Gauge vs Histogram)?
  2. 标签值是否有限(未使用UUID、时间戳、路径)?
  3. 指标名称是否包含单位(_seconds、_bytes)?
  4. 指标是否仅注册一次(而非每次请求都注册)?
  5. /metrics端点是否已正确暴露?