Kubernetes Webhook 实践:从零开始构建自定义控制器
摘要: Kubernetes 已成为容器编排领域的事实标准,其强大的可扩展性是其成功的关键因素之一。通过自定义资源(CRD)和自定义控制器,开发者可以扩展 Kubernetes API,以管理特定领域的应用程序或基础设施。Webhook,特别是 Admission Webhook,在这一扩展机制中扮演着至关重要的角色,允许在对象持久化之前对其进行验证(Validation)或修改(Mutation)。本文将深入探讨 Kubernetes Webhook 的概念,并指导您从零开始,一步步构建一个包含 Admission Webhook 的简单自定义控制器实践。我们将涵盖从定义 CRD、编写 Webhook 服务器、配置 TLS、部署到 Kubernetes 集群,到最终测试其功能的完整流程。
字数: 约 3500 字
1. 引言:Kubernetes 的可扩展性基石
Kubernetes 以其声明式 API 和强大的自动化能力,极大地简化了容器化应用的部署、扩展和管理。然而,标准的 Kubernetes 资源(如 Pod, Deployment, Service 等)可能无法完全满足所有特定业务场景的需求。例如,您可能需要管理一个自定义的数据库集群、一个特定的消息队列配置,或者实施一套独特的应用程序部署策略。
为了解决这个问题,Kubernetes 提供了强大的扩展机制:
- 自定义资源定义 (Custom Resource Definitions – CRDs): 允许用户定义新的 API 对象类型,扩展 Kubernetes API。
- 自定义控制器 (Custom Controllers): 监听这些自定义资源(CRs)或其他内置资源的创建、更新、删除事件,并执行相应的业务逻辑,以确保系统的实际状态与用户期望的状态(在 CR 中定义)保持一致。
而 Webhook,特别是 Admission Webhook,则为这个扩展机制增加了一层动态的、可编程的控制。它们允许我们在 Kubernetes API Server 处理请求的关键路径上注入自定义逻辑,实现更精细化的资源管理和策略执行。
本文旨在带领读者深入理解 Admission Webhook 的工作原理,并通过一个实际案例,展示如何结合 CRD 和 Go 语言,从零开始构建一个具备验证和修改能力的自定义 Webhook 服务,并将其集成到 Kubernetes 集群中。
2. 核心概念解析
在深入实践之前,我们先来梳理一下相关的核心概念。
2.1 Kubernetes 控制器 (Controller)
控制器是 Kubernetes 的大脑。它们是持续运行的后台进程,通过监视集群状态(通过 API Server),并努力将当前状态调整到期望状态。例如:
- Deployment Controller: 确保指定数量的 Pod 副本正在运行。
- ReplicaSet Controller: 维护稳定运行的 Pod 副本集。
- Node Controller: 负责节点的健康检查和管理。
自定义控制器 遵循相同的模式,但它们关注的是用户定义的 CRD 或需要特殊处理的内置资源。其核心是 Reconciliation Loop (调和循环):
1. Observe: 监视资源的当前状态。
2. Analyze: 对比当前状态与期望状态(通常定义在资源的 spec
字段)。
3. Act: 采取行动(创建/更新/删除其他资源)以缩小差异。
2.2 自定义资源定义 (CRD) 与 自定义资源 (CR)
- CRD (Custom Resource Definition): 是一个 Kubernetes API 资源 (
apiextensions.k8s.io/v1/CustomResourceDefinition
)。创建 CRD 对象会告知 Kubernetes API Server 一个新的资源类型存在,包括其名称(kind
)、所属组(group
)、版本(version
)以及可选的结构定义(通过 OpenAPI v3 schema)。 - CR (Custom Resource): 是 CRD 定义的资源类型的一个实例。一旦 CRD 被创建,用户就可以像创建 Pod 或 Service 一样,创建、获取、更新和删除该类型的 CR 对象。
CRD 为存储和检索结构化数据提供了基础,但它们本身不包含业务逻辑。逻辑是由自定义控制器实现的。
2.3 Webhook
在 Kubernetes 中,Webhook 是一种机制,允许外部服务在 API Server 处理请求的特定阶段进行干预。主要有两种类型的 Webhook 与我们的主题相关:
- Admission Webhook: 在对象持久化到 etcd 之前被调用,用于 准入控制。它们可以对 API 请求进行验证或修改。
- Conversion Webhook: 用于 CRD 定义了多个版本时,在不同版本之间进行对象的转换。
本文重点关注 Admission Webhook。
2.4 Admission Webhook
Admission Webhook 进一步分为两种类型:
-
Validating Admission Webhook:
- 目的: 对传入的 API 对象进行验证,确保其满足特定的策略或约束。
- 行为: 只能决定是接受(Allow)还是拒绝(Deny)请求,不能修改对象内容。如果验证失败,请求将被拒绝,并返回错误信息给用户。
- 配置: 通过
ValidatingWebhookConfiguration
对象进行注册。
-
Mutating Admission Webhook:
- 目的: 在对象持久化之前对其进行修改。常用于设置默认值、注入 Sidecar 容器、添加标签或注解等。
- 行为: 可以根据自定义逻辑修改传入的对象。修改后的对象将继续传递给后续的 Admission Webhook(包括验证型)以及最终存储。
- 配置: 通过
MutatingWebhookConfiguration
对象进行注册。
工作流程:
当用户(或控制器)向 Kubernetes API Server 发送一个创建、更新或删除资源的请求时(例如 kubectl apply -f my-cr.yaml
):
- 认证 (Authentication): 验证请求者的身份。
- 授权 (Authorization): 检查请求者是否有权限执行该操作。
- Mutating Admission Webhooks: 如果配置了匹配该操作(资源类型、操作类型等)的 Mutating Webhook,API Server 会将请求(包含对象信息)发送给相应的 Webhook 服务。Webhook 服务可以修改对象内容,并将修改后的对象或原始对象返回给 API Server。这个过程会依次调用所有匹配的 Mutating Webhook。
- Schema Validation: API Server 根据资源的 Schema(来自 CRD 或内置类型定义)验证对象的结构和类型。
- Validating Admission Webhooks: 如果配置了匹配的 Validating Webhook,API Server 会将(可能已被修改过的)对象发送给相应的 Webhook 服务。Webhook 服务进行验证,并返回允许或拒绝的决定。只要有一个 Validating Webhook 拒绝,整个请求就会失败。
- 持久化 (Persistence): 如果所有检查都通过,对象最终被写入 etcd。
关键点:
- Webhook 服务必须是 HTTPS 端点。API Server 需要验证 Webhook 服务器的身份。
- Webhook 的配置(
MutatingWebhookConfiguration
,ValidatingWebhookConfiguration
)指定了哪些操作(资源、API 组、版本、操作类型)应该触发哪个 Webhook 服务,以及如何连接到该服务(通常是通过 Kubernetes Service)和如何验证其 TLS 证书(通过 CA Bundle)。 - Webhook 需要快速响应,因为它们位于 API 请求的关键路径上。超时或失败可能导致 API 请求失败(取决于
failurePolicy
配置)。
3. 为什么需要自定义 Webhook?
尽管 Kubernetes 内置了许多强大的功能,但在实际应用中,我们经常遇到需要更细粒度控制或自动化特定逻辑的场景:
- 策略实施 (Policy Enforcement): 强制所有 Deployment 都必须包含特定的标签(如
owner
),或者限制 Pod 只能使用特定镜像仓库的镜像。 - 默认值设定 (Defaulting): 为自定义资源(CR)的某些字段自动设置合理的默认值,简化用户配置。例如,为一个
Database
CR 自动设置默认的存储大小或备份策略。 - 自动化注入 (Automatic Injection): 自动为 Pod 注入 Sidecar 容器(如日志收集代理、服务网格代理),或自动添加特定的环境变量、Volume Mounts。
- 复杂验证 (Complex Validation): 执行内置 Schema 验证无法表达的复杂逻辑检查。例如,验证 CR 中的某个字段值是否引用了另一个存在的 Kubernetes 资源,或者检查多个字段之间的关联约束。
- 安全性增强: 阻止不安全的配置,如禁止 Pod 以 root 用户运行或挂载敏感的主机路径。
Admission Webhook 提供了一个灵活、强大且与 Kubernetes API 原生集成的机制来实现上述需求。
4. 准备工作
在开始编码之前,确保您已准备好以下环境和工具:
- Kubernetes 集群: 可以是 Minikube, Kind, Docker Desktop 内置 Kubernetes,或任何云提供商的 K8s 服务。确保您有集群的管理员权限。
- kubectl: Kubernetes 命令行工具,用于与集群交互。
- Go 语言环境: 我们将使用 Go 编写 Webhook 服务器。确保安装了较新版本的 Go (e.g., 1.18+)。
- Docker: 用于构建 Webhook 服务器的容器镜像。
- (可选) OpenSSL: 用于手动生成 TLS 证书(用于演示,生产环境推荐使用 cert-manager)。
- (可选) cert-manager: 用于在集群中自动管理 TLS 证书,是生产环境的最佳实践。
5. 实践:构建一个简单的 Website CRD Webhook
我们的目标是创建一个名为 Website
的 CRD,并为其实现 Admission Webhook:
- CRD: 定义一个
Website
资源,包含spec.url
字段。 - Mutating Webhook: 如果用户创建
Website
资源时未指定spec.protocol
(http/https),则自动将其设置为https
。 - Validating Webhook: 验证
spec.url
字段必须是有效的 URL 格式。
Step 1: 定义 Website CRD
创建一个名为 website-crd.yaml
的文件:
“`yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: websites.example.com # CRD 的名称,格式为
spec:
group: example.com # API 组
names:
kind: Website # 资源类型 (Kind)
listKind: WebsiteList # 列表类型
plural: websites # 复数形式,用于 URL (/apis/example.com/v1/websites)
singular: website # 单数形式
shortNames: # (可选) 短名称,用于 kubectl
– web
scope: Namespaced # 作用域:可以是 Namespaced 或 Cluster
versions:
– name: v1 # API 版本
served: true # 此版本是否通过 API Server 提供服务
storage: true # 此版本是否为存储版本
schema:
openAPIV3Schema: # 使用 OpenAPI v3 定义结构
type: object
properties:
spec:
type: object
properties:
url:
type: string
description: “The URL of the website.”
pattern: “^(https?://)?([\da-z.-]+)\.([a-z.]{2,6})([/\w .-])?/?$” # 基础 URL 格式校验
protocol:
type: string
description: “The protocol (http or https). Defaults to https if mutated.”
enum: [“http”, “https”] # 限制只能是 http 或 https
required: # spec 下的必填字段
– url
# (可选) 添加额外的打印列,方便 kubectl get 输出
additionalPrinterColumns:
– name: URL
type: string
description: The URL of the website
jsonPath: .spec.url
– name: Protocol
type: string
description: The protocol used
jsonPath: .spec.protocol
– name: Age
type: date
jsonPath: .metadata.creationTimestamp
“`
应用 CRD 到集群:
bash
kubectl apply -f website-crd.yaml
确认 CRD 已创建:
bash
kubectl get crd websites.example.com
Step 2: 开发 Webhook 服务器 (Go)
我们将创建一个简单的 Go HTTP 服务器来处理 Admission 请求。
项目结构 (示例):
webhook-server/
├── go.mod
├── go.sum
├── main.go
├── pkg/
│ └── webhook/
│ └── handler.go
├── Dockerfile
└── manifests/
├── deployment.yaml
├── service.yaml
├── tls-secret.yaml # (如果手动管理证书)
├── mutating-webhook-config.yaml
└── validating-webhook-config.yaml
pkg/webhook/handler.go
(核心逻辑):
“`go
package webhook
import (
“encoding/json”
“fmt”
“io”
“log”
“net/http”
“net/url”
“strings”
admissionv1 "k8s.io/api/admission/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/runtime/serializer"
)
var (
runtimeScheme = runtime.NewScheme()
codecs = serializer.NewCodecFactory(runtimeScheme)
deserializer = codecs.UniversalDeserializer()
)
// WebsiteSpec is a simplified representation for unmarshalling
type WebsiteSpec struct {
URL string json:"url"
Protocol string json:"protocol,omitempty"
// omitempty is important for detection
}
// Website is a simplified representation for unmarshalling
type Website struct {
metav1.TypeMeta json:",inline"
metav1.ObjectMeta json:"metadata,omitempty"
Spec WebsiteSpec json:"spec,omitempty"
}
// admitFunc is a type for admission handler functions
type admitFunc func(admissionv1.AdmissionReview) *admissionv1.AdmissionResponse
// ServeAdmitFunc handles the http portion of admission webhook requests.
func ServeAdmitFunc(w http.ResponseWriter, r *http.Request, admit admitFunc) {
var body []byte
if r.Body != nil {
if data, err := io.ReadAll(r.Body); err == nil {
body = data
}
}
if len(body) == 0 {
log.Println(“Error: empty body”)
http.Error(w, “empty body”, http.StatusBadRequest)
return
}
// Verify the content type is accurate
contentType := r.Header.Get("Content-Type")
if contentType != "application/json" {
log.Printf("Error: invalid Content-Type %s, expected application/json\n", contentType)
http.Error(w, "invalid Content-Type, expected application/json", http.StatusUnsupportedMediaType)
return
}
// The AdmissionReview that was sent to the webhook
requestedAdmissionReview := admissionv1.AdmissionReview{}
responseAdmissionReview := admissionv1.AdmissionReview{}
if _, _, err := deserializer.Decode(body, nil, &requestedAdmissionReview); err != nil {
log.Printf("Error: Can't decode body: %v\n", err)
responseAdmissionReview.Response = toAdmissionResponse(err)
} else {
// Pass to the admitFunc
responseAdmissionReview.Response = admit(requestedAdmissionReview)
}
// Our review notes whether to allow the object based on policies defined by the admitFunc
responseAdmissionReview.APIVersion = requestedAdmissionReview.APIVersion
responseAdmissionReview.Kind = requestedAdmissionReview.Kind
// Propagate the same UID back to the apiserver
if requestedAdmissionReview.Request != nil {
responseAdmissionReview.Response.UID = requestedAdmissionReview.Request.UID
}
respBytes, err := json.Marshal(responseAdmissionReview)
if err != nil {
log.Printf("Error: Could not encode response: %v\n", err)
http.Error(w, fmt.Sprintf("could not encode response: %v", err), http.StatusInternalServerError)
return
}
log.Println("Info: Writing response...")
if _, err := w.Write(respBytes); err != nil {
log.Printf("Error: Could not write response: %v\n", err)
http.Error(w, fmt.Sprintf("could not write response: %v", err), http.StatusInternalServerError)
}
}
// ServeMutateWebsites is the handler for the mutating webhook
func ServeMutateWebsites(w http.ResponseWriter, r *http.Request) {
ServeAdmitFunc(w, r, mutateWebsites)
}
// ServeValidateWebsites is the handler for the validating webhook
func ServeValidateWebsites(w http.ResponseWriter, r *http.Request) {
ServeAdmitFunc(w, r, validateWebsites)
}
// mutateWebsites contains the mutating logic
func mutateWebsites(ar admissionv1.AdmissionReview) *admissionv1.AdmissionResponse {
log.Println(“Info: Received mutation request”)
if ar.Request == nil {
log.Println(“Error: received nil request in AdmissionReview”)
return toAdmissionResponse(fmt.Errorf(“received nil request in AdmissionReview”))
}
req := ar.Request
var website Website
// Only handle Create and Update operations
if req.Operation != admissionv1.Create && req.Operation != admissionv1.Update {
log.Printf("Info: Skipping mutation for operation %s\n", req.Operation)
return &admissionv1.AdmissionResponse{Allowed: true}
}
err := json.Unmarshal(req.Object.Raw, &website)
if err != nil {
log.Printf("Error: Could not unmarshal raw object: %v %s\n", err, string(req.Object.Raw))
return toAdmissionResponse(err)
}
log.Printf("Info: Mutating Website %s/%s. Original Spec: %+v\n", website.Namespace, website.Name, website.Spec)
// --- Mutation Logic ---
needsPatch := false
if website.Spec.Protocol == "" {
website.Spec.Protocol = "https"
needsPatch = true
log.Printf("Info: Defaulting protocol to 'https' for %s/%s\n", website.Namespace, website.Name)
}
// --- End Mutation Logic ---
if !needsPatch {
log.Println("Info: No mutation needed.")
return &admissionv1.AdmissionResponse{Allowed: true}
}
// Create the JSON patch
patchBytes, err := createPatch(&website, req.Object.Raw)
if err != nil {
log.Printf("Error: Could not create patch: %v\n", err)
return toAdmissionResponse(err)
}
log.Printf("Info: Generated Patch: %s\n", string(patchBytes))
return &admissionv1.AdmissionResponse{
Allowed: true,
Patch: patchBytes,
PatchType: func() *admissionv1.PatchType { // Pointer to PatchType
pt := admissionv1.PatchTypeJSONPatch
return &pt
}(),
}
}
// validateWebsites contains the validating logic
func validateWebsites(ar admissionv1.AdmissionReview) *admissionv1.AdmissionResponse {
log.Println(“Info: Received validation request”)
if ar.Request == nil {
log.Println(“Error: received nil request in AdmissionReview”)
return toAdmissionResponse(fmt.Errorf(“received nil request in AdmissionReview”))
}
req := ar.Request
var website Website
// Only handle Create and Update operations
if req.Operation != admissionv1.Create && req.Operation != admissionv1.Update {
log.Printf("Info: Skipping validation for operation %s\n", req.Operation)
return &admissionv1.AdmissionResponse{Allowed: true}
}
err := json.Unmarshal(req.Object.Raw, &website)
if err != nil {
log.Printf("Error: Could not unmarshal raw object: %v %s\n", err, string(req.Object.Raw))
return toAdmissionResponse(err)
}
log.Printf("Info: Validating Website %s/%s. Spec: %+v\n", website.Namespace, website.Name, website.Spec)
// --- Validation Logic ---
if website.Spec.URL == "" {
log.Println("Error: URL is required.")
return &admissionv1.AdmissionResponse{
Allowed: false,
Result: &metav1.Status{
Message: "spec.url is a required field",
Reason: metav1.StatusReasonInvalid,
Code: http.StatusUnprocessableEntity, // 422
},
}
}
// More robust URL validation
parsedURL, err := url.ParseRequestURI(website.Spec.URL)
if err != nil || parsedURL.Scheme == "" || parsedURL.Host == "" {
// Attempt to parse even if scheme is missing by prefixing temporarily
if !strings.HasPrefix(website.Spec.URL, "http://") && !strings.HasPrefix(website.Spec.URL, "https://") {
parsedURLWithScheme, err2 := url.ParseRequestURI("https://" + website.Spec.URL) // Use https as default for parsing check
if err2 != nil || parsedURLWithScheme.Host == "" {
log.Printf("Error: Invalid URL format for %s: %v / %v\n", website.Spec.URL, err, err2)
return &admissionv1.AdmissionResponse{
Allowed: false,
Result: &metav1.Status{
Message: fmt.Sprintf("spec.url '%s' is not a valid URL format (e.g., http://example.com)", website.Spec.URL),
Reason: metav1.StatusReasonInvalid,
Code: http.StatusUnprocessableEntity,
},
}
}
} else {
log.Printf("Error: Invalid URL format for %s: %v\n", website.Spec.URL, err)
return &admissionv1.AdmissionResponse{
Allowed: false,
Result: &metav1.Status{
Message: fmt.Sprintf("spec.url '%s' is not a valid URL format (e.g., http://example.com)", website.Spec.URL),
Reason: metav1.StatusReasonInvalid,
Code: http.StatusUnprocessableEntity,
},
}
}
}
// Check if protocol matches the one in the URL (if both exist)
if website.Spec.Protocol != "" && parsedURL.Scheme != "" && website.Spec.Protocol != parsedURL.Scheme {
log.Printf("Error: Protocol mismatch. spec.protocol is '%s' but URL scheme is '%s'\n", website.Spec.Protocol, parsedURL.Scheme)
return &admissionv1.AdmissionResponse{
Allowed: false,
Result: &metav1.Status{
Message: fmt.Sprintf("Protocol mismatch: spec.protocol is '%s' but the URL scheme is '%s'", website.Spec.Protocol, parsedURL.Scheme),
Reason: metav1.StatusReasonInvalid,
Code: http.StatusUnprocessableEntity,
},
}
}
log.Println("Info: Validation successful.")
// --- End Validation Logic ---
return &admissionv1.AdmissionResponse{Allowed: true}
}
// toAdmissionResponse creates a basic AdmissionResponse with error status
func toAdmissionResponse(err error) *admissionv1.AdmissionResponse {
return &admissionv1.AdmissionResponse{
Result: &metav1.Status{
Message: err.Error(),
Code: http.StatusInternalServerError, // Or a more specific code if applicable
},
}
}
// createPatch generates a JSON patch comparing the original and modified objects.
// Note: This is a simplified patch creation. For complex objects, consider libraries like json-patch
.
func createPatch(modifiedObj interface{}, originalRaw []byte) ([]byte, error) {
modifiedBytes, err := json.Marshal(modifiedObj)
if err != nil {
return nil, fmt.Errorf(“could not marshal modified object: %w”, err)
}
// Simple diff and patch generation (can be improved)
// This example focuses on replacing the whole spec if it changed.
// A more robust solution uses RFC 6902 JSON Patch operations (add, replace, remove).
// For this simple case, we know only 'protocol' might be added/changed in 'spec'.
var originalMap map[string]interface{}
var modifiedMap map[string]interface{}
if err := json.Unmarshal(originalRaw, &originalMap); err != nil {
return nil, fmt.Errorf("could not unmarshal original object: %w", err)
}
if err := json.Unmarshal(modifiedBytes, &modifiedMap); err != nil {
return nil, fmt.Errorf("could not unmarshal modified object: %w", err)
}
// Assume only 'spec' might change for mutation in this example
patch := []map[string]interface{}{
{
"op": "replace",
"path": "/spec",
"value": modifiedMap["spec"],
},
}
patchBytes, err := json.Marshal(patch)
if err != nil {
return nil, fmt.Errorf("could not marshal patch: %w", err)
}
return patchBytes, nil
// Proper implementation would use:
// import "github.com/evanphx/json-patch"
// patch, err := jsonpatch.CreateMergePatch(originalRaw, modifiedBytes)
// return patch, err
}
“`
main.go
(HTTP Server Setup):
“`go
package main
import (
“crypto/tls”
“flag”
“log”
“net/http”
“webhook-server/pkg/webhook” // Adjust import path if needed
)
func main() {
var certFile, keyFile string
var port int
// Command-line flags for certificate and port
flag.StringVar(&certFile, "tls-cert-file", "/etc/webhook/certs/tls.crt", "File containing the x509 Certificate for HTTPS.")
flag.StringVar(&keyFile, "tls-key-file", "/etc/webhook/certs/tls.key", "File containing the x509 private key matching --tls-cert-file.")
flag.IntVar(&port, "port", 8443, "Secure port that the webhook server listens on.")
flag.Parse()
log.Printf("Info: Starting Webhook Server on port %d...\n", port)
log.Printf("Info: Using Certificate File: %s\n", certFile)
log.Printf("Info: Using Key File: %s\n", keyFile)
// Load TLS certificates
certs, err := tls.LoadX509KeyPair(certFile, keyFile)
if err != nil {
log.Fatalf("Error: Failed to load key pair: %v\n", err)
}
// Define HTTP routes
mux := http.NewServeMux()
mux.HandleFunc("/mutate", webhook.ServeMutateWebsites)
mux.HandleFunc("/validate", webhook.ServeValidateWebsites)
mux.HandleFunc("/healthz", func(w http.ResponseWriter, r *http.Request) { // Health check endpoint
w.WriteHeader(http.StatusOK)
w.Write([]byte("ok"))
})
// Configure HTTPS server
server := &http.Server{
Addr: fmt.Sprintf(":%d", port),
Handler: mux,
TLSConfig: &tls.Config{
Certificates: []tls.Certificate{certs},
// Optionally configure client CA verification if needed
},
}
// Start HTTPS server
log.Fatal(server.ListenAndServeTLS("", "")) // Cert/key files already loaded in TLSConfig
}
“`
go.mod
(Initialize and add dependencies):
“`bash
cd webhook-server
go mod init webhook-server # Or your module name
go get k8s.io/api k8s.io/apimachinery k8s.io/client-go # Add necessary K8s modules
Add json-patch library if using a more robust patch method:
go get github.com/evanphx/json-patch/v5
“`
Step 3: Containerize the Webhook Server
Create a Dockerfile
:
“`dockerfile
Use an official Go runtime as a parent image
FROM golang:1.21-alpine as builder
Set the working directory
WORKDIR /app
Copy the Go module files and download dependencies
COPY go.mod go.sum ./
RUN go mod download
Copy the source code
COPY . .
Build the Go application
CGO_ENABLED=0 for static linking, GOOS=linux for cross-compilation if needed
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o webhook-server .
Use a minimal base image for the final container
FROM alpine:latest
Install CA certificates needed for TLS
RUN apk –no-cache add ca-certificates
Copy the built binary from the builder stage
COPY –from=builder /app/webhook-server /usr/local/bin/webhook-server
Expose the port the server listens on (defined in main.go)
EXPOSE 8443
Define the entry point for the container
The actual cert/key paths will be mounted via Kubernetes Secrets
ENTRYPOINT [“webhook-server”, “–tls-cert-file=/etc/webhook/certs/tls.crt”, “–tls-key-file=/etc/webhook/certs/tls.key”, “–port=8443”]
“`
Build and push the image to a registry accessible by your Kubernetes cluster (replace your-dockerhub-username
):
bash
docker build -t your-dockerhub-username/simple-webhook:v1 .
docker push your-dockerhub-username/simple-webhook:v1
Step 4: Deploy the Webhook Server to Kubernetes
manifests/deployment.yaml
:
yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: simple-webhook-deployment
labels:
app: simple-webhook
spec:
replicas: 1
selector:
matchLabels:
app: simple-webhook
template:
metadata:
labels:
app: simple-webhook
spec:
containers:
- name: webhook-server
# Replace with your image name
image: your-dockerhub-username/simple-webhook:v1
ports:
- containerPort: 8443
name: webhook-api
volumeMounts:
- name: webhook-certs # Mount the secret containing TLS certs
mountPath: /etc/webhook/certs
readOnly: true
readinessProbe: # Add a readiness probe
httpGet:
path: /healthz
port: webhook-api
scheme: HTTPS # Important: Probe the HTTPS endpoint
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe: # Add a liveness probe
httpGet:
path: /healthz
port: webhook-api
scheme: HTTPS
initialDelaySeconds: 15
periodSeconds: 20
volumes:
- name: webhook-certs
secret:
secretName: simple-webhook-tls # Name of the Secret we will create
manifests/service.yaml
:
yaml
apiVersion: v1
kind: Service
metadata:
name: simple-webhook-service # This name is used in WebhookConfiguration
labels:
app: simple-webhook
spec:
selector:
app: simple-webhook
ports:
- port: 443 # Service port (standard HTTPS port)
targetPort: webhook-api # Target the container port name (or number 8443)
protocol: TCP
Step 5: Configure TLS Certificates
API Server must communicate with Webhooks over HTTPS. We need TLS certificates.
Method 1: Manual Generation (using OpenSSL – for demonstration)
-
Warning: Managing certificates manually is complex and error-prone. Use cert-manager in production.
-
Generate CA Key and Cert:
bash
openssl genrsa -out ca.key 2048
openssl req -x509 -new -nodes -key ca.key -subj "/CN=SimpleWebhookCA" -days 3650 -out ca.crt - Generate Server Key:
bash
openssl genrsa -out webhook-server-tls.key 2048 - Create Server Certificate Signing Request (CSR):
- Important: The Common Name (CN) must match the Kubernetes Service name and namespace:
<service-name>.<namespace>.svc
. Assuming deployment in thedefault
namespace:
bash
openssl req -new -key webhook-server-tls.key \
-subj "/CN=simple-webhook-service.default.svc" \
-out webhook-server-tls.csr \
-config <(printf "[req]\ndistinguished_name=req_distinguished_name\nreq_extensions=v3_req\n[req_distinguished_name]\n[v3_req]\nsubjectAltName=@alt_names\n[alt_names]\nDNS.1=simple-webhook-service\nDNS.2=simple-webhook-service.default\nDNS.3=simple-webhook-service.default.svc") - The
subjectAltName
(SAN) is critical for hostname verification. Include the service name, service name + namespace, and the fully qualified service DNS name.
- Important: The Common Name (CN) must match the Kubernetes Service name and namespace:
- Sign the Server Certificate using the CA:
bash
openssl x509 -req -in webhook-server-tls.csr \
-CA ca.crt -CAkey ca.key -CAcreateserial \
-out webhook-server-tls.crt \
-days 365 \
-extensions v3_req \
-extfile <(printf "[v3_req]\nsubjectAltName=@alt_names\n[alt_names]\nDNS.1=simple-webhook-service\nDNS.2=simple-webhook-service.default\nDNS.3=simple-webhook-service.default.svc") - Create Kubernetes Secret:
bash
kubectl create secret tls simple-webhook-tls \
--cert=webhook-server-tls.crt \
--key=webhook-server-tls.key \
--dry-run=client -o yaml > manifests/tls-secret.yaml
Reviewmanifests/tls-secret.yaml
and then apply it:
bash
kubectl apply -f manifests/tls-secret.yaml - Get CA Bundle: We need the CA certificate (base64 encoded) for the WebhookConfiguration.
bash
cat ca.crt | base64 | tr -d '\n'
Copy this base64 encoded string. You’ll need it in the next step.
Method 2: Using cert-manager (Recommended)
- Install cert-manager in your cluster (follow official instructions: https://cert-manager.io/docs/installation/).
- Create an Issuer or ClusterIssuer (e.g., self-signed for testing, or Let’s Encrypt for public).
- Create a
Certificate
resource that references the Issuer and specifies the secret name (simple-webhook-tls
) and the DNS names (simple-webhook-service.default.svc
, etc.). cert-manager will automatically generate the certificate and store it in the secret. - cert-manager can also automatically inject the
caBundle
into the WebhookConfiguration using annotations (see cert-manager documentation).
Step 6: Register the Admission Webhooks
Now, tell the API Server about your webhook.
manifests/mutating-webhook-config.yaml
:
yaml
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
name: simple-webhook-mutating-cfg
# Optional: Add annotation for cert-manager CA injection
# annotations:
# cert-manager.io/inject-ca-from: "<namespace>/simple-webhook-tls" # Adjust namespace if needed
webhooks:
- name: mutate.websites.example.com # Unique name
clientConfig:
# If using cert-manager injection, remove caBundle field
caBundle: "PASTE_YOUR_BASE64_ENCODED_CA_CERT_HERE" # Replace with output from Step 5.6 or let cert-manager inject it
service:
name: simple-webhook-service # Matches the Service name
namespace: default # Matches the Service namespace
path: "/mutate" # Path defined in main.go
port: 443 # Port defined in the Service
rules:
- operations: ["CREATE", "UPDATE"] # Operations to intercept
apiGroups: ["example.com"] # API Group of our CRD
apiVersions: ["v1"] # API Version of our CRD
resources: ["websites"] # Resource name (plural)
admissionReviewVersions: ["v1"] # Supported AdmissionReview versions
sideEffects: None # Indicates the webhook has no side effects on other resources
failurePolicy: Fail # If webhook fails, the request fails (can be Ignore)
timeoutSeconds: 5 # How long API server waits for webhook response
manifests/validating-webhook-config.yaml
:
yaml
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: simple-webhook-validating-cfg
# Optional: Add annotation for cert-manager CA injection
# annotations:
# cert-manager.io/inject-ca-from: "<namespace>/simple-webhook-tls" # Adjust namespace if needed
webhooks:
- name: validate.websites.example.com # Unique name
clientConfig:
# If using cert-manager injection, remove caBundle field
caBundle: "PASTE_YOUR_BASE64_ENCODED_CA_CERT_HERE" # Replace with the SAME CA bundle
service:
name: simple-webhook-service
namespace: default
path: "/validate" # Path for validation handler
port: 443
rules:
- operations: ["CREATE", "UPDATE"]
apiGroups: ["example.com"]
apiVersions: ["v1"]
resources: ["websites"]
admissionReviewVersions: ["v1"]
sideEffects: None
failurePolicy: Fail
timeoutSeconds: 5
Important: Replace "PASTE_YOUR_BASE64_ENCODED_CA_CERT_HERE"
in both files with the actual base64 encoded CA certificate obtained in Step 5.6 (unless using cert-manager injection).
Apply the configurations:
“`bash
kubectl apply -f manifests/deployment.yaml
kubectl apply -f manifests/service.yaml
Apply TLS secret if not done yet: kubectl apply -f manifests/tls-secret.yaml
kubectl apply -f manifests/mutating-webhook-config.yaml
kubectl apply -f manifests/validating-webhook-config.yaml
“`
Wait for the deployment rollout to complete:
bash
kubectl rollout status deployment/simple-webhook-deployment
Check the webhook server logs:
bash
kubectl logs -l app=simple-webhook -f
Step 7: Test the Webhook
-
Test Mutation (Missing Protocol):
Create
test-website-no-protocol.yaml
:
yaml
apiVersion: example.com/v1
kind: Website
metadata:
name: test-no-protocol
spec:
url: "http://example.com/page1" # Note: Using http here, no protocol field
Apply it:
bash
kubectl apply -f test-website-no-protocol.yaml
Check the created resource:
bash
kubectl get website test-no-protocol -o yaml
You should seespec.protocol: https
added by the mutating webhook. The URL’s scheme (http) doesn’t prevent the mutation, but the validating webhook (which runs after mutation) will check consistency ifspec.protocol
is present. Correction: In this specific case, since we addedhttps
but the URL ishttp
, the subsequent validation should catch the mismatch if we adjust the validation logic to check this. Let’s refine validation logic slightly. (Self-correction: The original validation logic primarily checked URL format. Let’s ensure it also checks protocol consistency ifspec.protocol
is set). The provided Go code does check for protocol mismatch if bothspec.protocol
and a URL scheme exist.Let’s retry with a URL that doesn’t specify a scheme, relying on the mutation:
Createtest-website-no-scheme.yaml
:
yaml
apiVersion: example.com/v1
kind: Website
metadata:
name: test-no-scheme
spec:
url: "example.org/some/path" # No http:// or https://
Apply:kubectl apply -f test-website-no-scheme.yaml
Check:kubectl get website test-no-scheme -o yaml
Output should showspec.protocol: https
. -
Test Validation (Invalid URL):
Create
test-website-invalid.yaml
:
yaml
apiVersion: example.com/v1
kind: Website
metadata:
name: test-invalid
spec:
url: "this is not a url"
Apply it:
bash
kubectl apply -f test-website-invalid.yaml
The command should fail with an error message from the validating webhook, similar to:
Error from server (spec.url 'this is not a url' is not a valid URL format (e.g., http://example.com)): error when creating "test-website-invalid.yaml": admission webhook "validate.websites.example.com" denied the request: spec.url 'this is not a url' is not a valid URL format (e.g., http://example.com)
-
Test Validation (Protocol Mismatch):
Create
test-website-mismatch.yaml
:
yaml
apiVersion: example.com/v1
kind: Website
metadata:
name: test-mismatch
spec:
url: "http://example.net"
protocol: "https" # Explicitly set mismatching protocol
Apply it:
bash
kubectl apply -f test-website-mismatch.yaml
This should also fail due to the validation check for protocol consistency. -
Test Valid Case:
Create
test-website-valid.yaml
:
yaml
apiVersion: example.com/v1
kind: Website
metadata:
name: test-valid
spec:
url: "https://my-valid-site.io/app"
protocol: "https"
Apply it:
bash
kubectl apply -f test-website-valid.yaml
This should succeed.kubectl get website test-valid
should show the resource as created.
6. Advanced Considerations & Best Practices
- Error Handling & Logging: Implement robust error handling and detailed logging in your webhook server. This is crucial for debugging issues, as problems can be hard to trace. Log the incoming
AdmissionReview
and the outgoingAdmissionResponse
. - Idempotency (Mutating Webhooks): Ensure your mutation logic is idempotent. If the webhook is called multiple times for the same object state (e.g., during an update where your target field already has the desired value), it shouldn’t make unnecessary changes or enter a conflicting state.
- Performance: Webhooks are synchronous and block API requests. Keep your webhook logic fast and efficient. Set reasonable timeouts (
timeoutSeconds
in the configuration). Avoid complex computations or external calls that might block for long periods. - Security:
- TLS: Always use TLS. Use cert-manager for production certificate management.
- RBAC: Grant the ServiceAccount running your webhook Pod only the necessary permissions. It typically doesn’t need broad cluster access unless its logic requires inspecting other resources.
- Network Policies: Restrict network access to your webhook service only from the API server.
- Availability: Run multiple replicas of your webhook deployment for high availability. Ensure your
failurePolicy
is set appropriately (Fail
orIgnore
) based on whether the webhook is critical. - Testing:
- Unit Tests: Test your core mutation and validation logic functions in isolation.
- Integration Tests: Test the full HTTP request/response flow using sample
AdmissionReview
JSON payloads. - End-to-End Tests: Deploy the webhook to a test cluster and use
kubectl
to apply test manifests, verifying the outcomes.
- Observability: Expose metrics (e.g., request latency, error rates) from your webhook server using Prometheus client libraries. Implement distributed tracing if needed.
- Webhook Configuration Selectors: Use
namespaceSelector
orobjectSelector
in yourMutatingWebhookConfiguration
orValidatingWebhookConfiguration
to scope the webhook to specific namespaces or objects matching certain labels, reducing the load and blast radius. - API Review Versions: Ensure
admissionReviewVersions
includesv1
. Older versions (v1beta1
) are deprecated. - Frameworks: For more complex controllers and webhooks, consider using frameworks like Kubebuilder or Operator SDK. They provide scaffolding, code generation, and helpers that significantly simplify development, testing, and lifecycle management, including boilerplate for webhooks and certificate management integration.
7. Conclusion
Kubernetes Admission Webhooks provide a powerful mechanism to extend and customize the behavior of the Kubernetes API server. By intercepting API requests before they are persisted, we can enforce custom policies, automate resource modifications, and integrate bespoke logic directly into the Kubernetes control plane.
This article guided you through the process of building a simple custom controller component using Admission Webhooks from scratch. We defined a CRD, developed a Go-based webhook server handling both mutation and validation, configured TLS, deployed it to Kubernetes, and registered it with the API server. While this example was basic, it illustrates the fundamental principles and workflow involved.
Mastering Webhooks unlocks a new level of control and automation within Kubernetes, enabling developers and platform engineers to tailor the cluster’s behavior precisely to their application and organizational needs. While the “from scratch” approach is instructive, remember to leverage tools like cert-manager and frameworks like Kubebuilder for more robust and efficient development in real-world scenarios.