Plan du cours

Kafka Administration Essentials

  • Where Kafka fits in a modern data platform and typical production responsibilities
  • Core concepts for operators: brokers, topics, partitions, offsets, consumer groups
  • Replication fundamentals: leaders and followers, in-sync replicas, availability trade-offs
  • Kafka operational highlights and common terminology used in runbooks

KRaft Mode and Cluster Design

  • KRaft basics: controllers, metadata quorum, elections, and why it matters operationally
  • Deployment planning: sizing for throughput, partitions, retention, and growth
  • Node roles and layouts: combined vs dedicated controllers, fault domain considerations
  • Lab: inspect KRaft metadata, validate quorum health, and interpret controller logs

Installation, Configuration, and Day-to-Day Operations

  • Installation approaches (packages, tarball, containers) and what to standardize in enterprise environments
  • Core broker configuration that impacts reliability: listeners, replication, log directories, retention
  • Safe service operations: startup order, graceful shutdown, and validation checks
  • Lab: deploy a multi-node cluster, verify broker registration, and confirm baseline produce and consume

Managing Topics, Partitions, and Data Placement

  • Topic lifecycle using the Kafka CLI: create, describe, update configs, delete
  • Choosing partitions and replication factors for real workloads, including common anti-patterns
  • Reassignments and balancing: when to move partitions and how to verify progress safely
  • Lab: create topics, trigger a partition reassignment, simulate a broker outage, and confirm recovery

Securing Kafka for Production

  • TLS for client and inter-broker traffic: certificates, trust chains, and validation steps
  • Authentication with SASL: selecting common mechanisms and avoiding misconfiguration
  • Authorization with ACLs: least-privilege patterns for admins, producers, and consumers
  • Lab: enable TLS and SASL, validate client connectivity, and apply ACLs for application roles

Observability, Reliability, and Troubleshooting

  • Monitoring essentials: controller health, under-replicated partitions, request latency, disk and network saturation
  • Logs and metrics: reading broker logs and exposing metrics via JMX exporter to common observability stacks
  • Operational playbooks: rolling restarts, safe config changes, handling disk-full and ISR issues
  • Lab: build a minimal alert set, diagnose a degraded cluster, and restore healthy replication

Upgrades and Disaster Recovery Readiness

  • Upgrade planning for Kafka: compatibility checks, staging, and rollback approach
  • Backups and recovery expectations: what can be backed up, what cannot, and configuration recovery basics
  • Cross-cluster replication overview and when to use MirrorMaker 2 for DR and migrations
  • Wrap-up: operational checklist, handover artifacts, and next steps for production rollout

Pré requis

  • An understanding of basic Linux administration (users, services, files, permissions)
  • Experience with TCP/IP networking concepts (DNS, ports, firewalls, load balancers)
  • Basic scripting experience (Bash, PowerShell, or similar) for routine operational tasks

Audience

  • Kafka administrators and platform engineers responsible for operating Kafka clusters
  • Site reliability engineers and DevOps engineers supporting streaming platforms
  • Infrastructure and operations teams deploying new KRaft-based Kafka clusters or migrating from ZooKeeper
 21 Heures

Nombre de participants


Prix par participant

Nos clients témoignent (5)

Cours à venir

Catégories Similaires