{"product_id":"engineering-ai-agents-for-production-infrastructure-observability-security-and-production-case-studies-9798258511195","title":"Engineering AI Agents for Production: Infrastructure, Observability, Security, and Production Case Studies","description":"\u003cp\u003e • Author(s): Haythem Aber\u003cbr\u003e • Publisher: Independently Published\u003cbr\u003e • Publisher Imprint: Independently Published\u003cbr\u003e • BISAC: Artificial Intelligence - Generative AI\u003c\/p\u003e\u003cp\u003eVolume 1 specified the architecture. Volume 2 builds the system.\u003c\/p\u003e\u003cp\u003eEvery pattern, protocol contract, and orchestration design in Volume 1 was a specification. \u003cb\u003eEngineering AI Agents for Production\u003c\/b\u003e implements them-in production-grade Java, at distributed scale, under real operational constraints.\u003c\/p\u003e\u003cp\u003eThis is not a theory extension. It is a construction manual for engineers who are shipping autonomous AI systems into production: building the runtime from scratch, deploying across distributed infrastructure, measuring quality with engineering rigor, hardening against novel threats, and operating systems whose failure modes are non-deterministic and sometimes irreversible.\u003c\/p\u003eWhat this book builds - 828 pages, 31 chapters, 6 parts: \u003cul\u003e\n\u003cli\u003e\n\u003cb\u003ePart I - A Production Java Agent Framework, from Scratch: \u003c\/b\u003e Design principles before code (testability and observability as first-class constraints), Java 25 concurrency models (structured concurrency and Virtual Threads), the AgentRuntime as a deterministic state machine, the AgentExecutorService, and the full protocol adapter stack for MCP and A2A.\u003c\/li\u003e\n\u003cli\u003e\n\u003cb\u003ePart II - Infrastructure for Distributed Agent Systems: \u003c\/b\u003e The agent control plane (service discovery and health checking), agent sandboxing with Docker, gVisor, and eBPF, WebAssembly (WASI 2.0) for edge agents, event-driven architectures with Kafka and NATS, and durable execution with Temporal and AWS Step Functions.\u003c\/li\u003e\n\u003cli\u003e\n\u003cb\u003ePart III - Observability, Evaluation, and Testing: \u003c\/b\u003e The three observability pillars (distributed tracing, structured logging, metrics), evaluation as an engineering discipline (trajectory quality scoring, LLM-as-judge pipelines), chaos engineering for non-deterministic systems, and deterministic replay and time-travel debugging.\u003c\/li\u003e\n\u003cli\u003e\n\u003cb\u003ePart IV - Security and Governance: \u003c\/b\u003e The complete agent threat landscape (prompt injection, tool poisoning, denial-of-wallet), secure-by-design principles (least privilege, secure bootstrapping), enterprise governance (RBAC, audit logging, Policy-as-Code with OPA), and runtime circuit breakers and kill switches.\u003c\/li\u003e\n\u003cli\u003e\n\u003cb\u003ePart V - Eight Production Architecture Case Studies: \u003c\/b\u003e Detailed breakdowns including architecture decisions and production failure modes. Covers: Enterprise Copilot (M365\/Google integration), Autonomous Research Agent, DevOps and SRE Agent, Multi-Agent Customer Support, Autonomous Code Review, Regulated Clinical\/Financial Agent, and Decentralised Fraud Detection.\u003c\/li\u003e\n\u003cli\u003e\n\u003cb\u003ePart VI - The Future of Agentic AI: \u003c\/b\u003e The agent as a first-class software entity (identity, accountability, versioning), coordination horizons and emergent behavior at scale, the convergence of trust and delegation standards, and the frontier of agent regulation and liability.\u003c\/li\u003e\n\u003c\/ul\u003eWho this book is for: \u003cul\u003e\n\u003cli\u003e\n\u003cb\u003eJava and Python engineers\u003c\/b\u003e implementing production agent runtimes from scratch.\u003c\/li\u003e\n\u003cli\u003e\n\u003cb\u003ePlatform and Infrastructure engineers\u003c\/b\u003e deploying agent workloads on Kubernetes and distributed event systems.\u003c\/li\u003e\n\u003cli\u003e\n\u003cb\u003eML Engineers\u003c\/b\u003e building evaluation, replay, and cost-governance infrastructure for non-deterministic systems.\u003c\/li\u003e\n\u003cli\u003e\n\u003cb\u003eSecurity Architects\u003c\/b\u003e extending enterprise threat models and governance frameworks to autonomous AI systems.\u003c\/li\u003e\n\u003cli\u003e\n\u003cb\u003eEngineering Leads\u003c\/b\u003e who need real-world case studies-including failure modes-that map to their own deployment context.\u003c\/li\u003e\n\u003c\/ul\u003ePrerequisites: \u003cp\u003eVolume 2 requires familiarity with the agentic loop, capability stack, and the MCP\/A2A protocol layer-all established in Volume 1 or equivalent engineering experience. The implementation language throughout Part I is Java 25 LTS. Parts II-VI are language-agnostic.\u003c\/p\u003e\u003cp\u003e\u003ci\u003eArchitecture is the set of decisions that are hard to reverse. Make them deliberately.\u003c\/i\u003e\u003c\/p\u003e","brand":"Independently Published","offers":[{"title":"Paperback","offer_id":47883365548183,"sku":"9798258511195","price":11857.0,"currency_code":"INR","in_stock":false}],"thumbnail_url":"\/\/cdn.shopify.com\/s\/files\/1\/0666\/3471\/1191\/files\/9798258511195.webp?v=1781101436","url":"https:\/\/atlanticbooks.com\/products\/engineering-ai-agents-for-production-infrastructure-observability-security-and-production-case-studies-9798258511195","provider":"Atlantic Books","version":"1.0","type":"link"}