Role Description
- The Agentic AI Architect is a senior technical leader who can architect and implement production-grade agentic systems.
- This role blends platform architecture, hands-on Python/LLM engineering, and enterprise governance.
- Co-architect end-to-end agentic solutions: orchestration patterns, service boundaries, integration architecture, and non-functional requirements.
- Define engineering standards and best practices for safe, observable, testable agentic workflows (tool calling, structured outputs, approvals, traceability).
- Provide hands-on technical leadership: design reviews, reference implementations, complex debugging, and performance/scalability tuning.
- Ensure enterprise readiness: security-by-design, audit evidence, data controls, and operational stability across DEV/UAT/PROD.
Location
- Pune (on-site/hybrid).
Experience Requirements & Qualifications
Core Experience
- 8+ years in software engineering / platform engineering / solution architecture, including building production backend systems.
- 2+ years designing or building AI/LLM-enabled systems (agentic workflows, orchestration, RAG, evaluation/monitoring).
- Proven experience operating in enterprise environments with security, compliance, and change-management expectations (financial services preferred).
- Strong capability in distributed systems design: service decomposition, API contracts, asynchronous execution, retries/idempotency, failure isolation, and resilience.
- Experience designing workflow/agent execution architectures: worker patterns, job queues, scheduling, state management, and execution traceability.
- Ability to translate ambiguous objectives into implementable architecture and delivery plans (tradeoffs, risks, phased rollouts).
- Advanced Python engineering skills: clean architecture, modularity, testability, performance profiling, packaging, and secure coding.
- Strong experience building API-first services (FastAPI or equivalent), including auth patterns (OAuth2/JWT/API keys), versioning, and backwards compatibility.
- Strong schema/data contract practice using typed models and validation (e.g., Pydantic-style patterns).
- Deep understanding of reliable LLM interaction patterns ,Tool/function calling and safe tool execution boundaries
- Experience in implementing and integrating MCP servers for agentic use cases.
- Multi-step planning/routing, guardrails, and human-in-the-loop approvals
- Grounding and retrieval patterns (RAG), citation/evidence generation, and prompt/version management.
- Strong understanding on different agentic frameworks, strengths, weaknesses, and ability to suggest custom framework designs.
- Test harnesses, golden datasets, regression testing for prompts/agents
- Safety testing, hallucination mitigation strategies, and cost/performance controls
- Observability: application event tracking, traces, metrics, decision logs, prompt lineage, dashboards, prediction capabilities, run replay
Platform / DevOps Awareness (Strong Preference)
Comfortable with on-prem/cloud infra on:
- Collaborate with infra teams on on-prem/ hybrid environments.
- Containers (Docker), Kubernetes/OpenShift fundamentals (deployments, secrets, ingress)
- Logging/monitoring patterns, production readiness, capacity planning
- Secrets management, RBAC, audit logging, and environment separation
Nice-to-Have
- Experience with agent orchestration frameworks (e.g., LangGraph-like patterns) and LLM observability tools (Langfuse-like capabilities).
- Experience working with open-source frameworks and customizing existing agent frameworks to support complex business requirements.
- Experience with enterprise-hosted LLMs, and designing vendor-agnostic model abstraction layers.
- Exposure to orchestration tools such as n8n (good to have), especially for trigger/action workflows and enterprise automation.
Main Tasks and Responsibilities
1) Co-Architect Enterprise Agentic Solutions
- Co-own the reference architecture for agentic solutions (components, interfaces, execution model, security controls).
Define solution patterns for:
- Routing/planning agents, tool execution agents, retrieval/grounding, approvals, and escalation paths
- Multi-agent or multi-step workflows with deterministic control points
- Evidence capture, explainability notes, and audit-ready outputs
2) Establish Engineering Standards & Best Practices
Create and enforce standards for:
- Prompt/agent versioning, structured output contracts, and validation
- Safe tool execution (permissions, allow-lists, constrained actions)
- Error handling, retries, idempotency, and compensating actions
- Logging/tracing conventions and runbook expectations
- Define “definition of done” for production-grade agentic workflows (security, tests, observability, documentation).
3) Lead Technical Delivery
- Lead design reviews and support implementation teams (developers, infra, analysts).
- Provide hands-on guidance for complex builds: reference implementations, difficult integrations, performance bottlenecks, and incident triage.
- Mentor Agentic Developers; uplift overall code quality and engineering maturity through reviews and coaching.
4) Enterprise Security, Governance, and Risk Controls
Partner with security and client governance stakeholders to implement:
- Secrets handling, RBAC, identity integration patterns, audit logging
- Data residency and data handling controls, redaction/masking where required
- Change/release controls and environment promotion practices
- Ensure designs are compliant with regulated client expectations (especially BFSI).
5) Observability, Evaluation, and Continuous Improvement
- Define evaluation strategy (offline + online): regression suites, acceptance metrics, and monitoring thresholds.
- Implement/guide observability practices: traces, metrics, cost/performance tracking, decision logs.
- Drive post-go-live optimization: reduce failure rates, improve determinism, control latency/cost, and improve maintainability.


