System Architect Director - AI Platform Engineering - #304457

CNA Insurance

Date: 1 day ago

City: Chicago, IL

Salary: $97,000 - $189,000 per year

Contract type: Full time

You have a clear vision of where your career can go. And we have the leadership to help you get there. At CNA, we strive to create a culture in which people know they matter and are part of something important, ensuring the abilities of all employees are used to their fullest potential.

The System Architect (SA) Director for AI Platforms Engineering serves as the technical owner for the enterprise AI platform which is the shared foundation powering all AI and GenAI products across the organization. This leader owns the platform's architecture, engineering standards, and delivery roadmap, translating strategic AI capabilities into reliable, scalable, and governed platform capabilities that accelerate every product team building on top of them.

Working in close partnership with Enterprise Architects, Product Management, and Release Train Engineers (RTEs), the SA Director ensures that platform investments are tightly aligned to business outcomes, compliance requirements, and engineering excellence. This role combines the strategic depth of a principal architect with the hands-on leadership of a delivery-focused engineering director.

JOB DESCRIPTION:

Essential Duties & Responsibilities

Performs a combination of duties in accordance with departmental guidelines:

Own and continuously evolve the enterprise AI Platform reference architecture, encompassing all critical layers including model serving, orchestration engines, data and knowledge grounding pipelines, observability infrastructure, and ensuring the platform scales reliably to enterprise-grade workloads and usage patterns.

Define and enforce platform-wide standards, reusable design patterns, and golden-path templates that enable product and feature teams to build, deploy, and operate AI solutions safely, consistently, and with significantly reduced time-to-production.

Drive end-to-end delivery of new platform capabilities — from initial technical discovery and architecture design through prototyping, hardening, and full production rollout while maintaining meaningful hands-on involvement at critical technical milestones to ensure quality and coherence.

Architect and operationalize the core platform service catalog, including LLM gateway and routing layers, prompt lifecycle management, agentic orchestration frameworks, Retrieval-Augmented Generation (RAG) pipelines, vector stores, model registries, and rigorous automated evaluation infrastructure.

Build and maintain robust CI/CD and AIOps pipelines specifically designed for AI systems, incorporating automated evaluation gates, model and data versioning controls, staged deployment promotion, and continuous cost and performance optimization guardrails.

Architect enterprise-grade multi-agent and single-agent workflow patterns for high-value business use cases, establishing clear standards for orchestration design, state and memory management, tool and API integration, and safe autonomy controls including human-in-the-loop approvals, permission scoping, and comprehensive audit trails.

Design and implement knowledge grounding systems — spanning hybrid retrieval strategies, semantic reranking, ontology-driven entity modeling, and knowledge graph integration — to measurably improve AI output accuracy, traceability, and readiness for regulatory audit.

Embed responsible AI and compliance-by-design principles into every layer of the platform, covering data privacy protections, enterprise secrets management, granular access controls, output leakage prevention, and model risk governance practices aligned to enterprise and regulatory standards.

Actively shape PI Planning by authoring well-defined Enabler Epics and articulating architectural outcomes that anchor near-term delivery and long-horizon platform capability roadmaps, while contributing expert WSJF input to balance platform investment against feature team needs, risk reduction, and time-to-impact.

Directly manage, mentor, and grow a high-performing team of platform engineers, solution architects, and technical specialists — hiring hands-on builders, coaching technical leadership skills, and sustaining a healthy innovation pipeline that continuously advances the organization's AI platform maturity.

May perform additional duties as assigned.

Skills, Knowledge & Abilities

Deep AI Platform and AIOps engineering expertise, including hands-on experience designing, deploying, and operating shared AI platform capabilities such as model serving layers, LLM gateway and proxy services, prompt registries, vector databases, and automated evaluation harnesses at enterprise scale.
Proven agentic system design capability, with hands-on experience architecting multi-agent and single-agent workflow systems using orchestration frameworks such as Lang Graph, Google ADK — including tool and function calling patterns, state and memory persistence strategies, and robust safe autonomy controls.
Applied GenAI depth spanning LLM solution architecture patterns, model selection and routing strategies, advanced prompt engineering techniques, fine-tuning and RLHF tradeoffs, and production-grade RAG and hybrid retrieval system design and optimization.
Strong cloud-native and distributed systems architecture skills, with deep GCP expertise across Vertex AI, Cloud Run, GKE, Pub/Sub, and BigQuery, and a solid command of API and service-based design, event-driven architecture, and high-availability and fault-tolerant system patterns.
Knowledge grounding and semantic layer proficiency, including experience building canonical ontology and entity models, designing vector search and hybrid retrieval pipelines, integrating knowledge graphs, implementing reranking strategies, and establishing citation and traceability mechanisms that support compliance.
Solid AIOps and platform reliability engineering experience, including CI/CD pipeline design for AI systems, automated evaluation and quality gates, model and dataset versioning, production monitoring and observability, reliability engineering practices, and systematic cost-performance optimization.
Practical responsible AI and security expertise, with demonstrated experience implementing enterprise AI governance frameworks, model risk management programs, PII and data privacy controls, audit and event logging, and compliance-by-design patterns suited to regulated industries.
Strong SDLC and hands-on engineering fundamentals, including Python proficiency, architectural and code review practices, comprehensive testing strategies for AI systems, technical debt management, refactoring discipline, and operational readiness standards.
Scaled Agile (SAFe) leadership experience, including decomposing long-horizon strategy into actionable Enabler Epics, shaping PI planning outcomes.
Exceptional leadership and communication skills, with a demonstrated ability to influence senior stakeholders and cross-functional teams, negotiate complex technology tradeoffs, mentor and develop engineers at all levels, and translate deep technical concepts into compelling narratives for non-technical business audiences.

Education & Experience

Bachelor's degree in Computer Science, Software Engineering, Information Technology, or equivalent required; Master's degree in AI, Machine Learning, Data Science, or related discipline strongly preferred.
10+ years in software engineering and technical delivery, with demonstrated ownership of large-scale, distributed enterprise systems across the full SDLC from inception through production operations.
5+ years in system or solution architecture, with a track record of producing reference architectures, design patterns, technical standards, and enterprise-scale platform guardrails.
5+ years of direct people leadership, including hiring, performance management, career development, and building high-performing engineering and architecture teams.
5+ years hands-on designing, delivering, and operating AI/ML or GenAI platform capabilities in production, with measurable outcomes in quality, reliability, and developer adoption.
Strong Python proficiency and deep practical GCP experience — Vertex AI, GCP Agent Builder, and Gemini — with the ability to engage credibly in hands-on technical work alongside the engineering team.
Prior experience in regulated industries (insurance, financial services, or healthcare) strongly preferred, given stringent governance, auditability, and model risk management requirements.
Consulting or enterprise delivery background is a plus, bringing structured problem-solving and stakeholder management

#LI-KJ1 #LI-HYBRID

In certain jurisdictions, CNA is legally required to include a reasonable estimate of the compensation for this role. In District of Columbia, California, Colorado, Connecticut, Illinois, Maryland, Massachusetts, New York and Washington, the national base pay range for this job level is $97,000 to $189,000 annually. Salary determinations are based on various factors, including but not limited to, relevant work experience, skills, certifications and location. CNA offers a comprehensive and competitive benefits package to help our employees – and their family members – achieve their physical, financial, emotional and social wellbeing goals. For a detailed look at CNA’s benefits, please visit cnabenefits.com.

CNA utilizes AI-enabled technology during the recruiting process. For more information, please visit our careers page.

CNA is committed to providing reasonable accommodations to qualified individuals with disabilities in the recruitment process. To request an accommodation, please contact [email protected]

How to apply

To apply for this job you need to authorize on our website. If you don't have an account yet, please register.

Post a resume

Similar jobs

ReactJS with Azure developer

Tata Consultancy Services, Chicago, IL

$100,000 - $120,000 per year

4 days ago

ReactJS with Azure (Majority work), Python, Airflow, Spark with Kubernetes. And ETL Data knowledge. The ideal candidate is a ReactJS frontend engineer with Azure cloud deployment expertise who can also design and orchestrate data workflows in Airflow, perform Python scipting and run Spark jobs on Kubernetes. They should be comfortable with Helm, Docker, and CI/CD pipelines for both application and...

Designer - Substation

Quanta Services, Chicago, IL

$69,600 - $125,700 per year

1 week ago

About Us QISG leverages Quanta’s comprehensive resources to deliver collaborative solutions for our partners' energy infrastructure needs. We use in-house talent, expertise and resources to plan, design, engineer, manage, conduct maintenance on and construct projects. Our turnkey service capabilities provide our customers with efficiency, consistency, attention to detail and safe execution. The QISG team brings together Engineering, Safety, Quality, Material...

Administrative Business Partner, Customer Experience

Google, Chicago, IL

2 weeks ago

This role is not eligible for U.S. immigration sponsorship. Note: By applying to this position you will have an opportunity to share your preferred working location from the following: Chicago, IL, USA; Atlanta, GA, USA; Austin, TX, USA . Minimum qualifications: 2 years of administrative experience in a technology or international environment working on core administrative tasks (e.g., travel management,...