The team builds VibeData, an Azure-native platform that transforms business needs into secure, AI-ready data products through strong governance, automation, and observability. This role is responsible for the infrastructure, provisioning workflows, security foundations, and delivery pipelines that keep AI agents and platform services running reliably across multiple customer environments.
Key Responsibilities
- Design robust, composable, and idempotent Terraform modules that AI agents can safely orchestrate.
- Build and maintain fully automated provisioning pipelines for VibeData workspaces, covering Azure resources, identity, networking, compute, and data services.
- Implement zero-trust security patterns (Managed Identity, RBAC, Key Vault, secret rotation).
- Own the multi-tenant isolation model across customer environments.
- Develop end-to-end observability pipelines using Azure Monitor, Log Analytics, OpenTelemetry, and App Insights.
- Enable closed-loop observability where telemetry informs automated decisions or agent actions.
- Build and operate CI/CD pipelines (Azure DevOps or GitHub Actions) for all components, including Control Plane, Studio, Agents, and Terraform modules.
- Maintain automated release workflows and environment promotion processes.
- Ensure all infrastructure is fully reproducible, self-describing, and managed as IaC.
- Collaborate with Platform, AI Engineering, and Full-Stack teams to deliver end-to-end features.
- Contribute to architectural, security, networking, and operational standards.
- Lead incident response for infrastructure-level issues, including diagnostics and system tuning.
Required Technical Skills
Infrastructure as Code & DevOps
- Advanced Terraform expertise (module composition, lifecycle, testing, registries).
- Strong understanding of Azure architecture and resource design.
- Experience building reusable, opinionated Terraform modules.
- Solid knowledge of idempotency, dependencies, and lifecycle behavior.
Azure Platform Expertise
Hands-on experience with:
- Azure App Service, Azure Functions, AKS
- Storage Accounts, VNets, private endpoints, firewalls
- Key Vault, Managed Identity
- Cosmos DB, Log Analytics, Azure Monitor, Alerts
- Strong understanding of RBAC and Entra ID
CI/CD & Release Engineering
- Experience with Azure DevOps or GitHub Actions for multi-stage pipelines.
- Automated build, test, security scanning, and deployment flows.
- Versioning, artifact management, environmental promotion.
- Blue/green or canary deployments (nice to have).
Observability & Diagnostics
- Practical experience with OpenTelemetry (metrics, logs, traces).
- Ability to implement full-system observability across apps and infrastructure.
- Skilled at real-time dashboards, alerts, SLOs.
- Strong debugging abilities for distributed systems.
Security & Governance
- Zero-trust design principles and least-privileged access.
- Key Vault / KMS integrations, secret rotation, encrypted data paths.
- Policy-as-code (OPA, Azure Policy) optional.
Automation & Systems Thinking
- Build fully automated, immutable infrastructure.
- No manual provisioning or ad-hoc scripting.
- Ability to design systems as state machines, not runbooks.
Traits We're Looking For
- High ownership and ability to manage full infrastructure subsystems.
- Clear, effective communication with cross-functional teams.
- Automation-focused mindset.
- Comfort working in fast-paced and ambiguous environments.
- Strong debugging and incident resolution skills.
- Familiarity with AI-assisted coding and automation workflows.
Tech Stack
Infra & IaC: Terraform, AzureRM provider, Azure CLI, Azure DevOps, GitHub Actions
Azure Services: App Service, Functions, AKS, Cosmos DB, Storage, Key Vault, Entra ID, Log Analytics, Monitor, ACR, VNets
Observability: Azure Monitor, App Insights, OpenTelemetry
Automation: Azure DevOps Pipelines, GitHub Actions, Terraform Cloud (optional)
Workflow Tools: Cursor/Claude, GitHub, Azure DevOps, Linear