Posted 1 week ago
Building cost attribution and observability for AI workflows. Correlate application traces, model usage, and infrastructure costs to compute cost-per-outcome. Responsible for core data platform: ingestion pipelines, cost computation, and analytics i…
San Francisco
Posted 1 week, 1 day ago
Checkly helps engineers build reliable products by unifying testing, monitoring and observability. OpenTelemetry, Playwright, and Monitoring as Code are our foundation for unifying performance and reliability. In 2024, we raised $20M in Series B fun…
Posted 1 week, 1 day ago
At akeno, we help chemical manufacturers optimise production planning in large factories (asset utilisation, dependency graphs, real-time data, critical processes). Hiring 1x Senior/Staff Fullstack Engineer, 1x Senior/Staff DevOps Engineer, and 1x S…
Hamburg, Germany
Posted 1 week, 1 day ago
Ford is seeking an experienced Site Reliability Engineer to design, implement, and enhance Agentic triage software to drive automated and fast response times and lower MTTX. The role involves hands-on work to ensure reliability and scalability of sy…
Remote (US)
Posted 1 week, 2 days ago
Technical founding team (part of Nvidia Inception Program), backed by European VCs (SpeedInvest, Galion.exe). Building anomaly detection and agentic evaluation on top of an AI agent observability stack (Otel Native). Ideal candidate has experience i…
Amsterdam, The Netherlands
Posted 1 month ago
Zefr is a global technology company enabling responsible marketing in walled garden social environments (YouTube, Meta, TikTok, Snap) using patented AI. Role: Staff or Principal Site Reliability Engineer. Responsibilities include applying expertise …
Los Angeles, CA (Marina del Rey)