Description
About Pinterest:
Millions of people around the world come to our platform to find creative ideas, dream about new possibilities and plan for memories that will last a lifetime. At Pinterest, we’re on a mission to bring everyone the inspiration to create a life they love, and that starts with the people behind the product.
Discover a career where you ignite innovation for millions, transform passion into growth opportunities, celebrate each other’s unique experiences and embrace the flexibility to do your best work. Creating a career you love? It’s Possible.
At Pinterest, AI isn't just a feature, it's a powerful partner that augments our creativity and amplifies our impact, and we’re looking for candidates who are excited to be a part of that. To get a complete picture of your experience and abilities, we’ll explore your foundational skills and how you collaborate with AI.
Through our interview process, what matters most is that you can always explain your approach, showing us not just what you know, but how you think. You can read more about our AI interview philosophy and how we use AI in our recruiting process here.
The Production Engineering organization at Pinterest is accountable for ensuring overall Pinterest availability as well as enhancing Engineering teams' capability to design, build and operate robust systems at scale. Pinterest's applications and infrastructure handle billions of monthly page views and petabytes of data as Pinterest continues to grow and scale.
As a Senior Production Engineer on Solutions Engineering, you will design and build AI agents, platforms, tools, frameworks and methodologies to assure the reliability of our large-scale distributed systems serving hundreds of millions of monthly active users, handling hundreds of thousands of requests per second, and managing tens of petabytes of data. You'll lead infrastructure modernization initiatives, build intelligent automation that eliminates operational toil and amplifies engineering productivity, and transform successful consulting patterns into reusable platforms that democratize reliability expertise across Pinterest's 2500+ engineers.
What you’ll do
- Design and build AI agents that augment production reliability work - Develop agents that assist engineers with service health analysis, reliability recommendations, migration playbook generation, and risk identification, enabling faster decision-making while keeping humans in the loop for critical judgment calls
- Drive large-scale infrastructure modernization with AI-accelerated execution - Lead Kubernetes adoption and platform transitions using AI to generate automation, accelerate delivery, and create patterns that enable self-service adoption for standard use cases while tackling novel architecture challenges
- Transform consulting patterns into scalable platforms - Execute scoped reliability engagements with engineering teams, then encode successful approaches into AI-assisted tools, automation, and self-serve documentation that enable teams to handle similar problems independently while escalating complex challenges to experts
- Build the knowledge infrastructure that powers Pinterest's operational agent ecosystem - Create migration playbooks, operational runbooks, incident patterns, and best practices that democratize reliability expertise and raise the baseline capabilities of all Pinterest engineers
- Develop software solutions to enable reliability and operability of large-scale distributed systems - Build a deep understanding of how Pinterest's systems behave, scale, interact and fail, and use that insight to identify risks and opportunities for remediation through automation
- Build tools and automation to eliminate toil and reduce operational overhead - Create frameworks, processes and best practices that encode reliability expertise into software, making operational excellence accessible to all engineers while freeing experts to tackle harder problems
- Build meaningful, insightful and actionable SLIs - Develop service level indicators that provide clear signals of system health and enable data-driven reliability decisions across Pinterest Engineering
- Automate critical portions of Pinterest's engineering processes - Build automation that minimizes risk and maximizes the speed of innovation, enabling safe, rapid deployment and operational changes at scale
- Manage capacity and performance to help scale our infrastructure - Partner with teams to plan and optimize capacity across public and private clouds around the world, ensuring efficient resource utilization as Pinterest grows
What we’re looking for:
- 5+ years of industry experience building and operating large-scale, high-performance distributed systems
- Bachelor's degree in Computer Science or related field, or equivalent experience
- Strong programming skills in Python or Go - ability to build production-grade platforms, agents, and automation
- Deep knowledge of Linux/Unix internals and experience with open source infrastructure (MySQL, Kafka, Envoy, Hadoop, etc.)
- Infrastructure as Code experience (Terraform, Puppet, Chef, Ansible, Docker, Kubernetes)
- Experience deploying web applications to cloud infrastructure (AWS, GCP, or Azure) and working with distributed, service-oriented architecture
- Preferred:
- Experience developing AI agents for infrastructure automation, operational decision-making, or reliability workflows
- AI/ML infrastructure experience (LLM-based systems, model serving, agentic workflows)
- Technical consulting or embedded SRE experience with cross-functional engineering teams
In-Office Requirement Statement:
- We let the type of work you do guide the collaboration style. That means we're not always working in an office, but we continue to gather for key moments of collaboration and connection.
- This role will need to be in the office for in-person collaboration 1-2 times every 6 months and therefore can be situated anywhere in the country.
Relocation Statement:
- This position is not eligible for relocation assistance. Visit our PinFlex page to learn more about our working model.
#LI-REMOTE
#LI-JT1
At Pinterest we believe the workplace should be equitable, inclusive, and inspiring for every employee. In an effort to provide greater transparency, we are sharing the base salary range for this position. The position is also eligible for equity. Final salary is based on a number of factors including location, travel, relevant prior experience, or particular skills and expertise.
Information regarding the culture at Pinterest and benefits available for this position can be found here.
Our Commitment to Inclusion:



