
Job Information
Amazon Senior Software Development Engineer, ML Ops, AWS Infrastructure Science Engineering in Seattle, Washington
Description
AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we’re looking for talented people who want to help.
You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You’ll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. You’ll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.
Within AIS, the Science team takes on the exciting challenge of using big data and machine learning to optimize power and cooling, the most critical resources in our data centers. In short, we ensure maximum efficiency while preventing overheating and power outages. Our work helps shape future data center designs and drives exceptional cost savings to AWS customers.
As a Software Engineer on the AIS Science team, you will collaborate with scientists, program managers, and data engineers to build, operationalize, and scale machine learning workflows and platform services. Your work will directly impact how server demand is placed by modeling power and cooling load across AWS's global data centers.
You will play a critical role in building infrastructure meant to support all phases of ML models, from R&D to production, including model retraining and iteration. Our team tackles complex challenges in data processing, model hosting, and metric monitoring. As our responsibilities grow and the number of models we manage increases, we’re seeking an innovative senior engineer with a passion for data, machine learning, and MLOps to join our mission-driven team!
If you're passionate about machine learning and model operations, enjoy working in a collaborative and dynamic team that values work-life balance, and want to make a lasting impact on AWS infrastructure worldwide, this is your opportunity. Come join us on this exciting journey!
Key job responsibilities
In this role you will leverage your engineering background and expertise in ML to lead developing platforms for deploying, productionalizing, and scaling machine learning models, with a focus on variant retraining and ongoing model monitoring.
A day in the life
Lead the design and implementation of a stable and efficient training and inference infrastructure that scales to support a variety of different machine learning models.
Collaborate with tenured applied scientists and data engineers to develop improved training and inference infrastructure that accelerates innovation and promotes best practice model scoring and model monitoring.
Quickly learn the ins and outs of AWS infrastructure’s rack planning and forecasting distributed workflows, and engineer solutions to make these systems more robust, fault-tolerant, and efficient across input and output orgs.
About the team
The software team you’ll be joining is called Lanner under AIS Science Engineering. We’re a tight-knit group of eight developers, including one Senior SDE, three junior, and three entry-level engineers. We take pride in solving challenging problems and building impactful solutions—but we also value work-life balance. Our culture encourages healthy boundaries, and we make time to connect as a team through weekly happy hours, regular lunches, and occasional offsites and team events.
About AWS
Diverse Experiences
AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.
Why AWS?
Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.
Inclusive Team Culture
Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.
Mentorship & Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
Work/Life Balance
We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.
Basic Qualifications
5+ years of non-internship professional software development experience
5+ years of programming with at least one software programming language experience
5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
Experience as a mentor, tech lead or leading an engineering team
Preferred Qualifications
5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
Master's degree in machine learning or equivalent
Experience with developing state-of-the-art, best practice MLOps tooling and frameworks
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $151,300/year in our lowest geographic market up to $261,500/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit https://www.aboutamazon.com/workplace/employee-benefits . This position will remain posted until filled. Applicants should apply via our internal or external career site.