
Job Information
SciTec Senior Machine Learning Operations Engineer in Boulder, Colorado
Job Summary
We are seeking an experienced Machine Learning Operations (MLOps) Engineer to join and help shape our new MLOps team. This role focuses on deploying and optimizing machine learning models for always-on, high-availability systems in real-world, real-time unclassified and classified environments. As part of a new and growing team, you will have the unique opportunity to evangelize MLOps practices, contribute to the development of an on-premises development platform, and drive innovation in mission-critical applications.
Responsibilities
Deploy and maintain high-performing ML models (e.g., ensembles of LSTMs and Random Forests) in real-time environments
Monitor deployed models for drift or performance degradation and implement automated retraining pipelines.
Implement advanced deployment strategies (e.g., Blue-Green, Canary, Champion-Challenger).
Develop modular and flexible ML pipelines that ensure uptime and reliability
Build and manage scalable infrastructure using Kubernetes, Docker, Terraform, and related tools
Design and implement an on-premises development platform using Kubeflow to replicate cloud capabilities in classified environments
Set up robust monitoring, logging, and alerting systems using Prometheus, Grafana, and Loki.
Optimize performance metrics like inference latency and system throughput while ensuring fault tolerance
Work with cross-functional teams, including Data Engineering, Machine Learning, and DevOps, to integrate and enhance ML systems
Define touchpoints and handoffs with DevOps and Data Engineering to ensure seamless integration of ML workflows with existing infrastructure and data pipelines
Mentor junior team members and contribute to building a collaborative and innovative team culture
Other duties as assigned
Requirements
8+ years, including leading large-scale ML model deployments and scaling production environments
Expertise in architecting Python applications for large-scale systems, mentoring junior engineers in Python best practices, and optimizing code for high performance
Proven leadership in designing enterprise-grade CI/CD systems, incorporating advanced features like parallel testing, rollback strategies, and security hardening
Advanced expertise in designing and optimizing distributed pipelines with Protobufs and ZeroMQ, ensuring fault tolerance and scalability.
Advanced expertise in designing workflows using MLflow or Kubeflow to streamline experimentation and production deployments
Expertise in architecting complex Kubernetes and Terraform configurations for distributed systems, incorporating advanced features like auto-scaling and load balancing
Preferred Qualifications
Familiarity with C++ and/or Rust
Experience with workflow orchestration tools such as Airflow or Prefect
Experience with distributed data processing frameworks such as PySpark
Familiarity with SQL and modern database technologies (e.g., MinIO, Yugabyte)
Experience with DVC, Ansible, Kustomize, Helm, Prometheus, and Grafana
Understanding of secure software development practices and/or experience working in classified environments
Education
Bachelor’s, Master’s, or PhD in Computer Science, Engineering, or a related technical field
Relevant certifications (e.g., Certified Kubernetes Administrator, Certified Kubernetes Application Developer, Terraform Associate) are a plus
Soft Skills
Strong problem-solving and analytical skills
Excellent communication and collaboration capabilities
Ability to thrive in a dynamic, fast-paced environment
Good verbal and written communication skills
Detail oriented
Benefits
SciTec offers a highly competitive salary and benefits package, including:
Employee Stock Ownership Plan (ESOP)
3% Fully Vested Company 401K Contribution (no employee contribution required)
100% company paid HSA Medical insurance, with a choice of 2 buy-up options
80% company paid Dental insurance
100% company paid Vision insurance
100% company paid Life insurance
100% company paid Long-term Disability insurance
Short-term Disability insurance
Annual Profit-Sharing Plan
Discretionary Performance Bonus
Paid Parental Leave
Generous Paid Time Off, including Holiday, Vacation, and Sick Pay
Flexible Work Hours
The pay range for this position is $141,000- $168,000 / year. SciTec considers several factors when extending an offer of employment, including but not limited to the role and associated responsibilities, a candidate's work experience, education/training, and key skills. This is not a guarantee of compensation.
SciTec is committed to hiring and retaining a diverse workforce and is proud to be an Equal Opportunity/Affirmative Action employer. M/F/VETS/Disabled