Remote job description
The AI Infrastructure team is at the heart of AI and ML at Scale - supporting all Scale products with our infrastructure, services and products. As a member of the AI Infrastructure team at Scale, you will be responsible for building and scaling the infrastructure that powers all machine learning at Scale. Additionally you will be developing a number of critical machine learning services used by all Scale products. This includes supporting the AI and ML workloads for products such as Enterprise Labeling, Rapid, Nucleus, Document AI and Content Understanding.
We are building a large hybrid human-machine system in service of ML pipelines for dozens of industry-leading customers. We currently complete millions of tasks a month, and will grow to complete billions of tasks monthly.
- Lead the evolution of our infrastructure to support Scale's rapid growth.
- Develop best-in-class scaling, monitoring and observability tools for our infrastructure.
- Develop foundational ML services that will be used across Scale's product portfolio.
- Be a self-starter who can own projects end-to-end, from requirements, scoping, design, to implementation.
- Have good taste in building systems and tools and know when to make build vs. buy trade offs, as well as having an eye for cost efficiency.
- Have attention to detail and a good sense for automation, debugging, and troubleshooting.
Ideally you'd have:
- 4+ years work experience.
- Experience with Python, Docker, Kubernetes, and Infrastructure as Code, CI/CD pipelines and monitoring tools.
- Experience managing, scaling and monitoring clusters in production.
- Experience building and deploying complex microservice architectures.
- Ability to design software and systems and ship high-quality code to production.
- Excellent communication skills and ability to work with multiple product leads.
Nice to have:
- Experience building machine learning training pipelines or inference services in a production setting.
- Experience with machine learning frameworks and libraries (e.g. PyTorch, TensorFlow).
- Experience with big data tools (e.g. Spark, Hadoop) and building ETL and streaming pipelines.
At Scale, we believe that the transition from traditional software to AI is one of the most important shifts of our time. Our mission is to make that happen faster across every industry, and our team is transforming how machine learning can build innovative products. Our products provide access to human-powered data for hundreds of use cases and are used by industry leaders such as Open AI, Lyft, Meta, GM, Samsung, Airbnb, NVIDIA, and many more. We've recently raised $325 million in Series E funding at a valuation of $7B+ and are expanding our team to accelerate the development of AI applications.
We believe that everyone should be able to bring their whole selves to work, which is why we are proud to be an inclusive and equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability status, gender identity or Veteran status.
We are committed to working with and providing reasonable accommodations to applicants with physical and mental disabilities. If you need assistance and/or a reasonable accommodation in the application or recruiting process due to a disability, please contact us at firstname.lastname@example.org. Please see the United States Department of Labor's EEO poster and EEO poster supplement for additional information.Summary
Company name: Scale
Remote job title: Senior Software Engineer, AI Infrastructure
Job tags: api, software
location or timezone
posted305 days ago