Lead DevOps Engineer - AI Platforms [T500-26159]
Suffolk Global
Job Description
About Suffolk:
Suffolk is a national enterprise that builds, innovates, and invests. We provide value across the entire project lifecycle through our core construction management services and complementary business lines in real estate investment, design, self-perform construction, and technology start-up investment (Suffolk Technologies). By integrating data, artificial intelligence, and advanced technology through our Seamless Platform, we connect design, construction, and operations to deliver smarter, more predictable results and redefine how America builds.
Suffolk – America’s Contractor – is a national company with more than $10 billion in annual revenue, 3,000 employees, and 17 offices, including Boston (headquarters), New York City, Miami, West Palm Beach, Tampa, Estero, Dallas, Los Angeles, San Francisco, San Diego, Las Vegas, Herndon, U.S. Virgin Islands, and other key markets. Suffolk manages some of the most complex and transformative projects in the country, serving clients across healthcare, life sciences, education, gaming, aviation, transportation, government, mission critical, and commercial sectors.
Suffolk is privately held and is led by founder, chairman and CEO John Fish. Suffolk is ranked #8 on ENR’s list of “Top CM-at-Risk Contractors.”
About Suffolk Global:
Suffolk Global is a strategic extension of Suffolk Construction, established to unlock the full potential of a globally integrated delivery model. Based in Bangalore, India, Suffolk Global brings together world-class talent, advanced technology, and innovative processes to enhance how Suffolk designs, plans, and builds.
As a critical part of Suffolk’s long-term growth strategy, Suffolk Global enables greater speed, scalability, and technical rigor across the business. At the intersection of process standardization, AI-enabled innovation, and global talent, Suffolk Global is redefining how work is executed , embedding consistent workflows, leveraging technology to drive efficiency, and building high-performing teams aligned to Suffolk’s evolving needs.
By embedding deeply with U.S.-based teams, the platform supports high-impact work across design, digital delivery, and corporate functions. Powered by India’s exceptional talent pool, Suffolk Global is building a next-generation capability that combines deep technical expertise with a relentless focus on quality, efficiency, and continuous innovation, helping redefine what’s possible in the built environment.
About Artificial Intelligence (AI):
Suffolk’s AI team is at the forefront of redefining how AI transforms the built environment. Operating as a core part of Suffolk’s global innovation ecosystem, the team works at the intersection of construction, technology, and advanced analytics to design and deploy AI-driven solutions that address real-world challenges across jobsites and enterprise functions.
The work spans a broad range of IT capabilities, including enterprise applications, end-user support, systems integration, and platform delivery. The team helps ensure seamless connectivity across systems, reliable day-to-day operations, and a strong foundation for continued digital growth across the business.
This is an opportunity to build and scale cutting-edge AI capabilities in a highly applied environment, while helping shape Suffolk’s vision for the ‘Construction Site of the Future’ and redefining what’s possible in the built environment.
Overview:
Own infrastructure, CI/CD, and reliability for AI systems.
Key Responsibilities:
- Implement IaC, CI/CD pipelines
- Manage cloud infrastructure and scaling
- Ensure security and compliance readiness
Qualifications:
Experience:
- 8–10 years DevOps/SRE
- Experience supporting data/ML or high-scale backend systems
Cloud Expertise:
- Strong AWS experience:
- Compute (EC2, Kubernetes, serverless)
- Networking, IAM, security
Infrastructure as Code:
- Deep experience with:
- Terraform (must-have)
- CI/CD tools (GitHub Actions, Jenkins, etc.)
AI/ML Systems Awareness:
Experience supporting:
- GPU workloads or inference systems (nice to have)
- Data pipelines or batch/stream processing
- LLMops, MLops
Reliability & Security:
Experience implementing:
- Monitoring, alerting, incident response
- Security controls (secrets, IAM, audit logs)
- Cost Optimization
Ability to manage:
- Cloud spend (critical for LLM workloads)
- Scaling strategies