Responsibilities:
Infrastructure & Cloud Operations:
-
Manage and maintain cloud infrastructure across environments (development, staging, production)
-
Configure and optimize servers, networking, storage, DNS, SSL, and load balancing
-
Improve system reliability, uptime, scalability, and fault tolerance
-
Monitor infrastructure health and proactively resolve issues before outages occur CI/CD & Deployment Automation
-
Design and maintain CI/CD pipelines for mobile, backend, and frontend deployments
-
Reduce deployment failures and deployment time
-
Automate repetitive engineering and infrastructure workflows
-
Improve rollback and release management processes
Monitoring, Logging & Incident Response:
-
Implement centralized logging, monitoring, and alerting systems
-
Improve visibility into crashes, regressions, API failures, and infrastructure bottlenecks
-
Support root-cause analysis and postmortem processes
-
Help establish operational best practices and reliability standards
Security & Reliability:
-
Implement infrastructure security best practices
-
Manage secrets, credentials, access controls, backups, and recovery processes
-
Assist with hardening production systems and environments
-
Identify infrastructure risks and reliability gaps
Engineering Enablement:
-
Collaborate with software engineers to improve deployment quality
-
Help standardize environments and reduce “works on my machine” issues
-
Support software stabilization and release-readiness initiatives
-
Improve developer productivity through tooling and automation
Requirements:
- Minimum 3 years of DevOps, Cloud Engineering, or Infrastructure experience
-
Strong experience with Linux server administration
-
Experience with AWS or GCP
-
Experience designing CI/CD pipelines
-
Experience with Docker and containerized environments
-
Familiarity with reverse proxies, networking, DNS, SSL, and web servers
-
Strong scripting skills (Bash, Python, or similar)
-
Experience with monitoring/logging tools
-
Strong debugging and problem-solving ability
-
Comfortable working in fast-paced startup environments
- Highly organized and process-oriented
- Strong ownership mentality
- Proactive problem solver
- Calm under pressure during production incidents
- Able to balance speed with stability
- Strong systems thinking mindset
- Comfortable working across multiple products simultaneously
Preferred Qualifications:
-
Experience supporting mobile app release pipelines
-
Experience with Kubernetes or orchestration tools
-
Experience with Firebase, Supabase, or serverless architectures
-
Familiarity with infrastructure-as-code tools
-
Experience handling scaling or production reliability challenges
-
Exposure to security hardening and compliance best practices
-
Experience supporting AI/ML infrastructure or data pipelines is a plus
Success Metrics:
-
Reduced production incidents and regressions
-
Improved deployment reliability and release velocity
-
Faster issue detection and resolution times
-
Improved uptime and system performance
-
Improved infrastructure security and observability
-
Reduced manual operational workload for engineering teams