• The Cloud Operations Engineer is part of an agile, dynamic team of technical experts responsible for the delivery of cloud-based services and solutions for the enterprise.
• The ideal candidate will bring cloud-based expertise and experience in the build, deployment, and operation of software-defined infrastructure and services for business initiatives.
• This role supports strategic technology standards and manages the cloud environment.
• The ideal candidate can expect to join a team that leverages Agile methodologies to ensure around-the-clock availability and performance of cloud-based environments, working closely with many other parts of the organization in order to build and maintain world-class systems.
• Work in an Agile-based environment to build, operate, monitor and maintain cloud-based platforms and solutions for mission-critical systems
• Instrument systems to provide the best possible operational monitoring and metrics
• Assist with performance testing and tuning of complex and high-traffic environments
• Update and enhance written processes and documentation with a focus on ease of understanding and completeness
• Develop relationships with and work alongside other functional IT groups such as networking, storage, database and security
• Participate in on-call rotation, receiving and responding to daytime and after-hours alerts
• Perform after-hours or weekend system maintenance and application support as needed
• Participate in change management and incident management processes
• Engage in problem resolution and root cause analysis of system and application incidents
Skills and Qualifications:
• Operational experience in running at least two of the major Public Cloud Providers (Azure, AWS, or GCP) across a portfolio of services (VMs, Containers, Functions, Microservices, Automation/Orchestration, Networking/Content Delivery, Database, etc.)
• Specific experience or familiarity with a combination of GitHub, Ansible, Docker, Kubernetes, Splunk and related cloud systems and tools
• Specific experience or familiarity with at least two scripting languages or interfaces such as Python, AWS CLI, Azure CLI, Unix Shell (bash), or PowerShell
• 5+ years of operational experience in a large-scale enterprise environment, including the implementation, and support of Enterprise scale Java-based applications
• Azure, AWS or GCP Certifications a plus
• Demonstrable experience performance tuning, troubleshooting and resolving problems quickly and effectively in a production environment
• Working knowledge of Internet Protocols, Networking, and TCP/IP
• General understanding of standard IT security concepts as they relate to production environments
• Working knowledge of disaster recovery, high availability and other technologies and principles that support business continuity; experience with DR capabilities in cloud and/or virtualized environments
• Expertise with enterprise data center technologies including storage platforms, network switching, and security infrastructure within a virtualized data center
• Experience with standard software development methodologies, knowledge of Agile methodology highly desirable
• Good project management skills, ability to operate as part of a team organizing, planning and executing projects from vision through implementation
• Strong organizational, problem solving, and analytical skills
• Ability to foster and maintain both a positive team culture and service-centric environment
• Excellent verbal, presentation and written communications skills