Jobs / Excelon Solutions
DevOps Engineer
Excelon Solutions · Princeton, NJ
Visa: unknownSalary: unknownWork mode: unknown
Skills
ansibleawsdatadogdevopsdockergrafanajenkinskafkakubernetesprometheussplunk
Description
Cloud/Devops Engineer
Location: PRINCETON NJ (Onsite)
Tool/Technology Requirements:
- Kubernetes
- Cloud – AWS
- Docker Repository - Harbor
- Planning and Designing -Jira & Confluence
- Source Code Versioning - Git using Github
- Persistent data stores - RDS, S3, ES
- Unified Auth - Okta
- Configuration & Deployment Management – Ansible
- Infrastructure Monitoring – Datadog, Prometheus with Grafana
- Log Monitoring – Splunk & ELK stake
- Emergency Response & Alerting & Chat & Notification –Pagerduty & Slack
- Service Mesh - LinkerD
- Ingress - Contour, Envoy
- Metrics - Prometheus
- Logs - Fluentd + Kafka + ES + Kibana
Responsibilities:
- Level 3 support for above mentioned technologies.
- Experience managing Kubernetes
- Handle register and triage client issues.
- Participate in Post mortem meetings and create Jira issues for all action items.
- Debug queue related issues - kafka and rabbit mq.
- During an outage, understand what's happening and participate actively by providing relevant information - either errors, tracebacks, knowledge from playbook or output from executing a command.
- Address user login or token issues in okta dashboard
- Trigger deploys and other Jenkins jobs on production/hydra Jenkins.
- Notify customers of maintenance and any service disruptions.
- Managing Infra Nodes in Aws.
- Troubleshooting Node related issues from Console.
- Managing Infra in Aws and closely working with application teams for Pods and Node related fix.
- Setting up Monitoring tools for the Clusters.
- AWS key Rotation and provisioning access to the Kensho Users.
- Modify User limits for Scribe related to Transcription.
- Upgrading Cluster Version as per the platform requirement.
- Create new Namespace, involve in its requirement gathering and support in the entire life cycle.
- Changes in Infrastructure related to Clusters, resources, and its availability.
- Maintenance of the Infra 100% effectively by co-ordinately working with AWS vendor.
Preferred Certifications (Any one):
- Cloud related Certification.
- Kubernetes Certification.