FOR TALENT
Discover
Search remote jobs
Become premium
Verified employer
Top 100 flex employers
Top 100 remote influencers
Career explorer
FOR COMPANIES
Our solution
Book a demo
Post a job
Top 100 flex employers
Company access
ABOUT US
Our flex Score
About Jobgether
Our vision
Press
BLOG
FOR TALENT
Discover
Search remote jobs
🚀
Become premium
Verified employer
Top 100 flex employers
Top 100 remote influencers
Career explorer
FOR COMPANIES
Our solution
Book a demo
Post a job
Top 100 flex employers
Company access
ABOUT US
Our flex Score
About Jobgether
Our vision
Press
BLOG
SIGN IN
SIGN UP
Login as a company
remote jobs
Site Reliability Engineer (SRE)
Reliability Engineer
Senior (5-10 years)
Remote from:
United States
Match score not available
Login to calculate your matching score
APPLY
Priority access
Match score not available
Login to calculate your matching score
X
Unlock Premium Benefits Today!
🚀 Elevate your career with priority access.
Stand out to top recruiters with Jobgether Priority Access:
✅ Top Match
: We help you identify job offers where you're a perfect fit.
✅ Exclusive Referral Service:
Connect directly with recruiters as a Jobgether Certified Premium User.
✅ Personalized Feedback:
Receive feedback and expert guidance on applications, and have your profile manually reviewed.
Start Your Free Trial Now!
Later
Start my free trial
Reliability Engineer
Remote:
Full Remote
Experience:
Senior (5-10 years)
Work from:
United States
Resource Informatics Group, Inc
SME
https://www.rigusinc.com/
51 - 200
Employees
Company description
About Resource Informatics Group, Inc
We are a staffing solutions and IT Solutions company that provides you with a platform to find the right fit for your desired job profile. The employment situation is constantly shifting with the changing times, and we are here to ensure that you gather the workforce that compliments your long-term goals. We understand the struggle of the search for talent that accommodates the skills and qualifications for specific profiles, one that blends in with the theme of your organization. We aim to be the best available source for young talent to find their dream jobs, by helping them narrowing down their options to the most suitable work profiles available in the market. At RIG, we consistently work towards creating the latest technology that may not only simplify your work process but also provides you with the most cost-effective solutions. We work hard to make sure that your business strives in the market to be on the top in your field.
close
See more Resource Informatics Group, Inc offers
Job description
Job Description
We are currently seeking a Reliability Engineer to support critical projects for our Technology, Infrastructure & Operations team, for immediate start
The position is 100% remote. The qualifications are listed below. The candidate MUST have strong working knowledge of and experience with Data Dog. Please send candidate resumes to
EXPERT WITH DATA GUARD
YAML
100% REMOTE
EST HOURS
Develop and maintain comprehensive monitoring solutions in Data Dog for cloud-based and on premise-based services and applications.
Configure monitoring tools and systems to collect relevant metrics, logs, and traces.
Create custom monitoring dashboards and reports using Data Dog, to provide real-time insights into system performance and health.
Continuously monitor the infrastructure's performance and capacity, anticipating and addressing potential scalability issues and create monitors with targeted notifications
Understanding on how to install the Agent in Linux and Windows and configure the YAML file to monitor the systems.
Aggregate and visualize data in the Datadog application.
Familiarity the Data Dog API and how to write custom monitors and alerts.
Familiarity with Networking to work with Network team to setup Dashboards and alerts
Proactively suggest and implement improvements to enhance the system's reliability, resilience, and fault tolerance.
Work on automating tasks to streamline operational processes and reduce manual intervention.
Collaborate with cross-functional teams to investigate and resolve critical incidents, ensuring minimal impact on end-users.
Work with Problem Management team to complete post-mortem analysis of incidents to identify root causes and implement preventive measures.
Understand the overall architecture of our systems to identify gaps in monitoring and troubleshoot issues.
Configure and maintain custom dashboards and alerts in various monitoring tools.
Create custom reports, deliver report presentations to various stakeholders.
YAML, JSON, Python, and shell scripting
Develop metrics for both the business and technical teams to determine the health of systems.
Provide on-call support as needed.
Leads and coordinates performance engineering for medium to large initiatives.
Collect and document expected system performance and operational characteristics.
Collect and/or prepare test data for test execution.
Develop and execute performance tests including load, stress, endurance, fail-over and interoperability.
Conduct technical analysis of performance test results and production systems, and provide recommendations on performance tuning, systems, and infrastructure. Identify, report, and review defects in assessing system performance and stability.
Defining the strategy for enabling performance diagnostics and monitoring using an Application Performance Management (APM) tool, other monitoring tools, and diagnostic techniques.
Collaborating with developers to promote the concept of performance engineering during all phases of the SDLC to detect and correct performance issues earlier in the lifecycle.
Leads peer reviews to ensure the completeness of all test assets created.
Resolve performance and stability issues in performance test environment.
Develop performance engineering work plan structure and project schedule.
Review architectural design for performance risks and potential issues.
Prepare capacity analysis when applicable.
Minimum Requirements:
Requires an BA/BS degree in Information Technology, Computer Science or related field of study and a minimum of 7 years performance engineering and performance testing experience; or any combination of education and experience, which would provide an equivalent background.
Preferred Skills, Capabilities and Experiences:
Experience managing performance engineering efforts for an application strongly preferred.
Proficiency with the following tools is preferred (Splunk, DataDog, DynaTrace among others).
Experience managing performance engineering efforts for an application strongly preferred.
Knowledge of developing scripts for monitoring (PowerShell, Python and Shell scripting).
5 years' of Splunk programming proficiency is highly preferred.
5-6 years' experience using .NET and Java application and Application Monitoring Tools like App Dynamics or Datadog are highly preferred.
Proficiency is performance tuning is preferred.
Good understanding of the UI, Middleware and backend Databases
Required profile
Experience
Level of experience:
Senior (5-10 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.
Hard Skills
Python (Programming Language)
Dynatrace
YAML
Shell Script
Performance Testing
Splunk
Datadog
Windows PowerShell
Performance Engineering
Appdynamics
Site Reliability Engineer (SRE) Related jobs
Copy of Site Reliability Engineer (India, All-Levels)
Copy of Site Reliability Engineer (India, All-Levels)
Copy of Site Reliability Engineer (India, All-Levels)
Today
Sezzle
38 - 86K
Remote:
India
Senior Site Reliability Engineer (SRE)
Senior Site Reliability Engineer (SRE)
Senior Site Reliability Engineer (SRE)
30+ days ago
Swile
Full time
Remote:
France
Linux Site Reliability Consultant
Linux Site Reliability Consultant
Linux Site Reliability Consultant
30+ days ago
Pythian
Full time
Remote:
India
Site Reliability Engineer IRC243791
Site Reliability Engineer IRC243791
Site Reliability Engineer IRC243791
30+ days ago
GlobalLogic
Full time
34 - 34K
Remote:
Poland
Site Reliability Engineer
Site Reliability Engineer
Site Reliability Engineer
6 day ago
SOFTSWISS
Remote:
Malta
See more Site Reliability Engineer (SRE) jobs
X
Sign in or sign up with Google or LinkedIn
Google
Linkedin
close