Improve operational excellence with Zabbix monitoring on AWS

07Mar,22 Post Image

Zabbix is an open-source monitoring software tool for IT components that can help you improve operational excellence on AWS – a pillar of the AWS Well-Architected Framework. Zabbix helps you control your infrastructure by collecting any metric type from any metric source. Zabbix automatically provides its users with flexible, intelligent threshold definition options. With Zabbix, […]

View Post

Kubernetes with Prometheus and Grafana on Amazon EKS — better together

25Feb,22 Post Image

In 2012, SoundCloud engineers realized they had a problem. Their existing monitoring solutions were insufficient for their needs. Development teams need real-time, actionable monitoring data to improve Kubernetes deployment uptime, enhance system performance, and boost resource optimization. In other words, they needed a solution to handle the complexity and distributed nature of their cloud applications, […]

View Post

How to use code-free Datadog Synthetic Monitoring for simulated API and browser testing

21Jan,22 Post Image

Why container monitoring is critical for modern cloud environments Modern cloud application environments are complex, running across hundreds or even thousands of compute instances. Because of this complexity, modern applications require container monitoring to continuously collect metrics, track potential failures, and gather granular insights into container behavior. So, it’s not a question of whether or […]

View Post

Tutorial: How to automate a runbook to reduce MTTR

05Oct,21 Post Image

In this blog, I’ll provide a step-by-step tutorial on automating a runbook to reduce MTTR by using Amazon EventBridge (EventBridge) and Datadog. Datadog is used as a monitoring tool, and EventBridge is used to remediate issues and automatically resolve any alerts. EventBridge is a serverless event bus. It makes building an event-driven workflow for applications […]

View Post

nClouds expands 24/7 support with site reliability engineering services |
AWS Premier Consulting Partner achieves Datadog Gold Tier MSP Partner status

28Jul,21 Post Image

SAN FRANCISCO, July 28, 2021 — nClouds (www.nclouds.com), a provider of Amazon Web Services (AWS) and DevOps consulting and implementation services and an AWS Premier Consulting Partner, announced today the expansion of its 24/7 on-call support services to include site reliability engineering services (SRE). A top managed service provider (MSP), the company also announced it […]

View Post

Accelerate your microservice architecture incident response process using service maps

10Mar,21 Post Image

Recent studies indicate that the cost of IT downtime is between $9,000 – $12,000 per minute, depending on industry vertical, organization size, and business model. That cost includes business disruption, revenue loss, and end-user productivity. To protect SLAs and mitigate downtime, the first approach is to accelerate the incident resolution process and find the root […]

View Post

Tips to reduce alert fatigue and avoid recurring incidents

19Oct,20 Post Image

At nClouds, many of our 24/7 Support Services customers have some pretty aggressive Service Level Agreement (SLA) deadlines. So, we continuously search for strategies to help them separate the “signal from noise.” In this blog post, I’ll provide tips on the strategies we use to help our customers reduce alert fatigue and avoid recurring incidents. […]

View Post

How to aggregate monitoring alerts to reduce alert fatigue

09Sep,20 Post Image

Here at nClouds, we manage the infrastructure needs of many of our customers so that they can focus on building awesome products and delivering value to their customers. Since we are managing the infrastructure of multiple customers, the number of alerts can skyrocket pretty quickly if not managed properly. So we always look for ways to reduce unintended noise to avoid alert fatigue. Alert fatigue […]

View Post

Search Blog

Categories

Recent Posts

Subscribe to Our Newsletter

Join our community of DevOps enthusiast - Get free tips, advice, and insights from our industry leading team of AWS experts.