It's just not enough to deploy a specifically configured Linux server. Monitoring the servers is also crucial to maintain them effectively in long run.
If you know what's going on with your servers, you could avoid potentially catastrophic situations. Take something as trivial as disk space. If your server runs out of disk space, the running services will be affected.
This is why it is essential to install dedicated DevOps monitoring tools to ensure efficient maintenance and monitoring.
I'm going to include a bunch of tools and services that you can use to monitor your servers.
- Some of them allow you to set up alerts
- some show the stats in a nice dashboard style
- some of them show the graphs and let you manage the servers graphically.
You can go through the list and decide which tool and service are suitable for your need.
I have included both open source server monitoring software and web-based paid services. If you can self host, you get free server monitoring software. If money is not a concern and you want to save time and effort, go for the paid ones.
Better Uptime is a notifier whose primary job is to alert you whenever your server goes down due to any discrepancy thanks to continuous monitoring.
But it is more than just that. You can set alert if RAM/disks/CPU reach a certain level, a cronjob fails, database backup fails and more.
It's a complete infrastructure monitoring service. You can even add team members and put an on call roaster. The on-call person can be notified via email, SMS or call.
You may start using Better Uptime for free but some features may require signing up for Pro account.
We use it for monitoring our servers and it has proven itself really helpful and productive since recent times.
Webmin is an open-source web-based control panel for system administration, primarily for Unix-like systems with which you can easily manage your system graphically and even remotely.
You can read about how to install and configure it on our past coverage:
Grafana allows you to query, visualize and alert on metrics and logs no matter where they are stored. It serves as a web frontend with Prometheus running as the backend.
Prometheus is an extremely renowned open-source systems monitoring and alerting toolkit.
You can significantly improvise on the duo with Dockprom and influxDB.
Also reviewed earlier on Linux Handbook, Cockpit is a browser-based graphical administration tool for your Linux servers. With Cockpit installed on your server, you can access the server from a browser and perform all day-to-day regular administrative tasks.
It's a multi-dimensional open source server monitoring software.
Monit is a small Open-Source utility for managing and monitoring Unix systems. It has all features needed for system monitoring and error recovery and works as a watchdog with a toolbox on your server.
As per their official documentation, M/Monit builds on Monit's capabilities and provides monitoring and management of all your Monit enabled hosts via a modern, clean and well-designed user interface which also works on mobile devices.
The Netdata Agent is 100% open source and powered by more than 300 contributors. With Netdata, you can troubleshoot slowdowns and anomalies in your infrastructure with thousands of per-second metrics, meaningful visualizations, and insightful health alarms with zero configuration.
Linux Dash is a simple and beautiful open source server monitoring web dashboard that includes all the generic server metrics. Apart from providing system status as shown below, it also provides system-specific basic information, network details, user accounts and details of existing applications.
The Raw Edition is Free and Open Source while their Enterprise Edition is available as a 30-day trial.
Nagios offers an open source industry standard in IT infrastructure monitoring and alerting. Nagios Core is available free of cost.
Nagios' paid tools are also offered via free trials.
|Nagios Open Source||Nagios Paid Tools|
|Nagios Core||Nagios XI|
|Nagios Plugins||Nagios Log Server|
|Nagios Frontends||Nagios Fusion|
|Nagios Addons||Nagios Network Analyzer|
Icinga is an open-source computer system and network monitoring application originally created as a fork of the Nagios system monitoring application in 2009. The best way for you to begin here is the Icinga get started page.
Sensu is based on a pipeline model to fill gaps in observability between metrics, logging, & tracing. Sensu Go's features are pretty impressive. It is Open Source and offers upto a hundred maximum nodes under its free plan.
Their documentation includes a step-by-step guide to deploy Sensu in production necessary to get you started.
LibreNMS is a fully featured, open-source network monitoring system which includes support for a wide range of network hardware and operating systems including Cisco, Linux, FreeBSD, Juniper, Brocade, Foundry, HP and many more.
NodeQuery provides insights into your servers health, availability and performance. The Open Source NodeQuery agent collects selected Linux server data, which is sent to their monitoring system for further processing.
Munin is a monitoring tool, accessible through a web interface. It surveys all your servers and remembers what it saw. It presents all the information in graphs. Munin is Open Source.
Uptime Robot works as another notifier that continuously monitors your website, similar to Better Uptime.
Uptime.com also alerts you about website downtimes by SMS, phone call or email. It checks your website availability at one minute intervals from 30 different locations across 6 continents. Uptime's pricing is based on basic, superior, business and enterprise plans.
Supervisord is a client/server process control system that allows its users to control a number of processes on UNIX-like operating systems. It was inspired from convenience, accuracy, delegation and process groups and based on Python. A GoLang version of Supervisord is also available on GitHub.
Graphite is an open-source enterprise-ready monitoring tool that runs equally well on cheap hardware or Cloud infrastructure used to track the performance of websites, applications, business services, and networked servers. It revolutionized server monitoring, by making it easier than ever to store, retrieve, share, and visualize time-series data.
Best described by the developer on GitHub, Cabot is a free, open-source, self-hosted infrastructure monitoring platform that provides some of the best features of PagerDuty, Server Density, Pingdom and Nagios without their cost and complexity. It is Docker ready and takes 5 minutes to deploy.
Glances reminds you of the top command that could be run via a web interface. It is a cross-platform system monitoring tool written in Python. It can also work in client/server mode with remote monitoring via terminal, web interface or API. Stats can also be exported to files or external time/value databases.
Pydash is a small web-based monitoring dashboard useful for Linux servers developed in Python and Django + Chart.js. It uses the Python libraries available in the main Python distribution, having a small list of dependencies without the need of installing many packages or libraries.
Monitorix was originally designed for monitoring Red Hat, Fedora and CentOS Linux systems, but today it runs on different GNU/Linux distributions and even in other UNIX systems like FreeBSD, OpenBSD and NetBSD.
It's free, open source and lightweight, capable of monitoring as many services and system resources as possible. It is of course suitable for production servers and its simplicity and small size allows deployment on embedded devices as well.
Here are all the specific stats that it can report graphically:
- System load average and usage
- Global kernel usage
- Kernel usage per processor
- Filesystem usage and I/O activity
- Network traffic and usage
- Netstat statistics
- Processes statistics
- System services demand
- Mail statistics
- Network port traffic
- Users using the system
- FTP statistics
- Apache statistics
- MySQL statistics
- BIND statistics
- Chrony statistics
- Fail2ban statistics
- Redis statistics
- PHP-FPM statistics
- Devices interrupt activity
Nixstats is a powerful and easy to use monitoring platform to keep track of server performance and website uptime. It doesn't require complicated setups and you can get started within minutes with a one line command to install the monitoring agent on all your servers.
Disney uses Nixstats for server monitoring and is a top consumer.
Cacti is an open-source, robust and extensible operational monitoring and fault management framework with a complete network graphing solution designed to harness the power of RRDTool's data storage and graphing functionality over time-series data.
Here are some of its main features:
- Remote and local data collectors
- Device discovery
- Automation of device and graph creation
- Graph and device templating
- Custom data collection methods
- User, group and domain access controls
Zenoss Server Monitoring goes beyond the traditional approach of separately monitoring silos of device types, like servers. It enables monitoring all servers as one part of a complete IT stack of cloud and on-premises infrastructure to ensure optimal application performance.
Zenoss offers customizable and extensible plug-ins to extend the Zenoss platform which are called ZenPacks. It's a flexible and highly extensible model that allows the Zenoss platform to extend discovery, performance and availability monitoring capabilities to new technologies quickly.
ZenPacks use standard APIs and protocols, including SNMP, WMI, SSH and many more, to collect real-time health and performance data from any type of system or application. There are currently more than 400 ZenPacks covering physical systems, containers, cloud deployments and applications that are classified across the following categories:
You can read more in the Zenoss Server Monitoring Datasheet.
ntopng allows high-speed web-based traffic analysis and flow collection as a portable and next generation version of ntop, a network traffic probe that monitors network usage, based on libpcap/PF_RING.
ntopng is available in four versions:
- Community (Open Source)
- Enterprise M
- Enterprise L
The Community version is free to use and open-source. The Professional and Enterprise versions offer extra features particularly useful for SMEs or larger organizations.
Shinken offers an open source monitoring framework(previously a solution) inspired from the "keep it simple" Linux principle. It has a self sufficient Web UI, which includes its own web server (independent of Apache). The Shinken WebUI is started at the same time Shinken framework does, and is configured using the main Shinken configuration file by setting a few basic parameters.
According to the official documentation, Observium is an auto-discovering network monitoring platform supporting a wide range of hardware platforms and operating systems including Cisco, Windows, Linux, HP, Juniper, Dell, FreeBSD, F5, Brocade, Citrix Netscaler, NetApp and many more. Observium seeks to provide a powerful yet simple and intuitive interface to the health and status of your network.
Observium is available in two editions:
- Open Source Community Edition: The community edition is released on a biannual cycle
- Subscription Edition: The subscription edition includes additional features, rapid bug fixes and feature improvements on a daily basis and an easy to use SVN-based update mechanism.
Puppet is a tool that uses a DevOps approach to help you manage and automate the configuration of servers. Puppet is available as:
The Puppet Server is a required application that runs on the Java Virtual Machine (JVM) and controls the configuration information for one or more managed Puppet agent nodes.
Which one do you use?
Compiling this list and navigating through these interesting features took me quite a while. But it was definitely worth building this exhaustive list of diverse and useful server monitoring tools to explore.
If you have any more tools to share or any suggestions, feedback or comment, please do not hesitate in using the comment section below.