The network monitoring tools we use are very extensive. The backbone of the operation is based on the Kaseya engine. A small local agent is installed on each client that reports back to the Apex Central Command (ACC), where the data is processed. It is, by definition, a pure client->server structure.
The communication is internet or VPN based and we must have an external connection for this to work. The great thing about the agent is the minimal resources it takes on the local system. From our experience, the agent is robust and eats only 2-10 MB of RAM on our client systems.
Monitoring and Alerting
The local agent can monitor any process or threshold that the client deems necessary. We make suggestions of course (we do set up the basic monitoring by default which would consist of connectivity, disk space, RAM usage, critical processes based on the responsibilities of the machine) but its really what the client feels is imperative. We write small scripts to add the unique monitoring thresholds and then we set up the alert process. We can also monitor the Event Viewer for certain events and react on those.
When the agent sees a threshold has been hit, it reports back to the ACC. The ACC then follows the client specific protocol on what to do next. In an emergency, for instance, it starts calling cell phones of the Apex engineers. If it is a proactive alert, an email is sent to the engineering group, who respond and react to the problem.
Need a Han? No, I'll do it Solo
The agent can be programmed to handle some things on its own. This can be effective if the process is simple and can save the time it would take an engineer to be altered and respond. For instance, the agent can be programmed to restart processes and/or machines if a certain situation arises.
The agent can also install patches, perform disk defrags or run other maintenance scripts. Its a robust scheduler with fail-safes built in.
Almost Limitless Reporting
We were and still are amazed at what we can report on using this system. From license and registry keys to mobo numbers and specifications (slots used and available), RAM usage, disk space, installed programs, bandwidth usage, ... it goes on and on.
When we have to get something out of the ordinary, we have to write a script. The script is created on the server, which then instructs the agent to run it and report back the findings. This was very useful in the following scenario:
A client reports that someone may have installed a viral piece of malware. After investigating, we find that there is a common registry key that the malware creates. So, we create a script to scan all the machines for the specified registry key. We schedule the execution of the script and wait for the results. A few minutes later we have solid idea on how much this virus had spread, if at all. From there, we can use basic removal utilities to get the client back to normal.
Hardware and software inventory is also very valuable to the client, all from the agent. We can create scripts to scan for anything out of the ordinary. Mostly though our general reports give the client enough information to know what exactly they have in the company.
Anti-Virus, Spam Control and Backup Imaging
"We're small, but we're strong!" The local agents, as mentioned above, are tiny in size and order very little at the Resource Drive-Thru. However, they can perform some monumental tasks that can save a client money and simplify the application infrastructure.
Anti-Virus, Spam Scanning- The agents grab the latest definitions from the ACC and do real-time scanning on the local system. This allows the client to remove any other AV application that may take up more resources and helpdesk time. The agents can be told to ask the ACC for definitions at unique intervals. If the agent is offline (the computer is turned off) at the scheduled check-in, it will check in when the ACC connection has been re-established.
Backup Imaging- The agent can back up files or take a snapshot of the machine and send it anywhere it can connect to. Sometimes we upload the data offsite or we can send it to a file server in-house. The restores are bare-metal and if everything goes right we can usually restore a machine or server in less than an hour. The backups can be "point-in-time" snapshots depending upon the critical nature of the machine.