Expert Interview Series: Steve Francis of LogicMonitor on the Value of Data Collection and Monitoring
Steve Francis founded LogicMonitor, the leading provider of SaaS-based performance monitoring for modern IT infrastructure.
We recently asked Steve for his insight on the value of data-collection and monitoring. Here’s what he shared:
Can you tell us about the mission behind LogicMonitor? What sets you apart from your competitors?
The mission of LogicMonitor is to simplify IT for humankind. That’s a lofty goal, especially seeing as we focus only on datacenter infrastructure – not consumer or even desktop IT. But by helping companies get visibility into the performance, capacity and availability of their datacenters, we allow them to focus on their business – so they can deliver better experiences to their customers and employees.
What separates LogicMonitor from its competitors is that we are the only Enterprise monitoring system that is provided as Software-as-a-Service. There are many enterprise monitoring systems out there. There are quite a few SaaS monitoring systems out there. But if you want a platform that can monitor all the infrastructure in your datacenter that is running your business critical applications, and you want to monitor the new technologies you are deploying in the cloud – and you want to see it all in one platform, with automation and have the scaling and no overhead advantages of SaaS – LogicMonitor is unique.
How has managing IT infrastructure evolved since LogicMonitor was founded? From where you sit, what has made the biggest impact on the way companies are managing IT today?
In the last 10 years, the evolution of IT infrastructure has been a story of accelerating change. Virtualization increased the ability of enterprises to spin up servers rapidly, without having to wait for hardware to be provisioned. The public cloud, containerization, orchestration tools, service-oriented architecture, software-defined everything – these are all accelerating the rate of change in IT infrastructure. And consequently making IT monitoring tools not designed with velocity and flexibility in mind increasingly ineffective.
What are the most common challenges or pain points facing businesses today in monitoring their IT infrastructure? How do you help them address these challenges?
Most businesses today have a few challenges in common. It’s difficult to find enough IT or development talent. IT is being looked at as a strategic differentiator, and source of innovation for the business, which is great, but means that scarce people and resources need to be used strategically, to aid in digital transformation, not to manually manage older systems. And the pace of technological change is such that it’s hard to keep up.
LogicMonitor helps by delivering visibility into their entire set of IT applications and infrastructure. This means as companies embark on digital transformation, they can use LogicMonitor to consolidate a variety of monitoring systems. LogicMonitor has embedded knowledge and automation, so frees up skilled staff. And as a business moves applications to modern technology, LogicMonitor presents a unified view of the performance before, during and after the migration, in a single pane of glass.
Why is having a good picture of IT assets so critical to business today?
IT is now mission critical to every business. It used to be phone systems that were the key to office productivity – now it’s email, chat and business systems. Most people now don’t care if phones are down. Companies are delivering their services in the way that their customers want to consume them – in many cases, this means online. Which means that if you don’t have good visibility into the performance, availability and capacity of your IT infrastructure, so you can get ahead of issues and proactively prevent slowdowns or outages, you are damaging the very brand of your company.
What should businesses be monitoring in their IT infrastructure? What best practices would you recommend?
The short answer is businesses should be monitoring everything. I mean this in the sense that every piece of equipment, every application, every cloud service, should be monitored. Everything running in the business is adding value to someone. (If that’s not the case, it should be shut down – there is a cost to running unneeded infrastructure.) If it’s adding value, it should be monitored. This doesn’t mean it needs to be alerted on – no one needs to be woken up at 2 a.m. for a development system being offline. But if the developers depend on that system, they need visibility into its performance and availability. Otherwise it will consume needless time when there is an issue, tracking down why it’s not reachable – is it the network? The developers local workstation? Disk space exhausted? Or is the system itself down? Things that can be seen at a glance from a dashboard.
I also mean that when monitoring devices and applications, you should monitor them in great depth. Just tracking CPU, memory and network usage is not enough. When there is an issue, having more data will help you get to the root cause much faster. LogicMonitor typically monitors hundreds of different datapoints on a device. Having this data presented visually allows even brand-new problems to be resolved quickly.
What are the most common mistakes you see organizations making in regards to monitoring IT? What assets do you find are too often overlooked?
The most common mistake is simply not monitoring enough. Partly this results from people’s experience with more limited monitoring systems. They may monitor the virtual machines and the hypervisor – but not the storage array that provides the storage for the hypervisor, and not the application, such as Kafka, that the virtual machine provides – simply as they are not used to having a single system that can collect data from such a disparate array of systems.
What advice do you find yourself repeating to clients over and over?
Monitor everything. Make every alert meaningful. If an alert is triggered, it should be a real alert and escalated to the right team, via the right method (which may be, for some alerts, no escalation at all, and for some, may entail text to voice calls or SMS), at the right time. For every alert, either fix the underlying issues; schedule suppression of the alert until you can fix it; or tune the thresholds or disable the alert so it doesn’t trigger. LogicMonitor has tools to help this process.
The one other piece of advice I often give is to never regard any service incident as resolved if the monitoring didn’t give timely and accurate indication of the issue. Even if you’ve fixed the issue in this case, if it could happen again without monitoring alerting you – don’t close the incident until you’ve tuned your monitoring.
What trends or innovations are you following in the world of data monitoring and IT infrastructure? Why do they interest you?
There’s lot’s going on right now. The monitoring, management and orchestration of containers is changing the way software gets deployed. The rise of serverless computing is an interesting paradigm. All the trends are really moving towards the application or service, as the item that is significant from the IT management point of view. Of course, to maintain a service, you need visibility into both the service, and the underlying components of the service, even as they are dynamically changing, and then to quickly identify root-cause of any issue. This is the problem LogicMonitor is solving right now.
Get real-time landing page data enrichment. Sign up for a free trial.