Instead of taking the entire data set, look at a subsample. It's more valuable if you have a size of a data set that allows you to work very quickly and try out different ideas. And then only over time, when you think you know this is the right thing and have to make more fine-grain decisions, can you move up to the thing.
When it comes to real-world or business-related problems, you really need to know what you want to achieve. So what is the metric? What kind of performance is the performance level you actually need? You should, beforehand, know the expected performance level that would be OK for the application so that you can also know if I need more data, is this already good enough, and so on, and how to measure it. And especially in some cases, it might also be very clear what the prediction and accuracy that you are looking for are.
The labels and the evaluation metrics will help assess just how good the unsupervised models are at catching known patterns. Unsupervised learning systems are much harder to evaluate than supervised learning systems. Often, unsupervised learning systems are judged by their ability to catch known patterns. It's important to be mindful of this limitation as we proceed in evaluating the results. With time series data where you have highly interdependent data, you really have to make sure that you're testing on the right thing. You have to be really careful there to make sure that the estimate of the performance that you get from data analysis is the thing that you will also then see in reality.
Eventually you want to bring data analysis to production. On the upper right hand side, we have the data science part, and on the lower side is the engineering part. it's not a one-way road where you start with the data, you create your model, you bring it to production. But actually, the next round will be how can we improve the model? So you should definitely, from the very beginning, already think of this whole process like a bigger iteration workflow between production and data science.
Anomaly detection methods and models don't have enough context themselves to know if a system is actually anomalous or not. It's your task to utilize them for that purpose. On the flip side, you also need to know when to not rely on your anomaly detection framework. When a system or process is highly unstable, it becomes extremely difficult for models to work well. We highly recommend implementing filters to reduce the number of false positives. Some of the filters we've used include:
Although it's easy to get excited about success stories in anomaly detection, most of the time someone else's techniques will not translate directly to your systems and your data. That's why you have to learn for yourself what works, what's appropriate to use in some situations and not in others.
Our suggestion, store the results, but don't alert on them in most cases. And keep in mind that the map is not the territory: the metric isn't the system, an anomaly isn't a crisis, three sigmas isn't unlikely, and so on.