Slow performance is defined when the cluster is actively processing IO requests, but it appears to be operating at a lower performance level than what is expected. Generally, slow performance is caused by a component of your Ceph cluster reaching saturation and becoming a bottleneck. This maybe due to an increased number of client requests or a component failure that is causing Ceph to perform recovery.
Although there are many things which may cause Ceph to experience slow performance, here are some of the most likely causes.
Sometimes, slow performance may not be due to an underlying fault; it may just be that the number and the type of client requests may have exceeded the capability of the hardware. Whether this is due to a number of separate workloads all running at the same time, or just a slow general increase over a period of time, if you are capturing the number of client requests across your cluster, this should be easy to trend. If...