CA cells with bad DL/UL cell throughput

Experts, any recommendations for identifying cells with bad DL/UL cell throughput? In case of DL , CA complicates things.
An approach that can be used for non CA cells is to take for example the 8 top hours for all the cells and generate a table per EARFCN and CQI with a throughput value . Then compare this value with the each cell’s throughput and see if we have great divergence. But this does not work for a network that has extensive CA usage .
Maybe some data science approach is best.

Cross throughput with PRB usage and then filters those with less than 80%.

Already did that.
The problem is that in the DL we have 2CA/3CA and it muddles things.

Let me check since I have some curves that indicates the thresholds for those kind of configuration.
But for sure you have to identify with is the minimum throughput one cell can have in any configuration in order to identify if is candidate for optimization or not.