Here is a structured response you can provide to the user, focusing on the most “actionable” KPIs that drive actual performance improvements.
Top 4 Actionable UDM/HSS KPIs
1. Registration Success Rate (SR)
This is the “bread and butter” of subscriber management. If this drops, users can’t attach to the network.
* Why it’s actionable: Low SR often points to specific issues like Diameter/Sbi signaling congestion, authentication failures (triplet/quintet retrieval issues), or synchronization errors (SQN out of range).
* Huawei Specifics: Monitor L.HSS.Reg.Succ vs L.HSS.Reg.Att.
2. Authentication Success Rate
This measures the ability of the UDM/HSS to challenge and verify a subscriber.
* Why it’s actionable: Failures here are often due to HSS/UDM database sync issues or mismatching security algorithms (MILENAGE/TUAK). Frequent “Sync Failure” causes usually mean the HLR/HSS is out of sync with the USIM, which can be fixed by adjusting resynchronization procedures.
3. Mean Processing Latency (Request Response Time)
This measures how long the UDM/HSS takes to process a message (e.g., S6a, Sh, or N8/N10 interfaces).
* Why it’s actionable: High latency is a precursor to a crash or “Signaling Storm.” It is usually solved by load balancing across Front-End (FE) nodes or checking for CPU bottlenecks in the User Data Repository (UDR/CUDB).
4. Database (UDR/CUDB) Access Success Rate
The HSS/UDM is just a “brain” without its “memory” (the UDR).
* Why it’s actionable: If the FE can’t talk to the BE (Backend), the whole node fails. Actionable steps include checking the LDAP/SQL link health and ensuring the provisioning flow (MML/Soap) isn’t locking the database during peak hours.
Best Practices for Improvement (Huawei Context)
* Audit Signaling Retries: Check if the MME/AMF is retrying too aggressively. Reducing the retry timers on the Core nodes can prevent the HSS from being overwhelmed during a network recovery.
* Check “User Not Found” Trends: A high rate of “User Not Found” errors often indicates a provisioning lag or a mismatch between the HLR and HSS data during migration.
* Traffic Shaping: Use Huawei’s Diameter Congestion Control (DCC) or Sbi over-load control to prioritize existing subscribers over new attachments during high-load scenarios.
Note: Always differentiate between Functional Failures (Network issues) and User Failures (Wrong SIM, expired subscription) to ensure you aren’t chasing “ghost” technical issues that are actually commercial ones.