The LTE signaling challenge
As 4G networks scale, carriers are finding one of the biggest challenges to keeping networks up and running is scaling the signaling and control plane. Overloads, as some operators have already found, can take a 4G network down in a snap – and it may get worse before it gets better.
While mobile operators race to add radio access network and backhaul capacity to handle the sheer weight of smartphone and tablet traffic, they’ll do well to address another point of possible overload – the signaling and control portion of the network.
It’s a lesson that some operators – including Verizon Wireless, Telenor and Telia Sonera – have experienced first-hand as glitches in their IP signaling infrastructure have caused outages from small to outright embarrassing (even if the outage causes are often not at first understood).
At issue is that not only do mobile devices generate much greater demand for network bandwidth, they also cast off IP session and especially Diameter signaling requests like a bored fisherman. According to one signaling expert, launching the iPhone’s browser, for example, instantly sets off about fifteen individual network signaling requests. Beyond that, 4G network software elements supporting increasingly sophisticated mobile service scenarios “talk” to each other at rates that traditional TDM/SS7-based networks never had to deal with.
All of that signaling can shut down a network.
Earlier this year, Verizon Wireless suffered an early outage on its high-profile LTE network that it eventually traced back to a signaling problem (CP: IMS software bug caused Verizon LTE outage). A network element in Verizon’s IMS core experienced a software problem, which rapidly escalated, affecting the core’s back-up systems and eventually shutting off all access to both its 3G and 4G mobile data networks, Verizon Communications chief technology officer Tony Melone explained.
Norway’s Telenor, meanwhile, reported to regulators after its own 18-hour outage earlier this year that it was caused by a ‘signal storm’. According to one report, “Telenor's surveillance systems showed that the traffic between servers in the mobile network increased far beyond normal levels…the increased signalling traffic continued to increase to the extent that the servers no longer managed to connect calls and SMSes to recipients.”
The biggest problem, says Tekelec’s Emery, is that vendors and carriers too often deal with signaling on a network element-by-element basis. That breeds inefficiency and far too many point-to-point connections. As the signaling load grows, the resulting “n-squared” increase in signaling traffic can quickly overwhelm servers on the network, Emery says.
“It’s good that Diameter has been adopted for all these uses, but there hasn’t been a focus on the network-level robustness of protocol, things like congestion management, peer congestion control, redirecting on failure. That wasn’t considered part of the protocol nor was it designed into each network element” that implemented Diameter, Emery noted.
To meet such challenges, vendors like Tekelec, 4G signaling specialist Traffix and eventually, if not yet today, large IMS/RAN players like Huwaei and Ericsson, are delivering so-called Diameter routing solutions. Such “routers” sit at the center of LTE network software elements and centrally and intelligently help distribute and scale network signals.
Tekelec debuted the second release of its Diameter router this summer and is targeting a third release by the end of the year. Traffix will update its Diameter Edge Agent at next week’s 4G World with tighter security and new capabilities to support roaming, billing and third-party content scenarios.
“LTE is, door-to-door, creating new signaling requirements for Diameter,” says Traffix CEO and co-founder Ben Volkow, noting that the main driver is “the fragmentation of the network. There are just more and more boxes and more and more complexity and signaling.” Even seemingly neutral developments, like the use of TCP as a bearer for Diameter traffic, are increasing the load. Because TCP requires an acknowledgement for every message sent, its use essentially doubles signaling traffic, adding to the strain, Volkow said. Overall, 4G networks generate about 30 times more signaling than traditional carrier voice networks, he said.
“There are so many boxes, so many data centers, so many complex use cases,” Volkow said. “With voice over LTE and machine-to-machine coming too, the amount of signaling will only grow. It’s going to be a bombardment. You can’t manage that type of network point-to-point.”