Eternal Vigilance
Improved network reliability is like being 10 years old and getting a shirt for Christmas: Nice, but there were other things you wanted.
Industry News
Blogs
Briefing Room
advertisement
But that's better than suffering an outage and the embarrassing press coverage that usually follows. In May, an Ericsson switch failed and disrupted service to AT&T Wireless subscribers in the New York City area for about 22 hours. The moral of the story: Failures occur, so prepare for them.
Wireline carriers must notify the FCC of any outage that affects more than 30,000 subscribers. The FCC exempts wireless carriers, but subscribers don't. Iron-clad reliability couldn't come at a more critical point, as wireless positions itself as a replacement for wireline, and wireless data promises to provide anytime, anywhere access to information such as corporate databases.
The beauty of SS7 is that it was designed with failure in mind. Any shortcuts that undermine its inherent redundancy can result in a bad design that will come back to haunt.
"We didn't cut any corners on the basic SS7 design," said Mel Bailey, Triton PCS manager of network-operations support. "We have 3-way diversity where we can. It's a good design, and because of it, we've had absolutely no outages. A lot of people cut corners and put in just single A links and single F links. They don't follow the basic rules of SS7 design, and that gets them into trouble."
Even that's no guarantee, especially when you're leasing lines from other carriers. Periodically auditing those facilities can provide additional safeguards because facility providers sometimes regroom their networks.
"All of a sudden, a link that you ordered as diverse isn't diverse anymore," said Paul Florak, Illuminet director of network services. "So we have a pro-active audit system where we try to cycle through all the links to ensure that the diversity originally ordered is still in place."
Ordering those lines from two different carriers also helps ensure diversity, but it isn't a guarantee that those lines don't merge at some point.
"In order to ensure that physical diversity is maintained through all of the connections along the route, you have to ask the carrier to provide you with that detailed information," said Mark Ripley, GTE Wireless assistant vice president, area networks. "In some cases, you have to lay those two routes on top of each other to ensure that they're not joining back into a common pipe somewhere along the route."
But it's not always possible to track down that information. One alternative is service-level agreements (SLAs), which force transport carriers to look even more closely at their facilities.
"This would penalize the carrier for not meeting certain levels of uptime for the leased line," said Shashi Gowda, Boston Communications Group SS7-network engineer. "Costs for SLAs vary, depending on the level of uptime written into the agreement and the carrier that's providing the line."
For carriers that haven't completely migrated to SS7, protocol conversion is another potential weak link. Granted, they're using conversion strictly to support seamless roaming and not for call setup, but for subscribers, reliability doesn't mean just service in the home market.
"If the link between their switch and our protocol converter isn't redundant like an SS7 A link, they have a potential point of failure there," said Illuminet's Florak. "The protocol-conversion box is not mated and redundant like a signal transfer point (STP), so there's a (potential) point of failure there."
LOOK BEFORE YOU LEAP With multiple wireless carriers in most markets, time-to-market pressures can curtail the testing necessary to catch any glitches. A protocol emulator helps assess the effect that a change might have before it's deployed across the entire network.
Software glitches can be a double-whammy because they tend to ripple through the network. If a software upgrade takes down a signaling node, other nodes also can suffer as the network struggles to re-route traffic.
Hardware failures also can reveal software flaws -- and at the worst possible moment.
"Software that's in response to a hardware failure is going to be some of the software that's least commonly executed," said David Crowe, Cellular Networking Perspectives editor. "A hardware failure can uncover bugs in the software (if) that software has never been tested because there are so many conditions. That's all it really takes to cause a major problem."
It's impossible to test every condition, so chances are there are always going to be some uncovered bugs. Catching them before they do more damage is easier with a network-surveillance system, which can streamline troubleshooting by pinpointing exactly where the SS7 message failed. Illuminet used its new surveillance system to catch a glitch that was eating up network capacity.
"We had an LNP looping problem," Florak said. "An LNP message was going out, getting routed to one network and coming back and getting re-routed and coming back. (That) caused a looping condition where it started to put some links in jeopardy in terms of congestion."
RUN-IN WITH MURPHY'S LAW Ensuring reliability goes beyond engineering. It's a no-brainer to keep key equipment off-limits from anyone who walks into the facility, but it's less obvious -- although no less important -- to limit access to those who are qualified to work on those elements. If that's not practical, tagging key circuits with red flags is one alternative.
"You've trained people to know that circuits that are red-tagged have extreme importance and that they shouldn't be working around them or touching them without knowing what they're doing and what those circuits mean," said GTE Wireless' Ripley. "You don't necessarily have to have physical separation, but you have to have logical separation, where people understand the importance of those circuits."
That's good advice because a maintenance window typically is a network's most vulnerable time. One common mistake is new parameters entered incorrectly. Unfortunately, the problem usually isn't apparent until after the upgrade is complete and an outage occurs. That's why it is smart to have a contingency plan mapped out beforehand in case the upgrade goes awry.
"In addition, we have a formal change-management process that occurs prior to any changes being made," said Toby Seay, AT&T Wireless network-operations engineer. "This assures that a method of procedure, including (documenting) backout procedures, is completed before the changes are made."
Outages also can stem from the unlikeliest sources. In July, a small fire in a Bell Canada switch building triggered the sprinklers. The fire department wouldn't allow the generators to be turned on, so the batteries drained, and eventually the switch went down.
"Nobody probably thought, 'Gee, we've got generators, and we've got sprinklers, and those two things can't be used at the same time,'" said Cellular Networking Perspectives' Crowe. "Those are the kinds of things that get you."
So are industry trends that seem to have little to do with network reliability. Case in point: mergers and acquisitions. Suppose that you're leasing fiber, and you're talking to an STP over what you assume are two physically separate links. How can you be absolutely sure that they stay separate all the way to the destination?
"That was an easier task when you had a more monopolistic approach to telecommunications," Ripley said. "But in today's environment, where you have lots of sharing of pipes and high-speed facilities, it's really incumbent upon the carrier to get all of that information to ensure that those pipes are physically separated."
If there's an upside, it's that mergers also can improve reliability by applying the best practices from both companies to the new operation.
"With our purchase of Ameritech properties, we're getting insights on their best practices that we can bring into GTE," Ripley said. "Mergers and acquisitions are exciting times, not only for looking at how to redeploy network assets but also looking for adapting best practices among the companies."
That synergy couldn't come at a better time. Mergers provide economy of scale, but they don't keep competition at bay for long. Grabbing market share and boosting revenues often means deploying new services quickly.
"That is a pressure on all of us," said Michael Anderson, Ericsson BSS-marketing manager. "Not too many years ago, we thought a 2-year development cycle was very short. Now, in two years, a product can come and go and be completely phased out."
Getting ready to do work that might affect network reliability? The Alliance for Telecommunications Industry Solutions' Network Reliability Steering Committee suggests first asking these 10 questions:
1. Do you know why you are doing this work?
2. Have you identified and notified everybody, including customers and internal groups, who will be directly affected by this work?
3. Can you prevent or control service interruptions?
4. Is this the right time to do this work?
5. Are you trained and qualified to do this work?
6. Have you considered all the maintenance dos and don'ts that apply to this procedure?
7. Are the work orders, methods of procedure and supporting documentation detailed, current, error-free and approved?
8. Do you have everything you need to back out quickly or restore service if something goes wrong?
9. Do you know whom to call if something goes wrong?
10. Have you walked through the procedure and understood it?
Northwood Geoscience's Virtual Frontier, an interactive 3D mapping software package, brings together terrain visualization and thematic mapping in a 3D environment, enabling spatial-information users to explore and present data in new ways. Virtual Frontier creates realistic 3D landscapes by draping imagery and placing 3D objects on scenes rendered from commonly available terrain data. It connects directly with MapInfo Professional to access MapInfo.TAB files.
www.northwoodgeo.com
Tellabs' Verity 3300 DS3 broadband echo-canceller system is a high-density, low-power system that provides Tellabs' next-generation echo-canceller technology and voice-band enhancements at a standard DS3 (45Mb/s) interface. The system increases the efficiencies of cabling, power distribution and setup while putting more channels of echo cancellation in the same amount of space previously needed. The system improves network reliability by offering full redundancy on both DS3 interfaces, system control and echo-canceller modules.
www.tellabs.com
Want to use this article? Click here for options!
© 2012 Penton Media Inc.
advertisement
Learning Library
Webcasts
Using Real-Time Offers, Alerts and Interactions To Improve the Mobile Broadband Experience
In this Webinar you will learn how to create a real-time relationship with your customers, how to proactively improve the customer experience, and how to successfully target and cross-sell services to boost incremental revenue.
- Megabytes to Megabucks, Bandwidth to Business Models: How 4G Is Changing Everything
- How to Unplug Your Redundant Telco Apps To Save Money and Improve Efficiency
- When IaaS Isn't Enough: Service Provider Business Models to Drive Growth and Build Margin
- How to Transform Your Aging Telco Voice Network to Drive New Profits and Revenue
- Creative Licensing Approaches for Telcos & Their Network Equipment Vendors
- Smart Home Opportunity: Balancing Customer Data & Privacy
White Papers
The Role of Diameter in All-IP, Service-Oriented Networks
This paper discusses the rise of Diameter and benefits of Diameter Protocol.
- Conducting The Orchestration – Order Management at the Speed of Business
- Toward a Converged Network Edge
- Beyond Spam – Email Security in the Age of Blended Threats
- 6 Important Steps to Evaluating a Web Filtering Solution
- The Expertise to Protect You from Botnet and DDoS Attacks
- Seeing is Believing – Bridging the Order Visibility Gap
Featured Content
A time and money saving approach to fiber deployment
Service providers are under tremendous pressure to turn up new services faster then before and, at the same time,
to do it at less expense - and intra-office fiber is one of the biggest challenges in terms of both cost and service
turn-up.
of interest
The Latest
News
From the Blog
Briefingroom
Join the Discussion
Resources
Get more out of Connected Planet by visiting our related resources below:
Connected Planet highlights the next generation of service providers, as well as how their customers use services in new ways.
Subscribe Now







