SLAs: The "satisfaction guaranteed" warranty
Service level agreements are the telecom industry's "satisfaction guaranteed" warranty. In a marketplace teeming with competitors, SLAs are a competitive differentiator, an operator's way of saying, "Pick me because I offer SLAs."In
the past few years, SLAs have grown increasingly prevalent among service
providers due to a number of reasons. First, service providers are using SLAs to
prove to their customers that they can deliver advanced, value-added services.
By instilling faith in customers, service providers have an easier job of moving
these customers to higher-priced services in the future.
Second,
deregulation forces and the accelerated competition in the communications arena
have empowered communications customers, who now have a better understanding of
what they can demand from service providers. As customers become more
technologically savvy, they are calling for higher service levels and want to
see quality of service (QoS) guarantees in writing.
Third,
as customers grow ever more dependent on networked services, their willingness
to tolerate service downtime considerably diminishes. By offering SLAs, service
providers can set realistic goals and viable objectives from the get-go, thus
bridging the gap between customer expectations and operators' capabilities.
Helping customers understand that 100% service availability is unfeasible
increases the chances that customers will be more tolerable of service
disruptions.
Lastly,
not all customers are created equal. Customers who want optimal bandwidth will
be charged at a higher rate than customers who are satisfied with slower dial-up
speeds.
To
that end, when a network fault occurs and affects a service, an operator needs
to prioritize repair activities according to parameters such as customer class
or type of service that was affected.
Facets
of SLA management
To
give SLAs teeth, an operator needs to monitor services end-to-end from a
customers' viewpoint. To that end, an operator needs an operational support
system (OSS) that can define, store and correlate data on customers, services
and network elements. The chosen OSS should be able to collect and process
service level indicators and compare them to pre-defined SLAs to help operators
ascertain that guaranteed service levels are being met.
The
process whereby service providers define, monitor and report on SLAs is an
ongoing one, encompassing a number of phases:
SLA
definition
In
this initial phase, service providers need to populate the OSS's configuration
database with information on all service-related entities such as network
elements, services, customers and SLAs. By integrating and correlating network
data with customer and service data, an operator can create customer-aware
network information on services. This helps it to pinpoint services and
customers that are affected by failures and identify those subscribers that
consistently hit bandwidth limits and who are candidates for service upgrades.
During
this phase, operators also need to define service level indicators and the
methods that will be used to measure them. For instance, service performance can
be measured according to indicators such as availability, latency and
throughput; service provisioning can be measured as the time that it takes to
provision a new service; and customer care responsiveness can be measured as the
average waiting time that a customer waits in the call center.
Finally,
operators need to define the SLA contract between them and their customers based
on a set of rules--for example, average time-to-repair = 2 hours; maximum
time-to-provision = 2 days; minimum availability = 98%. Other than specifying
the availability and performance of networked services, the SLA also sets
penalties in case of SLA violations.
SLA
monitoring
Today,
customers want service levels to be understood from their perspective, not from
server statistics. They want services to be measured as a whole, not in parts.
From the end user's perspective, the availability of individual network
components along the service path does not necessarily reflect the QoS being
delivered. For instance, while a 99% network availability guarantee might apply
to the overall average availability of the network to all subscribers, it might
not apply to a particular user. For a customer who cannot download data or
access crucial information using a laptop, 99% availability means little if
he/she is suffering from poor performance or delayed service.
Using
an OSS that leverages multiple data sources such as performance measurements,
fault metrics, provisioning indicators and call detail records, an operator can
monitor service levels as the end user experiences it.
Performance
measurements include error seconds, severe error seconds, unavailable seconds
and bit error ratio. To extract meaningful performance indicators from these
measurements, the OSS must process the collected data. In cases where a service
is supported by just one network element, the data is collected and processed
from that one element (for example, in order to measure web availability, the
OSS needs to collect and process measurements from the web server only). In
cases where a service rides on a number of elements, the measurements need to be
collected and processed from the entire service path.
When
it comes to fault metrics, indicators such as time down and time back to service
can be derived from alarms or trouble ticketing applications. For example, from
the time stamps of service faults, the repair time can be processed and at the
end of each month the mean time to repair and the mean time between faults can
be calculated.
Call
detail records are another data source that provides invaluable SLA information
on a per-user basis and which can be used to assess service health. Call detail
records are collected from network elements such as switches, web servers and
gatekeepers. Using call detail records, service providers can assess call
completion rates in specific areas, perform call failure analysis to ascertain
which subscribers in which zones are affected by network failures and measure
traffic volumes.
In
addition to monitoring services end to end using various service indicators,
operators can also set thresholds based on service indicators. Thresholds can be
defined for types of services, such as "Gold" services, as well as for
raw indicators such as error seconds and calculated indicators such as service
availability. For example, an operator can set a threshold whereby the
availability of a given Web hosting service should conform to 90% availability
during scheduled business hours. If the pre-defined threshold is about to be
exceeded, the OSS notifies the operator of the impending problem. This enables
the operators to take action before the SLA is breached and the customer
experience is adversely affected.
By
proactively monitoring services, not only can operators treat problems before
they escalate into crises, but they can also take preemptive measures to respond
to changing network conditions and maintain high-quality service levels. This
includes passing some traffic at the expense of other traffic (real-time
transactions as opposed to low-priority tasks, for example), allocating more
bandwidth during peak hours to prevent overbooking and otherwise fine-tune
network elements.
Report
generation
An
integral part of SLA management is the ability to report on service quality. SLA
reports are important in that they enable operators to quickly grasp the status
of networked services by presenting information in an easy-to-understand,
graphical format. SLA reports enable a customer to receive up-to-date
information on the QoS against SLAs and assess whether or not the service
provider is delivering agreed-upon service levels.
An
OSS should produce both predefined reports and customized reports tailored to
the unique needs of a service provider. In addition, an ideal OSS should enable
customers to access reports via the web at their own convenience.
Real-time
reports enable operators to track the status of SLAs, graphically see when a
service has degraded, and pinpoint potential failures or poorly performing
network elements before an actual breakdown occurs.
Historical
reports enable operators to identify long-term trends and investigate recurrent
problem areas in the network. Historical reports are also useful for capacity
planning purposes, enabling service providers to plan for future expansions and
network growth more effectively.
SLA
modifications
Service
providers and their customers should meet at regular intervals to assess whether
any changes have been made in the communications market or in the customer's
organization that would necessitate modifying the SLA. For example, if the
customer's organization has hired a significant number of new employees, it is
reasonable to assume that the traffic on the network will greatly increase,
which might lead to slower response times or increased downtime. That is why it
is important that the two sides periodically convene to discuss recent changes
and how they will affect the SLA. By having an OSS in place, operators can
modify the SLA accordingly.
SLA
management in practice--VoIP
VoIP
services are conveyed across the IP domain, which spans IP routers and switches,
and the voice domain, which encompasses the traditional public network, media
gateways, signaling gateways and gatekeepers.
Service
indicators can be defined for each domain based on various sources of
information, including:
-
Alarms from various network equipment--which can be used as the basis for calculating the availability of both the IP and voice domains
-
The MIB of the routers--which is the main source of data on the performance of the IP domain. For example, data link utilization can be calculated based on the bandwidth of the link and on the number of packets in the ingress and egress of the link. Link utilization can serve as a service indicator to gauge network performance, where high utilization spells poor performance and low utilization indicates high performance
-
Call detail records generated by gatekeepers and media gateways--which can be used to define the Answer Seizure Ratio of the link between hubs
-
In the case of pre-paid VoIP services, the authentication, authorization, accounting (AAA) server and the interactive voice response can both be used as sources for service indicators. For example, the authentication success rate can be calculated based on the logs that the AAA server stores for every call attempt that was made.
For
instance, a VoIP operator can sign an SLA with a customer and define service
level thresholds and QoS as follows:
-
Committed availability of the whole network will be greater than threshold1
-
Average roundtrip latency in the IP domain will be less then threshold2
-
Number of high utilization events in a month will be less then threshold3 (an event can be classified as a high utilization event when the link utilization crosses the 80% mark for more than 10 minutes)
-
The answer seizure ratio between two destinations will be greater than the committed rate
-
The authentication success rate will be higher than threshold4.
The
above agreed-upon SLA provides a complete and detailed picture of the QoS being
delivered to the customer. This helps customers understand what it means to have
high quality service and assists them in deciding how much they are willing to
pay for it.
SLAs--A
competitive advantage
In
today's crowded market space, where customers can switch to a rival organization
with one call or the click of a mouse, an operator can set itself apart from the
pack by offering SLAs and demonstrating that it can deliver agreed-upon QoS
levels. Using an intelligent OSS, service providers can:
-
Minimize revenue loss by monitoring services in real time and taking corrective measures before SLA violations
-
Increase profits by offering verifiable SLAs that foster trust and customer satisfaction and help make premium-priced services easier to sell
-
Attract and retain customers by ensuring the performance and availability of business-critical services that they rely on
-
Meet and manage user expectations of services by agreeing on service parameters that at the same time are both measurable and meaningful to the customer.
Avichai
Levy is Vice President of Marketing for TTI Telecom.
Visit
TTI Telecom online.
advertisement
Learning Library
Webcasts
Using Real-Time Offers, Alerts and Interactions To Improve the Mobile Broadband Experience
In this Webinar you will learn how to create a real-time relationship with your customers, how to proactively improve the customer experience, and how to successfully target and cross-sell services to boost incremental revenue.
- Megabytes to Megabucks, Bandwidth to Business Models: How 4G Is Changing Everything
- How to Unplug Your Redundant Telco Apps To Save Money and Improve Efficiency
- When IaaS Isn't Enough: Service Provider Business Models to Drive Growth and Build Margin
- How to Transform Your Aging Telco Voice Network to Drive New Profits and Revenue
- Creative Licensing Approaches for Telcos & Their Network Equipment Vendors
- Smart Home Opportunity: Balancing Customer Data & Privacy
White Papers
The Role of Diameter in All-IP, Service-Oriented Networks
This paper discusses the rise of Diameter and benefits of Diameter Protocol.
- Conducting The Orchestration – Order Management at the Speed of Business
- Toward a Converged Network Edge
- Beyond Spam – Email Security in the Age of Blended Threats
- 6 Important Steps to Evaluating a Web Filtering Solution
- The Expertise to Protect You from Botnet and DDoS Attacks
- Seeing is Believing – Bridging the Order Visibility Gap
Featured Content
A time and money saving approach to fiber deployment
Service providers are under tremendous pressure to turn up new services faster then before and, at the same time,
to do it at less expense - and intra-office fiber is one of the biggest challenges in terms of both cost and service
turn-up.
of interest
The Latest
News
From the Blog
Briefingroom
Join the Discussion
Resources
Get more out of Connected Planet by visiting our related resources below:
Connected Planet highlights the next generation of service providers, as well as how their customers use services in new ways.
Subscribe Now







