Now hear
In the rush to be first to market with voice-over-IP deployments, service providers must first test for the most fundamental requirement: How does it sound?
Industry News
Blogs
Briefing Room
advertisement
A technology isn't worth much if it doesn't deliver. That's a no-brainer for providers, but making sure a new service such as voice over IP meets customer standards can cause some headaches.
Voice over IP is being touted as an acceptable way to carry voice communications because of the added features that it can offer over traditional public networks. However, for voice over IP to be commercially viable, itmust at least meet the service-quality standards of the public network.
Regardless of the efficiencies gained in network integration, bandwidth use and network upgrades, voice-over-IP networks must deliver public network quality to those ultimately footing the bill: end users. Voice quality will be the most immediate measure of success of voice-over-IP implementation. But what exactly are the factors that make up "public network voice quality," and how is it measured and tested? New metrics and strategies for testing across data networks exist and can offer rewards to those who use them.
The field of voice-quality testing is developing rapidly. Service providers must pay close attention to these testing strategies because although the public network has defined quality for decades, few if any parallels can be drawn to voice-over-IP implementations. Likewise, network equipment manufacturers are facing increasing pressure from both customers and management to deliver public network voice quality and to deliver it quickly.
What is voice quality?
Today's public network is the ultimate benchmark for voice quality in communications networks. However, the Internet's audio transport capabilities are still far below the benchmark set by public-network standards. Traditional telephony networks are designed to provide optimal service for time-sensitive voice applications requiring low delay and jitter. Telephone networks are good at reliably providing constant but low bandwidth services.
When considering the factors that affect the voice quality of the public network, service providers should remember that like all networks today, new and emerging technologies are affecting the performance of the network in many ways. Traditionally, factors influencing the voice quality of the public network include loudness, delay, echo, clarity, noise, fading and crosstalk. These factors usually can be controlled to provide telephone users with a conversation quality that is consistently high and predictable.
Two major network components that can impair public network voice quality are the public network phone, which can influence clarity, loudness and echo through the quality of the loudspeaker and microphone; and the use of digital voice transmission equipment for greater efficiency in the backbone.
Looking at IP networks and converged telephony services, a completely different set of factors emerges that can affect voice quality. The reason stems from the fact that IP networks are built to support non real-time applications characterized by bursty traffic with occasional high-bandwidth demand and longer delays. Therefore, in addition to some of the same factors that affect public network voice quality such as delay and clarity, IP network performance is subject to problems associated with packet loss, bandwidth availability and compression (Table 1).
IP network equipment systems also influence voice transmission and quality. For instance, although voice-over-IP gateway components such as speech codec, silence suppression mechanisms and comfort noise generators are designed to increase the performance of voice-over-IP implementations, they can adversely affect clarity if not properly tuned. Also, the IP network - even without active voice components - affects clarity through its tendency to lose packets and add extensive jitter and delay to the signal.
Clearly, the factors affecting voice quality of public network and IP-based networks differ greatly. Converging telephony and IP networks demand that IP-based systems be enhanced with mechanisms that ensure the quality of service (QOS) required to carry voice-over-IP traffic. Testing system performance from end to end helps ensure that new and emerging hybrid networks interface and provide the level of service customers expect from their voice providers (Figure 1).
A new approach
The traditional approach to testing voice quality uses techniques such as comparing waveforms, signal to noise ratio measurements and total harmonic distortion measurements.
The predominant approach to define pure public network call quality revolves around the mean opinion score, or MOS. In MOS testing, the tester gathers a preferably large group of people and records their opinion of a call's quality. This method is extremely subjective, and to achieve the most accurate and reliable data, a sample must be analyzed by many people over a period of time.
However, these traditional voice quality measurement techniques, with the exception of MOS, do not apply and are not meaningful in voice-over-IP networks, which use low bit-rate codecs to recreate the human perception of the voice signal rather than its waveform. MOS testing, however, while offering the same testing capabilities in public network and voice-over-IP systems, does not offer the expediency required by engineers that implement voice-over-IP networks, given the sheer number of configurable parameters.
Therefore, new techniques to measure voice clarity had to be created.
As technology to deliver packetized voice across IP-based networks has developed, the biggest concern has become the quality of the speech carried over these networks - and how to measure speech quality in a fast yet objective manner to provide a comparison to the voice quality provided over public network systems.
Several objective voice quality measurements have been developed recently, but to fully understand them, it is important to discuss the quantities being measured and the variables present in the environment in which they are observed. IP networks present different challenges to designers because of the large differences between conventional voice delivery methods.
The Internet has become a proving ground for integrated service technologies, including voice transmission over a network with no QOS guarantees. Refinements in technology from this arena have helped telephone companies construct IP-based voice networks to deliver toll service. Developments indicate that carriers will provide IP voice service first and integrated, interactive multimedia applications later, as the technology matures and gains wider acceptance.
Because of these developments, customers will expect service providers to deliver reasonable and predictable QOS. Being able to use voice over IP to offer comparable quality and service offerings, as that which is available through the public network, will help determine the acceptance and success of voice-over-IP service.
Testing issues
Rather than trying to faithfully reproduce a replica of the original source signal waveform, research on speech coding has focused on compressing voice in a way that does not affect the perceived quality of speech.
This process has led to the use of knowledge about acoustics - or psychoacoustics, the science of human perception of sound and how we extract and retain information from it - concerning the propagation of sound from its source to the brain. This information suggests that an exact waveform reproduction is not necessary to convey all information contained in speech.
Using a model of the human auditory system allows engineers to code speech without coding inaudible sounds, or using the bandwidth it takes to transport this information. The coder must choose the parts of speech that, if reproduced, the human brain would not perceive. This requires knowledge of the frequency response of the ear and an accurate representation of the auditory perception function in the coding process. Research continues on the auditory system models used for this function, though researchers already have achieved significant data reduction and high-quality speech encoding.
Only a few universally accepted performance metrics for voice quality on voice and voiceband telephony have been established. Even the most common metrics on application performance do little to predict application behavior when network performance impairments are factored in. Packet voice telephony is affected by various combinations of parameters and network behavior. Critical issues in delivering robust voice service over IP can be summarized as follows:
- Cell or packet loss through the network
- Bit error rate resulting in cell loss
- Traffic management issues in the access or backbone
- User-perceived performance degradation due to coding schemes
- Packet delay due to encoding, packetization and network performance
- Attributes perceived by human senses
- Echo, speech and video clarity, voice delay, background noise
- Jitter, possibly worsened by service integration through the backbone
- Reliable vs. unreliable transport mechanisms to connect endpoints in a call
- Impact of integrating multicast and unicast applications through the same backbone
- Live testing methodologies to quickly distinguish the contributing sources of impairments and bypass or remedy the causes to robust service.
Getting specific
ITU-T Recommendation P.861 specifies a model to measure the human perception of the quality of audio signals. While it may seem strange, psychophysical equivalent representations of audio frequency and intensity exist in the brain, which is how we perceive sounds.
The idea behind this approach is to measure the received and potentially impaired signals, perform an objective analysis between the original and the received signal, and to assess the quality of the signal as perceived by a human. The result represents the perceived degradation of the received signal vs. its reference or send signal.
Different factors contribute to the degradation of voice quality, which requires a more detailed analysis; however, perceptual speech quality measurement provides an analysis of the amount of distortion - if any - that is present in the received signal (Figure 2). What it does not provide is the underlying cause for voice-quality degradation. Additional measurements such as echo distortion, voice-activity detection and delay help uncover these effects (Figure 3).
It also is necessary to assess the impact of individual voice processing steps along the path of the call to determine the end-to-end network effects on voice quality. By using distributed test information, technicians can model input and output signals at the endpoints of a voice call to complete this task.
Testing end-to-end voice quality is a particularly important measurement because it provides a way to objectively quantify the user experience. It can be used to compare voice-over-IP networks with traditional voice networks by using the same measures, a test that voice-over-IP networks must pass to gain wide acceptance.
The following measurements are necessary to characterize the quality of the user experience in a network topology:
- Clarity, the precision to which a reference signal represents itself after being played through a voice network. Speech quality can be measured using the perceptual speech quality measurement method discussed earlier.
- End-to-end delay is the time it takes for speech to traverse the network. Delay can have an impact on how the tone of conversations is perceived and can lead to constant interruptions if not controlled. Intelligible conversation begins to break down at roughly 250 milliseconds; however, the ITU recommends a maximum of 150 milliseconds one-way delay.
- Voice activation detection is used to save bandwidth by sending packets only when speech is present. The effectiveness of the voice activity detector can be determined by measuring the following factors:
Front-end clipping, which is the amount of time it takes a voice-activity detector to detect speech and begin transmitting audio.
Holdover time, which is the amount of time needed to determine that speech is no longer present and to stop transmitting background audio.
Comfort noise generation, which is generated brown noise to give listeners the sensation of background noise so they don't feel the line has gone dead.
DTMF tone analysis, which involves DTMF tones that are not properly reproduced when carried across voice networks using low bit rate voice codecs. Analysis of DTMF tone degradation is required for these networks to ensure the proper functionality of such systems as voice mail and calling cards. The important distortion parameters include amplitude twist and frequency shifts.
Echo and echo double talk. Echo occurs in public networks; however, with the natural minimum delay present in voice-over-IP networks, its effects are exaggerated. Echo double talk describes the situation where echo cancellers remove valid signals while simultaneously trying to remove actual echo. Echo double talk issues also are exaggerated by the increased workload forced upon echo cancellers in voice-over-IP networks.
Another more recent model for measuring human perception of speech quality is the perceptual analysis measurement system (PAMS), developed by researchers at BT. Like perceptual speech quality measurement, PAMS uses a model based on factors of human perception such as frequency sensitivity. However, PAMS uses different signal processing techniques and produces different types of results.
PAMS compares a voice signal that has been transmitted across a network with the original reference signal. The process defines parameters for any errors as they would be perceived by a human. It first time-aligns, level-aligns and equalizes the two signals to cancel most effects of delay, delay jitter, gain/attenuation and analog line filtering.
PAMS then compares the two signals in the frequency and time domains, and records any perceived errors onto a two-dimensional error surface. It produces two scores: a listening quality score and a listening effort score. These scores are predictions of subjective mean opinion scores.
A little improvement
A successful voice-over-IP implementation is the result of a complex equation involving several modifiable variables. To improve overall voice quality, each parameter must be maximized. Factors such as delay, jitter, echo and packet loss each have individual effects that alone can reduce voice-over-IP call quality to low-level, half duplex communication.
Delay, for example, must be reduced across the network to its minimum. Voice-over-IP traffic must not only be carried across high-speed links, but it also must be given top priority. Voice-over-IP networks have an immediate handicap because they require relatively lengthy coding, packetization and decoding time that easily chew up half of the appropriated delay time, and that is only the work done at the end points. In normal enterprise networks, cross-country delay easily eats up the remainder of the time available before the user recognizes the reduction in call quality.
Engineers must minimize delay and understand that through a multipath, routed network, you are only as fast as your slowest link.
Jitter buffers also must be tuned to their maximum, as again, routed networks do not guarantee that all packets will take the same route to their destination. Jitter buffers must dynamically adjust to the changing conditions in the network.
Echo, as another example, depends heavily on delay. Echo becomes imperceptible when it returns so fast that there is no distinguishable difference between it and the reference signal. However, if you add a "normal" amount of delay to the signal, you will effectively stop any form of intelligent communication. Again, echo cancellers must be tuned to the proper amount of delay in the network yet still allow full duplex communication.
Packet loss is a critical issue in voice and voiceband applications. The ear is a filter and transducer, so it is more tolerant to instant quality degradation than, say, a fax machine. A single packet loss during a telephone conversation will have no perceptible effect on voice quality, but if we experience a single packet loss during the fax modem training phase of a voiceband application, such as a fax, after the selection of the data rate and modulation scheme have occurred, the modems could either downspeed and then retry, or perhaps worse.
For a voice conversation using a compressed encoding scheme - such as G729 or G.723.1 - to experience serious measurable quality degradation, the packet loss would need to be at a much higher level than the packet loss that could render the network unfit to carry voiceband applications. It is therefore important to understand the optimization parameters of the network and the types of traffic it will be designed to support to set the operating performance points properly.
The public network has had years to change and tweak these factors for success, while voice-over-IP pioneers are in a land rush to determine the proper formula and thereby help their companies grab the rewards gained by coming to market first. Unfortunately, time is not on their side, and the proper tools are a must to expediently come to the solution.
The technology and principles of voice over IP, when compared with the reliability and relative simplicity of the public network, make the task of implementing voice over IP seem daunting. However, a good understanding of the variables - as well as the ability to test and quantify all the parameters that make voice over IP work - is crucial. The advent of modern testing algorithms not only makes this process possible, but also increases the speed at which these networks move from development to production.
In the end, it all comes down to quality and the ability to overcome the obstacles that do not allow voice over IP to give, at a minimum, equal performance to the public network. Packet-based, convergent networks that carry voice, video and data are here to stay and, through such testing, will eventually flourish.
Several techniques exist or are being developed to improve voice quality - while reducing bandwidth requirements - on data packet transport networks:
- New and improved voice compression techniques that more accurately recreate the perceived sound of speech
- QOS-based routing techniques, which route voice traffic over reserve networks - including the public network - if a monitoring system detects voice quality is substantially degraded
- IP- or ATM-based QOS technologies, which allocate needed bandwidth to voice traffic
- Use of silence suppression via voice activity detection to reduce bandwidth during periods of silence. Allows use of high-quality G.711 coding and still reduce bandwidth from 64 kb/s
- Packet-error correction mechanisms on voice gateways compensate for periods of packet loss
- Jitter buffers, which are commonplace, compensate for variations in delay of packet transport
Want to use this article? Click here for options!
© 2012 Penton Media Inc.
advertisement
Learning Library
Webcasts
Using Real-Time Offers, Alerts and Interactions To Improve the Mobile Broadband Experience
In this Webinar you will learn how to create a real-time relationship with your customers, how to proactively improve the customer experience, and how to successfully target and cross-sell services to boost incremental revenue.
- Megabytes to Megabucks, Bandwidth to Business Models: How 4G Is Changing Everything
- How to Unplug Your Redundant Telco Apps To Save Money and Improve Efficiency
- When IaaS Isn't Enough: Service Provider Business Models to Drive Growth and Build Margin
- How to Transform Your Aging Telco Voice Network to Drive New Profits and Revenue
- Creative Licensing Approaches for Telcos & Their Network Equipment Vendors
- Smart Home Opportunity: Balancing Customer Data & Privacy
White Papers
The Role of Diameter in All-IP, Service-Oriented Networks
This paper discusses the rise of Diameter and benefits of Diameter Protocol.
- Conducting The Orchestration – Order Management at the Speed of Business
- Toward a Converged Network Edge
- Beyond Spam – Email Security in the Age of Blended Threats
- 6 Important Steps to Evaluating a Web Filtering Solution
- The Expertise to Protect You from Botnet and DDoS Attacks
- Seeing is Believing – Bridging the Order Visibility Gap
Featured Content
A time and money saving approach to fiber deployment
Service providers are under tremendous pressure to turn up new services faster then before and, at the same time,
to do it at less expense - and intra-office fiber is one of the biggest challenges in terms of both cost and service
turn-up.
of interest
The Latest
News
From the Blog
Briefingroom
Join the Discussion
Resources
Get more out of Connected Planet by visiting our related resources below:
Connected Planet highlights the next generation of service providers, as well as how their customers use services in new ways.
Subscribe Now







