Application-Layer Traffic
Optimization (ALTO) Problem StatementNEC Laboratories Europe, NEC Europe
Ltd.Kurfuersten-Anlage 36Heidelberg69115Germany+49 (0) 6221 4342 221jan.seedorf@nw.neclab.euhttp://www.nw.neclab.euThis Space for SaleNew HampshireUSA+1 530 267 7447eburger@standardstrack.comhttp://www.standardstrack.comPeer-to-peer applications, such as file sharing, real-time
communication, and live media streaming, use a significant amount of
Internet resources. Such applications often transfer large amounts of
data in direct peer-to-peer connections. However, they usually have
little knowledge of the underlying network topology. As a result, they
may choose their peers based on measurements and statistics that, in
many situations, may lead to suboptimal choices. This document describes
problems related to optimizing traffic generated by peer-to-peer
applications and associated issues such optimizations raise in the use
of network-layer information.Peer-to-peer (P2P) applications, such as file sharing, real-time
communication, and live media streaming, use a significant amount of
Internet resources. Different
from the client/server architecture, P2P applications access resources
such as files or media relays distributed across the Internet and
exchange large amounts of data in connections that they establish
directly with nodes sharing such resources.One advantage of P2P systems results from the fact that the
resources such systems offer are often available through multiple
replicas. However, applications generally do not have reliable
information of the underlying network and thus have to select among
available instances based on information they deduce from empirical
measurements that, in some situations, lead to suboptimal choices. For
example, one popular metric is an estimation of round-trip time. This
choice occurs before actual data transmission begins and thus before
the peer can deduce actual throughput. This is one reason why a peer
selection algorithm that simply uses round-trip time often results in
a sub-optimal choice of peers.Many of today's P2P systems use an overlay network consisting of
direct peer connections. Such connections often do not account for the
underlying network topology. In addition to having suboptimal
performance, such networks can lead to congestion and cause serious
inefficiencies. As shown in , traffic
generated by popular P2P applications often cross network boundaries
multiple times, overloading links which are frequently subject to
congestion . Moreover, such
transits, besides resulting in a poor experience for the user, can be
quite costly to the network operator.Recent studies show
a possible solution to this problem. Internet Service Providers (ISP),
network operators or third parties can collect reliable network
information. This information includes relevant information such as
topology or instantaneous bandwidth available. Normally, such
information is rather "static", i.e., information which can change
over time but on a much longer time scale than information used for
congestion control on the transport layer. By providing this
information to P2P applications, it would be possible to greatly
increase application performance, reduce congestion and optimize the
overall traffic across different networks. Presumably, both the
application and the network operator can benefit from such
information. Thus, network operators have an incentive to provide,
either directly themselves or indirectly through a third party, such
information and applications have an incentive to use such
information. This document gives the problem statement of optimizing
traffic generated by P2P applications using information provided by a
separate party. introduces the problem. describes some use cases where both P2P
applications and network operators benefit from a solution to such a
problem. describes the main issues to
consider when designing such a solution. Note a companion document to
this document, the ALTO
Requirements, goes into the details of these issues.The papers and
are examples of contemporary solution proposals that address the
problem described in this document. Moreover, these proposals have
encouraging simulation and field test results. These and similar,
independent, solutions all consist of two essential parts: a discovery mechanism which a P2P application uses to find a
reliable information source anda protocol P2P applications use to query such sources in order
to retrieve the information needed to perform better-than-random
selection of the endpoints providing a desired resource.It is not easy to foresee how such solutions will perform in the
Internet. A more accurate evaluation requires representative data
collected from real systems by a critical mass of users. However, wide
adoption is unlikely without an agreement on a common solution based
upon an open standard.The following terms have special meaning in the definition of the
Application-Layer Traffic Optimization (ALTO) problem.A distributed communication system (e.g.,
file sharing) that uses the ALTO service to improve its performance
or quality of experience while optimizing resource consumption in
the underlying network infrastructure. Applications may use the P2P
model to organize themselves, use the client-server model, or use a
hybrid of both.A specific participant in an application.
Colloquially, a peer refers to a participant in a P2P network or
system, and this definition does not violate that assumption. If the
basis of the application is the client-server or hybrid model, then
the usage of the terms "client" and "server" disambiguates the
peer's role.Peer-to-Peer.Content, such as a file or a chunk of a file
or a server process, for example to relay a media stream or perform
a computation, which applications can access. In the ALTO context, a
resource is often available in several equivalent replicas. In
addition, different peers share these resources, often
simultaneously.An application layer identifier
used to identify a resource, no matter how many replicas exist.For P2P applications, a resource
provider is a specific peer that provides some resources. For
client-server or hybrid applications, a provider is a server that
hosts a resource.For P2P applications, a resource
consumer is a specific peer that needs to access resources. For
client-server or hybrid applications, a consumer is a client that
needs to access resources.All address information that a
resource consumer needs to access the desired resource at a specific
resource provider. This information usually consists of the resource
provider's IP address and possibly other information, such as a
transport protocol identifier or port numbers.A virtual network consisting of
direct connections on top of another network, established by a group
of peers.An entity that is logically
separate from the resource consumer that assists a resource consumer
to identify a set of resource providers. Some P2P applications refer
to the resource directory as a P2P tracker.Information about the
location of a host in the network topology. The ALTO service gives
recommendations based on this information. A host location attribute
may consist of, for example, an IP address, an address prefix or
address range that contains the host, an autonomous system (AS)
number, or any other localization attribute. These different options
may provide different levels of detail. Depending on the system
architecture, this may have implications on the quality of the
recommendations ALTO is able to provide, on whether recommendations
can be aggregated, and on how much privacy-sensitive information
about users might be disclosed to additional parties.Several resource providers may be able
to provide the same resource. The ALTO service gives guidance to a
resource consumer or resource directory about which resource
provider(s) to select in order to optimize the client's performance
or quality of experience while optimizing resource consumption in
the underlying network infrastructure.A logical entity that provides interfaces
to query the ALTO service.The logical entity that sends ALTO
queries. Depending on the architecture of the application one may
embed it in the resource consumer or in the resource directory.A message sent from an ALTO client to an
ALTO server, which requests guidance from the ALTO Service.A message sent from an ALTO server to
an ALTO client, which contains guiding information from the ALTO
service.An ALTO transaction consists of an
ALTO query and the corresponding ALTO response.Traffic that stays within the network
infrastructure of one Internet Service Provider (ISP). This type of
traffic usually results in the least cost for the ISP.Internet traffic exchanged by two
Internet Service Providers whose networks connect directly. Apart
from infrastructure and operational costs, peering traffic is often
free to the ISPs, within the contract of a peering agreement.Internet traffic exchanged on the
basis of economic agreements amongst Internet Service Providers
(ISP). An ISP generally pays a transit provider for the delivery of
traffic flowing between its network and remote networks to which the
ISP does not have a direct connection.A protocol used by the
application for establishing an overlay network between the peers
and exchanging data on it, as well as for data exchange between
peers and resource directories if applicable. These protocols play
an important role in the overall ALTO architecture. However,
defining them is out of the scope of the ALTO WG.The protocol used for sending
ALTO queries and ALTO replies between an ALTO client and ALTO
Server.A protocol used for populating
the ALTO server with topology-related information.The protocol used for
synchronization, query forwarding, or referral between ALTO servers
that have been provisioned with only partial knowledge of the
topology-related information (e.g., on a per-domain basis).Figure 1 shows the scope of the ALTO client protocol: Peers or
super-peers can use such a protocol to query an ALTO-service. The
mapping of topological information onto an ALTO service as well as the
application protocol interaction between peers and super-peers are out
of scope for the ALTO client protocol.Network engineers have been facing the problem of traffic
optimization for a long time and have designed mechanisms like MPLS and DiffServ
to deal with it. The problem these protocols address consists in finding
(or setting) optimal routes for packets traveling between specific
source and destination addresses and based on requirements such as low
latency, high reliability, and priority. Such solutions are usually
implemented at the link and network layers, and tend to be almost
transparent. At best, applications can only "mark" the traffic they
generate with the corresponding properties.However, P2P applications that are today posing serious challenges to
Internet infrastructures do not benefit much from the above route-based
techniques. Cooperating with external services aware of the network
topology could greatly optimize the traffic the P2P application
generates. In fact, when a P2P application needs to establish a
connection, the logical target is not a host, but rather a resource
(e.g., a file or a media relay) that is often available in multiple
instances on different peers. Selection of the closest one -- or, in
general, the best from an overlay topological proximity -- has much more
impact on the overall traffic than the route followed by its packets to
reach the endpoint.Optimization of peer selection is particularly important in the
initial phase of the process. Consider a P2P protocol such as
BitTorrent, where a querying peer receives a list of candidate
destinations where a resource resides. From this list, the peer will
derive a smaller set of candidates to connect to and exchange
information with. In another example, a streaming video client may be
provided with a list of destinations from which it can stream content.
In both cases, the use of topology information in an early stage will
allow applications to improve their performance and will help ISPs make
a better use of their network resources. In particular, an economic goal
for ISPs is to reduce the transit traffic on interdomain links.Addressing the Application-Layer Traffic Optimization (ALTO) problem
means, on the one hand, deploying an ALTO service to provide
applications with information regarding the underlying network and, on
the other hand, enhancing applications in order to use such information
to perform better-than-random selection of the endpoints they establish
connections with.File sharing applications allow users to search for content shared
by other users and download it. Typically, search results consist of
many instances of the same file (or chunk of a file) available from
multiple sources. The goal of an ALTO solution is to help peers find
the best ones according to the underlying networks.On the application side, integration of ALTO functionalities may
happen at different levels. For example, in the completely
decentralized Gnutella network, selection of the best sources is
totally up to the user. In systems like BitTorrent and eDonkey,
central elements such as trackers or servers act as mediators.
Therefore, in the former case, optimization would require modification
in the applications, while in the latter it could just be implemented
in some central elements.Providers of popular content like media and software repositories
usually resort to geographically distributed caches and mirrors for
load balancing. Selection of the proper mirror/cache for a given user
is today based on inaccurate geolocation data, on proprietary network
location systems or often delegated to the user himself. An ALTO
solution could be easily adopted to ease such a selection in an
automated way.P2P applications for live streaming allow users to receive
multimedia content produced by one source and targeted to multiple
destinations, in a real-time or near-real-time way. This is
particularly important for users or networks that do not support
multicast. Peers often participate in the distribution of the content,
acting as both receivers and senders. The goal of an ALTO solution is
to help peers to find the best sources and the best destinations for
media flows they receive and relay.P2P real-time communications allow users to establish direct media
flows for real-time audio, video, and real-time text calls or to have
text chats. In the basic case, media flows directly between the two
endpoints. However, unfortunately a significant portion of users have
limited access to the Internet due to NATs, firewalls or proxies.
Thus, other elements need to relay the media. Such media relays are
distributed over the Internet with a public addresses. An ALTO
solution needs to help peers to find the best relays.Distributed hash tables (DHT) are a class of overlay algorithms
used to implement lookup functionalities in popular P2P systems,
without using centralized elements. In such systems, peers maintain
addresses of other peers participating in the same DHT in a routing
table, sorted according to specific criteria. An ALTO solution will
provide valuable information for DHT algorithms.This section introduces some aspects of the problem that some people
may not be aware of when they first start studying the problem
space.The goal of an ALTO service is to provide applications with
information they can use to perform better-than-random peer
selection.At least three different kinds of entities can provide ALTO
services: Network operators. Network operators usually have full
knowledge of the network they administer and are aware of their
network topology and transit and peering traffic policies.Third parties. Third parties are entities separate from network
operators, but which may have either collected network information
or have arrangements with network operators to learn the network
information. Examples of such entities are content delivery
networks like Akamai, which control wide and highly distributed
infrastructures, or companies providing an ALTO service on behalf
of ISPs.User communities. User communities run distributed algorithms,
for example for estimating the topology of the Internet.It is important for the reader to understand there are significant
user communities that expect an ALTO Server to be a centralized
service. Likewise, there are other user communities that expect a
service that services P2P applications be itself a distributed,
possibly even a P2P, service.A result of this is one can reasonably expect there to be some sort
of service discovery mechanism to go along with the ALTO protocol
definition.On the one hand, there are data elements an ALTO client could
provide in its query to an ALTO server that could help increase the
level of accuracy in the replies. For example, if the querying client
indicates what kind of application it is using (e.g. real-time
communications or bulk data transfer), the server will be able to
indicate priorities in its replies accommodating the requirements of
the traffic the application will generate. On the other hand,
applications might consider such information private. In addition,
some applications may not know a priori what kind of request they will
be making.Operators, with their intimate knowledge of their network topology,
can play an important role in addressing the ALTO problem. However,
operators often consider such network information to be
confidential.Caching is a common approach to optimizing traffic generated by
applications that require large data transfers. In some cases, such
techniques have proven to be extremely effective in both enhancing
user experience and saving network resources.A cache, either explicitly or transparently, replaces the content
source. Thus, the cache must use the same protocol as the querying
peer. That is, if a cache stores web content, it must present an HTTP
interface to the web client. Any cache solution for a given protocol
needs to present that same protocol to the client. Said differently,
each caching solution for a different protocol needs to implement that
specific protocol. For this reason, one can only reasonably expect
caching solutions for the most popular protocols, such as HTTP and
BitTorrent.It is extremely important to realize that caching and ALTO are
entirely orthogonal. ALTO, especially if it is aware of caches, can in
fact direct clients to nearby caches where the user could get a much
better quality of experience.This document is neither a requirements document nor a protocol
specification. However, we believe it is important for the reader to
understand areas of security and privacy that will be important for the
design and implementation of an ALTO solution. Moreover, issues such as
digital rights management are out of scope for ALTO, as they are not
technically enforceable at this level.The approach proposed in this document asks P2P applications to
delegate a portion of their routing capability to third parties. This
gives the third party a significant role in P2P systems.In the case where the network operator deploys an ALTO solution, it
the P2P community might consider it hostile because the operator could,
for example: use ALTO to prevent content distribution and enforce
copyrights;redirect applications to corrupted mediators providing malicious
content;track connections to perform content inspection or logging;
andapply policies based on criteria other than network efficiency.
For example, the service provider may suggest routes sub-optimal
from the user's perspective to avoid peering points regulated by
inconvenient economic agreements.It is important to note there is no protocol mechanism to require
ALTO for P2P applications. If, for some reason, ALTO fails to improve
the performance of P2P applications, ALTO will not gain popularity and
the P2P community will not use it.At the time of this writing, the privacy issues described in the
previous section are relevant for an ALTO solution. Users may be
reluctant to disclose sensitive information to an ALTO server.
Operators, on the other hand, may not wish to disclose information that
would expose details of their interior topology. When exploring the
solution space in detail, one needs to consider these issues so that an
ALTO protocol does not presume mandatory information disclosure, by
either clients or servers.None.The basis of this document is draft-marocco-alto-problem-statement,
written by Enrico Marocco and Vijay Gurbani. They continue to provide
significant edits and inputs to the current document editors.Vinay Aggarwal and the P4P working group conducted the research work
done outside the IETF. Emil Ivov, Rohan Mahy, Anthony Bryan, Stanislav
Shalunov, Laird Popkin, Stefano Previdi, Reinaldo Penno, Dimitri
Papadimitriou, Sebastian Kiesel, Greg DePriest and many others provided
insightful discussions, specific comments and much needed
corrections.Jan Seedorf and Sebastian Kiesel are partially supported by the
NAPA-WINE project (Network-Aware P2P-TV Application over Wise Networks,
http://www.napa-wine.org), a research project supported by the European
Commission under its 7th Framework Program (contract no. 214412). The
views and conclusions contained herein are those of the authors and
should not be interpreted as necessarily representing the official
policies or endorsements, either expressed or implied, of the NAPA-WINE
project or the European Commission.Thanks in particular to Richard Yang for several reviews.Application-Layer Traffic Optimization (ALTO)
RequirementsMany Internet applications are used to access resources, such
as pieces of information or server processes, which are available
in several equivalent replicas on different hosts. This includes,
but is not limited to, peer-to-peer file sharing applications. The
goal of Application-Layer Traffic Optimization (ALTO) is to
provide guidance to applications, which have to select one or
several hosts from a set of candidates, that are able to provide a
desired resource. This guidance shall be based on parameters that
affect performance and efficiency of the data transmission between
the hosts, e.g., the topological distance. The ultimate goal is to
improve performance (or Quality of Experience) in the application
while reducing resource consumption in the underlying network
infrastructure. This document enumerates requirements for ALTO,
which should be considered when specifying, assessing, or
comparing protocols and implementations, and it solicits feedback
and discussion.The impact of DHT routing geometry on resilience and
proximityCan ISPs and P2P systems co-operate for improved
performance?Should ISPs fear Peer-Assisted Content Distribution?An Empirical Evaluation of WideArea Internet
BottlenecksTaming the Torrent: A practical approach to reducing
cross-ISP traffic in P2P systemsP4P: Explicit Communications for Cooperative Control Between
P2P and Network ProvidersP2P fuels global bandwidth bingeMultiprotocol Label Switching ArchitectureThis document specifies the architecture for Multiprotocol
Label Switching (MPLS). [STANDARDS TRACK]New Terminology and Clarifications for DiffservThis memo captures Diffserv working group agreements concerning
new and improved terminology, and provides minor technical
clarifications. It is intended to update RFC 2474, RFC 2475 and
RFC 2597. When RFCs 2474 and 2597 advance on the standards track,
and RFC 2475 is updated, it is intended that the revisions in this
memo will be incorporated, and that this memo will be obsoleted by
the new RFCs. This memo provides information for the Internet
community.The case for an informed path selection serviceWith today's peer-to-peer applications, more and more content
is available from multiple sources. In tomorrow's Internet hosts
will have multiple paths to reach one destination host with the
deployment of dual-stack IPv4/IPv6 hosts, but also with new
techniques such as shim6 or other locator/identifier mechanisms
being discussed within the IRTF RRG. All these hosts will need to
rank paths in order to select the best paths to reach a given
destination/content. In this draft, we propose an informed path
selection service that would be queried by hosts and would rank
paths based on policies and performance metrics defined by the
network operator to meet his traffic engineering objectives. A
companion document describes a protocol that implements this
service.