Phase 1
Characterization of Internet traffic has become over the past few years one of the major challenging issues in telecommunications networks. The availability of broadband connections for Internet users has been increasingly growing, particularly based on cable TV and ADSL technologies so that new possibilities for resource usage has been emerging, by users from small organizations and home users. The increased capacity and availability provided by the broadband connection lead to a complex behavior of a typical user, potentially very different from a dial-up user. Consequently the traffic mix generated by such users is different from typical dial-up users, as well as from commercial users.
There have been some efforts on measuring and analyzing broadband traffic, on most of them point out that currently the predominant type of traffic is produced by the so-called Peer-to-Peer (P2P) file sharing applications, such as KaZaA, eDonkey and BitTorrent. Depending on the location, hour or day, the P2P traffic is responsible for 40% to 80% of the total traffic volume. However, those previous investigations suffer from known limitations, such as loss of information due to traffic summarizations, failure to correctly identify the application and little comprehension of the user behavior. Furthermore, the current trend of moving phone calls from the PSTN to the Internet via VoIP P2P applications represents a thereat to telephony companies and its effects have not been completely understood.
This project is aimed at building a real infrastructure within an ISP for measuring, storing, filtering, making relations and analyzing broadband traffic. One of its main achievements will be the development of a software tool, which incorporates statistic techniques, able to provide meaningful information for both technical and management staff of an ISP, software and hardware vendors and to the Internet scientific community. Other contribution will be the use of such tool to obtain results of the measured traffic in the broadband ISP (an ADSL broadband provider).
The expected project conclusions and outcomes will be compared to those released by similar studies and scientific papers, as well as to current measurement infrastructures deployed worldwide in different ISPs and universities. The measured traffic from broadband users will be analyzed according to the most sound statistical methods, so that to provide high confidence about the user behavior.
Phase 2
Characterization of Internet traffic has become over the past few years one of the major challenging issues in telecommunications networks. The availability of broadband connections for Internet users has been increasingly growing, particularly based on cable TV and ADSL technologies so that new possibilities for resource usage has been emerging, by users from small organizations and home users. The increased capacity and availability provided by the broadband connection lead to a complex behavior of a typical user, potentially very different from a dial-up user. Consequently the traffic mix generated by such users is different from typical dial-up users, as well as from commercial users.
There have been some efforts on measuring and analyzing broadband traffic, on most of them point out that currently the predominant type of traffic is produced by the so-called Peer-to-Peer (P2P) file sharing applications, such as KaZaA, eDonkey and BitTorrent. Depending on the location, hour or day, the P2P traffic is responsible for 40% to 80% of the total traffic volume. However, those previous investigations suffer from known limitations, such as loss of information due to traffic summarizations, failure to correctly identify the application and little comprehension of the user behavior. Furthermore, the current trend of moving phone calls from the PSTN to the Internet via VoIP P2P applications represents a thereat to telephony companies and its effects have not been completely understood.
This project is aimed at building a real infrastructure within an ISP for measuring, storing, filtering, making relations and analyzing broadband traffic. One of its main achievements will be the development of a software tool, which incorporates statistic techniques, able to provide meaningful information for both technical and management staff of an ISP, software and hardware vendors and to the Internet scientific community. Other contribution will be the use of such tool to obtain results of the measured traffic in the broadband ISP (an ADSL broadband provider).
The expected project conclusions and outcomes will be compared to those released by similar studies and scientific papers, as well as to current measurement infrastructures deployed worldwide in different ISPs and universities. The measured traffic from broadband users will be analyzed according to the most sound statistical methods, so that to provide high confidence about the user behavior.
Phase 3
Profiling Internet backbone and access network traffic using DPI techniques is a must have technology for network operators in order to keep their infrasctrutures in a good shape when it comes to network planning and management. It should rely on an in-depth understanding of the composition and the dynamics of Internet traffic, which is essential management and supervision of the ISP’s network. In general, characterization of Internet traffic, from a huge variety of networked applications, plays a key role in capacity planning for the operator`s infrastructure. The need for meeting users’ quality of experience (QoE) requirements and for finding a good balance of maximizing network utilization while maintaining fairness for users is a intimidating challenge. On the top of that, several bandwidth-intensive applications have contributed to an unprecedented expansion of traffic volume growth, along with the increase in broadband subscribers. ISPs now understand that rigorous control of scarce bandwidth requires the most advanced network management resources (i.e., software and hardware tools, adequate techniques and intensive supervision). Such resources must be deployed in a way to provide the maximum amount of control and flexibility regarding traffic policies to different classes of users and applications.
BTMA2 DPI has so far reached the 1Gbps packet capture and inspection at user space. Currently, the major challenge is to evolve its architecture to cope with 10Gbps and beyond in a commodity platform. To be effective and fully functional, TAM must provide a comprehensive and cost-effective DPI solution enabling ISPs to monitor network traffic per application, per subscriber and per locality, at full wire-speed.
As far as we are concerned about precise traffic identification, some applications encrypt their payload which makes difficult to identify them through payload inspection. Therefore, TAM must incorporate different techniques (e.g., behavioural traffic analysis) in order to get close to 100% of traffic volume and flows correct identification. Although the DPI component in TAM will be its core element, it must aggregate different techniques to a proper identification of traffic that could pose threats to the network (i.e., abnormal and secure-related traffic). As all major protocols and applications (e.g., peer-to-peer file sharing and streaming (P2P), instant messaging (IM), multimedia streaming etc) can be identified within TAM`s DPI component, we must drive our focus on VoIP and applications that pose security threats to networks and users. Please note that this is an additional work on the top of the continuous applications signature update process.
In addition to these technical challenges, TAM must provide a Human-Computer Interface (HCI) in order to meet the requirements as a consulting tool to be used within ISP premises. A distributed system version of TAM must have the DPI separated from other components, to scale with the number of monitoring points within the network. This also can facilitate the deployment of a management system capable of providing a general profile of the whole network, instead of looking at specific backbone links.
Considering the challenges and issues above and that we have built substantial theoretical and hands-on background during BTMA Phase 1 and 2, we propose advanced research work following five different tracks:
- Remodeling TAM’s architecture to be capable of operating at 10Gbps with minimal losses in commodity platforms – Our DPI system must evolve to ensure scalability with link speed and must cope with 10Gbps. The current architecture must be redesigned to forward packets from kernel to user space losslessly at 10Gbps. Then, it must process all packets payload by relying on efficient Deterministic Finite Automaton (DFA) and/or other advanced specific techniques to pattern matching. It also should be able to monitor across multiple topologies, locations, and network technologies.
- Behaviour analysis to identify encrypted applications – It is well-known that in order to recognize and classify Internet applications one should rely on the use well-known signatures as the cornerstone for recognition and classification. Although this approach is somewhat efficient in small and medium networks, in today’s Internet it is easy to use encryption to evade blocking mechanisms. As the volume of encrypted traffic has been increasing in the last years, we propose to study and evaluate several methods to detect applications hidden within encrypted connections. There are some techniques for building network applications traffic profiles using only the packet headers, which remains intact and observable after encryption, thus making possible to improve accuracy of TAM.
- Update of the application signature database – this is a continuous process that must be present throughout the project lifespan. It is extremely important to have the latest Internet applications in TAM’s signature list.
- Analyzing VoIP applications profiles – The DPI engine must use a combination of signature-based analysis techniques along with behavioral analysis to track VoIP applications (e.g., Skype, Gizmo, etc). Understanding VoIP application traffic profile and its implications to the traffic management area requires advanced studies, since most VoIP applications generate an assorted of traffic profile. With the use of a combination of chat, voice and video capabilities, it is also important to keep track of control traffic that such applications exchange (i.e., control status information). Therefore, it is of primordial importance an in-depth understanding of the traffic behavior of these increasingly popular applications, from both the connection and network level perspectives.
- Developing a Human-Computer Interface – As TAM can be use as a consulting tool, a proper graphical user interface (GUI) must be developed in order to facilitate its use by consultants or network managers.
Project Goals and Objectives
The general project goal is to come up with contributions in the area of techniques, methodologies and metrics for traffic analysis in high speed networks, e.g., in 10-40Gbps link speeds. In addition, the project will put some effort on refining the core heuristics of the DPI in order to deal with traffic profiles from encrypted and VoIP applications.
SPECIFIC OBJECTIVES:
- The refinement of the traffic analysis tool to cope with 10Gbps links is the most relevant activity in this project, since it can a real-time in-depth understanding of broadband enterprise and residential users;
- Obtaining insightful results on the behavior of broadband users (enterprise and residential), particularly related to the use of VoIP applications, such as Skype and Gizmo. In addition, we will give priority to the identification of encrypted traffic;
- The list of accurately recognizable applications by TAM must be updated on a regular basis (e.g., monthly);
- As occurred in the previous BTMA 1 and 2, operation and management of a measurement and analysis infrastructure will be again an objective. Since the main infrastructure for measurement is deployed at the ISP premises, this activity is of fundamental importance for supporting the development of new ideas and concepts. It is also important to analyze data from different operators outside Brazil’s boundaries;
- Developing abilities of human resources of the research team in GPRT/UFPE, Ericsson Research and the ADSL provider. This will be an important contribution to the current knowledge in broadband traffic;
- From the research point of view an important outcome is the publication of scientific papers. We expect to continue having a remarkable number of papers submitted to important conferences and journals.