The Internet void is an interesting place. In normal condition, the Internet void is empty and we should not see anything. But if you take the time to look deeply into "black-hole" monitoring dataset, you might find and identify surprising results from badly configured systems to effects of unknown attacks along with various unexplained events. This talk will introduce you to a journey into the noise of Internet network monitoring.

Black-hole monitoring definition

Bailey, Michael, et al. [IMS] defines in 2005 black-hole monitoring as the monitoring of unused or dark address space. Nowadays (in 2013), the IPv4 address.[1] is becoming a scarce resource and by so the definition of black-hole monitoring can be extended to the monitoring of temporarily unused address space. The address space (even if unused) must be announced globally on Internet and visible by the majority of the Autonomous System (AS) just like any used address space.

In our experiments, the address allocation given for monitoring influences the data collected not only due to residual network activities (from previous usage of the address space) but also links to its number addressing. Spelling errors play also a significant role into the data collected from dark address space as discovered in one of our experiment [AnotherPerspective].

Challenges

Black-hole monitoring has to be seen as a sampled measurement of the Internet noise. The information collected represents a subset of the Internet noise traffic. The behavior of such traffic is erratic as the network traffic not only depends of the size of the monitored black-hole subnet but also on the events occurring on the Internet as a whole. But also on the selected subnet who could be more prone to spelling errors.

While designing an adequate and long-term collection mechanisms for black-hole monitoring, the erratic behavior has to be taken into consideration to cope with low and high bandwidth usage (e.g. from one of our sensor, it’s not unusual to see a variation of bandwidth usage with a factor of ten).

As the traffic behavior tend to deviate at a regular interval, the analysis tend to be more complicated as you can question yourself about the proper operation of the network collection mechanisms (e.g. packet lost, errors). Netbeacon[Netbeacon] was developed in the context of such verification.

Once the packets are properly collected, the analysis of the collected data is another challenging element. Sampling a subeset of traffic helps to discover some tendencies of misconfiguration. As the data is collected on the large period of time (more than 1 year), an evolution about the misconfiguration can be drawn.

Technical implementation overview

The black-hole data is collected from sensors capturing Ethernet frames installed in a dedicated network where the traffic is routed to. These frames are captured and a timestamp is added. The packets were then relayed via an encrypted tunnel to a collector where the packets are put into files which are rotated every five minutes. The resulted files are organized by day in order to be processed. When a file is ready a queue is updated which is polled by the processing programs.

Once a compressed file arrived it is processed with a chain of tools in order to distillate information. Hence, the protocol dissection is done with various tools. In addition to the protocol dissection, the data is enriched with other systems. The dissected and enriched packets are aggregated into documents which are index for further queries. Hence, it is possible to rank countries and organizations that own ASNs according their attacks and mis-configured systems.

Netbeacon

Netbeacon is a simple protocol and software implementation[Netbeacon] to send network beacons at regular interval towards the black-hole networks.

Seminar Agenda

  • Overall review of our research regarding "typography errors"[AnotherPerspective] and black-hole monitoring. Slides

  • Technical implementation and analysis techniques of a black-hole monitoring. Slides

Further research

As black-hole monitoring and especially its dataset analysis can be a huge topic, we propose research topic including dataset exchange to researchers in order to improve the understanding of Internet and its security.

Bibliography


1. The overall IPv4 is 2 power 32 and the current allocation of this address space is near completion.