The Internet void is an interesting place. In normal condition, the Internet void is empty and we should not see anything. But if you take the time to look deeply into "black-hole" monitoring dataset, you might find and identify surprising results from badly configured systems to effects of unknown attacks along with various unexplained events. This talk will introduce you to a journey into the noise of Internet network monitoring.
Black-hole monitoring definition
Bailey, Michael, et al. [IMS] defines in 2005 black-hole monitoring as the monitoring of unused or dark address space. Nowadays (in 2013), the IPv4 address. is becoming a scarce resource and by so the definition of black-hole monitoring can be extended to the monitoring of temporarily unused address space. The address space (even if unused) must be announced globally on Internet and visible by the majority of the Autonomous System (AS) just like any used address space.
In our experiments, the address allocation given for monitoring influences the data collected not only due to residual network activities (from previous usage of the address space) but also links to its number addressing. Spelling errors play also a significant role into the data collected from dark address space as discovered in one of our experiment [AnotherPerspective].
Black-hole monitoring has to be seen as a sampled measurement of the Internet noise. The information collected represents a subset of the Internet noise traffic. The behavior of such traffic is erratic as the network traffic not only depends of the size of the monitored black-hole subnet but also on the events occurring on the Internet as a whole. But also on the selected subnet who could be more prone to spelling errors.
While designing an adequate and long-term collection mechanisms for black-hole monitoring, the erratic behavior has to be taken into consideration to cope with low and high bandwidth usage (e.g. from one of our sensor, it’s not unusual to see a variation of bandwidth usage with a factor of ten).
As the traffic behavior tend to deviate at a regular interval, the analysis tend to be more complicated as you can question yourself about the proper operation of the network collection mechanisms (e.g. packet lost, errors). Netbeacon[Netbeacon] was developed in the context of such verification.
Once the packets are properly collected, the analysis of the collected data is another challenging element. Sampling a subeset of traffic helps to discover some tendencies of misconfiguration. As the data is collected on the large period of time (more than 1 year), an evolution about the misconfiguration can be drawn.
Technical implementation overview
The black-hole data is collected from sensors capturing Ethernet frames installed in a dedicated network where the traffic is routed to. These frames are captured and a timestamp is added. The packets were then relayed via an encrypted tunnel to a collector where the packets are put into files which are rotated every five minutes. The resulted files are organized by day in order to be processed. When a file is ready a queue is updated which is polled by the processing programs.
Once a compressed file arrived it is processed with a chain of tools in order to distillate information. Hence, the protocol dissection is done with various tools. In addition to the protocol dissection, the data is enriched with other systems. The dissected and enriched packets are aggregated into documents which are index for further queries. Hence, it is possible to rank countries and organizations that own ASNs according their attacks and mis-configured systems.
Netbeacon is a simple protocol and software implementation[Netbeacon] to send network beacons at regular interval towards the black-hole networks.
As black-hole monitoring and especially its dataset analysis can be a huge topic, we propose research topic including dataset exchange to researchers in order to improve the understanding of Internet and its security.
[IMS] Bailey, Michael, et al. "The Internet Motion Sensor-A Distributed Blackhole Monitoring System.". NDSS 2005. http://www.csd.uoc.gr/~gvasil/old2009/old/stuff/papers/ims-ndss05.pdf
[IMSMeasurement] Bailey, Michael, et al. "Practical darknet measurement." Information Sciences and Systems, 2006 40th Annual Conference on. IEEE, 2006. http://nsrg.eecs.umich.edu/publications/ciss06_final.pdf
[IMSFiltering] Bailey, Michael, et al. "Data reduction for the scalable automated analysis of distributed darknet traffic." Proceedings of the 5th ACM SIGCOMM conference on Internet Measurement. USENIX Association, 2005. https://www.usenix.org/legacy/event/imc05/tech/full_papers/bailey/bailey.pdf
[Netbeacon] Alexandre Dulaunoy. Netbeacon monitoring your network capture. https://github.com/adulau/netbeacon/
[AnotherPerspective] Alexandre Dulaunoy, Gerard Wagener, Cynthia Wagner, Marc Stiefer. Another Perspective to IP-Darkspace Analysis. TF-CSIRT 2013 http://www.circl.lu/files/tf-csirt-first2013-circl-restena-blackhole.pdf
[InternetSinks] Yegneswaran, Vinod, Paul Barford, and Dave Plonka. On the design and use of Internet sinks for network abuse monitoring. Recent Advances in Intrusion Detection. Springer Berlin Heidelberg, 2004. http://wns.ice.cycu.edu.tw/~lin730/labpaper/vance/2007.1.4/On%20the%20Design%20and%20Use%20of%20Internet%20Sinks%20for%20Network%20Abuse%20Monitoring.pdf
[SpellingErrors] Pollock, Joseph J., and Antonio Zamora. Collection and characterization of spelling errors in scientific and scholarly text. Journal of the American Society for Information Science 34.1 (1983): 51-58.
[SDBF] Wagner, Cynthia, Jérôme François, Gérard Wagener, Alexandre Dulaunoy and T. Engel. "SDBF: Smart DNS brute-forcer." In Network Operations and Management Symposium (NOMS), 2012 IEEE, pp. 1001-1007. IEEE, 2012. http://www.foo.be/papers/sdbf.pdf
[PassiveDNS] Alexandre Dulaunoy, Sebastien Tricaud. Scrutinizing a Country using Passive DNS and Picviz or how to analyze big dataset without loosing your mind. https://github.com/adulau/pdns-toolkit/blob/master/slides/picviz-pdns.pdf?raw=true