, pp.354-359 http://dx.doi.org/10.14257/astl.2016.139.71 Network Intrusion Forensics System based on Collection and Preservation of Attack Evidence Jong-Hyun Kim, Yangseo Choi, Joo-Young Lee, Sunoh Choi, and Ik-kyun Kim Electronics and Telecommunications Research Institute (ETRI), 34129 Daejeon, Korea {jhk, yschoi92, joolee, suno, ikkim21}@etri.re.kr Abstract. Usually, the network forensics and intrusion analysis are executed after the attacks are completed and some useful evidence data are lost. Since there is no evidence data enough to investigating a cause of the attack after the cyber-attack occurs, it is always difficult to analyze the cause of an attack even after an attack event is found. Moreover, since cyber-attacks such as advanced persistent threats (APT) against Internet are getting more intelligent, it is difficult to find the cause of attacks with conventional forensics equipment. In this paper, we introduce a network intrusion forensics system based on the collection and preservation of the evidence of an attack. It is used to quickly analyze a cause of an attack event after the attack occurs, and provide a function of collecting the evidence data and ensuring data integrity of them stored in the virtual volume-based storage. The paper also describes the experimental results of the network throughput performance by evaluating our proposed system in a real Local Area Network environment. Keywords: Cyber Attacks, Network Forensics, Attack Cause Analysis. 1 Introduction Currently, computer networks are vulnerable to cyber-attacks from both inside and outside of an organization. Furthermore, the threats of the cyber-attacks such as cyber personal information disclosure, bank fraud, DDoS attacks and APT attack are occurring continuously. Therefore, the conventional information security systems such as IDS (Intrusion Detection System) and IPS (Intrusion Prevention System) may not be sufficient to defend against those attacks. Current cyber incident response is always done after the attacks are completed and some useful evidence data are lost. It cannot give an enough information for the forensics analysis when the attacks are recognized. In addition, since there is no security log information for analyzing an attack after the cyber incident occurs, it is difficult to investigate the cause of an attack. To solve these security issues, we need to find new approaches to enhance the network forensics which collects, stores, and analyzes network traffic for investigating the cause of an attack. ISSN: 2287-1233 ASTL Copyright 2016 SERSC
Many useful network forensics tools and network traffic collecting tools are introduced in [1]. The essential functions of network forensics tools is to collect entire network traffic, store the evidential information of attacks, and analyze them for finding the cause of an attack [2]. The network traffic collecting tools often use a promiscuous interface to collect network packets, extract the content of the packets, and preserve statistical data into the storages [3, 4]. In this paper, we propose a network traffic collecting and analyzing system based on the collection and preservation of the evidence of cyber attacks. The paper is organized as follows: section 2 describes the system architecture and technical functions. Section 3 explains the experimental results and section 4 gives the conclusion. 2 System Architecture and Technical Functions 2.1 System Architecture The main goals of our proposed system is to quickly analyze a cause of an attack after the attack occurs, and provide the evidential information of the attack. Also, our system collects the network traffic including entire network packets, network flow, transmitted files and so on. Fig. 1. (a) Architecture of the proposed system, (b) System Specifications Fig.1 shows the architecture and the system specification of the proposed system, called cyber black box. There are two physical systems in the architecture. We describe the detail functionality of each module (block) on the proposed system in following section. Copyright 2016 SERSC 355
2.2 Traffic & Flow Information Gathering From the Fig.1, TFGB (Traffic & Flow information Gathering Block) is able to accommodate 10Gbps network traffic via a network interface card (NIC), store collected packets, extract the traffic information of the network flow and generate the session data. A method of extracting the flow data can analyze all packet data extracted by the packet extraction unit. It may collect packet data having the same feature in units of a certain time, and may bundle the packet data in a specific file having the PCAP format to extract one piece of flow data (or a flow packet). The extracted flow data may be temporarily stored in the virtual volume based storage. By connecting a plurality of the hard disk, our proposed system stores a total of 10Gbps traffic data without loss of traffic. The hash value generating unit may apply a hash function to each of the entire packet data and the flow data to generate a hash value (SHA-256), for ensuring data integrity of each of the entire packet data and the flow data which are stored in the virtual volume based storage. 2.3 Transmitted File Reconstruction TFRB (Transmitted File Reconstruction Block) performs the function of reconstructing the transmitted file which is extracted from the stored data collected by TFGB, and performs a function for storing additional metadata that is collected from one file reconfiguration. For example, the PE file extraction unit may select packets having PE file information (or a PE format) in the entire packet data. The extracted PE file is also temporarily stored in the virtual volume based storage. TFRB provides the ability to analyze the network service protocols such as HTTP, SMTP, FTP, POP3, and so on. It also determines which protocol a file is sent by and calculates the hash value (SHA-256) for the file, and stores the metadata information that is collected between the extracted files in the directory specified in the csv file format. 2.4 Virtual Volume based Storage Management VSMB (Virtual volume based Storage Management Block) stores the entire packet data, the flow data, and the transmitted files which are encoded by the encoding unit. It receives the hash value that generated by the hash value generation unit, for each of the entire packet data, the flow data, and the transmitted files. And then it stores the received hash value as evidence data. Moreover, VSMB supports a write once read many (WROM) function to ensure integrity of the stored data in the virtual volume based storage. It can be understood that the storage unit supporting the WORM function is a storage medium in which data is written once and from which the data is read at many times like CD-ROMs. Therefore, the storage unit may preserve the entire packet data, the flow data, and the transmitted files for a long time. It provides the capability to create or destroy the virtual volumes of the storage systems for the data protection as well as the file storage management. 356 Copyright 2016 SERSC
2.5 Intrusion Analysis & Scenario generation IASB (Intrusion Analysis & Scenario generation Block) provides the user interface to perform a function of analyzing the cause of the cyber-attack with the preserved data and management data. That is, it provides an analysis result to a user through a GUI. It also reconstructs the cyber-attack scenario based on the extracted information and reproduces a corresponding cyber-attack according to the reconstructed attack scenario. The result of the cause analysis can be supplied to an external system through the external cooperation protocol. That is, the external cooperation system sets a security grade in an external system and gives an appropriate authority to the external system according to the set of security grade. The external system may be a security-related system provided in a security company, a public institution, a portal company, a general company, and so on. 3 Experimental Results In this section, we present the experimental results of our proposed system. We designed and implemented each functional module of the proposed system using the C programming language on the CentOS Linux 7.1 platform. For all the experiments we used a single machine with 128GB DDR3 RAM, 12-Core 2.6 Ghz CPU (Intel Xeon E5-2690v3), 4TB SSD, 96TB SATA 128 MB Buffer 7200 rpm disk with a RAID CONTROLLER(LSI 9280_2414E). For experiments, we used the network traffic and flow data collected over a 24- hour period of one weekday at our testbed environment. In details, we deployed our proposed system in data collecting points such as the front of working offices and an experimental Lab. We also conducted the traffic throughput test with an Agilent N2X tool. It is verified that our proposed system could collect the attack event data and related flow records without the loss of network packets in a total of 20Gbps traffic as shown in Table.1. Table 1. Summary of network packet processing performance in a total of 20Gbps traffic. Copyright 2016 SERSC 357
For another experimental results, our proposed system collected at least 400,000 flow records per second and stored those data into the virtual volume based storage which supports WORM function. We also confirmed that TFRB block performed the function of reconstructing the transmitted file which was extracted from the stored traffic packets collected by TFGB block, and performed a function for storing additional metadata that was collected from one file reconfiguration. In addition, our proposed system has been installed and run in a real LAN environment supporting up to a total of 2Gbps traffic for evaluating the system processing performance for a long period of times. Table.2 shows the summary of network traffic, flow counts, and packet counts collected from 4th to 10th of September in 2016 in the real network environment. It is evaluated that our proposed system could collect about 4 TB network traffic data daily without the loss of network packets in real LAN environment. Table 2. Summary of collected data from a real network environment. Total amount per week Daily Average Average of working day Average of nonworking day Traffic (TB) 27.847631 3.978 4.995 1.435 # of Flows 922,591,298 131,798,757 149,747,230 86,927,575 # of Packets 34,574,737,355 4,939,248,194 6,174,046,881 1,852,251,476 We have also analysed the collected traffic data by the most frequently used services, and provided the summary of network traffic distribution by top 10 services in Table.3. Table 3. Summary of network traffic distribution by Top 10 services. For the overall performance of network traffic processing, we evaluated that our proposed system could collect 20 Gbps network traffic rates without the loss of network packets for our experimental implementation. We also measured the 358 Copyright 2016 SERSC
maximum processing performance by exploiting the various packet size from a N2X tool. However, we couldn t evaluate the network packet processing performance with traffic rates up to 20 Gbps for a real LAN environment since. 5 Conclusion Since there is no evidence data enough to investigating a cause of the attack after the cyber-attack occurs, it is always difficult to analyze the cause of an attack even after an attack is recognized. However, according to our proposed system, entire packet data, flow data, and transmitted files can be collected as evidence data from network traffic and also stored in the storage medium for a long time, and thus, a cause of an attack is quickly analyzed based on the evidence data preserved in the storage medium. This paper described the architecture of our proposed system for network forensics and verified network throughput performance by deploying our proposed system in an experimental testbed environment as well as a real LAN environment. It is evaluated that our proposed system can collect the attack event data and related flow records without the loss of network packets in a total of 20Gbps traffic. Acknowledgments. This work was supported by Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIP) (No. B0101-15-0300, The Development of Cyber Blackbox and Integrated Security Analysis Technology for Proactive and Reactive Cyber Incident Response) References 1. Pilli. E.S., Joshi, R.C., & Niyogi, R.: Network forensic frameworks: Survey and research challenges. The International Journal of Digital Forensics & Incident Response archive. vol. 7, pp.14-27 (2010) 2. Davidoff, S. & Ham, J.: Network Forensics: Tracking Hackers through Cyberspace, Pearson Education (2012) 3. Rizzo, L.: netmap: a novel framework for fast packet I/O. In Proceedings of USENIX conference on Annual Technical Conference, pp. 9-19 (2012) 4. Deri, L., Cardigliano, A., & Fusco, F.: 10 Gbit line rate packet-to-disk using n2disk. In Proceedings of IEEE INFOCOM Workshop on Traffic Monitoring and Analysis pp.3399-3404 (2013) Copyright 2016 SERSC 359