OpenDataPlane: network packet journey Maxim Uvarov This presentation describes what is ODP. Touches common data types and APIs to work with network packets. Answers the question why ODP is needed and how to use it. Gives basic overview of internal components.
Linux kernel stack
More complex but not complete view: https://wiki.linuxfoundation.org/images/1/1c/network_data_flow_through_kernel.png
Issues with kernel network stack: - Long and unpredictable latencies for network packets; Memory size for metadata for network packet (SKB metadata, TCP/IP and netfilter metadata) - few kilobytes for 64 byte packet; System fight for cpu time between kernel and userland, spending cpu cycles is not optimal; TLB entries for both userland and kernel, no Huge Pages support; Packet body copy, strip out vlan tags and later re insertion in user land for AF packet; Other interrupts involved which may delay execution; No hw accelerators support; Crash in kernel makes whole system not operable and you have to reboot. Reboot whole system vs one single application.
Abstract network hardware
What is OpenDataPlane? ODP is the set of APIs describing hardware in software. It is represented as library both static and dynamic. Supported ABI compatibility libodp version to use with different implementations. Non ABI compat version uses native types for more speed optimization. ODP defines API and does not put any restrictions for implementations. Know similar projects to ODP are: dpdk, netmap, pf_ring, libpcap. Know implementations are: Linux generic, DPDK based, Cavium Thunder-X, Freescale based on DPAA, TI Keystone II, Kalray MPAA. Current ODP implementations done in linux user space. Where hardware delivers packets directly to user space memory. If ODP API can not be accelerated with hardware, than software realisation is used.
ODP network packet odp_packet_t handle odp_packet_t = odp_packet_alloc() void *odp_packet_data(odp_packet_t pkt); uint32_t odp_packet_len(odp_packet_t pkt); uint32_t odp_packet_headroom(odp_packet_t pkt); uint32_t odp_packet_tailroom(odp_packet_t pkt); void *odp_packet_push_head(odp_packet_t pkt, uint32_t len); void *odp_packet_pull_head(odp_packet_t pkt, uint32_t len); void *odp_packet_pull_head(odp_packet_t pkt, uint32_t len); int odp_packet_extend_tail(odp_packet_t *pkt, uint32_t len, void **data_ptr, int odp_packet_move_data(odp_packet_t pkt, uint32_t dst_offset, uint32_t src_offset, uint32_t len); int odp_packet_copy_data(odp_packet_t pkt, uint32_t dst_offset, uint32_t src_offset, uint32_t len); uint32_t *seg_len);
Packet journey begins: ODP classifier Classifier is the hw block on ingress pipeline doing matching packet with configured Packet Matching Rules (PMRs). As the result packet scheduled for processing to specific queue. Packet classification does not take any cpu time. No latency delays in packet processing.
PKTIO, QUEUES and POOLS odp_pktio_t - handle to control packets Input / Output, set up modes, get statistics counters and link status, describe connection to odp queue. odp_queue_t - handle to describe queue of the packet. odp_pool_t - handle for memory storage for the packets. Example of usage: odp_pktio_t pktio = odp_pktio_create( netmap:eth0 ); odp_queue_t queue = odp_queue_create( qname ); odp_pool_t pool = odp_pool_create( pool name, ¶ms);
Run model
ODP_PKTIN_MODE_DIRECT odp_pktin_queue_t queue_in, odp_pktout_queue_t queue_out odp_pktin_queue(pktio, &queue_in, 1); odp_pktout_queue(pktio, &queue_out, 1); while (1) { pkts = odp_pktin_recv_tmo(queue_in, pkt_tbl, MAX_PKT_BURST, ODP_PKTIN_NO_WAIT); sent = odp_pktout_send(queue_out, pkt_tbl, pkts); }
ODP_PKTIN_MODE_SCHED while (1) { odp_event_t ev; ev = odp_schedule(); switch (odp_event_type(ev)) { case ODP_EVENT_PACKET: pkt = odp_packet_from_event(ev); sent = odp_pktout_send(queue_out, pkt, 1); case ODP_EVENT_TIMEOUT: case ODP_EVENT_CRYPTO_COMPL: } }
Crypto operation: encrypt / decrypt packet odp_crypto_session_t - struct provides crypto block setups (sync, async, cipher and auth algorithms) int odp_crypto_operation(odp_crypto_op_param_t *param, odp_bool_t *posted, odp_crypto_op_result_t *result); odp_crypto_compl_t odp_crypto_compl_from_event(odp_event_t ev);
Egress Traffic manager QoS packet scheduling ODP block using following algorithms: Strict Priority scheduling. Weighted Fair Queueing scheduling (WFQ). Bandwidth Shaping. Weighted Random Early Discard (WRED). odp_tm_t tm = odp_tm_create("tm", &tm_params); odp_tm_queue_t tq = odp_tm_queue_create(tm, &queue_params); tn = odp_tm_node_create(tm, "TmNode", &node_params); odp_tm_queue_connect(tq, tn); sched_profile_rr = odp_tm_sched_create("schedprofilerr", &sched_params);
Areas where ODP can be useful: - Routers; IoT sensors information collector servers; Industrial real time networking application; DPI systems (like Snort, Suricata); VPN applications; OS kernel network stack less network apps (OFP); Virtual switches;
Links API, Linux-generic implementation and tests: https://git.linaro.org/lng/odp.git ODP-DPDK: https://git.linaro.org/lng/odp-dpdk.git ODP Freescale DPAA1, DPAA2: http://git.freescale.com/git/cgit.cgi/ppc/sdk/odp.git Cavium ARM 64 Thunder-X: https://github.com/linaro/odp-thunderx