Kubernetes Love at first sight? 15, February 2018 Joost Hofman (Lead Developer @ Albert Heijn IT Online) Milo van der zee (Senior Developer @Albert Heijn IT Online)
Agenda Kubernetes Why at AH? How? Questions Relational problems Is it real love?
Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications.
Kubernetes - Searches
Kubernetes Service POD POD 1 n 1 n
Kubernetes Operator / Developer API Server Kubernetes Master Controller Manager Scheduler ETCD Users Kubelet cadvisor kube-proxy Kubelet cadvisor kube-proxy Pod Pod Pod Pod Up to 5000 Pod Pod Pod Pod Kubernetes Node Plugin Network - Calico Kubernetes Node
Kubernetes user@host $ nodes NAME STATUS ROLES AGE VERSION k8snode2098 Ready,SchedulingDisabled master 12d v1.8.4+coreos.0 k8snode2099 Ready,SchedulingDisabled master 12d v1.8.4+coreos.0 k8snode2100 Ready,SchedulingDisabled master 12d v1.8.4+coreos.0 k8snode2101 Ready node 12d v1.8.4+coreos.0 k8snode2102 Ready node 12d v1.8.4+coreos.0 k8snode2103 Ready node 12d v1.8.4+coreos.0 k8snode2104 Ready node 12d v1.8.4+coreos.0 k8snode2105 Ready node 12d v1.8.4+coreos.0 k8snode2107 Ready node 12d v1.8.4+coreos.0 k8snode2108 Ready node 12d v1.8.4+coreos.0 k8snode2109 Ready node 12d v1.8.4+coreos.0 k8snode2110 Ready node 12d v1.8.4+coreos.0 k8snode2111 Ready node 12d v1.8.4+coreos.0
Kubernetes user@host $ pods -o wide NAME READY STATUS IP NODE shoppinglist-widget-3162246403-q7c1x 1/1 Running 10.233.106.55 k8snode1657 subscription-service-8cc4c97fb-dh9zz 1/1 Running 10.233.87.218 k8snode1656 subscription-service-8cc4c97fb-t7wrj 1/1 Running 10.233.73.169 k8snode1651 taxonomy-neo4j-neo4j-core-0 1/1 Running 10.233.124.123 k8snode1814 taxonomy-neo4j-neo4j-core-1 1/1 Running 10.233.73.147 k8snode1651 taxonomy-neo4j-neo4j-core-2 1/1 Running 10.233.79.109 k8snode1813 taxonomy-service-7b4fb7f8d5-c6mvb 1/1 Running 10.233.79.105 k8snode1813 taxonomy-service-7b4fb7f8d5-h2hjk 1/1 Running 10.233.68.145 k8snode1655 gateway-3060515939-57r22 1/1 Running 10.233.124.98 k8snode1814 gateway-3060515939-9lqzk 1/1 Running 10.233.68.185 k8snode1655 gateway-3060515939-fkt9k 1/1 Running 10.233.71.29 k8snode1654 gateway-3060515939-ls9pv 1/1 Running 10.233.79.101 k8snode1813
Pods kubectl describe pod api gateway # kubectl -n online-prd describe pod gateway-3060515939-57r22 Name: gateway-3060515939-57r22 Namespace: online-prd Node: k8snode1814/150.83.153.243 Start Time: Wed, 14 Feb 2018 13:12:03 +0100 Labels: name=gateway Status: Running IP: 10.233.124.98 Containers: gateway: Image: regisry-docker.online.ah.nl:443/ah-open-api-gateway:0.1.2 Port: <none>
Service - kubectl describe svc api gateway # kubectl describe svc gateway Name: gateway Namespace: online-prd Labels: run=gateway Annotations: kubectl.kubernetes.io/last-appliedconfiguration={"apiversion":"v1","kind":"service","metadata":{"annotations":{},"labels":{"ru n":"gateway"},"name":"gateway","namespace":"online-prd"},"spec":{"ports":... Selector: run=gateway Type: ClusterIP IP: 10.233.52.234 Port: <unset> 8080/TCP TargetPort: 8080/TCP Endpoints: 10.233.124.98:8080,10.233.68.185:8080,10.233.71.29:8080 + 1 more... Session Affinity: None Events: <none>
api service iptables -A KUBE-SERVICES -d 10.233.52.234/32 -p tcp -m tcp --dport 443 -j SVC-JFMNS -A SVC-JFMNS --mode random --probability 0.25 -j KUBE-SEP-JPX2Q -A SVC-JFMNS --mode random --probability 0.33 -j KUBE-SEP-KUJYT -A SVC-JFMNS --mode random --probability 0.5 -j KUBE-SEP-HTGFR -A SVC-JFMNS --mode random -j KUBE-SEP-JP5GT -A SEP-JPX2Q -p tcp -m recent -j DNAT --to-destination 143.54.22.4:6443
Why @ Albert Heijn? 2015 Monolith Binary coupling Scalability problems Growth issues CI/CD impossible Downtime NOW and future Scalable Decoupling Rolling updates Services CI/CD to the max Isolation of code Zero downtime Technology agnostic
Why @ Albert Heijn? Scalable architecture and technology on a modern, scalable, automated platform Containers - Fully automated - Within minutes Container management platform Virtual hardware - Semi-automated - Within weeks Virtualization Commodity hardware - Manual - Within months
On Premise VS Cloud No cloud options in 2016 and 2017
How?
How?
A HTTP call to appietoday.nl Users Nginx - Ingress Frontend (service) Loadbalancer Frontend (pod) API Gateway (service) API Gateway (pod) IDP (service) API (service) IDP (pod) API (pod)
Our setup? Frontend API Gateway Services Platform Continuous delivery 25+ services Continuous delivery Automated from development to production Authorization Authentication Throttling Routing 5 Clusters 40+ nodes 650+ Docker containers Automate platform deployment with Ansible
Relational problems: Communication.
Relational problems: Storage. On premise Storage Host path NFS vsphere volumes
Relational problems: Storage. On premise Storage GlusterFS
Relational problems: Postgres on Gluster. pg_restore: [archiver (db)] Error from TOC entry 53398; 0 16503 TABLE DATA l1aaux_sci sdmcleod pg_restore: [archiver (db)] COPY failed for table "l1aaux_sci": ERROR: unexpected data beyond EOF in block 9391 of relation base/16386/17043 HINT: This has been seen to occur with buggy kernels; consider updating your system. CONTEXT: COPY l1aaux_sci, line 319329: "1854661 \N 1.05156717906094999 1378796678.44843268 2012-02-01 07:04:39.5+00 2012-02-01 07:04:38.4484..." pg_restore: [archiver (db)] Error from TOC entry 53399; 0 16528 TABLE DATA l1afts_dbl sdmcleod pg_restore: [archiver (db)] COPY failed for table "l1afts_dbl": ERROR: unexpected data beyond EOF in block 10097 of relation base/16386/17068 HINT: This has been seen to occur with buggy kernels; consider updating your system.
Relational problems: Postgres on Gluster. postgres source code: src/backend/storage/buffer/bufmgr.c /* * We get here only in the corner case where we are trying to extend * the relation but we found a pre-existing buffer marked BM_VALID. * This can happen because mdread doesn't complain about reads beyond * EOF (when zero_damaged_pages is ON) and so a previous attempt to * read a block beyond EOF could have left a "valid" zero-filled * buffer. Unfortunately, we have also seen this case occurring * because of buggy Linux kernels that sometimes return an * lseek(seek_end) result that doesn't account for a recent write. In * that situation, the pre-existing buffer would contain valid data * that we don't want to overwrite. Since the legitimate case should * always have left a zero-filled buffer, complain if not PageIsNew. */ bufblock = islocalbuf? LocalBufHdrGetBlock(bufHdr) : BufHdrGetBlock(bufHdr); if (!PageIsNew((Page) bufblock)) ereport(error, (errmsg("unexpected data beyond EOF in block %u of relation %s", blocknum, relpath(smgr->smgr_rnode, forknum)), errhint("this has been seen to occur with buggy kernels; consider updating your system.")));
Relational problems: Communication. Nodes can t reach each other anymore KubeProxy can t reach API iptables are broken Network interface changes Subnet Flannel and Docker mismatch (magicly)
Relational problems: Communication. Nodes can t reach each other anymore Migration from Flannel to Calico resulted in a small downtime but a very stable network afterwards Created a Network test DaemonSet, as our own relation therapist
Relational problems: Communication. [prd-node1:root@k8snode1650 ~]# bridge fdb grep cali 33:33:00:00:00:01 dev calif8b8ce32fae self permanent 01:00:5e:00:00:01 dev calif8b8ce32fae self permanent... [prd-node1:root@k8snode1650 ~]# ip -d link show calif8b8ce32fae 8: calif8b8ce32fae@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> state UP mode DEFAULT link/ether 7e:3f:ee:5e:d4:ed brd ff:ff:ff:ff:ff:ff link-netnsid 0 promiscuity 0 veth addrgenmode eui64 [prd-node1:pnlmv17y@k8snode1650 ~]$ route -n grep cali 10.233.65.153 0.0.0.0 255.255.255.255 UH 0 calif8b8ce32fae 10.233.65.155 0.0.0.0 255.255.255.255 UH 0 cali2b5d60cd0be 10.233.65.156 0.0.0.0 255.255.255.255 UH 0 cali9fa8da37832 10.233.65.158 0.0.0.0 255.255.255.255 UH 0 cali4c2e295795a 10.233.65.159 0.0.0.0 255.255.255.255 UH 0 cali5c975203c3b
Relational problems: Communication. [pnlmv17y@k8snode2110 ~]$ ip addr... 13: tunl0@none: <NOARP,UP,LOWER_UP> mtu 1440 qdisc noqueue state UNKNOWN qlen 1 link/ipip 0.0.0.0 brd 0.0.0.0 inet 10.233.65.133/32 scope global tunl0 valid_lft forever preferred_lft forever... [pnlmv17y@k8snode2101 ~]$ route -n grep tunl0 10.233.76.192 162.53.123.117 255.255.255.192 UG 0 0 0 tunl0 10.233.78.192 162.53.123.110 255.255.255.192 UG 0 0 0 tunl0 10.233.85.192 162.53.123.115 255.255.255.192 UG 0 0 0 tunl0 10.233.88.64 162.53.123.111 255.255.255.192 UG 0 0 0 tunl0
Relational problems: Communication. Not much knowledge about Calico... And that is a good thing. It just works. We know a lot more about Flannel and that also says enough...
Relation problems: Containers drop
Relation problems: Communication. Kubernetes Master Network Test (Pod) DS Kube DNS (pod) Network Test (Pod) DS Network Test (Pod) DS Kube DNS (service) Kube DNS (pod) Kubernetes Node Kubernetes Node
Kubernetes gives more benefits than doubts on premise A lot of open source tools around Helm packages Fast delivery of software Auto healing Very very stable (Only got called out of bed once at night in 2017) Happy developers Enabler for DevOps Etc..
Open source tools that boosts our relationship
Projects that boosts our relationship Easily deploying production-ready Kubernetes clusters. Kubespray saved months of work setting up Kubernetes on premise.
Projects that boosts our relationship Package manager for Kubernetes Helm makes upgrading and maintaining our applications predictable and super easy.
Love Joost Milo
Questions?