A Summary of Openstack Austin Summit
To give a summary of Openstack Austin Summit:
There are no much news on Cinder (features are developing but a bit routine). Manila becomes mature (and gets more exposure) now. Multi-site openstack is receiving increasing weight (from Cell V2, Ceph & Swift, backup/DR, to deployment practices).
Ceph Jewel release is remarkable (CephFS production-ready, RDB mirror available for journal replication). NVM/SSD technologies are game-changing (NVMe Ceph journal, XtremIO, etc). DPDK are quickly getting adopted (in OVS, NFV, monitoring). Hyper-converged native storage solution (Veritas) for Openstack starts to show up (Ceph however not designed for this).
Kuryr or container overlay network doesn’t have much progress (looks like). Neutron keeps improving DNS, Baremetal support, IPAM, and NFV/VNF. For SDN part, OVN (vs Dragonflow), Opendaylight integration, and Dragonflow are progressing. Service Function Chaining (SFC) is coming to shape (and Tacker, OPNFV, NFVI, ETSI NFV, OASIS TOSCA, etc). Networking, NFV, SDN, service chaining, and various solutions & vendors, are still the most active part of Openstack. Note that VNF usually need custom Kernel (which is common for proprietary switches), thus you see Cumulus Linux.
Containers are of course hot, but most of them are supported via PaaS (rather than directly through Openstack) (Murano vs Magnum), or used for containerized Openstack deployment (Kolla, Ansible + container, run on Kubernetes, etc). OCI & CNCF are still working hard to get themselves known. Magnum is building Bay Driver (when will CloudFoundry and Openshift come in?).
IoT becomes hot (SAP, IBM, HPE, Pivotal, TCP Cloud, SmartCities). IBM is betting on Openstack. Mirantis is gaining increasing weight (and respect) in community. Ubuntu/Canonical is rising (they have so many presentations). Openstack Foundation is spending increasing effort on training activities, including the new Certificated Openstack Admin (COA) (99Cloud supports it in China), in preparation to become the true industry standard. Besides, this Summit has a new Superuser TV series.
An interesting thing is that super user/developer companies have basically occupied most presentation slots on the Summit (market/committee consolidation?). In the maillist someone even proposed to remove voting process of speaker proposals.
Interesting new projects:
- Romana for network and security automation
- Kingbird for multi-site services
- Nation for compute node HA
- Convergence to make Heat execution more scalable and handle failures better
- Astara for virtualize Neutron agents, VNF, and ease of management
- Tacker for network service function chaining orchestration
- Fuxi as mentioned in Magnum etherpads with Kuryr to enable data volumes
- Watcher provides continous resource optimization (energy-aware optimization, conslidation, rebalancing, etc) in close loop including monitoring and action/advise.
- Higgins/Zun aims to enable Openstack as one platform for provisioning and managing VMs, baremetals, and containers as compute resources. Compared to Magnum, who enables containers by a second framework such as Kubernetes/Mesos/Swarm on top of Openstack, Higgins try to make containers Openstack-native. The developers come from original Nova-docker and Magnum. It being renamed to Zun.
Interesting new storage vendors: Scality (S3 compatible, unified solution for Openstack (EMC solutions are however more usecase specific as I see), dynamic loadbalancing & no consistent hashing & never balance data); Veritas (DAS hyper-converged native solution for Openstack (Ceph is not designed for this), built into hypersior).
How to Select Videos to Watch
Each Openstack Summit releases hundreds of presentation videos. It is no easy work to select the most worthwhile ones to watch. Here’s my guideline
Checkout the officially featured videos (link).
Checkout the officially summary/recap videos (example of Tokyo Summit).
Checkout the keynotes presentation (link). They are at the beginning of each summit day. They demonstrate key community events and directions.
Checkout the Breakout Sessions. Openstack Summit is divided into Breakout Sessions and Workshops, Developer & Operator working sessions, Keynote presentations, Marketplace Expo Hall, Lounges, etc (see summit timeline; it is clearer if you attend on-site). Note that this is not “track”. Breakout Sessions are usually of more importance (of course you can watch other types too), to locate them:
Checkout the vidoes on Youtube of high view count (link). High view count video indicates bigger impact.
Checkout the popular videos people liked through Twitter (effective if you followed the right group).
Checkout how many attendees signed to watch a video (example of Vancouver Summit, see the “Attendees” part). It shows how many person said “I want to watch” this video.
Checkout your interested videos by track (link). Track means type of a presentation, for example, storage, operations, enterprise IT strategies, etc.
Watch the presentation level (example: beginner). Choose your fits.
Don’t forget #vBrownBag videos (link, search “austin”). They are 15min each, but usually very inspiring. #vBrownBag is not part of Openstack Foundation; AFAIK it is a horizontal organization that borrows slots in all sorts of summits/conferences.
Checkout the Design Summit (link). This is where the next version Openstack (Newton) features are being discussed and planned. Wish there was video. The Etherpads content are pretty condensed, while the best way to understand what core developers have said is to attend on-site.
Besides, if you can go on-site to an Openstack Summit, listen to the questions asked by audience (and answer), ask your questions, and talk with people, are usually more important.
After this Openstack Austin Summit, I found out that the official site provided us with a new lively video page.
As far as I can see, Openstack Summit is high focusing on users, especially the big users. The most favored contents are usecases, practices, experiences, etc. Technical details, black magic, design discussions are not the main theme, however, except that routinely core developers will come to stage and share the newest updates.
For real technical stuff, you may need to attend the design summit (it spans the full week, with most events scheduled to the last summit day; example). The core developers summarize their discussions on design summit etherpads (I wish there would be videos too). And remember that, the most cutting edge technical updates always appear on developer maillist, where the key is to learn how experts think and discuss upon a new problem.
AT&T is an elder and super user of Openstack. What they favor is common in the community: open white box architecture, multi-site deploy with combined local and global controllers, no vendor lock-in, and the agility. But essentially I think it is cost-reduction, which is actually the most seen. I can see multi-site Openstack is getting mature and getting adoptted now. Checkout the differences between zone vs region vs cell vs aggregates (Note: Openstack zone is very different from AWS zone). Cell V1 is deprecated, while Cell V2 is still being actively developed (link). Murano is recommended by AT&T, for which I personally like its object-oriented orchestration language; Magnum, however, is not seen. And eventually Mirantis, and its Fuel, is becoming more and more the canonical production-level Openstack distribution.
It is amazing that Swift, who uses Python as the data path language (with so many C++/C/Golang competitors), becomes such a success today. So tweaking Pythin interpretor is a must-do. I remember that Jython tries to run Python on JVM, leveraging JVM’s GC, JIT & Hotspot and performance & maturity; not sure its stats, seems no much adoption. The default Python interpretor is CPython. PyPy, used in this video, however, features in the JIT, which is famous for interpertors. Using PyPy in Swift to improve performance is straightforward, which should have come out years before (since Swift is written in Python). Now it finally made progress, awesome progress, bravo!
NFV is hot and increasingly gaining heat in telecom area to adopt Openstack. But I think they are far from “carrier grade” now, the latter demands HA, security, demanding throughput & latency, manageability, and smooth upgrading & patching. For jargon such as Openstack vs OpenDaylight vs Openflow vs Open vSwitch, see here. Generally this video gives introduction to Juju (integrated with Ubuntu) that eases Openstack development, and provide support to various aspects such as containers, hyper-converged architecture, software-defined storage, NFV & SDN, deep learning, ceph monitoring. The interesting trends is that, Ubuntu becomes increasingly the canonical platform for Openstack and various opensource software. Although people saying CentOS is more production stable, it seems systemd draws too much repell from the community.
This is a keynote. 7500 people attended Austin Summit on-site (slightly less than 9000 in Tokyo?). A key move from this summit is the Certified Openstack Adminstrator (COA). We can see Openstack is preparing to become a mature industrial fundamental platform; increasingly more training activities occur on the summit, and now we have official Openstack admin certification. In China, 99Cloud instantly established the COA training facility. The video released currently voted Openstack Super Users winner: NTT (Tokyo), from nominated candidates: GMO INTERNET, Go Daddy, Overstock, University Federal de Campina Grande, Verzion.
AT&T is elder. AT&T Integrated Cloud (AIC) starts from Juno, and moving to Mitaka in 2017. Agility, CI/CD, DevOps are the key enabler from Openstack; so like most adoptors, AT&T is using Openstack vastly in the development environment, but seems limitted in production. They use KVM and VMware (vCenter) in hybrid. They need to integrate Openstack with many other things, so writing Fuel plugins is priority, and also need to integrate Fuel with other management tools such as Ansible. Fuel upgradability is the key. There are things that AT&T needs but not present in upstream community, AT&T needs to close the gap itself (and contribute).
HP Openstack, Helion, is elder, but doesn’t perform very successful. HP shutdown its public cloud in Oct 2015. This video demonstrate HPE’s lifecycle management of OpenStack using Ansible. To be honest, this is a hotspot in the past Openstack but already out-dated now (and we have Fuel).
It is very interesting that this talk tries to dig into venture capitalists’ key concerns related to startups based on opensource. For startups, how to evaluate the correct product and market is hard. Another problem is scaling (from small business to big), for example, how to do goto market, how to build the organization and leadership team, how to think about services vs product. Although they are not as familar with the technical part, Venture help beyond money. When enterprise wants technology, they want it standard and know where the support comes from, rather than free cost; the former is what Redhat is doing. Markey dynamics are changing; companies are invest more in opensource rather than proprietary. The speaker expressed concerns about a trend from building great technology to building for money. Opensource vs open-core is interesting; although the later is widely being employed today, but too many companies are burned by their open-core models. Customer expects their vendors to make money (they want healthy vendors), but don’t like to be held hostage by them (no vendor lock-in is big concern). The open-core model is slowly dying today (according to the speaker). Next generation angel is a new creation, which requires entrepreneurs to be under age 40, and a commitment that investor spends enough time staying with startups.
Intel and IBM are radical investors in various opensource ecosystem. It is interesting to think how their strategies differ from other elder IT vendor companies. Recent breakthroughs, DPDK & SPDK, 3D XPoint NVM, Intel PCIe SSDs, and E5 v4 Cloud CPU, from Intel, are bringing great momentum in the storage and cloud world. Native GPU access in virtual machines now relies on Intel GVT; if you remember that Intel VT is one of the beginning foundation of the virtualization age.
I can say that, Mirantis makes production-level Openstack distribution public accessable. Tales are that Mirantis before Openstack is nearly bankrupt. But Mirantis grabbed the big oppotunity, and became the canonical Openstack flagship (and gets a lot of financing investments). It doesn’t own any single line of proprietary code; the value comes from their selection of bug fixing, patches, and security enhancements, they step further than community, their solid testing, and their good deployment designs (link). Mirantis is also the top rank upstream contributor. This video tells an interesting opinion: Openstack is 1 part technology and 9 parts people and process.
This video is completely organized as a long and solid demo. Betfair shows how they use their tools and Openstack underlyingly to orchestrate package building, network creating, app deployment, setup loadbalancers, and rolling upgrade their app. It is curious that no one actually use Horizon; they build UI each of their own. In a word, the demo is killer usecase of Openstack in app lifecyle management.
What, I can’t find any Keynote video? Are they merged into featured videos? Or decomposed into a series of common videos? Weird … (There are still Tokyo keynotes on Youtbe, but no Austin …)
There is no official recaps of Austin Summit. But I’ve found one from Rackspace and one from HPE.
Rackspace: OpenStack Summit Austin 2016 - Racker Recap
Talking about the Summit is exciting, in such a big scale, great experience, bla, bla, bla … Nothing important.
To short. A lot of big things are happening … Video ends.
Cinder, Ceph and Storage in Openstack
I’m always interested in Ceph, Cinder and various storage technologies in Openstack, either data path or control path. Recent storage world are quickly evolving: DPDK & SPDK, PCIe SSD, NVMe, NVDIMM, RDMA adoption, smart NIC, Ceph BlueStore, hyper-converged architecture, software-defined storage (SDS), etc. Is an age that
Storage is again merging with computing. You can see Ceph (using commodity computing hardwares), and hyper-converged architectures.
Software-defined datacenter is the future. SDS is one of the pieces.
Flash is getting more and more adoptted. You can see from SAS/SATA SSD, PCIe SSD/Flash, NVMe SSD/Flash, NVDIMM SSD/Flash, persistent memory, etc, they are quickly climbing up the stack. Storage (and network) is too fast for CPU and memory, so people are finding ways to mitigate the memory bandwidth and PCIe bandwith limits, where you can see DPDK, SPDK, RDMA, etc. Many new technologies bypass the Linux Kernel to achieve lower latency. Also, Kernel page table (and the hardware-assistant MMU) now can be used to address filesystem metadata, see SIMFS, interesting.
Scale-out architecture is the king. I have to say that one reason is Intel cannot build any more scaled-up CPU (and architecture) now, so vendors need the industry to buy-in scale-out strategy. And scale-out is more friendly to the cloud fashion and commodity white box trend.
Cinder replication has been long under development, basically, get troubled because vendors have very different design requests. Replication V1.0 is in volume granularity, but given up. Replication V2.x is in fulll backend granularity. V2.1 hope fully will be available to use (doc). This article is a good introduction of how replication works; but it doesn’t mention thaw. The video is by NetApp. Live migration and storage compability chart is a bit useful.
Joint video by NetApp and SolidFire. This video introduces using Magnum to orchestrate container PaaS, use Manila to deploy a share (filesystem), and mount to Docker. Where is the “big data”?
Cinder volume-type and multi-backend have been available for long time. This video teaches you how to use.
Cinder and Manila are of coure volume solutions for Container/Docker, one as block and one as fileysystem. Docker now have volume-plugin. Kubernetes support Cinder (doc). The talk is by IBM and Dell, but promote rackspace/gophercloud in the end.
This video is by UnitedStack. “More and more users want to leverage the advantages of ceph and enterprise storage. But with the restriction of glance we could only get images in one place and copy to another storage if we boot virtual machines in different backends.” Now we can use Glance Multi-location to solve the problem. It is also a usecase that we need more than one Ceph backends to be switched in Glance.
Promoting using CoprHD in Openstack. AFAIK CoprHD can be used to replace Cinder (CoprHD supports Cinder API), or to be used as a Cinder driver. CoprHD actually has a pretty cool architecture and a much wider feature range covering block, object, filesystem, replication, and recovery.
Datera introducts its orchestration tool product. Talks a lot about the template.
The emccode/rexray is software-defined storage controller for container platforms such as Docker and Mesos. Magnum uses Rexray to provide persistent volumes for Mesos. Compared to Cinder, Rexray is more native to Docker, standalone, and simpler (also mentioned here).
CephFS has finally gone production-ready (Jewell version). Integration of CephFS with Manila is OK but seems not mature yet.
Cinder core developers presents.
The Replication API: V2.0 is disabled, V2.1 (Cheesecake) fallover the whole backend; avaible to use now, but not mature; vendor support list see here.
Backup supports full and incremental and non-disruptive backup. Active-active HA is very awesome design, there are a lot of moving parts, still WIP. Checkout the code if you like.
Multi-attach allows a volume to be attached to multiple hosts or VMs, not fully functional yet.
Rolling upgrade is OK now, but I guess it not very mature; it includes RPC versioning, versionedObjects, API microversions, and online DB schema upgrade. There are updates for Fibre Channel.
Some new backend drivers are added (now 53 in total); LVM, RBD, NFS are the reference architectures.
In Newtown (next release), we will have, Replication V2.x (Tiramisu); continuing of active-active HA, rolling upgrade, microversions, os-brick will help Cinder on ironic baremetal, and async operation and reporting. Cinder Replication Tiramisu gives tenants more control of the replication granularity, e.g. a volume or a group of volumes (using Replication Groups).
AWCloud (海云迅捷) presents. They deployed 200 nodes Openstack and test by Rally. Boot from volume often fail because of the low performance of Cinder. The problem resides on
HAProxy reports 504. It is too slow because the version is too old
Cinder-api database connection driver blocks the thread (eventlet monkey patch doesn’t help). Solution is to increase worker count.
Cinder-volume is too slow to process large amount of requests: create volume, initialize connection, attach. Solution is to run more Cinder-volume workers (private code).
Cinder-volume race condition while running multiple works. Solution is to add lock (private code).
RDB rados call blocks the thread, because they are not patched by eventlet.
Download or clone image is too slow by Glance. Solution is to use RBD store.
The increase of database entries lead to sharp decline in performance. The hotspot is the reservation table. Solution is to add a combined index, and clean unecessary data.
Others: increase rpc_response_timeout, rpc_case_timeout, osapi_max_limt.
Results: boot_server_from_volume from failure in concurrency=200, to all success in concurrency=500; create_volume from failure in concurrency=1000, to all success in concurrency=2500. Good presentation!
Presentation by Tesora, NetApp and Redhat. On 5min22s there is a summary of why people want Openstack
- 97% is to standardize one platform API
- 92% to avoid vendor lock-in
- 79% to accelerate innovation
- 75% to operation efficiency
- 66% to save money
“Until recently, the OpenStack Trove DBaaS project only used the Cinder block storage service for database storage. With joint development work from NetApp, Red Hat and Tesora, it is now possible to run database workloads on OpenStack using Manila-based file shares.”
Introduce Red Hat Ceph Storage to you.
Cassandra deployment demo to introduce VMware Integrated OpenStack, VMware NSX, and PernixData. The case study is pretty detailed, with cluster layout design and benchmark results.
Presentation by Mirantis. Migrating data from one storage backend to another backend, or inter-cloud. Challenge is usually network limits, and how to avoid impact SLA. Approaches can be
- DD from block to block. Simple, slow, and don’t allow data udpates.
- Rsync. It’s file but not block level.
- Use storage backend’s replication. It is vendor dependent.
- Just connect the storage backend to the other side.
They use bbcp protocol to accelerate block migration. The command
dd | pv | dd looks useful. For ceph, we have rbd export-diff and rbd import-diff; rbd export and rbd import; this is called incremental snapshots transfer. Sébastien’s blog is using DRBD and Pacemake. MOS/Fuel plugin helps deploy existing Ceph as primary storage, i.e. connect instread of move; it is still under development.
Object storage has become more of a choice for many workloads. There are still traditional applications that need filesystem access. Swift and Manila solves the data sharing needs for VM. Presentation by IBM.
Ceph RGW, the object storage, is actually pretty popular. Many people are deploying RGW. The POD architecture of Ceph is interesting, even it may not be really necessary. VM use ephemeral storage vs Ceph, a summary. Ceph RGW stack configurations, see here. This video shares in detailed their Ceph and RGW config in both hardware and software. The orchestration of Ceph is by Chef. Their tools at github. The testing tools for Ceph and RGW:
- Ceph: RADOS Bench, COS Bench, FIO, Bonnie++
- Ceph RGW: JMeter. Test load by requesting from a cloud.
Presentation by IBM and HPE. This talk is about future, so Swift object encryption is not ready. The encryption can be supported in hardware disk level, virtual block device level (LUKS, dm-crypt), or Swift encryption middleware level. BYOK (bring your own key) can be supported only in the last approach. Here is encryption spec and code.
Presented by Big Switch. This talk is about OCP hardware. It is still early age, so this talk is pretty “soft”.
Talking about backup in Openstack, there is a project, Freezer, focusing data level, and Smaug focusing on application-level. Besides, you can apply the common standard backup practices, for DB, filesystems, /var/lib/xxx, /etc/xxx, /var/log/xxx, etc.
This presentation focus on Ceph cross-site backup. RDB mirroring (finally, don’t need to export-diff now) is used to replicate Ceph. The architecture design replications each level of Openstack. RBD mirror is available with Ceph Jewel, with the upcoming Redhat Ceph Storage 2.0. RBD mirror replicates journal underlyingly; it is asynchronized replication. It is supported in Cinder Replication V2.1. The current gap mainly resides on metadata replication. New project Kingbird provides centralized service for multi-site Openstack deployments.
Presented by Inspur (浪潮) & 99Cloud. The write & read affinity creatly improve multi-site performance of Swift. But due to eventual consistency, new data may not have to be replicated to appropriate location when site failure happens. Basically this talk tells about some practices about read & write affinity.
SSD / CPU performances and bandwidth drop dramatically because the quick climbing of SSD speed. CPU / DRAM bandwidth bottleneck is another problem. SAN 2.0 - NVMe over Fabrics; this is an interesting idea:
- NICs will forward NVMe operations to local PCIe decies
- CPU removed from the software part of the data path
- CPU is still need for the hardware part of the data path
- IOPS improve, BW is unchanged
- Significant CPU freed for application processing
To me, it looks like that the storage industry is evolving in spiral path. The rise of new NVM/SSD media, may bring back the past-style SAN architecture again. But this time, NVMe protocol is connected directly on PCIe bus, compared to the past-style expensive SCSI. Storage media access is bypassing kernel, bypassing CPU, bypassing memory, just direct RDMA; so it’s kinda like a computer controller connecting to bunch of disk arrays, even through the disk array box is actually a computer, its CPU/memory/OS is not used or necessary. New technologies also bring in a lot of proprietary hardware configurations, but they are really much faster than what pure-software white box can do now. Finally rack-scale architecture is a lot heard related to the storage market.
Presented by OpenIO. Commodity hardware + softwared-defined storage = hyper scalable storage and compute pool. Track containers rather than objects. Grid of nodes with no consistent hashing, never balance data. Dynamic loadbalancing by compute scores for each serivce in realtime. These designs are interesting. The OpenIO object storage is integrated at Swift Proxy server level.
The “middileware” here orients from Python’s WSGI server design. It allows you to add customized feature to each part of Swift. Middleware can be added to the Swift WSGI servers by modifying their paste configuration file. Anyway, middleware is the decorator design pattern introduced by Python WSGI to overlay server features; it’s useful. Swift itself actually uses a lot of middleware, see its config file.
Present by EMC. Promoting the idea of software-defined storage (SDS), and EMC ScaleIO. Shared best practices for work with SDS. Compared to Ceph, ScaleIO is purpose-built, native block, less trade-off on performance.
Present by SwiftStack. A practical talk to introduce Swift advanced features. Concurrent gets to reduce first byte latency. To optimize multi-region, use read/write affinity, memcache pooling, aysnc account/container updats. Swift 2.7 now allows for 1 byte segment in Static Large Objects (previous it is 1MB).
This is Hands-on Workshop, lasting 1h26m. The Youtube view count is 278, pretty high in average, looks welcome. There is a demo of network service chain: external -> firewall + lb -> lb -> app -> db. The demo is present on Redhat Enterprise Linux Openstack Platform (not Horizon, well).
Present by NetApp & SolidFire. Backup the volume snapshots from Cinder, to Swift, to NetApp appliance (dedup & compression is good), or to cloud through a cloud gateway (cloud-integrated storage appliance). Demo 2 shows the backup workflow of SAP HANA on Manila. Next they introduced Manila Share Replication. Replication is used as non-disruptive backup.
Present by NTT. Swift is good solution backup / disaster recovery. Swift uses HTTP REST API. But customer, as mentioned in this video, wants NFS or iSCSI to be compatible with their legacy application. The solution is to mount Swift as filesystem using Cloudfuse. But note that Swift is optimized for large files rather than lots of small files. There various issues while trying to use Swift as NFS/iSCSI to solve the backup problem. This talk has in detail discussion of them.
Introduction of Scality Ring Storage product. Cumulus Linux is interesting: a networking-focused Linux distribution, deeply rooted in Debian. Note that NFV runs VNF on commodity server, thus optimizing the Kernel is important; by which there emerges dedicated Kernel provider, such as Cumulus Linux. Scality is fully distributed P2P no-center-at-all architecture.
Scality Ring Storage product is a unified storage platform, being able to support Swift, Glance, Cinder, Manila (each has the dirver). It is able to replication, erasure coding, geo-redundancy, self-healing, etc. On 9m50s there is an Openstack Storage usecase diagram against storage type and size.
Unified storage solution of Openstack, interesting; AFAIK some companies choose Ceph for the same purpose. EMC solutions are however more usecase specific, AFAIK: Block is ScaleIO/XtremIO, filesystem is Isilon, object storage is ECS.
Promoting EMC XtremIO. The problem to solve is: IO blender effect at large scale, VM provisioning & clone, dynamic policy-based operations. XtremIO is all-flash and sparkingly fast. The content-based addressing is a key design of XtremIO. Actually the best technical video to introduce XtremIO is the one from Storage Field Day and the one from SolidFire. XtremIO is the #1 all-flash market leader with 34% share. On 11m51s there is a comparison graph of scale-up vs scale-out on the rack shelf; scale-up is actually not able to survive shelf-level failure (e.g. power, switch). Per XtremIO controller provides 150K IPOS, scale-out to 16 boxes 2M IOPS. XtremIO has 100% metadata in memory, inter-connected with RDMA fabric. XtremIO integrates with Openstack Cinder to provide block storage.
Presented by SwiftStack. 8m13s is a nice summary of monitoring components: agent, aggregration engine, visualizer, alerting, and the popular solutions for each of them. 10m11s categorizes the types of data to monitor, and the monitoring lifecycle: measurement, reporting, characterization, thresholds, alerting, root cause analysis, remediation (manual/automated). 19m49s records the key point to monitor in Swift: cluster data full, networking including availability and saturation, proxy states such as CPU and /healthcheck, auditing cycles, replication cycle timing. The checks can be installed on load balancer. Later of this talk is demo.
Present by Ubuntu. Canonical is the company behind Ubuntu. Ubuntu is quite active on this Summit. Compariing raw Disk (3-year refresh) vs AWS storage price:
- SSD $12 TB/month
- HDD $1.5 TB/month
- EBS SSD $100 TB/month
- EBS HDD $45 TB/month
- S3 $30 TB/month
- Glacier $7 TB/month
- S3 $90/TB transfer out
- Glacier $10/TB transfer out
8m45s is a summary of how recent new technologies save cost (is low-power archtecture ready to use now?). So how Ubuntu helps reduce storage cost? ZFS, Ceph, and Swift. [Deutsche Telekom] evaluated Manila, summarizing that Manila is enterprise mature, and something needs improve.
Finally! CephFS is production-ready in Jewel release. For previous history, see CephFS Development Update, Vault 2015.
CephFS has “consistent caching”. The client is allowed to cache, and server invalidates them before change, which means client will never see any stale data. Filesystem clients write directly to RADOS. Only active metadata is stored in memory. CephX security now applies to file path. Scrubbing is available on MDS. Repair tools are available: cephfs-data-scan, cephfs-journal-tool, cephfs-table-tool. MDS has standby servers; they replay MDS logs to warm up the cache for fast take-over. CephFS sub-tree partitioning allows you to have multiple active MDSes. Directory fragmentation allows you to split a hot directory over many active MDSes; it is not well-tested. Snapshot is available now. You can create multiple filesystem, like pools or namespaces (not well tested). Still pain points: file deletion pins inode in memory, client trust problem (there is totally no control expcet separate them in namespaces/tenants), some tools to expose states are still missing (dump individual dirs/files, see why things are blocked, track access to file).
Presented by Comcast. The storage node is using NVMe for journal (but SATA HDD). To benchmark, FIO for block, Cosbench for object. Remember to test scaled-out performance. Issues encountered
- TCMalloc eats 50% CPU. Solution is to give it more memory
- Tune the NUMA. Map CPU cores to sockets; map PCIe devices to sockets; Map storage disks (and journals) to the associated HBA; pin all soft IRQs to its associated NUMA node. Align mount points so that OSD and journal are on the same NUMA node.
General performance tips below
- Use latest vendor drivers (can be up to 30% performance increase)
- OS tuning focus on increasing threads, file handles, etc
- Jumbo frames help, particular on the cluster network
- Flow control issues with 40Gbe network adapters; watch out for dropping packets
- Scan for failing disks (slow responding disks), take them out
Next I will pick up the popular Openstack Austin Summit videos by Youtube view count that I’m interested to watch. It is 3 weeks after the Summity day, rough average view count is 100. So 200+ usually means the video is popular. There are a few videos which has over 1000+ views, such as Why IBM is Betting on OpenStack.
Declare own move is a powerful strategy according to Gaming Theory. IBM contributs a lot in Openstack community. Bluemix is based on CloudFoundry, and addes a well set of functionality including CI/CD, Collaboration, IoT, Serverless. IBM identified customer needs Openstack distributor rather than direct community source. Openstack etc is also a good base from where IBM builds its cloud solution stack.
IoT Platform on SAP HANA Platform on CloudFoundry on Openstack, helps customer to transform their business. This is a pretty sales-oritented video, no much detail. SAP and CF (and also GE) are good pantners from early age. CF sells well in North America, however not even much known in China.
Openstack multi-site is receiving increasing attention. Cell is a unified API endpoint for multi-site. It is in use by large deployers such as Rackspace, CERN, GoDaddy. Checkout the differences between zone vs region vs cell vs aggregates (Note: Openstack zone is very different from AWS zone). Cell V1 is deprecated, while Cell V2 is still being actively developed (link).
Generally there are two ways to create a large cluster: 1) just creat a cluster with all nodes 2) create a lot of small clusters and combine them in some way. Cell works for (2). Note that many scale-out distributed system actually cannot scale much; if you use (1), the message queue, scheduler, DB, whatever, may start to malfunction when the cluster is large enough. The easier way is to use (2), where only exist small clusters, but combine them with either a unified API, or separated by let user select the region.
I remember that many online games let player select region, that is (2), which creatly lowers down the difficulty of creating large cluster (and server large amount of user). Internet companies also use Cell architecture divided by user accounts, to reduce the difficulty of building a large cluster, and to achieve even active-active multple datacenters located in different cities.
This is a pretty in-detail talk about how Cell V2 works. There are a lot of moving parts. Looking forward to it. Cell V2, expected Fall 2016.
Present by TCP Cloud. This video is very hot (and short), 2000+ views (the average presentation view count is only 100). IoT creates a lot of usecases for common user, and for the industry. Sensors -> IQRF -> gateway -> internet -> datacenter based on Openstack, with analyzing, big data, visualization, and API access; architecture at 5m24s. Container (in gateway) and virtual machines (in processing platform) are used together, connected with overlay network solutions. Note that, they can boost Kubernetes which starts various micro-services at the IO gateway directly located in the city, or in any remote location; this extends their platform. The final demo is pretty cool, see from 5m41s.
Present by Rackspace. A lot of attention on automated security. What they use: Security Technical Implementation Guides (STIG) from the Defense Information System Agency (DISA). It covers a lot of critical aspects. They embed the security configurations in openstack-ansible-security, and with documentation. How to get it: apply_security_hardening to True.
Google and CoreOS demo deploying containerized Openstack running on Kubernetes, including fast adding Nova compute node, auto failure recovery, and black-white publish of Horizon. The overall architecture is called GIFEE (Google Infrastructure for Everyone Else). There needs some modification on Kubernetes to allow Openstack components to invoke (privileged) hypervisor features. Pretty Google style.
DreamCompute price with predictive bill, i.e. cheaper. DreamCompute is Openstack-powered public cloud, general available at April 2016. Because Ceph all-SSD copy-on-write, DreamCompute can do fast VM creation. Note that Ceph was created on DreamHost. Network setup uses VXLAN managed by Cumulus, encap/decap hardware & white box switch, L3+ serivces via Astara, dual-wired 10G TORs on each rack, TOP uplink at 40G to Spines, 20G effective between every node. Storage setup: hyper-converged architecture.
OVH is a company name. OVH doesn’t use private network, every instance got a public IP. This is a high-level video, introducing how OVH uses their Openstack and how large is it. No much detail.
Present by Mirantis and Intel. The first part is introducting of Mirantis and its design. Murano is an App catelog. App interoperablity (because MuranoPL is objective language) is a good design. Magnum is a provision platform (plus scaling and management). Magnum vs Murano, the decision is to integrate them both, providing Magnum-based Kubernetes, Swarm, Mesos apps for Murano.
Taiwan Openstack Application Hackathon is launched at 2016 March. This video is basically a promotion of this event. In the end the winner team demo their creation (collect guitar hand motion data and analyze them in Openstack Sahara).
Popular, But No Time to Watch
(Note that several of the popular videos are moved to my “interested” sections.)
By Errisson. Get familar with tools, do the dirty work, do code reviews, focus on project/feature, enter large project by code review or priority bugs/features. More advanced, to drive the agenda: know usecases, solutions, why, alternatives, and usability; find supporters via maillist, events; use the Big Tent (to create new project). Inter-project features and communication are becoming more impotant these days. Before start big features, talk with core devs to make sure they support (and align with the project design decisions). Focus, Be professional, Be collaborative.
The problem to solve by Kuryr is the overlay^2 network of VM nested containers, which results in great performance penalty. According to video, I think there isn’t much actual progress. Magnum has plan to integrate with Kuryr. Let’s wait.
Two technologies: NSH-based SFC and MPLS/BGP VPN-based SFC. Comparison at 30m28s. Related platforms: Openstack Tacker as orchestration platform, OpenDaylight SDN Controller, OPNFV Apex Installer Platform, and Custom OVS with NSH patch. There are quite a lot of diversity in the implementation (but not fragmentation, according to the video).
Tacker orchestrate VNFs. Tacker Multi-Site allows Operators to place, manage and monitor VNFs in multiple OpenStack sites. It closely works with OPNFV and standard bodies like ETSI NFV and OASIS TOSCA. 99Cloud is the 3rd top contributor of Tacker. The later slides introduces Tacker architecture, how it works, and various features. Multi-site VIM support is interesting.
Presented by ZeroStack. This talk presents how workloads interfere with each other in Openstack, from a several month long study of running workloads in different configurations on ZeroStack. They use micro-benchmarks as well as enterprise workloads such as Hadoop, Jenkins and Redis. The experiment setup is showed in great detail. SSD backends cope with random read/writes well, compared to HDD. Both VM perform well before storage is not saturated, but drop significantly after that. Lessons learned: use SSD, use local storage, don’t need to use reliable storage for Hadoop, Cassandra who have in-built replication. Single VM is not able to saturate all 10Gbps NIC due to CPU saturation; throughput is OVS bound; GRE encap/decap consumes high CPU. Suggestions for network: leverage DPDK, explore VLAN-based solutions. Anyway, the overall observations and conclustions are a bit too plain … I remember that Google Heracles have done quite a lot of analysis in depth.
DPDK, (Ironic,) SR-IOV are new technologies that can significantly boost performance. To use them: SR-IOV VM driver, DPDK VM driver. There are however a lot of issues before make them work together. The later slides focus on them. (However I want to know how to enable DPDK and SR-IOV in compute host or VM: this? this? …).
Present by Juniper. Value moves up The Value Stack and away from Telo’s. The needs are to enable applications which is closer to customers, and ingration of DC existing technologies and network & operations. Usecases span from L2-L7 networking, security services, 3GPP, to CDN, voice and video. The current gap of what is needed and what is available requires various of solutoins (or compromises or just wait). Generally this is a pretty good video with deep understanding to the Telco needs.
Zenoss promoting their monitoring solution: model, events, metrics. It uses no extra agents (use what is already there). There is Ceilometer integration from Ceilometer collector; and integration with Neturon, etc. Impact analysis generate a dependency graph to show the risk of a failure. I remember that a new project, Openstack Vitrage (invented from Nokia), is able to do root cause analysis; interesting but not receiving much attention yet. Not sure how many machine learning / detection / prediction are actually ready product use. As the slides illustrated, Zenoss monitoring solution is quite comprehensive; wish it is opensource.
Presented by Redhat, to demo the list of her Openstack platform customers: FICO, betfair, Verizon, etc. See 0m44s. Nothing related to “trust” technology.
IBM & Intel (and ZTE) presents. Watcher governs the Openstack and provides resource optimization, e.g. energy aware optimizations, workload consolidations, rebalancing, etc. It includes monitoring in the close loop. Users can template their own strategies. Watcher can run in advise mode, active mode, and verbose mode. It reminds me of VMware DRS which uses live-migration to conlidate VMs and saves power. Good project orientation.
Interesting, But Watch When Have Time
vBrownBag is less than 15min each. There are about 70 sessions of them on Openstack Austin Summit. vBrownBag has dedicated meeting rooms; I think it’s far from saturated; there must still be many empty slots remaining available.
Present by Veritas. End-to-end storage QoS
- Application SLA: max IOPS, min IOPS, workload priority, latency
- Efficienty utilization of all tiers of storage
- Storage congestion control
- Resolving distributed IO dependencies
- Data management IO prioritization
- I added: quotas, throttling, reservation, …
The QoS ability is provided via
- Cinder QoS APIs
- Scheduling filters for VM and storage affinity
- Commodity storage: Near Storage: implemented in IO stack of hypervisor
- Awareness of direct-attached storage
- Adaptive features/feedback
OpenFlame by Veritas - a software-defined storage solution for Openstack powered private clouds.
Juniper conainerized SRX (vs virtualized SRX) to monitor west-east traffic (a firewall). Would containerized network functions overides VNF one day?
Present by Veritas. Still introducing their storage solution; there are a few quite interesting designs. To enable efficient support for VM live migration:
- Use direct-attached storage (DAS) on compute node
- Make sure VM always have local replica
- Storage lazily moves with VM during live migration
- Nova/Cinder orchestrate the storage movement
Openstack native hyper-converged storage solution, interesting. Ceph however is not designed for this. Veritas OpenFlame.
CVE vulnerability database; pay attention. There are tons of Openstack-specific vulnerabilities disclosed on CVE. Make sure you patch faster than cyber attackers. This talk generally walks through the important CVE exploits, horrible. I guess soon (or already?) public net Openstack vulnerability scanner will show up.
Ericsson Network Manager - Analytics: insights of VNF.
Mirantis & Cisco presents. Workload testing
- Use realistic production environment as you can
- Compare results against baseline virtualization and baremetal results
- Use incremental adjustments to flavors to find sweet spot in CPU and Memory requirements
Proper flavor definition. Monitoring, metrics and elasticity calculations. Develop triggers for expansion.
- Elasticity: the number of growth and shrinkage daily based on total capacity
Available capacity managment softwares for Openstack: Talligent (commercial), Rightscale Cloud Analytics (Commercial), Cloud Kitty (opensource). Generally, this talk provides a lot of good practices for capacity planning. Mark.
Present by AT&T. Introducing AIC ECOMP architecture. Basically this is automated management, monitoring & auto-scaling, multi-datacenter hybrid cloud support, and visualized control panel.
VMware do community contribution, provide drivers, NSX, and the VIO (VMware integrated Openstack) for Openstack. But … what most Openstack adoptors want is
- 97% is to standardize one platform API
- 92% to avoid vendor lock-in
- 79% to accelerate innovation
- 75% to operation efficiency
- 66% to save money
And now Openstack show much more feature diversity (NFV, big data, IoT, container, etc) than VMware (except you pay for Pivotal again).
Introducing to ProxyFS, which provides filesystem through Swift object storage. The key is that, ProxyFS is integrated to Swift Proxy, rather than on top of Swift API. There are some comparisons against SwiftStack Filesystem Gateway. (Swift Filesystem supports Hadoop. And there is a S3QL who supports file access to Google Storage, Amazon S3, or Swift). ProxyFS use Log-Structured Files/Directories to store data in Swift. Present by SwiftStack. Looks promising.
Design Summit (Newton)
Openstack Austin Summit is for Mitaka version release. But the Design Summit is for Newton. Looking into the future.
Openstack Summit is high focused on users. Most contents are usecases, practices, experiences, project status, IT strategies, vendor promotions, etc. For real technical stuff, you may need to attend the Design Summit (it spans full week accompanying the main Summit). Core developers rally to discuss important decisions for the new release cycle, and summarize them on etherpads (I wish there would be videos too).
Besides, the most cutting edge technical updates always appear on developer maillist, where the key is to learn how experts think and discuss upon a new problem. And checkout the blueprint status (example) and code reviews (example) are helpful.
I mainly focus on Cinder and Magnum status
- Replication Next Steps (Tiramisu, in granularity of volume group)
- Rolling upgrades - next steps (Good to borrow to other systems)
- HA Active/Active (Awesome design, and long work)
- Scalable backup (Interesting)
- Cinderclient to OpenStackclient parity
- Changes to our current testing process
- Move Cinder Extenstions to Core
- Move API docs in tree
There are tons of details. Basically each core dev governs their specifc feature. See it yourself (and in accompany of blueprints). So don’t say Cinder is too mature to move. A lot of work needs to be done :-)
A lot of topics in Magnum too
- The bay driver design (Finally! When will CloudFoundry and Openshift come into Magnum?)
- Lifecycle operations for long running bays (Rotating certs, soft/hard rest, dynamically reconfigure, automate failover/recoveretc. Learning from Carina.)
- Magnum scalability / discussion of async implementation (Wait to learn from the design)
- Container Storage - Support for Container Data Volumes (What? OverlayFS is 5 - 7 times more performance than devicemapper?)
- Container Network - Integrate a Kuryr Networking Driver (The overlay^2 network performance penalty is biting)
- Ironic Integration - Add Support for Running Containers in Baremetal (Ironic has long been ignored but quite necessary in enterprise usecases)
- Challenges in Magnum Adoption - Experiences and How to Address
- Unified Container Abstraction for Different COEs
- Magnum HEAT template versioning (To allow Magnum upgration being compatible)
- Bay monitoring: health, utilization (Notifications and Ceilometer integration, etc)
See their details in each (and in accompany with blueprints).
Other Sources of Summaries
I like the Openstack Austin Summit observation written by Sammy Liu: Openstack community is marching in the “tier II” area, bigdata, NFV, IoT, blockchain, finance & trading, e-commerce core web servers, etc; VMware is still basically “tier I” however, while in “tier II” its voice is hardly heard. The opinion is very inspiring; but as a comment, I know CloudFoundry, which is usually deployed with VMware in commercial use, is an early starter in IoT, with GE the big partner; and Pivotal CF is also veteran in bigdata area (while it is true that voice of commerical IaaS & PaaS is very small in Openstack summmit). “Tier II” area is usually done with PaaS, rather than VMware which is IaaS.
And, usually each release of Openstack will have a release note that summarize major changes (e.g. Kilo’s). They are very useful. But on Mitaka, release note is re-organized by projects. They are not as informative as before, but I think still very useful to grasp the lastest updates.
Finally, if you are really interested in the feature & development progress of each Openstack component, I think however checkout the blueprint status on Launchpad (example) is the best way (and checkout the dev maillist).
Create an Issue or comment below