From Next-Generation Data Center
Each live session in the Building a Next-Generation Data Center course will include a guest speaker, enough time to ask guest speaker any questions you might have, and a follow-up discussion of various design or technology aspects.
These are the questions asked by the attendees of the Autumn 2016 question. We answered most of them during the live online sessions - you might want to watch the live session recordings to get the answers.
- 1 Generic questions
- 2 Collecting the Requirements
- 3 Automation and Orchestration
- 4 Design Compute and Storage Infrastructure
- 5 Design the Network Services and Infrastructure
- 6 High-Availability Concerns
- 7 Putting it All Together
Add questions that don't map directly into one of the sessions here. We'll discuss them whenever we'll have some extra time ;)
- Not sure if this is relevant, but I'd like to ask for the addition of the cabling, cooling and power infrastructure as part of the course.
Collecting the Requirements
- Various IT/Enterprise application types (tier 2/3, platform 2/3);
- Sizing and design guidelines for those applications, in terms of storage, network, compute/virtualization;
- How do I deal with big vendor applications that don’t follow great, multi-tier web-application style topologies?
- What if my customer(s) don't know what their requirements are or should be?
- While pushing for requirements (which usually don't appear), we are normally told that the network design must be flexible and therefore, it must account for future unknown requirements without the requirement for forklift upgrades. What is the best approach to deal with these types of requirements? (we normally complete the design and highlight all of the known limitations at the time).
- Within the Data centre, the network is responsible for various applications and services, with new ones being added well after the initial deployment. To account for the lowest common denominator, we normally work to an assumption that the network should fail over in under a second/or the best failover time that doesn't offer risks to flapping etc (for unicast based traffic which is sometimes challenging with services such as firewalling in the path). What is the best way to collect and account for requirements such as their tolerance to packet loss where large numbers of applications may exist (and new ones might appear tomorrow)?
- Somewhere there is lock in that resticts options up or down the stack; where is a good, or very bad, place(s) to have it?
- at the network layer (e.g. Cisco ACI)
- at compute layer (Hyperconverged, Nova etc.)
- at hypervisor layer (ESX, KVM, Acropolis etc.)
- at controller layer (VMware NSX, Big Switch BCF etc.)
- at orchestrator layer (Openstack, vRO etc.)
- at SDDC layer (Nuage, Mesos etc.)
Automation and Orchestration
- Day 2+ network automation and provisioning. How do you automate the ongoing production, for example making sure the templates you used to create day0 configs are getting synced back with changes on following days?
- Cloudstack v Openstack
- I don't even know how to think about doing any sort of automation. Working on very large-scale public-sector data centre networks, my team faces a very heterogeneous environment, with almost every major vendor and application running that you could think of! The network team is literally like plumbers. We provide the 'water' to whomever requests it (with appropriate approvals), with literally 100+ different teams to cater to. We are almost like a service provider - with absolutely no control over applications - and that unlikely to change in the foreseeable future. Where do you begin?
- Last week, we discussed OpenStack and it was highlighted a a possible overarching orchestration tool (for complete delivery of the end-to-end services - unless I completely misunderstood). Have anyone seen or been involved in a company where they have deployed or used an "orchestrator" of orchestration tools, are there any examples or products that people are currently using for this?
Design Compute and Storage Infrastructure
- When should I use storage tiering, all-flash storage and how does it compare to hyper-converged?
- Use cases for objects stores
- Can we talk about the different data protection/replication solutions (array based, host, network...etc) specially in heterogeneous environments, including disaster recovery of course?
- In a large scale cloud networks virtual ips are inevitable they are constantly move over the bonded interface in LACP active-active mode. What is the implication while the bonded interface have 2 ips and same mac address in the context of virtual ip mobility and VXLAN?
- What values should we take in consideration when calculating the maximum bandwidth of server? I guess it depends the host OS, Applications, RAM, Disk performance but how to find the info and do the math ?
- What are the limitations in virtualizing DB servers ?
- What are the main differences between storage and backup ? If the storage have replication on the DR site, is it considered as backup ? Do we need snapshots at different interval to be considered as backup ? How to properly size the backup size ?
- Does it make sense to run a containers server in a VM (Maybe more secure ?) ?
- Are containers considered as infrastructure or platform ?
- Would that be possible to do a summary and state what are the use case, pros, cons of the below technologies and how all this interact ? If some are used for files,blocks,sectors,data weither this lock/nolock resources or what is the maximum amount of data/latency recommanded ?
- Log Shipping
- Synchronous replication
- Asynchronous replication
- Eventual Consistency
- Transactional consistency
- Strong consistency
- 2 Phase commit
- Is VSAN a good technology to minimize the storage dedicated hardware?
- Should DMZ servers (bastion hosts) have a single IP address (single interface for both north and south traffic) or two IP addresses (seperate interfaces for north and south traffic)?
Design the Network Services and Infrastructure
- What about controller-less overlays, with BGP EVPN?
- what about using Clos with BGP for ECMP - no overlay - what are the benefits, implications, limitations etc.
- Are people using dedicated compute/storage for virtualized appliances vs just using the same compute/storage for all VM’s?
- Solving MTU issues in VXLAN environment
- How to determine best oversubscription ratio based on different applications especially in cloud scale environments where east-west traffic in higher than north-south...
- What is better for 2 DC with two independent dark fiber between, active-active design or active-dr?Or how to design both scenarios. Each DC has only single internet connectivity
- What to use for DCI, if I do not want to have trunk with transfer VLAN per VRF/tenant, if have one or two dark fiber between DC, what in case I have 3 DCs in a triangle?
- Two DCs with same security zones and central FW. How to build routing if I want bypass central FW if traffic traverse inside same security zone? Easiest way is dynamic routing inside zone and static default gateway pointing to central FW, but if I want have all dynamic? Or it is bad idea?
- Can we define what an underlay is?
Leaf-Spine Architectures Questions
- Is there a recommended maximum distance difference between a Leaf and Spine? If I have 2 rooms ~1km appart (~0.001ms) can I still have a full mesh fabric? If I have 10km is that still ok? Are the optics price and cabling price the only limit?
- Would it also be possible to cover stretched fabric impacts and limitations? By stretched I mean full mesh in each room (of the same logical DC) and only couple of links (2/4/6) between rooms (to avoid the cost of Single mode QSFP full mesh).
- In the case of a stretched fabric, what interconnection should we have (Leaf to Leaf vs Spine to Leaf partial mesh vs Spine to Spine)?
- What are the impact of having non equal links bandwidth between leaf and spines. For instance is that ok/does it make sense to have 8 computes racks connect with 40G to the spine and 4xI/O intensive rack (storage, LB, FW) connected with 100G to the same spine?
- Is it better to have I/O intensive rack or spread the I/O all over the fabric ?
- How to determine best oversubscription ration based on different applications especially in cloud scale environments where east-west traffic in higher than north-south?
- In the case we want a L3 fabric but still have non MLAG host/servers. Are the creation of an interswitch link dedicated for non MLAG traffic or adding another layer of switch (Fabric interconnect) good options ?
- Could you give some arguments if we need or not loss less transport in NFS and iSCSI environment ? Is there a need of path diversity with these protocols ?
- Issues with multiple overlays operating on the same CLOS fabric concurrently i.e. a mix of two or more of NSX, Openstack, and Docker Swarm
- Is this possible to have OVSDB and EVPN modifying the same dataplane and interacting with each other ? EVPN<->dataplane<->OVSDB table<->OVSDB Protocol<->SDN Controller ?
- Can we consider BGP RR as light BGP controler and so very light SDN ?
- In a leaf/Spine fabric should we have a dedicated interswitch link for orphan port traffic to avoid to use the MLAG interswitch link ?
- If using overlay network/VXLAN, limit also VXLAN to one rack? Is better to have stretched VXLAN as VLAN, or it is more/less same?
- How to handle firewalling of bare metal server in a pure L3 leaf/spine environment without overlay ?
Virtual networking and appliances
- What are the alternatives to overlays?
- How do we solve MTU issues in VXLAN environment?
- Is the Layer 3 switching over vxlan limitation still true with the new hardware ? Could you do a packet walk maybe to have a better understanding of L3 switching between hardware VTEP and Hypervisor VTEP ? Could you elaborate more the does it matter ?
- Is significant difference if we run VMware NSX over stretched VLANs as over L3 fabric? How to explain management, that L3 fabric is better in this case?
- How should we approach multi-tier applications, in terms of east-west/south-north traffic along with security segmentation , also how we decide that we should go with physical vs. virtual firewalling, LB etc.
JR RIvers answered most of these questions in the Networks, Buffers and Drops webinar
- Is there a way to monitor the buffers? What is a deep enough buffer? What is a too deep buffer?
- Is the serialization time an issue ? What are the impact of the serialization when going from 10G to 1G, 100G to 10G...
- If we know the buffer type and size of the device, is it possible to find the theoretical value where the switch will start to drop Incast or Elephant flow and when the serialization will become an issue?
- Is there a known formula to help with what is the correct size of a buffer when going from a XXG interface to a XG interface?
- What is the difference between Ingress vs VoQ vs Egress buffer model, what is the best or pros/cons of each?
- Is this more important to have big buffer in the spine or in the leaf?
- In the past the biggest area of contention seems to be buffering and solutions such as vSAN where microbursts are seen, normally we push for testing of new technologies and solutions, what is the best way to capture solution based requirements beyond vendor recommendations and testing.
- How large scale cloud companies solve IP mobility without overlays?
- Alternatives to overlays: route redistribution/injection, routed leaf ports and merits and demerits on a cloud scale data center.
- I would like to focus on the benefits and drawbacks of the large stretched Layer-2 domains, particularly in combination with NSX overlay segments. With the most recent releases, egress routing per site is fixed, and with a monitor and advertising /32 into ospf, can also fix inbound routing (I realize this is just a wan, not internet), but does allow for some dynamic movement automatically. And while I agree the networking pieces themselves could be done outside of NSX, it also adds the interesting possibilities of micro-segmentation, network function visualization, etc.
- NSX requires ESG to allow communications between different tenants in which the ESG can be a bottleneck in large scale DCs, the question: with this model is it true that you can use ECMP and use multiple ESG's to improve the global throughput and HA BUT you can NOT have both ECMP and North/South firewall at the edge of your tenant? and how this operates in a multi-DC scenario ?
- How do we help control/route inbound traffic for active/active scenarios? LISP or similar?
- Separate management infrastructure - what all do we put there? Where do we draw the line? Thoughts/specifics on large scale management networks?
- What is recommended HA solution for VM appliances (FW, Load Balancers etc.). Just using the same approach as for physical devices: Active/Standby HA VM pair on different Physical Hosts (Vendor recommended practice)? Or we could leverage Virtualization Platform HA solution and use just one appliance VM (account increased Recovery Time Objective versus budget)?
- Two DCs with Active/Active design scenario. Separated L3 Networks, using DNS for load balancing. Perfectly written applications (independent instances in each DC). In such case do we still need to build HA solution (HA FW pair etc.) for each DC ? Or we can just use one device for each DC (cost savings)?
Putting it All Together
- How do I build very small data center footprints - two blade chassis or so?
- Should we have a default route in the data center? If yes, should I point it to the Internet or in a black hole/honey pot or somewhere else?
- Should I avoid NAT at all cost? If I avoid NAT then should I redistribute public route into my DC or use the default?
- Should I use 1 or 2 layers of perimeter firewall ? Can I use the loadbalancing layer or reverse proxy as one of this firewall layer?
- Do you consider the load balancer or reverse proxy that end the TCP session with the client like a NAT mechanism?
- Regulatory compliance requirements in the design (PCI DSS, etc)?