Cisco ACI

The Cisco ACI page contains articles and study notes that helped me understand Cisco ACI technology.

In all my writings about ACI, you will notice that I am using capital words in the middle of sentences (such as “Filters” instead of “filters”, “Domain” instead of “domain”). And that has two reasons:

1- I refer to these keywords as being in the ACI terminology. So for example when I write “Filter”, I mean with it the concept of filters in the ACI world, not filters within Cisco ASA firewalls. And sometimes I will emphasize on the fact that we are in ACI terminology by adding the word “ACI” in front of it, such as “ACI filter”, “ACI contract”, etc.

2- It has become a mechanical writing behaviour to capitalize the first letter of concepts, since I am living in Germany.

ACI Topologies I am using

Depending on the availability of the ACI lab, I am using two ACI topologies in my study notes:

ACI lab topology 1
ACI lab topology 1
ACI lab topology 2
ACI lab topology 2

Imperative Model vs Declarative Model

  • There are two operational models, as to how the hardware reacts to the intent of the network administrator: imperative or declarative.
  • The imperative model is what we, network engineers, did for years before the appearance of ACI: we tell network equipement how to implement a feature or a protocol by “programming” them. the result is immediately seeable.
  • The declarative model is however when we tell the hardware where and when we need such and such features, but we do not tell the hardware how to implement them.

ACI builds by default a zero-based trust network, i.e communication is not allowed unless specified, which is opposite to the traditional network which is trust-based.

APIC Initial Setup

The APIC server is a Cisco UCS-C series. It has generally these types of physical ports:

APIC-rear-view
APIC rear view
  • number 2: IP OOB management network ports
  • number 4: the CIMC management port. This is where you plug an Ethernet cable and access the CIMC web GUI with its IP address to manage the physical server, i.e. the UCS chassis. Note that on the APIC, the CIMC management IP address and the IP OOB management address are two different concepts!
  • number 5: console port. This is where you plug a console cable, or a terminal server to remotely access the chassis console.
  • number 9: ports to connect to the fabrics. In APIC-L3 we have four ports grouped in pairs: the first and second ports form one pair, the third and the fourth form another pair. To connect APIC we may use a maximum of one port from each pair to connect to leafs: one from the first pair, and one from the second pair.

Initially we must configure CIMC with certain parameters. Once CIMC IP settings are done, we can configure APIC further with the CIMC virtual console (CIMC GUI) or through a direct connection (keyboard and mouse).

  • physically APIC requires 2x 10G connections to the fabric, 2x connections to the Out-of-Band management network and one CIMC connection.
  • The CIMC port is different from the IP Out-of-Band management port.
  • configure parameters such as
    • Controller ID
    • Controller Name
    • Fabric Name
    • Infrastructure VLAN
    • VTEP address pool
    • Out-of-Band Management IP address and gateway
    • Multicast group address
  • once the following parameters are set up during the script install, you can not change them later unless you rebuild the fabric:
    • Infrastructure VLAN
    • VTEP address pool
  • Adding a switch that has a different configuration to the fabric won’t overwrite the fabric configuration, contrary to what most network engineers would think. In fact, there is no such a thing like VTP in the ACI fabric.

Shard

A Shard is the smallest unit of data in a database. Understanding this concept is critical for the operation of APIC clusters. In Cisco ACI, each Shard has three instances: one active instance and two backups. And APIC servers in an APIC cluster share the three instances, in a way that depends on how many APICs we have.

So if we have only one APIC, he will hold all three instances of each Shard, which equivalent of 100% losing data when the APIC crashes. If we have three APICs, one of them will hold the active instance and the other APICs will hold a backup each.

APIC cluster

An APIC cluster is composed of at least two APICs and a maximum of 7, as of ACI Release 4.1. Cisco recommends to design APIC clusters in sizes of 3, 5 or 7 APICs in order to preserve the minority/majority in terms of Shard and avoid split-brain APIC scenarios. So always have odd numbers of APICs in your cluster. The minimum recommended APIC cluster size is then 3. A fourth APIC can be added to the cluster but stays in Cold Standby state. The cold standby APIC:

  • does not participate in policy definition
  • becomes active when an APIC goes down. In this case we need to add a new cold standby APIC to the cluster
  • has its firmware updated whenever the firmware of the active APICs in the cluster are upgraded.

A bad ACI controller design is to have an APIC cluster of 2, 4 or 6 controllers.

In a multi-site ACI design, we must have at any time a site that contains the highest number of APIC servers. This site is said to have the “quorum”, meaning “the majority of APICs”.

All APICs in a cluster must have the same ACI firmware version. So when you have a running cluster of APICs and you want to add a new APIC, you need to upgrade him first to the same ACI firmware as the active APICs before configuring him to join them.

Before adding an APIC or upgrading an APIC cluster, ensure that each APIC’s health is 100% (“fully fit”).

If all APICs fail, the fabric continues to forward packets as normal.

APIC servers are staged one after the other, unless we are in a multipod setting, where not all APICs are on the same physical location. In this case we must set up the Pod environments first then stage the remote APICs.

APIC clusters of 5 allow to have active/standby operation: 3 APICs are active and 2 are standby.

Since Cisco APIC is based on Linux, you can access its Bash shell by issuing the following command:

apic# bash

you will find yourself with this new prompt:

admin@apic:> 

COOP

COOP, or the Councel Of Oracle Protocol, is a protocol that runs on spines and leafs to exchange end point reachability information.

By “reachability information” I mean: MAC Address, IP Address, port number. The encapsulation ID (VLAN ID, or VXLAN ID if the end point supports it) stay local to the leaf.

Each leaf sends the reachability information of its end points using COOP to one of its attached spines. The spine shares the received information using COOP also with the other spines. And that is how every spine (also called Oracle in COOP terminology) participates to build a complete table called the Proxy Station Table, aka PST.

Local Station Table, Global Station Table, Proxy Station Table

  • LST (Local Station Table):
    • exists on each leaf
    • contains entries of the directly attached end points: MAC address, IP address, encapsulation ID and port number.
    • entries are in the format “Endpoint IP address — VTEP address” ?
  • GST (Global Station Table):
    • found on each leaf
    • builds entries based on learned traffic from non-local endpoints, e.g endpoints that are attached to other leafs.
    • entries are in the format “Endpoint IP address — VTEP address” ?
  • PST (Proxy Station Table):
    • found only on spines
    • entries are in the format “Endpoint IP address — VTEP address?

vPC Protection Group and vPC Domain Policy

In ACI, we can group any two leafs together in a vPC. As a network engineer, you should know that we need to configure the following constructs: vPC domain, vPC peer link, vPC keepalive peer link.

In ACI:

  • The vPC domain is called vPC Protection Group.
  • the vPC Peer Link, the vPC Keepalive Peer Link and the vPC Dead Interval are part of the vPC Domain Policy and are automatically set up by the APIC in the background as soon as the network engineer configures a vPC Domain Policy.

In a vPC Protection Group, the network engineer must define a Pairing Type. This dictates how the leafs – that are to be put in vPC- are chosen.

In fact, there are three possible Pairing Type values: consecutive, explicit and reciprocal:

aci vpc pairing types
aci vpc pairing types

To illustrate the difference between Pairing Types, let us consider ACI lab topology1.

If the network engineer chooses Pairing Type consecutive, then APIC will put:

  • Leaf 1001 and Leaf 1002 in a vPC domain,
  • Leaf 1003 and Leaf 1004 in another vPC domain.

If the network engineer chooses Pairing Type explicit, then it means he will select the leaf switches by himself. For example, he can select leafs 1001 and 1004 to be part of a vPC domain, and leafs 1002 and 1003 in another one. In this case, he should create what is called in ACI the vPC Explicit Protection Group.

Oh, and the network engineer can not mix Pairing Types. In fact, he must chooses a Pairing Type and stick with it, because it is applied on all the fabric.

Let us assume the network engineer set the vPC Pairing Type to consecutive at the beginning of the ACI deployment project. Then for some reason he figured out he wanted to pair the first leaf with the last one in a vPC Domain. This means he must change the Pairing Type to explicit, thus he must bare in mind the fact that all vPC channels would be evaporated!

What happens actually when a network engineer clicks on Submit after selecting another Pairing Type?

aci selecting another vPC pairing type
aci selecting another vPC pairing type
clicking on submit after changing the vPC pairing type
clicking on submit after changing the vPC pairing type
ACI vPC switching from one pairing type to another
ACI vPC switching from one pairing type to another: a warning message from APIC.

Pay attention that each ACI leaf can be part of only one vPC Domain. Thus it can have only one vPC Domain ID.

vPC Domain Policy configuration

To configure a vPC Domain Policy, a network engineer should click on Fabric –> Access Policies as a start point. From there, he has two possibilities, depending on his ACI release:

  • for ACI release 3, further click on Switch Policies –> Policies –> VPC Domain
aci vpc domain policy in aci release 3
vpc domain policy in aci release 3
  • for ACI release 4, further click on Policies –> Switch –> VPC Domain
ACI vPC domain policies
vPC domain policies in ACI release 4

The location of the managed Object is as you can see not that different between release 3 and 4.

The name VPC Domain in the ACI Management Information Tree is misleading. I think Cisco should have named it VPC Domain Policies.

Click the Create VPC Domain Policy button to create a vPC Domain Policy:

ACI create a vPC Domain Policy
create a vPC Domain Policy in ACI

There are no big things to set up here:

ACI create vpc domain policy settings
create vpc domain policy settings in ACI

vPC Explicit Protection Group configuration

To configure an explicit vPC Protection Group, a network engineer should:

in ACI release 3: go to Fabric –> Access Policies –> Switch Policies –> Policies –> Virtual Port Channel Default

virtual port channel default in ACI release 3
virtual port channel default in ACI release 3

go to Fabric –> Access Policies –> Policies –> Switch –> Virtual Port Channel Default

aci vpc explicit protection group
virtual port channel default in ACI release 4
creating explicit vPC Protection Group in ACI
creating explicit vPC Protection Group in ACI

I would consider the managed object Virtual Port Channel default as a sort of vPC peer building policy, where the network engineer defines the Pairing Type, the vPC peer switches in the vPC Protected Group (if the Pairing Type is explicit) and the associated vPC Domain Policy.

ID is like the vPC Domain ID that you may have configured on Cisco Nexus switches.

Notice that the virtual IP address of the vPC construct is created automatically by APIC. I think the IP is an incrementation of 1 in comparison to the highest IP address of the existing vPC explicit Protection Groups. The highest IP address was 10.0.136.67/32. So my vPC explicit Protection Group got the virtual IP address of 10.0.136.68/32:

vPC explicit Protection Groups
vPC explicit Protection Groups: virtual IP

Vmware VDS vs Cisco AVS

Vmware VDS

  • is purely a L2 virtual switch.
  • spans one or more virtualization host, unlike the Virtual Standard Switch VSS (not to confuse with VSS feature on Catalyst switches)
    • VSS is not supported when integrating ACI with vCenter.
  • can either be managed by vCenter or by ACI when the vSphere environment is integrated to ACI through a VMM Domain. In the latter case, we call it an ACI-managed DVS.
  • does not support OpFlex.
  • supports CDP and LLDP.
  • has a closed code that only VMware has access to. That is why APIC follows an imperative model with DVS

Cisco AVS

  • Application Virtual Switch, is a multi hypervisor (compatible with many hypervisor vendors) virtual switch that comes with ACI free of charge
  • is built on the successful 1000V switch.
  • is completely managed by ACI, unlike the 1000V that was managed by a Virtual Supervisor Module VSM.
  • supports L2/L3 functionality
  • supports OpFlex
  • is supported by VMware vSphere up to vSphere 6.5.
  • integrates a Distributed Firewall which can be in disabled mode, Learning mode or Enabled mode.
  • supports both VLAN and VXLAN encapsulations
  • supports neither intra-EPG isolation nor intra-EPG contracts
  • its successor is the Cisco ACI Virtual Edge (AVE)

The choice of using VDS or Cisco AVS is made in the menu during the configuration of the VMM domain integration.

Integrating Workloads with ACI

  • In the IT industry we distinguish physical workloads and virtual workloads.
  • A physical workload is a subset of compute, storage and network resources dedicated to a single physical entity or machine.
  • A virtual Workload is the same subset being used by a virtual machine.
  • When integrating a physical workload with ACI:
    • we should very likely configure policies on each physical NIC or virtual NIC on the server.
    • we configure static path binding on the EPG.
  • To integrate ACI with Microsoft platforms, we have two options:
    • integration with Microsoft SCVMM
    • integration with Azure Pack
      • provides ready-to-use management portal and administrator portal
      • reflects the same experience as Microsoft Azure cloud.

ACI Migration Modes

when companies migrate from a traditional network to ACI, they can adopt either one of the following approaches:

  • network-centric mode:
    • the network administrator creates a subnet per VLAN per Bridge Domain and put their servers that were in this VLAN on an EPG. This mode is also known as the VLAN=BD=EPG mode.
    • in this mode we take our existing VLANs and subnets from the old network and create them in ACI; i.e. we “reproduce” the network in ACI. 
    • can be a one-tenant or a multi-tenant setup.
  • Application-centric mode .
  • Hybrid mode: some servers remain grouped by VLAN, some others according to another criteria such as by application or by business need, rather than by VLAN as it was in the traditional network. The hybrid mode is the combination of the network-centric mode and some features from the application-centric mode.

There is nothing wrong with either migration modes, i.e. you are not forced to migrate to the application-centric mode if you don’t have a need to. Always ask the question “Is my customer happy with my network design?”

Blacklist Model vs Whitelist Model

In an organisation the corporate security guidelines and policies follow one of the following security models: blacklist or whitelist. In a blacklist model everything is open unless specifically denied. In the whitelist model every communication is denied unless specifically authorized. A quick analogy to understand the whitelist model is Cisco IOS Access Lists. There an entry “deny any” at the end of the ACL. Cisco ACI employs the whitelist model by default.

Overlay vs Underlay

  • VXLAN forms the overlay network in ACI. And IS-IS builds the underlay network, which runs transparently in the ACI fabric. IS-IS operations do not require any user intervention.
  • MDT (Multicast Distribution Tree) runs also on the underlay network.

VTEP aka TEP

VTEP, Virtual Tunnel EndPoint addresses, or simply TEP, defines either the virtual tunneling technology or the tunnel endpoint address.

VTEP pool is the the address pool that we will assign to the TEP devices, our ACI spines and leafs being the VTEP devices. Technically a VTEP poo is a subnet.

Each ACI fabric node requires a VTEP address to be able to route packets internally to other fabric nodes:

cisco aci fabric membership under Inventory
cisco aci fabric membership under Inventory

We can call it also VTEP prefix. It is defined during an initial APIC setup and is recommended to be a /16 or a /17 subnet. By default, The VTEP pool has the subnet 10.0.0.0/16. And starting from ACI version 2 we can configure a VTEP pool as a /22 subnet.

Switches in a Pod – whether they are leafs or spines- share the same VTEP prefix. I said “switches” because an APIC does not have a VTEP address.

We can display the VTEP pool

displaying TEP pool on the pod 1
displaying TEP pool on the pod 1

VMware NIC cards

  • vNIC: the virtual NIC on a VM
  • vmnic: aka pNIC: the physical NIC on the virtualization host
  • vmknic: the vmnic on the hypervisor itself, used to transport infrastructure traffic from/to the hypervisor itself.

ACI Traffic Classification

ACI fabric performs traffic classification when an end host or a NIC is attached to it. The purpose is to be able to correctly assign the endpoint to one preconfigured EPG.

Traffic classification is based on one of the following criteria, depending on whether we attach a physical workload or a virtual workload, or whether we use an ACI-managed DVS or AVS:

  • source MAC address
  • source IP address
  • port and VLAN encapsulation
  • port and VXLAN encapsulation
  • etc.

Opflex

  • is a declarative policy distribution model.
  • is supported on ACI components and many physical and virtual switches on the market.
  • is used by APIC to communicate with the Nexus 9000 in the ACI fabric.
  • was presented as a draft to IETF. It is as of 2020 still not standardized yet.
  • is a southbound protocol.
  • is not supported on Vmware VDS.
  • is supported on Microsoft vSwitch, Openstack OVS and Kubernetes OVS.
  • runs on the ACI Infrastructure VLAN.
  • provides visibility of virtual switches in Openstack environments
  • When a virtual switch supports Opflex, we must extend the ACI infrastructure VLAN outside of the fabric, which we can perform in the AAEP configuration menu.
  • finds out which physical ports of the virtualization host are connected to leaf ports, if the virtualization host is directly plugged to ACI fabric. Otherwise, LLDP (enabled by default) or CDP (not enabled by default) is used. Remember that a virtual switch connects to physical NIC ports of the virtualization host, and the physical NIC ports connect to the ACI fabric.
  • An Opflex proxy is built in every ACI leaf. It serves to interact with an OVS Opflex agent, whenever we integrate ACI with Openstack.
  • The Opflex protocol runs on TCP 8009.

  • NTP must be configured and synchronized on APIC and all fabric nodes. Here is a quick tutorial on setting up NTP on ACI.
  • New nodes being added to the fabric are automatically discovered by APIC through LLDP. As soon as they pop up in the APIC GUI Interface you can add or block them from joining the fabric, based on their Serial Numbers.
  • New fabric nodes send DHCP requests and receive replies from APIC.
  • APIC sends TEP addresses to the new leafs
  • Giving lower numerical IDs to the spines is recommended. The subsequent higher IDs should be reserved for the leafs.
  • All fabric nodes and APICs should be connected to an OOB network for management purposes. The same OOB network carries traffic to and from a virtual manager like vCenter, when the vCenter is integrated in ACI (see VMM domains).
  • Access to leaf switches through console cable is possible but offers only read capabilities.
  • OS image management occurs on the APIC, which supports TFTP
  • in ACI there is no need to:
    • configure loopback addresses on new switches
    • configure IGP protocol and neighborships
    • configure custom routing timers
    • configure list of allowed VLANs on trunks.
  • Management of the fabric can be performed also using an external management station connected to the fabric on tenant “mgmt”. In this scenario you must:
    • configure a VLAN Pool, an AEP, a physical domain
    • assign the VLAN Pool to the domain
    • encapsulate the domain under the AEP
  • Provisioning a switch port in traditional networks is completely different from the ACI world:
    • in a traditional switch you configure interfaces separately
    • in ACI, you configure many constructs and objects at first, sch as domain, AEP, VLAN Pool, Switch Profile, Interface Profile… which may seem a burden at first. But its power lays with its flexibility and extensibility. For example if you want to add an interface with similar configuration to a previous one, simply add it to the Interface Profile.
  • an Application in the ACI model ist not a virtual/physical machine, but the combination of:
    • workloads, either physical or virtual
    • L2 – L7 policies: VLANs, subnets, L4 ports, ACL, QoS policies, filtering policies, load balancing policies,…
  • In terms of number of supported Spines, the ACI fabric supports a minimum of 2 and a maximum of 6, in even numbers (2, 4, 6).
  • ACI fabric operates on a whitelist model: no communication is allowed unless specified.
  • Frames in ACI are routed, but the L2 switching semantics are preserved.
  • Infrastructure VLAN
    • is used within the fabric
    • must be unique on the whole network, including end host VLANs
    • must be extended (manually configured) to Blade Systems
    • recommended but not mandatory: use VLAN ID 3967.
  • ACI Basic vs Advanced GUI
    • Basic GUI
      • use cases:
        • for small ACI deployments
        • for network administrators who do not need full ACI features such as L4-7 integration.
      • allows configuration of tenants, leaf ports and access profiles
      • allows to configure one port at a time
      • is above ACI v3.0 not supported anymore
    • Advanced GUI
      • allows to configure multiple ports through the access selector and Interface Profiles
      • recommended to be used.
  • VXLAN overlay:
    • the virtual network built using all VTEP addresses of the fabric nodes (leafs and spines).
    • VXLAN packets are routed over the fabric underlay network.
  • Fabric underlay: is the IS-IS network topology that ensures that packets are routed from one fabric node to another. This network runs under the hood and does not need any intervention from the APIC administrator. All routes in the underlay are host routes (/32 subnet masks)
  • Software overlay network:
    • not to be confused with the VXLAN Overlay of the fabric.
    • the logical network built between virtual switches that are located on a hypervisor.
    • When a virtualized server is a dual hypervisor, then each hypervisor runs its own software network overlay, and both network overlays do not communicate with each other.
    • the software overlay network does not communicate with the physical network either unless a software gateway is installed.
  • Dockers (equivalent to VM) in the Linux Docker technology doe not have their own TCP/IP stack but rather a namespace in the TCP/IP stack of the host machine.
  • A Blade system (or Blade Chassis) is composed of Blade servers and Blade Switches
    • Blade Switches are physical
    • Blade Servers are physical and contain Virtual Switches
  • ACI plugin for vCenter
    • allows virtualization administrators to interact with APIC in an easy way without the requirement to have prior networking knowledge:
    • virtualization administrator can add/delete/modify ACI constructs (tenants, VRF, Bridge Domains, App Profile, EPG, uSeg EPG), add/modify Port Groups, add/modify VM to Port group associations, etc.

Endpoint Learning

there are three so-called Station tables

  • local station table:
    • each leaf has a local station table which contains information of all endpoints connected to it: VLAN ID, MAC/IP address and Port.
  • global station table
    • each leaf has also a copy of the global station table
    • contains cached information about some remote endpoints. Leafs are not supposed to possess forwarding information about all endpoints in the fabric.
  • proxy station table
    • resides on the spines
    • all the spines have the same proxy station table
    • contains forwarding information about all endpoints attached to the leafs (aka the endpoint reachability information).
      • The endpoint reachability information includes:
        • L2 information: VLAN, endpoint MAC address
        • L3 information: endpoint IP address
        • location information: Leaf ID, access port ID.

The concepts of Local Station Table and Global Station Table are introduced by ACI hardware that uses Broadcom-based ASICs. The new Cisco Nexus 9000 such as the EX2 and FX2 leverage the new Cisco Cloud Scale ASICs, which do not have the concept of Local Station Table or Global Station Table. In fact, with Cloud Scale ASIC, end point information is maintained with FPt, Forwarding and Policy tiles.

Microsegmentation

  • leads to the distinction between the original EPG (aka Base EPG) and microsegmented EPG (aka uSeg EPG)
  • The purpose of microsegmenting EPG is to automate the assignment of selected Virtual Machines to a particular EPG using rules, instead of the VMware administrator having to manually assigning them.
  • Each rule is in the format “match-any | match-all {u-attribute}”, where u-attributes are the microsegmentation attributes
  • Only two u-attributes are supported by uSeg EPG when attached to bare metal servers:
    • IP Address
    • MAC Address
  • the list of available u-attributes of an uSeg EPG attaching to a VMM domain is richer:
    • IP Address
    • MAC Address
    • VM Name
    • VM OS
    • VM tag
  • a rule can be a pure “match-any” filter, a pure “match-all” filter, or a combination of both.
  • if there are many clauses in the rule, than beware of the precedence among the u-attributes, e.g. the u-attribute “VM Name” has a higher precedence than “VM tag”. So if the u-attribute “VM Name” matches first, further clauses of the rule won’t be inspected by APIC.
  • available for both physical and VMM Domains

ACI Fabric Multi Site Design For Active-Active Data Centers

Cisco has determined the following design topologies when dealing with an ACI fabric on multiple sites:

ACI Stretched Fabric Design

In a stretched fabric ACI design, the ACI fabric is – like its name mentions- stretched on both sites (I suppose we have two sites). We still have:

  • one APIC cluster: one APIC is installed on one site and two APIC on the other site,
  • One control plane, one data plane.

From the leafs perspective, we have:

  • some leafs from site A physically connect to some spines of site B. These are called transit leafs; leafs that connect the sites together.
  • partial or full-meshed physical topology between leafs and spines.

The Round Trip Time between sites must be less than 10ms.

The Data Center Interconnect (DCI) link is one of the following options:

  • for a maximum of 40Gbps throughput, we can choose DWDM or a dark fiber
  • for 100Gbps, there is Ethernet over MPLS (EoMPLS) pseudo-wire technology.

When the DCI link goes down, we have a split brain situation. In this case, the APIC cluster minority operates in read-only mode.

A stretched ACI fabric has historically evolved into ACI multi-pod design.

ACI Multi-Pod Design

  • Pods can be in the same physical location (intra-DC) or in separate locations (inter-DC) separated by a point-to-point network like dark fiber or DWDM, or by a traditional L3 infrastructure network like an MPLS network.
    • Whether it is MPLS or point-to-point, the transport network must have a maximum of 50 ms RTT. This value depends also on the ACI firmware release.
  • Each Pod owns a separate control plane. However, the spines on both Pods exchange COOP entries using MultiProtocol BGP over Ethernet VPN (MP-BGP EVPN).
  • It involves an InterPod Network IPN consisting of IPN devices.
  • IPN devices:
    • can be routers or modular switches, which support MP-BGP.
    • must support Multicast PIM BiDir mode in order to correctly forward BUM traffic between Pods.
    • at least one IPN device connects to some of all spines per Pod. Ideally two IPN devices connect to all spines per Pod.
    • establish OSPF peering with spines of each pod.
    • have in their routing tables the TEP pool prefixes of the Pods.
    • Each IPN device installs a multicast source-group pair (*, G), with G = the GIPO value of each Bridge Domains of the attached Pod.
    • It is recommended to ensure the existence of a physical path anytime between an IPN device A and an IPN device B, whether they are connected to the same Pod or not.
    • Between IPN devices use 10/40/100Gbps connectivity
  • The spines that are peering with the IPN devices perform mutual redistribution:
    • they redistribute IS-IS prefixes (local TEP pool prefix) into OSPF, and
    • redistribute OSPF prefixes they learned from IPN devices (these OSPF prefixes are the remote TEP pool prefixes) into IS-IS, in order to let the local leafs learn them and know how to reach remote TEP addresses.
  • when IP communication fails between the Pods:
    • the Pod with the APIC majority still operates in read/write
    • the Pod with the APIC minority operates in read-only mode. When communication is restored, it synchronizes its database.

ACI Dual Fabric Design

In a dual fabric design, each site has its own APIC cluster and own ACI fabric. Both ACI fabrics are connected over the L2 or L3 networks, which are carried by some leafs at each site.

The ACI dual-fabric design is officially not supported by Cisco anymore and Cisco encourages network engineers to consider the multi-site design when they are planning more than one ACI cluster and fabric in their network.

ACI Multi-Site Design

The multisite design is an evolution of the dual fabric design, where we have the possibility of connecting more than two ACI fabrics over the WAN. The WAN is connected at the spines of each site.

The hardware we choose for the spines that connect to the EVPN (so not all spines) must be at least EX series.

ACI Integration With Puppet

  • Puppet is a data center orchestration framework.
  • Puppet configuration includes preparing modules that will be downloaded onto puppet-compatible hardware platforms
  • Puppet components:
    • Puppet Master: the server that hosts the modules
    • Puppet Agents: installed on the Nexus switches
  • Cisco Nexus 9000 support Puppet natively in its API, i.e. we can install Puppet modules on the nexus switch.
  • A Puppet module contains configuration of a certain feature, for example SNMP, VRF, interface speed,… As soon as the module is downloaded on the switch, the changes in the config are visible.

Configuration Zones

  • A Configuration Zone is a technology that reduces the impact of a change on the tenant infrastructure.
  • From a leaf up to a complete pod can be selected as a configuration zone.
  • The configuration change made within a configuration zone is:
    • enabled: which means “the change takes effect immediately”
    • disabled: which means “the change is queued but not executed.

Software Management

ACI acts as a software repository.

Do not upgrade all fabric nodes at once! Define groups, upgrade a small number of groups, observe the result, then upgrade the rest.

At any time, there is:

  • only one image for spines and leafs
  • only one image for the APICs

Avoid running mixed firmware versions between switches for more than a couple of hours.

Always upgrade the APICs first then the fabric switches.

ACI Snapshots And Rollbacks

Rollbacks are performed from the Admin tab.

We can configure snapshots are rollbacks on ACI either for all objects or for a selected object.

How to check the supervisors on a Nexus 9500

Having redundant hardware on the ACI fabric spines is critical to maintain the fabric operational. The Cisco Nexus 9500 supports up to two supervisor modules that operate in an active-passive redundancy model.

to see which supervisor is active and which one is standby:

cisco-aci-spine-hardware

Go to Fabric –> Inventory. Select your Pod. Select the spine. Click on Chassis then Supervisor Modules

The list of supervisors, their status and even their serial numbers are displayed on the left of the window:

cisco aci supervisors on the spine
cisco ACI supervisors on the spine

ACI fat tree = an ACI full-meshed fabric. In other words, all spines are physically connected to all leafs.

Scaling up a fabric = adding more spines. When we want more bandwidth and redundancy within the fabric we need to scale it up.

Scaling out a fabric = adding more leafs.

Tetration is a Cisco product that allows to record all application packets (in some cases up to one year!), perform application profiling and automatic application segmentation. It could help those application engineers that do not master their application flows (which is not a surprise by the way if you encounter them in the business).

I’ve distilled all these notes from my ACI study material.

My other Cisco ACI notes