Saturday, August 17, 2013

Configuring a Clustered NetApp Filer as an NFS Datastore for VMware ESXi Implementing Multiple VLANs, MTUs and IPs

On your NetApp filer you can easily configure multiple VLANs with differing MTU on the same LACP trunked 1GbE or 10GbE ports with stacked IPs on the storage VLAN network to assist with load balancing.  In this example, network 10.0.0/24 (VLAN 10, MTU 1500) is just the regular network. Network 10.0.1/24 (VLAN 20, MTU 9000) is the NFS storage network. On your switch create an LACP trunk to the filer's interfaces and then trunk VLANs 10 and 20. Your ESXi servers storage network would also be on VLAN 20 and use the load balancing policy of Route based on IP hash. On the switch you would create a static trunk (since ESXi 5 does not support LACP). The VMkernel port on the vSwitch would be untagged for the storage network. Here's /etc/rc:

hostname filer1
ifconfig e0a flowcontrol send
ifconfig e0b flowcontrol send
ifconfig e0c flowcontrol send
ifconfig e0d flowcontrol send
vif create lacp NETWORK -b ip e0a e0b e0c e0d
vlan create NETWORK 10 20
ifconfig NETWORK-10 `hostname`-NETWORK-10 netmask mtusize 1500 -wins partner
ifconfig NETWORK-20 `hostname`-NETWORK-20 netmask mtusize 9000 -wins partner
ifconfig NETWORK-20 alias `hostname`-NETWORK-20-ALIAS-1 netmask
ifconfig NETWORK-20 alias `hostname`-NETWORK-20-ALIAS-2 netmask
ifconfig NETWORK-20 alias `hostname`-NETWORK-20-ALIAS-3 netmask
route add default
routed on
options dns.enable on
options nis.enable off

Ensure /etc/hosts is populated correctly with the IP of both toasters in the event of failover/failback: localhost filer1 filer1-NETWORK-10 filer1-NETWORK-20 filer1-NETWORK-20-ALIAS-1 filer1-NETWORK-20-ALIAS-2 filer1-NETWORK-20-ALIAS-3 filer2 filer2-NETWORK-10 filer2-NETWORK-20 filer2-NETWORK-20-ALIAS-1 filer2-NETWORK-20-ALIAS-2 filer2-NETWORK-20-ALIAS-3

Ensure your VM exports (/etc/exports) are secured ensuring only access from your ESXi VMKernel port on the storage switch of each ESXi host - in this case there are 3 ESXi hosts. Additionally, individual IPs don't necessarily need to be used if an entire subnet requires rw and root access to the VM volumes:

/vol/root      -sec=sys,rw,anon=0,nosuid
/vol/root/home -sec=sys,rw,nosuid
/vol/downloads -sec=sys,rw,nosuid
/vol/vm00      -sec=sys,rw=,root=
/vol/vm01      -sec=sys,rw=,root=
/vol/vm02      -sec=sys,rw=,root=
/vol/vm03      -sec=sys,rw=,root=
/vol/iso       -sec=sys,rw=,root=

This configuration would be need to be made identically on filer1 and filer2 with the exception that on filer2 the hostname changes in /etc/rc.

20-Jun-2014 03:20 AM - George Willia
Thank you, this is helpful with the OnTap Edge VSA evaluation.. do you know if LACP is supported on the virtual nic's of the VSA?
20-Jun-2014 03:20 AM - James Bourne
I think so since OnTap Edge includes NetApp Data ONTAP-v.
20-Jun-2014 03:20 AM - Ryan
One really has to wonder what kind of drugs the folks at netapp are on... NFS is without a doubt the worst protocol to use VMware with, the NFS client doesn't even support native multipath only link failover.
and lets not even mention the fact that vmware vmkernel cannot take advantage of LACP properly due to the way it streams the data with a single connection for every data store.
lets go into detail when you set flow control on your netapp and your switch you are telling the switch you want it to buffer any overflow on the network the netapp never gets pause frames and neither do your esx(i) host's.
and the netapps are known to send far to many pause frames onto the lan.
really it sucks for their customers I'm one and we have 4 netapp's which with any luck will be replaced with some very nice EMC VMAX hardware soon.
Good luck and don't get bit by the netapp bug.

20-Jun-2014 03:20 AM - James Bourne
Well they are trying to leverage a proven and robust simple technology. I've done a bunch of smaller ESXi / NFS setups. Configuration is an absolute breeze and scaling your datastores is trivial. The same can't generally be said for block approach on iSCSI and FC plus it can get costly and complex very rapidly.
20-Jun-2014 03:20 AM - Ed
Actually you are completely incorrect Ryan in blaming NetApp for this problem. VMware needs to update and fix their NFS client, the problem resides on their side. Also, if you had bothered to read any NetApp best practice documentation, flow control is set to off on all 10G links. You should read up before posting uneducated comments.

Captcha Image

Recent Posts



    Sign up for Product Updates and Discounts
    Captcha Image