	    Open Fabrics Enterprise Distribution (OFED)
	  ConnectX driver (mlx4) in OFED 3.12 Release Notes

  			May 2014


===============================================================================
Table of Contents
===============================================================================
1. Overview
2. Supported firmware versions
3. VPI (Virtual Process Interconnect)
4. InfiniBand new features and bug fixes since OFED 1.3.1
5. InfiniBand (mlx4_ib) new features and bug fixes since OFED 1.4
6. Eth (mlx4_en) new features and bug fixes since OFED 1.4
7. New features and bug fixes since OFED 1.4.1
8. New features and bug fixes since OFED 1.4.2
9. New features and bug fixes since OFED 1.5
10. New features and bug fixes since OFED 1.5.1
11. New features and bug fixes since OFED 1.5.2
12. New features and bug fixes since OFED 1.5.3
13. New features and bug fixes since OFED 1.5.4
14. Known Issues
15. mlx4 available parameters

===============================================================================
1. Overview
===============================================================================
mlx4 is the low level driver implementation for the ConnectX adapters designed
by Mellanox Technologies. The ConnectX can operate as an InfiniBand adapter,
as an Ethernet NIC, or as a Fibre Channel HBA. The driver in OFED 3.12 supports
InfiniBand and Ethernet NIC configurations. To accommodate the supported
configurations, the driver is split into three modules:
    
- mlx4_core
	Handles low-level functions like device initialization and firmware
	commands processing. Also controls resource allocation so that the
	InfiniBand and Ethernet functions can share the device without
	interfering with each other.
- mlx4_ib
	Handles InfiniBand-specific functions and plugs into the InfiniBand
	midlayer
- mlx4_en
	Handles Ethernet specific functions and plugs into the netdev mid-layer.

mlx5 is the low level driver implementation for Connect-IB adapters designed
by Mellanox Technologies. The connect-IB can operate as an Infiniband adapter
only. The driver is split into two modules:

- mlx5_core
	Handles low-level functions like device initialization and firmware
	commands processing.
- mlx5_ib
	Handles InfiniBand-specific functions and plugs into the InfiniBand
	midlayer

===============================================================================
2. Supported firmware versions
===============================================================================
- This release was tested with Connect-IB FW 10.10.3000 or Connext-X3
  FW 2.31.5050
- The minimal Connect-X version to use is 2.3.000.
- To use both IB and Ethernet (VPI) use Connect-X FW version 2.6.000 or higher

===============================================================================
3. VPI (Virtual Protocol Interconnect)
===============================================================================
VPI enables ConnectX to be configured as an Ethernet NIC and/or an InfiniBand
adapter.
o Overview:
  The VPI driver is a combination of the Mellanox ConnectX HCA Ethernet and
  InfiniBand drivers.
  It supplies the user with the ability to run InfiniBand and Ethernet
  protocols on the same HCA (separately or at the same time).
  For more details on the Ethernet driver see MLNX_EN_README.txt.
o Firmware:
  The VPI driver works with FW 25408 version 2.6.000 or higher.
  One needs to use INI files that allow different protocols over same HCA.
o Port type management:
  By default both ConnectX ports are initialized as InfiniBand ports.
  If you wish to change the port type use the connectx_port_config script after
  the driver is loaded.
  Running "/sbin/connectx_port_config -s" will show current port configuration
  for all ConnectX devices.
  Port configuration is saved in file: /etc/infiniband/connectx.conf.
  This saved configuration is restored at driver restart only if done via
  "/etc/init.d/openibd restart".
  
  Possible port types are:
    "eth"   - Always Ethernet.
    "ib"    - Always InfiniBand.
    "auto"  - Link sensing mode - detect port type based on the attached
              network type. If no link is detected, the driver retries link
              sensing every few seconds.
    
  Port link type can be configured for each device in the system at run time
  using the "/sbin/connectx_port_config" script.
  
  This utility will prompt for the PCI device to be modified (if there is only
  one it will be selected automatically).
  At the next stage the user will be prompted for the desired mode for each port.
  The desired port configuration will then be set for the selected device.
  Note: This utility also has a non interactive mode:
  "/sbin/connectx_port_config [[-d|--device <PCI device ID>] -c|--conf <port1,port2>]".

- The following configurations are supported by VPI:
	Port1 = eth   Port2 = eth
	Port1 = ib    Port2 = ib
	Port1 = auto  Port2 = auto
	Port1 = ib    Port2 = eth
	Port1 = ib    Port2 = auto
	Port1 = auto  Port2 = eth

  Note: the following options are not supported:
	Port1 = eth   Port2 = ib
	Port1 = eth   Port2 = auto
	Port1 = auto  Port2 = ib
  
	
===============================================================================
4. InfiniBand new features and bug fixes since OFED 1.3.1
===============================================================================
Features that are enabled with ConnectX firmware 2.5.0 only:
- Send with invalidate and Local invalidate send queue work requests.
- Resize CQ support.

Features that are enabled with ConnectX firmware 2.6.0 only:
- Fast register MR send queue work requests.
- Local DMA L_Key.
- Raw Ethertype QP support (one QP per port) -- receive only.

Non-firmware dependent features:
- Allow 4K messages for UD QPs
- Allocate/free fast register MR page lists
- More efficient MTT allocator
- RESET->ERR QP state transition no longer supported (IB Spec 1.2.1)
- Pass congestion management class MADs to the HCA
- Enable firmware diagnostic counters available via sysfs
- Enable LSO support for IPOIB
- IB_EVENT_LID_CHANGE is generated more appropriately
- Fixed race condition between create QP and destroy QP (bugzilla 1389)


===============================================================================
5. InfiniBand new features and bug fixes since OFED 1.4
===============================================================================
- Enable setting via module param (set_4k_mtu) 4K MTU for ConnectX ports.
- Support optimized registration of huge pages backed memory.
  With this optimization, the number of MTT entries used is significantly
  lower than for regular memory, so the HCA will access registered memory with
  fewer cache misses and improved performance.
  For more information on this topic, please refer to Linux documentation file:
  Documentation/vm/hugetlbpage.txt
- Do not enable blueflame sends if write combining is not available
- Add write combining support for for PPC64, and thus enable blueflame sends.
- Unregister IB device before executing CLOSE_PORT.
- Notify and exit if the kernel module used does not support XRC. This is done
  to avoid libmlx4 compatibility problem.
- Added a module parameter (log_mtts_per_seg) for number of MTTs per segment.
  This enable to register more memory with the same number of segments.


===============================================================================
6. Eth (mlx4_en) new features and bug fixes since OFED 1.4
===============================================================================
6.1 Changes and New Features
----------------------------
- Added Tx Multi-queue support which Improves multi-stream and bi-directional
  TCP performance.
- Added IP Reassembly to improve RX bandwidth for IP fragmented packets.
- Added linear skb support which improves UDP performance.
- Removed the following module parameters:
   - rx/tx_ring_size
   - rx_ring_num - number of RX rings
   - pprx/pptx - global pause frames
   The parameters above are controlled through the standard Ethtool interface.

Bug Fixes
---------
- Memory leak when driver is unloaded without configuring interfaces first.
- Setting flow control parameters for one ConnectX port through Ethtool
  impacts the other port as well.
- Adaptive interrupt moderation malfunctions after receiving/transmitting
  around 7 Tera-bytes of data.
- Firmware commands fail with bad flow messages when bringing an interface up.
- Unexpected behavior in case of memory allocation failures.

===============================================================================
7. New features and bug fixes since OFED 1.4.1
===============================================================================
- Added support for new device ID: 0x6764: MT26468 ConnectX EN 10GigE PCIe gen2

===============================================================================
8. New features and bug fixes since OFED 1.4.2
===============================================================================
8.1 Changes and New Features
----------------------------
- mlx4_en is now supported on PPC and IA64.
- Added self diagnostics feature: ethtool -t eth<x>.
- Card's vpd can be accessed for read and write using ethtool interface.

8.2 Bug Fixes
-------------
- mlx4 can now work with MSI-X on RH4 systems.
- Enabled the driver to load on systems with 32 cores and higher.
- The driver is being stuck if the HW/FW stops responding, reset is done
  instead.
- Fixed recovery flows from memory allocation failures.
- When the system is low on memory, the mlx4_en driver now allocates smaller RX
  rings.
- The mlx4_core driver now retries to obtain MSI-X vectors if the initial request is
  rejected by the OS

===============================================================================
9. New features and bug fixes since OFED 1.5
===============================================================================
9.1 Changes and New Features
----------------------------
- Added RDMA over Converged Enhanced Ethernet (RoCEE) support
  See RoCEE_README.txt.
- Masked Compare and Swap (MskCmpSwap)
  The MskCmpSwap atomic operation is an extension to the CmpSwap operation
  defined in the IB spec. MskCmpSwap allows the user to select a portion of the
  64 bit target data for the "compare" check as well as to restrict the swap to
  a (possibly different) portion.
- Masked Fetch and Add (MFetchAdd)
  The MFetchAdd Atomic operation extends the functionality of the standard IB
  FetchAdd by allowing the user to split the target into multiple fields of
  selectable length. The atomic add is done independently on each one of this
  fields. A bit set in the field_boundary parameter specifies the field
  boundaries.
- Improved VLAN tagging performance for the mlx4_en driver.
- RSS support for Ethernet UDP traffic on ConnectX-2 cards with firmware
  2.7.700 and higher.

9.2 Bug Fixes
-------------
- Bonding stops functioning when one of the Ethernet ports is closed.
- "Scheduling while atomic" errors in /var/log/messages when working with
  bonding and mlx4_en drivers in several operating systems.

===============================================================================
10. New features and bug fixes since OFED 1.5.1
===============================================================================
10.1 Changes and New Features
----------------------------
1. Added RAW QP support
2. Extended the range of log_mtts_per_seg - upper bound moved from 5 to 7.
3. Added 0xff70 vendor ID support for MADs.
4. Added support for GID change event.
5. Better interrupts spreading under heavy RX load (mlx4_en)

10.2 Bug Fixes
-------------
1. Fixed chunk sg list overflow in mlx4_alloc_icm()
2. Fixed bug in invalidation of counter index.
3. Fixed bug in catching netdev events for updating GID table.
4. Fixed bug in populating GID table for RoCE.
5. Fixed XRC locking and prevention of null dereference.
6. Added spinlock to xrc_reg_list changes and scanning in interrupt context.
7. Fixed offload changes via Ethtool for VLAN interfaces

===============================================================================
11. New features and bug fixes since OFED 1.5.2
===============================================================================
11.1 Changes and new features
-----------------------------
1. RoCE counters are now added to the regular Ethernet counters. The counters
   for RoCE specific traffic are at the same place and are not changed.
2. Forward any vendor ID SMP MADs to firmware for handling.
3. Add blue flame support for kernel consumers. This allows lower latencies to
   be achieved. To use blue flame, a consumer needs to create the QP with
   inline support.
4. Enabled raw eth QPs to work with inline and blueflame
5. Enabled new steering model in mlx4_en. The RX packets are now steered
   through the MCG table instead of Mac table for unicast, and default entry
   for multicast.
6. Added support for promiscuous mode in the new steering model.

11.2 Bug fixes
--------------
1.  Fix race when reading node description through MADs.
2.  Fix modify CQ so each of moderation parameters is independent.
3.  Limit the number of fast registration work requests to match HW
    capabilities.
4   Changes to node-description via sysfs are now propagated to FW (for FW
    2.8.000 and later).  This enables FW to send a 144 trap to OpenSM regarding
    the change, so that OpenSM can read that nodes updated description.  This
    fixes an old race condition, where OpenSM read the nodes description before
    it was changed during driver startup.
5.  Fix max fast registration WRs that can be posted to CX.
6.  Fix port speed reporting for RoCE ports.
7.  Limit GID entries for VLAN to match hardware capabilities.
8.  Fix RoCE link state report.
9.  Workaround firmware bug, reporting wrong number of blue flame registers.
10. Bug fix in kernel pos_send when VLANs are used.
11. Fix in mlx4_en for handling VLAN operations when working under bond
    interfaces.
12. Fix Ethtool transceiver type report for mlx4_en
13. Avoid vunmpa invalid pointer in allocation bad flow
14. Fix mlx4_ib_reg_xrc_rcv_qp() locking

===============================================================================
12. New features and bug fixes since OFED 1.5.3
===============================================================================
1.  Fix the release func to be consistent with the allocation one
2.  Fix high priority attach
3.  Fix endianness with blue frame support
4.  Consider reserved_cqs
5.  Add debug messages when cannot perform SENSE_PORT
6.  Add sensing port only when supported by HW

===============================================================================
13. New features and bug fixes since OFED 1.5.4
===============================================================================
1. SRIOV basic
2. SRIOV EN
3. new link sensing
4. RAW QP
5. Flow Steering
6. QoS
7. aRFS TCP
8. huge pages
9. ipoib perf improvements
10. Thermal event
11. ipoib tss/rss
12. 64 byte CQE ConnectX-3
13. SRIOV IB
14. SRIOV RAW QP
15. ipoib-pv module
16. time stamping EN
17. Memory Window
18. macvlan
19. VF MAC spoof checking
20. mlx5 driver
21. XPS support
22. VF link state support
23. Low Latency Socket (busy poll)
24. Order-0 memory alloc in rx
25. Warning on TX timeout
26. Blueflame race fixed
27. NCSI support (using oprequest)

===============================================================================
14. Known Issues
===============================================================================
- The SQD feature is not supported
- To load the driver on machines with a 64KB default page size, the UAR bar
  must be enlarged. 64KB page size is the default of PPC with RHEL5 and Itanium
  with SLES 11 or when 64KB page size enabled.
  Perform the following three steps:
  1. Add the following line in the firmware configuration (INI) file under the
     [HCA] section:
       log2_uar_bar_megabytes = 5
  2. Burn a modified firmware image with the changed INI file.
  3. Reboot the system.

     
================================================================================
15. mlx4 available parameters
================================================================================
In order to set mlx4 parameters, add the following line(s) to /etc/modpobe.conf:
   options mlx4_core parameter=<value>
      and/or
   options mlx4_ib   parameter=<value>
      and/or
   options mlx4_en   parameter=<value>

mlx4_core parameters:
 set_4k_mtu:		try to set 4K MTU to all ConnectX ports (int)
 debug_level:		enable debug tracing if > 0 (int)
 block_loopback:	block multicast loopback packets if > 0 (int)
 msi_x:			attempt to use MSI-X if nonzero (int)
 log_num_mac:		log2 max number of MACs per ETH port (1-7, int)
 use_prio:		enable steering by VLAN priority on ETH ports
			(0/1, default 0) (bool)
 log_num_qp:		log maximum number of QPs per HCA (int)
 log_num_srq:		log maximum number of SRQs per HCA (int)
 log_rdmarc_per_qp:	log number of RDMARC buffers per QP (int)
 log_num_cq:		log maximum number of CQs per HCA (int)
 log_num_mcg:		log maximum number of multicast groups per HCA
			(int)
 log_num_mpt:		log maximum number of memory protection table
			entries per HCA	(int)
 log_num_mtt:		log maximum number of memory translation table
 			segments per HCA (int)
 log_mtts_per_seg:	log2 number of MTT entries per segment (1-5)
			(int)
 enable_qos:		enable Quality of Service support in the HCA
			(default: off) (bool)
 enable_pre_t11_mode:	set FCoXX to pre-T11 mode if non-zero
			(default 0) (int)
 internal_err_reset:	reset device on internal errors if non-zero
 			(default 1) (int)

mlx4_ib parameters:
 debug_level: 		enable debug tracing if > 0 (default 0)

mlx4_en parameters:
 udp_rss:        	enable RSS for incoming UDP traffic or disabled (0)
 tcp_rss:		enable RSS for incoming TCP traffic or disabled (0)
 num_lro:		number of LRO sessions per ring or disabled (0)
			(default is 32)
 ip_reasm:		allow reassembly of fragmented IP packets (default
			is enabled)
 pfctx:			priority based Flow Control policy on TX[7:0]
			per priority bit mask (default is 0)
 pfcrx:			priority based Flow Control policy on RX[7:0]
			per priority bit mask (default is 0)
 inline_thold:		threshold for using inline data (default is 128)

================================================================================
15. mlx5 available parameters
================================================================================
debug_mask:		debug mask: 1 = dump cmd data, 2 = dump cmd exec time, 3 = both. Default=0

