             Open Fabrics Enterprise Distribution (OFED)
                    SDP in OFED 1.5.1 Release Notes

                          March 2010



===============================================================================
Table of Contents
===============================================================================
1. Overview
2. Bug Fixes and Enhancements since OFED 1.5.1
3. Known Issues
4. Verification Applications/Flows/Tests

===============================================================================
1. Overview
===============================================================================
SDP in OFED is at GA level for OFED 1.5.1
Main changes are:
- Improve Copy stability
- PPC support
- Fix some installation problems
- Bug fixes

===============================================================================
2. Bug Fixes and Enhancements since OFED 1.5
===============================================================================
* Cleanup
    - Fix kernel crashes
    - No timeouts in ZCopy
    - SDP in QLogic <-> OFED stack compilancy 
    - Cleaned up bugzilla - removed old bugs that were fixed already
    - Removed unnecessary prints

* Bug fixes
    - No known kernel crash is caused by SDP.
    - Many small bugs fixed - see bugzilla.


===============================================================================
3. Known Issues
===============================================================================
- BUG1444 - setsockopt(SO_RCVBUF) is not working in sdp socket. To limit top
  system wide sdp memory usage for recv use module parameter top_mem_usage.

- BUG1331 - TCP allows connecting to IP_ANY - 0.0.0.0 (as a destination address!). SDP 
  does not allow - and will reject the connection.

- Each SDP socket currently consumes up to 2 MBytes of memory. If this value
  is high for your installation, it is possible to trade off performance
  for lower memory utilization per socket by reducing the value of the
  "rcvbuf_scale" module parameter (default: 16).

  Note: the minimum legal value for this parameter is 1.
  At this parameter value, each socket will consume approximately 128 KBytes.

- Small message size performance is low when messages are sent by client
  at a rate lower than the rate at which they are consumed by server,
  and when TCP_CORK is not set. This is observed, for example, with iperf
  benchmark. As a workaround, set the TCP_CORK socket option
  to ensure data is sent in at least 32K byte chunks.

- Performance is low on 32-bit kernels, as SDP utilizes high memory
  to ease memory pressure. Moving to a 64-bit kernel solves this
  problem even if the application remains a 32-bit one.

- By default, SDP utilizes a 2 Kbyte MTU size.  This may cause PCI-X cards
  using Mellanox Technologies "Infinihost" HCAs to experience low bandwidth.
  Workaround:  reset the MTU size to 1K in this situation, using either of
  the two methods below:

  1. Activate the "tavor quirk" workaround in opensm:
     a. Create an opensm options cache file (/var/cache/osm/opensm.opts):
          > opensm --cache-options -o
     b. Add the following line to /var/cache/osm/opensm.opts:
          enable_quirks TRUE
     c. Rerun opensm using your usual command line options to activate
        the opensm quirk option.

  2. Activate the "tavor quirk" workaround in cma:
       set the tavor_quirk module parameter of the rdma_cm module to value 1
       (default: 0).

- BZCopy is disabled by default. large block transfers will be using ZCopy.
  to reenable BZCopy or to change its threshold set the module parameter sdp_bzcopy_thresh
  to a non zero value - every block larger than that will be sent by using BZCopy.

- ZCopy is enabled by default for blocks larger than 64K. could be disabled by setting
  the module paramter sdp_zcopy_thresh to zero. Or to any other value by setting it to
  another non zero value.

- ZCOPY mode gives good performance for large blocks with very small cpu 
  utilization When in use,  All messages longer than 
  'sdp_zcopy_thresh' bytes in length will cause the user space buffer to
  be pinned and the data sent directly from the original buffer.  This 
  results in less CPU use and, on many systems, much better bandwidth.
  The default 64K value for 'sdp_zcopy_thresh' is sometimes too low for
  some systems.  You must experiment with your hardware to select the
  best value.


===============================================================================
4. Verification Applications/Flows/Tests
===============================================================================
- ssh/sshd
- wget/netscape/firefox/apache                  
- netpipe               
- netperf             
- LTP socket tests
- iperf-2.0.2         
- ttcp
- Threaded and forking echo client server examples
- Various Java client server applications (SUN:jre, BEA:jrockit/WebLogic, GNU:gij/gcj)
- Many UNIX utilities to verify that pre-load did not harm the applications


