| ======================================== | 
 | Symmetric Communication Interface (SCIF) | 
 | ======================================== | 
 |  | 
 | The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low | 
 | level communications API across PCIe currently implemented for MIC. Currently | 
 | SCIF provides inter-node communication within a single host platform, where a | 
 | node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of | 
 | communicating over the PCIe bus while providing an API that is symmetric | 
 | across all the nodes in the PCIe network. An important design objective for SCIF | 
 | is to deliver the maximum possible performance given the communication | 
 | abilities of the hardware. SCIF has been used to implement an offload compiler | 
 | runtime and OFED support for MPI implementations for MIC coprocessors. | 
 |  | 
 | SCIF API Components | 
 | =================== | 
 |  | 
 | The SCIF API has the following parts: | 
 |  | 
 | 1. Connection establishment using a client server model | 
 | 2. Byte stream messaging intended for short messages | 
 | 3. Node enumeration to determine online nodes | 
 | 4. Poll semantics for detection of incoming connections and messages | 
 | 5. Memory registration to pin down pages | 
 | 6. Remote memory mapping for low latency CPU accesses via mmap | 
 | 7. Remote DMA (RDMA) for high bandwidth DMA transfers | 
 | 8. Fence APIs for RDMA synchronization | 
 |  | 
 | SCIF exposes the notion of a connection which can be used by peer processes on | 
 | nodes in a SCIF PCIe "network" to share memory "windows" and to communicate. A | 
 | process in a SCIF node initiates a SCIF connection to a peer process on a | 
 | different node via a SCIF "endpoint". SCIF endpoints support messaging APIs | 
 | which are similar to connection oriented socket APIs. Connected SCIF endpoints | 
 | can also register local memory which is followed by data transfer using either | 
 | DMA, CPU copies or remote memory mapping via mmap. SCIF supports both user and | 
 | kernel mode clients which are functionally equivalent. | 
 |  | 
 | SCIF Performance for MIC | 
 | ======================== | 
 |  | 
 | DMA bandwidth comparison between the TCP (over ethernet over PCIe) stack versus | 
 | SCIF shows the performance advantages of SCIF for HPC applications and | 
 | runtimes:: | 
 |  | 
 |              Comparison of TCP and SCIF based BW | 
 |  | 
 |   Throughput (GB/sec) | 
 |     8 +                                             PCIe Bandwidth ****** | 
 |       +                                                        TCP ###### | 
 |     7 +    **************************************             SCIF %%%%%% | 
 |       |                       %%%%%%%%%%%%%%%%%%% | 
 |     6 +                   %%%% | 
 |       |                 %% | 
 |       |               %%% | 
 |     5 +              %% | 
 |       |            %% | 
 |     4 +           %% | 
 |       |          %% | 
 |     3 +         %% | 
 |       |        % | 
 |     2 +      %% | 
 |       |     %% | 
 |       |    % | 
 |     1 + | 
 |       +    ###################################### | 
 |     0 +++---+++--+--+-+--+--+-++-+--+-++-+--+-++-+- | 
 |       1       10     100      1000   10000   100000 | 
 |                    Transfer Size (KBytes) | 
 |  | 
 | SCIF allows memory sharing via mmap(..) between processes on different PCIe | 
 | nodes and thus provides bare-metal PCIe latency. The round trip SCIF mmap | 
 | latency from the host to an x100 MIC for an 8 byte message is 0.44 usecs. | 
 |  | 
 | SCIF has a user space library which is a thin IOCTL wrapper providing a user | 
 | space API similar to the kernel API in scif.h. The SCIF user space library | 
 | is distributed @ https://software.intel.com/en-us/mic-developer | 
 |  | 
 | Here is some pseudo code for an example of how two applications on two PCIe | 
 | nodes would typically use the SCIF API:: | 
 |  | 
 |   Process A (on node A)			Process B (on node B) | 
 |  | 
 |   /* get online node information */ | 
 |   scif_get_node_ids(..)			scif_get_node_ids(..) | 
 |   scif_open(..)				scif_open(..) | 
 |   scif_bind(..)				scif_bind(..) | 
 |   scif_listen(..) | 
 |   scif_accept(..)				scif_connect(..) | 
 |   /* SCIF connection established */ | 
 |  | 
 |   /* Send and receive short messages */ | 
 |   scif_send(..)/scif_recv(..)		scif_send(..)/scif_recv(..) | 
 |  | 
 |   /* Register memory */ | 
 |   scif_register(..)			scif_register(..) | 
 |  | 
 |   /* RDMA */ | 
 |   scif_readfrom(..)/scif_writeto(..)	scif_readfrom(..)/scif_writeto(..) | 
 |  | 
 |   /* Fence DMAs */ | 
 |   scif_fence_signal(..)			scif_fence_signal(..) | 
 |  | 
 |   mmap(..)				mmap(..) | 
 |  | 
 |   /* Access remote registered memory */ | 
 |  | 
 |   /* Close the endpoints */ | 
 |   scif_close(..)				scif_close(..) |