Network latency is a fact of life. There is nothing you can do about it, except join the network queue and wait. But Bridgeworks thinks it has solved the latency problem with pipelining and artificial intelligence. Can it be true?
SANSlide is the product that does this and it aims to increase the speed of storage backup replication and SAN linking across TCP/IP wide area network links. The thinking behind it starts with network latency. The speed of light means that the time for data to cross a link increases with distance. For example it could take data travelling from the UK to Germany WAN 6.6ms to traverse the wire. Add in the communications gear at each end of the line which processes the signal and the recorded latency could be 32ms.
It is a characteristic of TCP/IP that a stream of data to be transmitted, such as a file, is broken up into component packets each of which are sent until the receive window is full, and the sender waits for an acknowledgement of receipt (ack) from the remote site. The ack triggers the transmission of the next packet series. A missing ack triggers a packet retransmission on the assumption that the original transmission has failed.
With a perfect link and streamed data, the link speed and data transmission speed would be the same. With packetisation and the send-ack sequence the data transmission speed is slowed. Add in latency and the link's efficiency drops significantly. With the UK-Germany example above a packet series is sent and 64ms later the ack comes back. Every packet series transmission is followed by 64ms of waiting.
A Bridgeworks example of the effect of this is to take a 1GbitE link, have a 32KB window size and a 100ms round trip latency, and arrive at a 320KB/sec transmission speed.
Nothing can be done about latency directly. Instead suppliers such as QLogic, Cisco, Brocade and others, who make iSCSI and FCIP storage data WAN transmission products, have tried making data packets larger, and compressing and deduping data to avoid repetitious bits in packets. But the latency time between the data packets is fixed and, it's assumed, immutable.
Well, yes, but you can have more than one logical TCP/IP connection on a link. Bridgeworks makes storage protocol conversion bridges - SAS to Fibre Channel, that sort of thing - and it knows about dealing with protocol-wrapped data packets and converting them to another protocol. Its idea is to send a packet across a link, then open another connection and send a packet on that, and to keep opening connections until you get an ack on the first connection and send the next packet in the sequence. That way, you increase the number of packets in transit on the wire and could have, say, a pipeline of ten packets in transit on ten connections with an ack coming back on each connection, say, 64ms after the packet on that connection was sent.
The number of open connections depends upon the round trip time (RTT) for the ack. As it shortens or lengthens the number of connections needed to keep the pipeline full reduces or increases in synchrony. Bridgeworks CEO David Trossell says: "The number of connections created will come to a steady state when the packet time on the network × the number of connections = the bandwidth of the connection."
He then says: "If “virtual connections” are used rather than a series of physical connections, it is possible to send all the data down a single physical connection. At this point the line utilisation rate will increase to nearly 100 per cent and the effects of latency will be reduced to almost zero. With full link bandwidth utilisation irrespective of RTT and latency, it is now possible to transport data over vast distances without performance drop-off."
Bridgeworks is aiming to sell this technology through its OEM and distribution channel to small and medium businesses. To get over TCP/IP network set up and management complexity an artificial intelligence-based management function has been developed to make the SANSlide link nearly autonomous.
The AI manager varies TCP/IP network parameters, such as window scaling, maximum concurrent session number, compression level and transmit buffer sizes, continuously. If they improve performance the changes are kept and if they don't they are not. When the product is first installed on a customer's network it is switched on and given an IP address. A self-learn mode period follows in which initial optimisation parameters are set. Then it constantly monitors and adjusts all parameters to optimise data transmission performance. Trossell says this means there is no user set-up and no user maintenance.
He says SANSlide can be thought of, roughly, as a bridge product with a WAN link inserted in the middle: "Because we transport the same core data we use in our bridges, we can support the same protocol on both nodes or differing protocols on both nodes. In fact we support any of the major storage protocols – a unique feature of our product."
SANSlide is pitched as lower cost and complexity than Cisco (MDS 922i) and Brocade (7500) FCIP-type products, and better performance and lower complexity than QLogic's 6142 product. Its target markets include remote replication, offsite archives, disaster recovery, and remote backup. The development roadmap includes compression, encryption, FCoE, priority paths, InfiniBand and multi-node support.
This is new technology, a version one product. The theory looks valid and Bridgeworks' internal testing proves it works. It begins to look as if a small and clever UK protocol bridging company has actually come up with a pipelining method to bypass WAN latency and so steal a march on Brocade, Cisco, QLogic, Riverbed and other very much larger companies.®