Sending and Receiving Binary Data

There are a few important development considerations when exchanging binary data over a network connection. For most applications, strings should never be used to send or receive binary data; instead, applications should use byte arrays to exchange binary data.

SocketTools .NET Edition

Many of the .NET classes include overloaded versions of Read and Write methods, which are used receive and send data. All implementations provide support for the native string type and byte arrays, and some also include support for MemoryStream objects. If your application only sends ASCII text, then it is safe to use strings. However, if the data contains binary data or uses extended ASCII characters, then using a string may have unexpected results. This is because all strings are internally represented as Unicode, and any strings that are sent using the Write method will be sent as UTF-8 encoded Unicode. Likewise, any data read into a string will be treated as Unicode and converted from UTF-8 encoding to UTF-16 encoding. This may result in corruption of data that is sent or received using a string, but is not actually Unicode text.

It is recommended that you always use byte arrays or MemoryStream objects to exchange binary data, or data that uses extended ASCII characters which are not Unicode strings. For example, a string that contains extended ASCII (ANSI) characters used by the local code page should be converted to a byte array before being sent using the Write method. Typically this would only affect legacy applications that don’t support Unicode or the use UTF-8 encoded text.

Some classes include ReadLine and WriteLine methods that are used to read and write lines of text, up to a terminating newline character. These methods should never be used with binary data under any circumstances.

SocketTools Library Edition

Many of the SocketTools libraries provide lower level functions that enable you to read or write on the socket used to establish the connection. The C/C++ prototypes for these functions typically either use LPVOID or LPBYTE for the pointer to the buffer that contains the data to be sent or received. In this case, the function will send or receive the data as-is for the number of bytes specified. However, if the function expects a string type (e.g.: the InetReadLine or InetWriteLine functions) then that data will be handled as text. This means if the ANSI version of the function is called, it will be sent as text using the current code page, up to a terminating null character. If the Unicode version of the function is called, the text will be sent as UTF-8 encoded Unicode, up to the terminating null character.

You should never use functions that expect a pointer to a string when sending or receiving binary data. Null bytes in the data will be interpreted as the end of the string. If the project is compiled to use Unicode, they will automatically encode any written data as UTF-8 and attempt to convert any data received back to using UTF-16 encoding. In both cases, this can result in corruption of the data.

SocketTools ActiveX Edition

The SocketTools ActiveX controls have Read and Write methods which will accept both strings (BSTRs) and byte arrays as arguments. If your application only sends ASCII text, then it is safe to use strings. However, if the data contains binary data or uses extended ASCII characters, then using a string may have unexpected results. This is because all strings are internally represented as Unicode, and any strings that are sent using the Write method will be sent as UTF-8 encoded Unicode. Likewise, any data read into a string will be treated as Unicode and converted from UTF-8 encoding to UTF-16 encoding. This may result in corruption of data that is sent or received using a string, but is not actually Unicode text.

It is recommended that you always use byte arrays to exchange binary data, or data that uses extended ASCII characters which are not Unicode strings. For example, a string that contains extended ASCII (ANSI) characters used by the local code page should be converted to a byte array before being sent using the Write method. Typically this would only affect legacy applications that don’t support Unicode or the use UTF-8 encoded text.

Some controls include ReadLine and WriteLine methods that are used to read and write lines of text, up to a terminating newline character. These methods should never be used with binary data under any circumstances.

Note that although Visual Basic 6.0 uses Unicode internally, the common controls do not support Unicode and will represent text using the current code page. If you are reading input entered using a control like a TextBox, and international characters are used, then you may get unexpected results. A solution is to either use controls which natively support Unicode, or convert the ANSI text back to Unicode. The StrConv function can be used to convert to and from Unicode, and the vbFromUnicode option can be used to convert a string to a byte array which can then be passed to the Write method.

See Also

Unicode Support in SocketTools
Structured Data Over Stream Sockets

Shopping Cart
Scroll to Top