Winsock Programmer's FAQ
Section 7: Articles: The Lame List
The Lame List
Introduction
I have reproduced The Lame List here because it is so valuable. This
text is cut-and-pasted directly from Appendix C of version 2.2.2 of the
Windows Sockets 2 Application Programming Interface. The list
originally started out as a list of complaints by Winsock stack vendors
about wrongheaded applicationswe won't name names here. Despite
that, these items are still valuable because newbie Winsockers still make
the same wrongheaded mistakes. Avoiding the items on this list will take
you a long way along the road toward Winsock guruhood.
The original introduction to the List:
Keith Moore of Microsoft gets the credit for starting
this, but other folks have begun contributing as well. Bob Quinn, from
sockets.com, is the kind soul who provided the elaborations on why these
things are lame and what to do instead. This is a snapshot of the list
as we went to print (plus a few extras thrown in at the last minute).
This version of the List is slightly different from the original:
I have changed some punctuation, minor bits of phrasing, etc. And,
of course, I have added all the pretty HTML formatting.
The Windows Sockets Lame List
(or What's Weak This Week)
Brought to you by The Windows Sockets Vendor Community
Calling connect() on a non-blocking socket, getting
WSAEWOULDBLOCK, then immediately calling recv()
and expecting WSAEWOULDBLOCK before the connection
has been established. Lame.
Reason: This assumes that the connection will never be
established by the time the application calls recv(). Lame
assumption.
Alternative: Don't do that. An application using a non-blocking
socket must handle the WSAEWOULDBLOCK error value,
but must not depend on occurrence of the error.
Calling select() with three empty fd_sets
and a valid TIMEOUT structure as a sleazy delay
function. Inexcusably lame.
Reason: The select() function is intended as a network
function, not a general purpose timer.
Alternative: Use a legitimate system timer service.
Polling with connect() on a non-blocking socket to
determine when the connection has been established. Dog
lame.
Reason: The Winsock 1.1 spec does not define an error for
connect() when a non-blocking connection is pending,
so the error value returned may vary.
Alternative: Using asynchronous notification of connection
completion is the recommended alternative. An application that
prefers synchronous operation mode could use the select()
function (but see item 23).
Non-Alternative: Changing a non-blocking socket to blocking mode
to block on send() or recv() is even more lame than
polling on connect().
Applications that don't properly shut down when the
user closes the main window while a blocking API is in
progress. Totally lame.
Reason: Winsock applications that don't close sockets, and call
WSACleanup(), may not allow a Winsock implementation to
reclaim resources used by the application. Resource leakage can
eventually result in resource starvation by all other Winsock
applications (i.e. network system failure).
Alternative: While a blocking API is in progress in a 16-bit
Winsock 1.1 application, the proper way to abort is to:
Call WSACancelBlockingCall()
Wait until the pending function returns. If the
cancellation occurs before the operation completes, the
pending function will fail with the WSAEINTR
error, but applications must also be prepared for success,
due to the race condition involved with cancellation.
Close this socket, and all other sockets. Note: the
proper closure of a connected stream socket involves:
call shutdown() with the how equal to 1
loop on recv() until it returns 0 or fails with any error
call closesocket()
Call WSACleanup()
This procedure is not relevant to 32-bit Winsock 2 applications,
since they really block, so calling WSACancelBlockingCall()
from the same thread is impossible. (Therefore, this call is
deprecated under Winsock 2.) However, step 3 above is still
useful for shutting down a socket cleanly.
Out of band data. Intensely lame.
Reason: TCP can't do Out of Band (OOB) data reliably. If
that isn't enough, there are incompatible differences
in the implementation at the protocol level (in the
urgent pointer offset). Berkeley (BSD) Unix implements
RFC 793
literally, and many others implement the corrected RFC 1122
version. (Some versions also allow multiple OOB data bytes by
using the start of the MAC frame as the starting point for the
offset.) If two TCP hosts have different OOB versions, they
cannot send OOB data to each other.
Alternative: Ideally, you can use a separate socket for urgent
data, although in reality it is inescapable sometimes. Some
protocols require it (see item 7), in
which case you need to minimize your dependence, or beef up your
technical support staff to handle user calls.
Calling strlen() on a hostent structure's ip address,
then truncating it to four bytes, thereby overwriting part of
malloc()'s heap header. In all my years of observing
lameness, I have seldom seen something this lame.
Reason: This doesn't really need a reason, does it?
Alternative: Clearly, the only alternative is a brain
transplant.
Passing a longer buffer length than the actual buffer size
since you know you won't receive more than the actual buffer
size. Universally lame.
Reason: Winsock implementations often check buffers for
readability or writability before using them to avoid Protection
Faults. When a buffer length is longer than the actual buffer
length, this check will fail, so the function call will fail with
WSAEFAULT.
Alternative: Always pass a legitimate buffer length.
Bounding every set of operations with calls to
WSAStartup() and WSACleanup(). Pushing the
lameness envelope.
Reason: This is not illegal, as long as each WSAStartup()
has a matching call to WSACleanup(), but it is more work
than necessary.
Alternative: In a DLL, custom control or class library, it is
possible to register the calling client based on a unique task
handle or process ID. This allows automatic registration without
duplication. Automatic de-registration can occur when a process
closes its last socket. This is even easier if you use the process
notification mechanisms available in the 32-bit environment.
Ignoring API errors. Glaringly lame.
Reason: Error values are your friends! When a function fails,
the error value returned by WSAGetLastError() or included in
an asynchronous message can tell you why it failed. Based
on the function that failed, and the socket state, you can often
infer what happened, why, and what to do about it.
Alternative: Check for error values, and write your
applications to anticipate them, and handle them gracefully when
appropriate. When a fatal error occurs, always display an error
message that shows:
the function that failed
the Winsock error number, and/or macro
a short description of the error meaning
suggestions for how to remedy, when possible
Calling recv(MSG_PEEK) in response to an
FD_READ async notification message. Profoundly
lame.
Reason: It's redundant. It's redundant.
Alternative: Make a plain recv() call in response
to an FD_READ message. Even if it fails with
WSAEWOULDBLOCK, that error is easy to ignore, and you
are guaranteed to get another FD_READ message later
since there is data pending.
Polling with ioctlsocket(FIONREAD) on a stream
socket until a complete "message" arrives. Exceeds the bounds
of earthly lameness.
Reason and Alternative: See item 12.
Assuming that a UDP datagram of any length may be
sent. Criminally lame.
Reason: Various networks all have their limitations on maximum
transmission unit (MTU). As a result, fragmentation will occur,
and this increases the likelihood of a corrupted datagram (more
pieces to lose or corrupt). Also, the TCP/IP service providers
at the receiving end may not be capable of re-assembling a large,
fragmented datagram.
Alternative: check for the maximum datagram size with the
SO_MAX_MSG_SIZE socket option, and don't send anything
larger. Better yet, be even more conservative. A max of 8K is
a good rule-of-thumb.
Assuming the UDP transmissions (especially multicast
transmissions) are reliable. Sinking in a morass of
lameness.
Reason: UDP has no reliability mechanisms (that's why we have
TCP).
Alternative: Use TCP and keep track of your own message
boundaries.
Applications that require vendor-specific extensions, and
cannot run (or wore yet, load) without them. Stooping to
unspeakable depths of lameness.
Reason: If you can't figure out the reason, it's time to hang
up your keyboard.
Alternative: Have a fallback position that uses only base
capabilities for when the extension functions are not
present.
Expecting errors when UDP datagrams are dropped by the sender,
receiver, or any router along the way. Seeping lameness from
every crack and crevice.
Reason: UDP is unreliable. TCP/IP stacks don't have to tell
you when they throw your datagrams away (a sender or receiver
may do this when they don't have buffer space available, and a
receiver will do it if they cannot reassemble a large fragmented
datagram.
Alternative: Expect to lose datagrams, and deal. Implement
reliability in your application protocol, if you need it (or
use TCP, if your application allows it).
Copyright owned by the authors of the Lame List items,
including, but not necessarily limited to, the people mentioned in the
introductory matter at the beginning of this article.