I recently developed an interest in programming unix sockets, and in particular P2P applications. In C. Yeah, I know. Masochist. This is all much easier in slower languages. But slower languages are usually more resource intensive, sometimes less portable, and can hide hamfisted implementations that then affect absolutely everything developed in those languages, so I’m going to try to get to the very fundamentals of it, in the same langauge the kernel is written in.
The first thing I did was to read online Beej’s guide to network programming, which is really excellent. Here’s the fundamental guide I’m going to start with, and y’all should really go read it to understand what’s going on.
Beej’s guide to network programming!
Now Beej gives the following as server code for his “hello world” socket server: It’s pretty simple; it runs, you connect to it, and it says “Hello world!” to you.
/* ** server.c -- a stream socket server demo */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <string.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <netdb.h> #include <arpa/inet.h> #include <sys/wait.h> #include <signal.h> #define PORT "3490" // the port users will be connecting to #define BACKLOG 10 // how many pending connections queue will hold void sigchld_handler(int s) { while(waitpid(-1, NULL, WNOHANG) > 0); } // get sockaddr, IPv4 or IPv6: void *get_in_addr(struct sockaddr *sa) { if (sa->sa_family == AF_INET) { return &(((struct sockaddr_in*)sa)->sin_addr); } return &(((struct sockaddr_in6*)sa)->sin6_addr); } int main(void) { int sockfd, new_fd; // listen on sock_fd, new connection on new_fd struct addrinfo hints, *servinfo, *p; struct sockaddr_storage their_addr; // connector's address information socklen_t sin_size; struct sigaction sa; int yes=1; char s[INET6_ADDRSTRLEN]; int rv; memset(&hints, 0, sizeof hints); hints.ai_family = AF_UNSPEC; hints.ai_socktype = SOCK_STREAM; hints.ai_flags = AI_PASSIVE; // use my IP if ((rv = getaddrinfo(NULL, PORT, &hints, &servinfo)) != 0) { fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(rv)); return 1; } // loop through all the results and bind to the first we can for(p = servinfo; p != NULL; p = p->ai_next) { if ((sockfd = socket(p->ai_family, p->ai_socktype, p->ai_protocol)) == -1) { perror("server: socket"); continue; } if (setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof(int)) == -1) { perror("setsockopt"); exit(1); } if (bind(sockfd, p->ai_addr, p->ai_addrlen) == -1) { close(sockfd); perror("server: bind"); continue; } break; } if (p == NULL) { fprintf(stderr, "server: failed to bind\n"); return 2; } freeaddrinfo(servinfo); // all done with this structure if (listen(sockfd, BACKLOG) == -1) { perror("listen"); exit(1); } sa.sa_handler = sigchld_handler; // reap all dead processes sigemptyset(&sa.sa_mask); sa.sa_flags = SA_RESTART; if (sigaction(SIGCHLD, &sa, NULL) == -1) { perror("sigaction"); exit(1); } printf("server: waiting for connections...\n"); while(1) { // main accept() loop sin_size = sizeof their_addr; new_fd = accept(sockfd, (struct sockaddr *)&their_addr, &sin_size); if (new_fd == -1) { perror("accept"); continue; } inet_ntop(their_addr.ss_family, get_in_addr((struct sockaddr *)&their_addr), s, sizeof s); printf("server: got connection from %s\n", s); if (!fork()) { // this is the child process close(sockfd); // child doesn't need the listener if (send(new_fd, "Hello, world!", 13, 0) == -1) perror("send"); close(new_fd); exit(0); } close(new_fd); // parent doesn't need this } return 0; }
And this is Beej’s code for the client:
/* ** client.c -- a stream socket client demo */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <string.h> #include <netdb.h> #include <sys/types.h> #include <netinet/in.h> #include <sys/socket.h> #include <arpa/inet.h> #define PORT "3490" // the port client will be connecting to #define MAXDATASIZE 100 // max number of bytes we can get at once // get sockaddr, IPv4 or IPv6: void *get_in_addr(struct sockaddr *sa) { if (sa->sa_family == AF_INET) { return &(((struct sockaddr_in*)sa)->sin_addr); } return &(((struct sockaddr_in6*)sa)->sin6_addr); } int main(int argc, char *argv[]) { int sockfd, numbytes; char buf[MAXDATASIZE]; struct addrinfo hints, *servinfo, *p; int rv; char s[INET6_ADDRSTRLEN]; if (argc != 2) { fprintf(stderr,"usage: client hostname\n"); exit(1); } memset(&hints, 0, sizeof hints); hints.ai_family = AF_UNSPEC; hints.ai_socktype = SOCK_STREAM; if ((rv = getaddrinfo(argv[1], PORT, &hints, &servinfo)) != 0) { fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(rv)); return 1; } // loop through all the results and connect to the first we can for(p = servinfo; p != NULL; p = p->ai_next) { if ((sockfd = socket(p->ai_family, p->ai_socktype, p->ai_protocol)) == -1) { perror("client: socket"); continue; } if (connect(sockfd, p->ai_addr, p->ai_addrlen) == -1) { close(sockfd); perror("client: connect"); continue; } break; } if (p == NULL) { fprintf(stderr, "client: failed to connect\n"); return 2; } inet_ntop(p->ai_family, get_in_addr((struct sockaddr *)p->ai_addr), s, sizeof s); printf("client: connecting to %s\n", s); freeaddrinfo(servinfo); // all done with this structure if ((numbytes = recv(sockfd, buf, MAXDATASIZE-1, 0)) == -1) { perror("recv"); exit(1); } buf[numbytes] = '\0'; printf("client: received '%s'\n",buf); close(sockfd); return 0; }
Okay, that’s a starting point. Of course, it isn’t exactly what I wanted. Beej is a linux programmer who does a few things that are slightly unportable, this probably elides certain issues that are important in the real world for the sake of making a nice easy intro guide, and besides, what I’m trying to do is build a P2P app.
So, first I’m going to restructure his server code. To start with I’ll address a few stylistic issues that you’re free to agree with or not – this is just the stuff I do when getting comfortable with the code. First, Beej packed a lot more stuff into one big ‘main’ function than I like, and I’d prefer to decompose it into procedures a bit. Second, I prefer a different indentation and brace style. Third, I’m still learning this stuff so I’m going to go read man pages and add a bunch more comments. Fourth, he’s using C++ comments in C code. Yes, I know C++ comments are allowed by every C compiler anybody is likely to use, but call me obsessive-compulsive here. I use c++ comments in c++ code, but C code is a different language and C comments are absolutely portable across all C compilers.
/* ** server.c -- a stream socket server demo (slightly restructured, v1 .... ) */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <string.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <netdb.h> #include <arpa/inet.h> #include <sys/wait.h> #include <signal.h> #define PORT "3490" /* the port users will be connecting to */ #define BACKLOG 10 /* how many pending connections queue will hold */ void sigchld_handler(int s){ /*this call checks and reaps one dead child process, returning 0 immediately if there are none. */ /*the child's PID is returned otherwise. */ while(waitpid(-1, NULL, WNOHANG) > 0); } /* get sockaddr, IPv4 or IPv6: */ void *get_in_addr(struct sockaddr *sa){ if (sa->sa_family == AF_INET) { return &(((struct sockaddr_in*)sa)->sin_addr); } return &(((struct sockaddr_in6*)sa)->sin6_addr); } int setup (int * sockfd){ struct addrinfo hints, *servinfo, *paddr; /* beej named paddr 'p' which is useless. */ int yes=1; int addrerror; /* beej called this rv which is a completely useless name. */ memset(&hints, 0, sizeof hints); hints.ai_family = AF_UNSPEC; /* AF_UNSPEC means we don't care whether IP4 or IP6 */ hints.ai_socktype = SOCK_STREAM; /* we are using sockets, not datagrams. */ hints.ai_flags = AI_PASSIVE; /* use my own IP */ /* getaddrinfo sets servinfo to point to linked list of bindable sockets and returns a nonzero error code if not successful. */ if ((addrerror = getaddrinfo(NULL, PORT, &hints, &servinfo)) != 0) { fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(addrerror)); /* return 1 meaning unable to get any addresses to bind */ return 1; } /* loop through all the results and bind to the first we can */ for(paddr = servinfo; paddr != NULL; paddr = paddr->ai_next) { if ((*sockfd = socket(paddr->ai_family, /* for our purposes anything other than AF_INET or AF_INET6 is useless */ paddr->ai_socktype, /* for our purposes anything other than SOCK_STREAM is useless */ paddr->ai_protocol) /* must be zero because we're not using a standard protocol. */ ) == -1) { /* we found a socket we can't use. */ perror("server: socket"); /* continue meaning try the next thing in the list */ continue; } /* SOL_SOCKET means manipulating options at the level of the sockets API (rather than the protocol level) SO_REUSEADDR specifies the option to allow this socket to be used by more than one port. &yes (actually a pointer at any nonzero) means enable that option. (void* because various structs for protocols) the last arg gives the size of the thing that the &yes argument points at, in our case an int. */ if (setsockopt(*sockfd, SOL_SOCKET, SO_REUSEADDR, (void*)&yes, sizeof(int)) == -1) { /* we found a socket but we can't set the options the way we want. */ perror("setsockopt"); /* exit meaning kill this thread; I guess an error here is unrecoverable? */ exit(1); } /* okay, if we got to here we have a socket descriptor and we've set the options necessary to use it. Now we attempt to bind it. The arguments here are the socket descriptor, the address, and its length. The return value is 0 for success and -1 (with an errno set) for failure. One errno that we're going to be interested in for a P2P app is EADDRINUSE - meaning another instance is already running. */ if (bind(*sockfd, paddr->ai_addr, paddr->ai_addrlen) == -1) { close(*sockfd); /* we found a socket and set the options but we couldn't bind it (maybe somebody else bound it before we got to it, maybe something else changed)*/ perror("server: bind"); /* continue meaning try the next thing in the list */ continue; } /* okay, if we got here then we either bound a port (so we break the loop) or we're out of items in the linked list (in which case break changes nothing because the loop is about to exit anyway). */ break; } /*this is a check to see why the loop stopped. If paddr is null we ran out of servinfo entries.*/ if (paddr == NULL) { fprintf(stderr, "server: failed to bind\n"); /* return 2 for failure to bind a port. */ return 2; } /* now that we've bound something (or used up all our entries) we don't need the list any more. */ freeaddrinfo(servinfo); /* we've bound a port, now we try to listen to it. listen returns 0 on success, -1 with an errno on failure. again, we will be interested in EADDRINUSE for a P2P app. */ if (listen(*sockfd, BACKLOG) == -1) { /* error, listen failed. kill the thread. */ perror("listen"); exit(1); } printf("server: waiting for connections...\n"); /* 0 for success! */ return 0; } int main(int argc, char * argv[]){ socklen_t sin_size; int sockfd, new_fd; /* listen on sock_fd, new connection on new_fd */ struct sockaddr_storage their_addr; /* connector's address information */ char csixaddr[INET6_ADDRSTRLEN]; /* beej called this s which is a completely useless name */ struct sigaction sa; int rtval; if ((rtval = setup(&sockfd))!= 0) return rtval; /* reap all dead processes */ sa.sa_handler = sigchld_handler; /* sigemptyset initializes sa. */ sigemptyset(&sa.sa_mask); /* some of the socket calls block. SA_RESTART says that blocked calls, if interrupted by a sigaction, should be restarted. */ sa.sa_flags = SA_RESTART; if (sigaction(SIGCHLD, &sa, NULL) == -1) { perror("sigaction"); exit(1); } while(1) { /* main accept() loop */ sin_size = sizeof their_addr; new_fd = accept(sockfd, (struct sockaddr *)&their_addr, &sin_size); if (new_fd == -1) { perror("accept"); continue; } inet_ntop(their_addr.ss_family, get_in_addr((struct sockaddr *)&their_addr), csixaddr, sizeof csixaddr); printf("server: got connection from %s\n", csixaddr); if (!fork()) { /* this is the child process */ close(sockfd); /* child doesn't need the listener */ if (send(new_fd, "Hello, world!", 13, 0) == -1) perror("send"); close(new_fd); exit(0); } close(new_fd); /* parent doesn't need this */ } return 0; }
Now the above is a very simplistic restructuring. The brace style is just my personal thing and nobody else cares really, and my use of C-style comments rather than C++ style is pedantic and I know it. Extracting setup() is a step toward more modular code, though it doesn’t really make a big difference here. I’ve read a bunch of man pages and put in a whole lot more comments about what the system calls actually do for people learning this stuff (including me).
But I haven’t changed the way anything actually works. Probably the most significant change in the above is again a purely stylistic issue: Beej used a few variable names that are just plain stupid. A good variable name, to the extent possible, isn’t a substring of anything else in your program. And Beej used both ‘s’ and ‘rv’ as variable names in a program that is all about SERVERS. He also used ‘p’ as a variable name. In particular, ‘s’ is spectacularly wrong because it’s the seventh most useless variable name possible, right after e, t, o, a, i, and n!
Now let’s have a more critical look at the code. Since I’ve separated out the routine ‘setup’, whose job is to bind a listening port, let’s look at how calls to ‘setup’ can end. First, it can bind a port and return zero. That’s the hoped-for result. Second, it can fail to find any addresses it can bind and return 1. That’s a problem, but it’s not the routine’s fault. Third, it can find an address but fail to bind it and return 2. Again, that’s a problem but it’s not the routine’s fault. Third, it can find an address and fail to set the options it needs to bind that port, and exit.
Wait, what? Exit? That’s inconsistent behavior, different from all the other things that the subroutine can do. You never allow a subroutine that can exit but might not. That will screw up the caller because the caller just won’t know. In this case it’s me who introduced this inconsistency by abstracting out a subroutine that contained a call to exit(). It doesn’t actually cause an error in this program, because anytime setup returns nonzero, main is exiting (by returning) anyway. But this is a broken abstraction – although the bug doesn’t bite us here, it will cause a bug if someone ever calls it and needs it to return.
Second, why are we terminating the program when we have allocated something that we haven’t released? You probably don’t notice unless you actually read the man pages for the system calls, which I just have, but getaddrinfo() returns a linked list and doesn’t tell us which process space it’s allocated in. We don’t know whether that’s allocated in our process space or in the kernel, but we’re responsible for calling freeaddrinfo() to either release it, or tell the kernel that it’s okay to release it, depending on how this system call is implemented on a particular OS.
If it’s allocated in the kernel, you can’t assume that it gets reclaimed when the process exits. In linux it does, so this doesn’t cause any failure in linux code. But the rules are different in different operating systems. In fact, there used to be OS’s where memory you dynamically allocated in your own process wouldn’t get reclaimed when the process exits, but those have become thankfully rare. The short version of this story is that on some OS’s failure to call freeaddrinfo() after a call to getaddrinfo() is likely to leak kernel memory, eventually leading to a crash if you run such programs enough times.
Another thing about this code is that it calls exit or return with hard-coded integer values. This is fine for Beej, because he’s a Linux programmer and the values he’s passing are correct for Linux. On Linux, a return of zero means a program succeeded and a return of nonzero means it failed. But that isn’t universal. It’s only almost universal. So there are these two constants, EXIT_SUCCESS and EXIT_FAILURE, whose value varies from OS to OS in order to give the right semantics to a function’s return value. We should use ’em.
So, let’s modify this some more. Although this code runs and works fine on Linux, it contains incipient bugs. Setup() is a broken abstraction, and calling it in any code that needs it to return (such as after ‘malloc’ and before ‘free’) will create a bug. There are code paths that call getaddrinfo() but don’t call freeaddrinfo(), and on some operating systems that will leak kernel memory. And its return values aren’t completely portable.
Fourth, I’ll fix another style thing, which other people may not care about but which, for me, made these things a bit harder to find, trace, and be sure of: instead of continue and break and exit(), I’ll just use if/else for control flow, reserve exit() for threads, and use return from main rather than exit() for the main process. IMO this is simpler and clearer. So here’s the revised code:
/* ** server.c -- a stream socket server demo (slightly restructured, v2 .... ) */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <string.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <netdb.h> #include <arpa/inet.h> #include <sys/wait.h> #include <signal.h> #define PORT "3490" /* the port users will be connecting to */ #define BACKLOG 10 /* how many pending connections queue will hold */ void sigchld_handler(int s){ /*this call checks and reaps one dead child process, returning 0 immediately if there are none. */ /*the child's PID is returned otherwise. */ while(waitpid(-1, NULL, WNOHANG) > 0); } /* get sockaddr, IPv4 or IPv6: */ void *get_in_addr(struct sockaddr *sa){ if (sa->sa_family == AF_INET) { return &(((struct sockaddr_in*)sa)->sin_addr); } return &(((struct sockaddr_in6*)sa)->sin6_addr); } int setup (int * sockfd){ struct addrinfo hints, *servinfo, *paddr; /* beej named paddr 'p' which is useless. */ int yes=1; int addrerror; /* beej called this rv which is a useless name. */ int returncode = 0; memset(&hints, 0, sizeof hints); hints.ai_family = AF_UNSPEC; /* AF_UNSPEC means we don't care whether IP4 or IP6 */ hints.ai_socktype = SOCK_STREAM; /* we are using sockets, not datagrams. */ hints.ai_flags = AI_PASSIVE; /* use my own IP */ /* getaddrinfo sets servinfo to point to linked list of bindable sockets and returns a nonzero error code if not successful. */ if ((addrerror = getaddrinfo(NULL, PORT, &hints, &servinfo)) != 0) { fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(addrerror)); /* return 1 meaning unable to get any addresses to bind */ returncode = 1; } /* loop through all the results and bind to the first we can */ if (returncode != 1){ for(paddr = servinfo; paddr != NULL; paddr = paddr->ai_next) { if ((*sockfd = socket(paddr->ai_family, /* for our purposes anything other than AF_INET or AF_INET6 is useless */ paddr->ai_socktype, /* for our purposes anything other than SOCK_STREAM is useless */ paddr->ai_protocol) /* must be zero because we're not using a standard protocol. */ ) == -1) { /* we found a socket we can't use. */ perror("server: socket"); /* continue meaning try the next thing in the list */ } /* SOL_SOCKET means manipulating options at the level of the sockets API (rather than the protocol level) SO_REUSEADDR specifies the option to allow this socket to be used by more than one port. &yes (actually a pointer at any nonzero) means enable that option. (void* because various structs for protocols) the last arg gives the size of the thing that the &yes argument points at, in our case an int. */ else if (setsockopt(*sockfd, SOL_SOCKET, SO_REUSEADDR, (void*)&yes, sizeof(int)) == -1) { /* we found a socket but we can't set the options the way we want. */ perror("server:getsockopt"); } /* okay, if we got to here we have a socket descriptor and we've set the options necessary to use it. Now we attempt to bind it. The arguments here are the socket descriptor, the address, and its length. The return value is 0 for success and -1 (with an errno set) for failure. One errno that we're going to be interested in for a P2P app is EADDRINUSE - meaning another instance is already running. */ else if (bind(*sockfd, paddr->ai_addr, paddr->ai_addrlen) == -1) { close(*sockfd); /* we found a socket and set the options but we couldn't bind it (maybe somebody else bound it before we got to it, maybe something else changed)*/ perror("server: bind"); } /* okay, if we got here then we either bound a port (so we break the loop) or we're out of items in the linked list (in which case break changes nothing because the loop is about to exit anyway). */ break; } /* ends for loop*/ /*this is a check to see why the for loop stopped. If paddr is null it means we ran out of servinfo entries.*/ if (paddr == NULL) { fprintf(stderr, "server: failed to bind\n"); /* return 2 for failure to bind a port. */ returncode = 2; } } /* now that we've bound something or used up all our entries we don't need the list any more. having called getaddrinfo(), we must not return or exit without calling freeaddrinfo(). */ freeaddrinfo(servinfo); /* we've bound a port, now we try to listen to it. listen() returns 0 on success, -1 with an errno on failure. again, we will be interested in EADDRINUSE for a P2P app. */ if (returncode == 0){ if (listen(*sockfd, BACKLOG) == -1) { /* error, listen failed. kill the thread. */ perror("server:listen"); returncode = 3; } } /* successfully bound and listening! Yay!*/ if (returncode == 0) printf("server: waiting for connections...\n"); return returncode; } int main(int argc, char * argv[]){ socklen_t sin_size; int sockfd, new_fd; /* listen on sock_fd, new connection on new_fd */ struct sockaddr_storage their_addr; /* connector's address information */ char csixaddr[INET6_ADDRSTRLEN]; /* beej called this s which is a completely useless name */ struct sigaction sa; int rtval; /* first we try to bind a listening port. If we fail, we return. */ if ((rtval = setup(&sockfd))!= 0) return EXIT_FAILURE; /* reap all dead processes */ sa.sa_handler = sigchld_handler; /* sigemptyset initializes sa. */ sigemptyset(&sa.sa_mask); /* some of the socket calls block. SA_RESTART says that blocked calls, if interrupted by a sigaction, should be restarted. */ sa.sa_flags = SA_RESTART; if (sigaction(SIGCHLD, &sa, NULL) == -1) { perror("server:sigaction"); return EXIT_FAILURE; } while(1) { /* main accept() loop */ sin_size = sizeof their_addr; new_fd = accept(sockfd, (struct sockaddr *)&their_addr, &sin_size); if (new_fd == -1) { perror("server: accept"); continue; } inet_ntop(their_addr.ss_family, get_in_addr((struct sockaddr *)&their_addr), csixaddr, sizeof csixaddr); printf("server: got connection from %s\n", csixaddr); /* okay, we have a connection. Now we need to fork off a child process to deal with that connection while we go on listening. fork() is called from one thread, and returns in BOTH of two threads. The "extra" is a newly created copy of the existing thread. fork() returns zero in the child process and nonzero (actually the child's PID) in the parent process. */ if (!fork()) { /* Fork! If this is the child process then */ close(sockfd); /* we don't need the listener */ if (send(new_fd, "Hello, world!", 13, 0) == -1) perror("child: send"); close(new_fd); /* child process exits */ exit(0); } else /* this is the parent process */ close(new_fd); /* parent doesn't need this */ } return EXIT_SUCCESS; }
Now we return EXIT_FAILURE if we can’t bind a port to listen (for any of several reasons), we return EXIT_FAILURE if we can’t register a sigaction to resume interrupted system calls, and we return EXIT_SUCCESS if …. hmmm. We return EXIT_SUCCESS if a while(1) loop exits. But in normal flow control, while(1) loops don’t exit.
In practical terms that means that if we start it from a command line we can exit with control-c (command-line-ese for ‘sigkill the foreground process’) or by exiting the shell that started it (meaning it gets a signal that its parent process was terminated), and if it’s started some other way we can exit with a direct sigkill. This is not too unreasonable for a server process. But it’s different from the way most programs behave, so we should be aware of it. That ‘return EXIT_SUCCESS’ is just a decoration that shuts up a compiler warning. Flow of control never reaches it.
In fact we’d rather not have things like this need a signal to exit, because a signal can happen at any point in flow control. And that means it becomes possible to call, eg, getaddrinfo() or malloc() and then get killed via a signal before calling freeaddrinfo() or free(). And there are operating systems where that really matters. So we should do two things: first, we should do whatever we portably can to ensure that freeaddrinfo() gets called — and called exactly once — even if interrupted by a signal, provided getaddrinfo() has been called. Second, we should provide a nicer way for it to exit on command while doing all needed cleanup. But those bugs won’t bite me on Linux, they won’t bite anybody on any OS unless this process is killed during a very brief (milliseconds) interval right after it opens, and it’s a bit complicated and will probably require some code that’s otherwise nonportable. – so I’m going to ignore it for now, even though I already said “airtight and portable” as a general goal.
I see that I’m a bit over 3500 words, so I’ll continue this in another post.