Monthly Archives: April 2014

Network programming – 2

Okay, it’s time to move on a little bit from where I left off in my last installment. As I said, I’m making a p2p application. And so far, what I’ve got is a ‘hello world’ server that I’ve tweaked a little bit for structure and portability. It listens for new connections, and when it gets them it talks to them, but so far it can’t connect to another copy of itself and be the one talked to.

Well, the first thing we need to add, then, is a way for it to know where to find other instances of itself which it can connect to. You can do this in dozens of ways, but I’m going to install some infrastructure to handle config files and command lines, which we’ll need later anyway.

Here is code I wrote, which checks and handles command line arguments.  And you can specify a configuration file on the command line and it’ll process the lines of that configuration file in the same way it processes command line options.  For the first cut I’ve also implemented  –listenport to allow a specification of what port to listen to, because this is still for the server.  In a bit, I’ll add –connectnode, which is needed for client or p2p apps.

 

int handleconfigfile(char *filename){
  FILE *config = fopen( filename, "r" );
  char cmdbuffer[180];
  int bufferloc = 0;
  int rchar;
  int failed = 0;

  if ( config == NULL ) {
    /* if we can't open the file, set failure. */
    fprintf( stderr, "Could not open configfile %s\n", config );
    failed++;
  }

  else {
    /* read one character at a time from file. */
    while  ((rchar = fgetc( config )) != EOF && !failed ){
      /* read to the end of a line. */
      if (rchar != '\n'){
	/* if there's a line longer than 180 characters 
	   in the config file we ignore the rest of it. */
	if (bufferloc+1 < 180) cmdbuffer[bufferloc++] = rchar;
      }
      else{ /* we got a newline, a command is complete */
	    /* add a null terminator. */
	cmdbuffer[bufferloc] = 0;
	/* reset bufferloc to read next command line */
	bufferloc = 0;
	/* process this line or set failed. */
	if (handlecommand(cmdbuffer)){
	  fprintf(stderr, "error in config file: this line is unrecognized. \n%s\n", cmdbuffer);
	  failed++;
	}
      }
    }
    /* okay, this is weird but occasionally we get a config file that
       ends with something besides a newline. So there might be a
       valid unhandled command in the cmdbuffer after the loop exits.
       If that happens bufferloc will be nonzero. So handle that. */
    if (bufferloc != 0 && !failed){
      cmdbuffer[bufferloc] = 0;
      if (handlecommand(cmdbuffer)){
	/* bad command, report failure and return failure */
	fprintf(stderr, "error in config file: this line is unrecognized. \n%s\n", cmdbuffer);
	failed++;
      }
    }
    fclose( config );
  }
  /* return 0 for success or >0 for failure. */
  return failed;
}

int handlelistenport(char *portstring){
  /* portstring is a number, possibly preceded by whitespace */
  int argpoint = 0;
  long int newval = 0;
  errno = 0;
  newval = strtol(portstring, NULL, 0);
  if (errno == ERANGE || errno == EINVAL){
    fprintf(stderr, "--listenport %s found but %s is not a valid port number.\n", portstring, portstring);
    return(1); /* failure */
  }
  configoptions.listenport = newval;
  return(0); /* success */
}

/* check a string against a particular command; pass remainder of string to appropriate handler if it matches */
int checkcommand(char *command, char *commandstring,  int(*handler)(char *harg), int *rcode) {
  int argloc = strlen(command);
  if (strncmp(command, commandstring, argloc) == 0){
    *rcode = handler(&(commandstring[argloc]));
    /* 1 for this command matched */
    return(1);
  }
  /* 0 for this particular command not matching */
  return (0);
}

/* handle some command.  We don't know what it is, so we check it against all possibilities */
int handlecommand(char *commandstring){
  /* add more lines and handler routines as needed. */
  int rcode = 0;
  if (checkcommand("--configfile",   commandstring,  handleconfigfile, &rcode) ||     /*
      checkcommand("--foo",          commandstring,  foohandler,       &rcode) ||
      checkcommand("--bar",          commandstring,  barhandler,       &rcode) ||
      checkcommand("--baz",          commandstring,  bazhandler,       &rcode) ||
      checkcommand("--whateverelse", commandstring,  handlewhatever,   &rcode) ||     */
      checkcommand("--listenport",   commandstring,  handlelistenport, &rcode))

    /* if checkcommand was true for anything, then rcode will be set according to the 
       success of the handler for that option.  If the handler succeeded, we return 
       rcode to signal success, otherwise we return rcode to signal failure. */
    return rcode; 

  /* otherwise none of the checkcommand lines matched so we report to stderr and return failure. */
  fprintf(stderr, "unrecognized command: %s\n", commandstring);
  return(1);
}

int handlearguments(int argc, char *argv[]){
  char cmdbuffer[180];
  int bufferloc = 0;
  int rchar;
  int argcount;
  FILE *config;
  /* first make sure argc is odd, 'cause all our arguments are spec,value pairs */
  if ((argc % 2) != 1){
    /* from the user's point of view the number of args must be even, because he's not counting argv[0].*/
    fprintf(stderr, "wrong number of arguments (must be even).\n");
    return 1;
  }
  else for (argcount = 1; argcount+1 < argc; argcount += 2){    
      strcpy(cmdbuffer, argv[argcount]);
      strcat(cmdbuffer, " ");
      strcat(cmdbuffer, argv[argcount+1]);
      if (handlecommand(cmdbuffer)){
	// handling this pair of arguments failed.
	fprintf(stderr, "invalid command: %s", cmdbuffer );
	return 1;
      } /* end if checking one command */
    } /* end for checking all commands. */
  return 0;
}

The same handler handles the configfile options and the command line options. This is IMO good design because the user who knows how to use one therefore knows how to use the other. It’s good design unless an app needs configuration so complicated that command line arguments can’t reasonably express it, anyway. But there is good news and bad news about using the same infrastructure to handle both.

The good news is that you can put a ‘–configfile secondfile’ line in your configfile, and these routines will process another configuration file. And if you put ten –configfile directives in it, it will process ten more configuration files. And if each of those ten has another ten, it’ll process a hundred more than that. This ought to please the sort of people who like spraying their configuration for each app all over fifteen dozen files scattered all over the system. The guys who package software for various Linux distributions really like to torture the configurations that way, because they try to meet various goals, rules, and guidelines (none of which are ‘ease of understanding and modifying configuration‘ unfortunately) many of which conflict with each other if there’s only one configuration file.

The bad news is that configuration files capable of including each other can be made into an “include loop.” If file A says to process B, and file B says to process A, then the app would continue processing its configuration files forever. Well, not forever. Just until it runs out of stack space and ignominiously crashes.  Even though this can only happen when the user makes an obvious mistake, a “loop until crash” condition is bad and we need to fix it so the user gets at least a relevant error message and the app gets a controlled exit.

So, let’s try to integrate command line and config file handling into the server code.  Here’s a first cut.

/*
** server.c -- a stream socket server demo (restructured, command line & config file processing added)
*/

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <arpa/inet.h>
#include <sys/wait.h>
#include <signal.h>

#define PORT "3490"  /* the default port users will be connecting to */

#define BACKLOG 10   /* how many pending connections queue will hold by default */

/* configoptions is file scope. */
struct {
  char listenport[9];
  int backlog;
} configoptions;

void initconfig(){
  /* atoi does not detect errors.  if PORT is not a number this fails hard. */
  strcpy( configoptions.listenport, PORT);
  configoptions.backlog = BACKLOG;
}

void sigchld_handler(int s){
  /*this call checks and reaps one dead child process, returning 0 immediately if there are none. */
  /*the child's PID is returned otherwise. */
  while(waitpid(-1, NULL, WNOHANG) > 0);
}

/* get sockaddr, IPv4 or IPv6: */
void *get_in_addr(struct sockaddr *sa){
  if (sa->sa_family == AF_INET) {
    return &(((struct sockaddr_in*)sa)->sin_addr);
  }

    return &(((struct sockaddr_in6*)sa)->sin6_addr);
}

int setup (int * sockfd){
  struct addrinfo hints, *servinfo, *paddr; 
  int yes=1;
  int addrerror;  
  int returncode = 0;

  memset(&hints, 0, sizeof hints);
  hints.ai_family = AF_UNSPEC;         /* AF_UNSPEC means we don't care whether IP4 or IP6 */
  hints.ai_socktype = SOCK_STREAM;     /* we are using sockets, not datagrams. */
  hints.ai_flags = AI_PASSIVE;         /* use my own IP */

  /* getaddrinfo sets servinfo to point to linked list of bindable sockets and 
     returns a nonzero error code if not successful. */  
  if ((addrerror = getaddrinfo(NULL, configoptions.listenport, &hints, &servinfo)) != 0) { 
    fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(addrerror));
    /* return 1 meaning unable to get any addresses to bind */
    returncode = 1;
  }

  /* loop through all the results and bind to the first we can */
  if (returncode != 1){
    for(paddr = servinfo; paddr != NULL; paddr = paddr->ai_next) {
      if ((*sockfd = socket(paddr->ai_family,     /* for our purposes anything other than AF_INET or AF_INET6 is useless */
                            paddr->ai_socktype,   /* for our purposes anything other than SOCK_STREAM is useless  */
                            paddr->ai_protocol)   /* must be zero because we're not using a standard protocol. */
           ) == -1) {
        /* we found a socket we can't use. */
        perror("server: socket");
        /* continue meaning try the next thing in the list */
      }

      /* SOL_SOCKET means manipulating options at the level of the sockets API (rather than the protocol level)
         SO_REUSEADDR specifies the option to allow this socket to be used by more than one port.  
         &yes (actually a pointer at any nonzero) means enable that option. (void* because various structs for protocols)
         the last arg gives the size of the thing that the &yes argument points at, in our case an int.   */

      else if (setsockopt(*sockfd, SOL_SOCKET, SO_REUSEADDR, (void*)&yes, sizeof(int)) == -1) {
        /* we found a socket but we can't set the options the way we want. */
        perror("server:getsockopt");
      }

      /* okay, if we got to here we have a socket descriptor and we've set the options necessary to use it. 
         Now we attempt to bind it.  The arguments here are the socket descriptor, the address, and its length.
         The return value is 0 for success and -1 (with an errno set) for failure. One errno that we're going to 
         be interested in for a P2P app is EADDRINUSE - meaning another instance is already running. */
      else if (bind(*sockfd, paddr->ai_addr, paddr->ai_addrlen) == -1) {
        close(*sockfd); 
        /* we found a socket and set the options but we couldn't bind it (maybe somebody else bound it before 
           we got to it, maybe something else changed)*/
        perror("server: bind");
      }
      /* okay, if we got here then we either bound a port (so we break the loop) or we're out of items 
         in the linked list (in which case break changes nothing because the loop is about to exit anyway). */
      break;
    }  /* ends for loop*/

    /*this is a check to see why the for loop stopped.  If paddr is null it means we ran out of servinfo entries.*/
    if (paddr == NULL)  {
      fprintf(stderr, "server: failed to bind\n");
      /* return 2 for failure to bind a port. */
      returncode = 2;
    }
  }  
  /* now that we've bound something or used up all our entries we don't need the list any more. 
     having called getaddrinfo(), we must not return or exit without calling freeaddrinfo().
   */
  freeaddrinfo(servinfo); 

  /* we've bound a port, now we try to listen to it.  listen() returns 0 on success, -1 with an errno on failure.
     again, we will be interested in EADDRINUSE for a P2P app. */

  if (returncode == 0){
    if (listen(*sockfd, configoptions.backlog) == -1) {
      /* error, listen failed.  kill the thread. */
      perror("server:listen");
      returncode = 3;
    }
  }
  /* successfully bound and listening!  Yay!*/
  if (returncode == 0) printf("server: waiting for connections...\n");
  return returncode;
}

int configfilehandler(char *filename){
  FILE *config = fopen( filename, "r" );
  char cmdbuffer[180];
  int bufferloc = 0;
  int rchar;
  int failed = 0;
  static int filecount = 0;

  /* yes, 180 == exactly fifteen dozen.  That's my canonical definition of 'too many config files'.  */
  if (filecount++ > 180){
    fprintf(stderr, "%s %s %s", 
	    "Fifteen dozen configuration files is too many. Giving up while trying to process ", filename, 
	    "./nThere is probably an 'include loop' of --configfile directives in your config files./n" );
    /* We can return (fail) directly because we haven't opened a file we need to close yet. */
    return 1;
  }

  if ( config == NULL ) {
    /* if we can't open the file, set failure. */
    fprintf( stderr, "Could not open configfile %s\n", config );
    failed++;
  }

  else {
    /* read one character at a time from file. */
    while  ((rchar = fgetc( config )) != EOF && !failed ){
      /* read to the end of a line. */
      if (rchar != '\n'){
	/* if there's a line longer than 180 characters 
	   in the config file we ignore the rest of it. */
	if (bufferloc+1 < 180) cmdbuffer[bufferloc++] = rchar;
      }
      else{ /* we got a newline, a command is complete */
	    /* add a null terminator. */
	cmdbuffer[bufferloc] = 0;
	/* reset bufferloc to read next command line */
	bufferloc = 0;
	/* process this line or set failed. */
	if (handlecommand(cmdbuffer) != EXIT_SUCCESS ){
	  fprintf(stderr, "error in config file: this line is unrecognized. \n%s\n", cmdbuffer);
	  failed++;
	}
      }
    }
    /* okay, this is weird but occasionally we get a config file that
       ends with something besides a newline. So there might be a
       valid unhandled command in the cmdbuffer after the loop exits.
       If that happens bufferloc will be nonzero. So handle that. */
    if (bufferloc != 0 && !failed){
      cmdbuffer[bufferloc] = 0;
      if (handlecommand(cmdbuffer) != EXIT_SUCCESS ){
	/* bad command, report failure and return failure */
	fprintf(stderr, "error in config file: this line is unrecognized. \n%s\n", cmdbuffer);
	failed++;
      }
    }
    fclose( config );
  }
  /* returns 0 for success or >0 for failure. */
  return failed;
}

int listenporthandler(char *portstring){
  /* portstring is a number, possibly preceded by whitespace */
  int argpoint = 0;
  long int newval = 0;
  errno = 0;
  /* first make sure it's valid. */
  newval = strtol(portstring, NULL, 0);
  /* You can in practice send and receive on port zero, but it is defined by IANA to be an invalid port number. */
  if (errno == ERANGE || errno == EINVAL || newval <= 0 || newval >= 65536){
    fprintf(stderr, "--listenport %s found but %s is not a valid port number.\n", portstring, portstring);
    return(1); /* failure */
  }
  /* IANA doesn't state this explicitly but it's a convention some systems (solaris, etc) follow. */
  if (newval > 49151){
    fprintf(stderr, "%s %s %s %s", 
	    "warning: preparing to listen on port ", portstring, ". On some systems ports higher than 49151 ",
	    "are reserved for ephemeral connections.\n");
  }
  if (newval < 1024){
    fprintf(stderr, "warning: preparing to listen on port %s. Ports below 1024 are reserved for standard protocols.\n", portstring);
  }

  /* Anyway, if we got here we have a valid port.  Save it as a string in a known format, no matter how we got it. */
  snprintf(configoptions.listenport, 8, "%d", newval);
  return(0); /* success */
}

int backloghandler(char *backstring){
  /* backstring is a number, possibly preceded by whitespace */
  int argpoint = 0;
  long int newval = 0;
  errno = 0;
  newval = strtol(backstring, NULL, 0);
  if (errno == ERANGE || errno == EINVAL || newval < 0){
    fprintf(stderr, "--backlog %s found but %s is not valid (must be a number >= 0).\n", backstring);
    return(1); /* failure */
  }
  if (newval >= 100){
    /* 100 is way too many for a protocol with even moderately persistent connections.  */
    fprintf(stderr, "--backlog %s found but %s is too many waiting connections.  For best results try 1-15.\n", backstring, backstring);
    return(1);
  }
  else if (newval > 20){
    fprintf(stderr, "warning, --backlog %s found; a range of 1-15 maximum waiting connections is recommended for best results.\n ", 
	    backstring);
  }
  configoptions.backlog = newval;
  return(0); /* success */
}

/* check a string against a particular command; pass remainder of string to appropriate handler if it matches.
   The third argument is a pointer to a procedure which takes a string and returns an int.  */
int checkcommand(char *command, char *commandstring,  int(*handler)(char *harg), int *rcode) {
  int argloc = strlen(command);
  if (strncmp(command, commandstring, argloc) == 0){
    *rcode = handler(&(commandstring[argloc]));
    /* 1 means this command matched. success/failure is in rcode. */
    return(1);
  }
  /* 0 means no match.  rcode is irrelevant. */
  return (0);
}

/* handle some command.  We don't know what it is, so we check it against all possibilities */
int handlecommand(char *commandstring){
  /* add more lines and handler routines as needed. */
  int rcode = 0;
  if (checkcommand("--configfile",   commandstring,  configfilehandler, &rcode) ||     
      checkcommand("--backlog",      commandstring,  backloghandler,    &rcode) ||
      checkcommand("--listenport",   commandstring,  listenporthandler, &rcode))

    /* if checkcommand was true for anything, then rcode will be set according to the 
       success of the handler for that option.  If the handler succeeded, we return 
       rcode to signal success, otherwise we return rcode to signal failure. We don't 
       report to stderr because it's not us that hit the error, it's the handler. */
    return rcode; 

  /* otherwise none of the checkcommand lines matched so we report to stderr and return 1 for failure. */
  fprintf(stderr, "unrecognized command: %s\n", commandstring);
  return(1);
}

int handlearguments(int argc, char *argv[]){
  char cmdbuffer[180];
  int bufferloc = 0;
  int rchar;
  int argcount;
  FILE *config;
  /* first make sure argc is odd, 'cause all our arguments are spec,value pairs */
  if ((argc % 2) != 1){
    /* from the user's point of view the number of args must be even, because he's not counting argv[0].*/
    fprintf(stderr, "wrong number of arguments (must be even).\n");
    return 1;
  }
  else for (argcount = 1; argcount+1 < argc; argcount += 2){    
      strcpy(cmdbuffer, argv[argcount]);
      strcat(cmdbuffer, " ");
      strcat(cmdbuffer, argv[argcount+1]);
      if (handlecommand(cmdbuffer)){
	// handling this pair of arguments failed.
	fprintf(stderr, "invalid command: %s", cmdbuffer );
	return 1;
      } /* end of checking one argpair */
    } /* end for checking all argpairs. */
  return 0;
}

int main(int argc, char * argv[]){
  socklen_t sin_size;
  int sockfd,   new_fd;  /* listen on sock_fd, new connection on new_fd */
  struct sockaddr_storage their_addr; /* connector's address information */
  char csixaddr[INET6_ADDRSTRLEN];  
  struct sigaction sa;
  int rtval;

  /* process arguments.  If that causes an error, fail. */
  if (handlearguments(argc, argv)) return EXIT_FAILURE;

  /* Try to bind a listening port.  If we fail, we return. */
  if ((rtval = setup(&sockfd))!= 0) return EXIT_FAILURE;

  /* reap all dead processes */
  sa.sa_handler = sigchld_handler; 
  /* sigemptyset initializes sa. */
  sigemptyset(&sa.sa_mask);
  /* some of the socket calls block.  SA_RESTART says that blocked calls, 
     if interrupted by a sigaction, should be restarted. */
  sa.sa_flags = SA_RESTART;
  /* try to set sig handling for children.  If that causes an error, fail. */
  if (sigaction(SIGCHLD, &sa, NULL) == -1) {
    perror("server:sigaction");
    return EXIT_FAILURE;
  }

  while(1) {  /* main accept() loop */
    sin_size = sizeof their_addr;
    new_fd = accept(sockfd, (struct sockaddr *)&their_addr, &sin_size);
    if (new_fd == -1) {
      perror("server: accept");
      continue;
    }

    inet_ntop(their_addr.ss_family,
              get_in_addr((struct sockaddr *)&their_addr),
              csixaddr, sizeof csixaddr);
    printf("server: got connection from %s\n", csixaddr);

    if (!fork()) { /* Fork!  If this is the child process then */
      close(sockfd); /* we don't need the listener */
      if (send(new_fd, "Hello, world!", 13, 0) == -1)
        perror("child: send");
      close(new_fd);
      exit(0);
    }
    else /* otherwise this is the parent process and the */
      close(new_fd);  /* parent doesn't need the talker. */
  }
  return EXIT_SUCCESS;
}

So now you can use command lines or config files to change the port it will listen on or the number of backlogged connections it will allow on the listening port. I added some nice error messages for too many config files (meaning probably an include loop), invalid choices for ports, and backlog values drastically too high.

I also added warning messages for stuff you can probably do and might want to in some strange circumstance, but  which is certainly a mistake in normal use.  So far, that’s using reserved ports and excessive backlog values. This is part of my general belief that no application – especially a command line application – should fail without telling you why.  Nor should it allow you to put it in a drastic configuration that will suck for normal use without complaining to you first.

But it’s still just a ‘hello world’ server, and my goal here is still to create a P2P app. Now it’s time to integrate code from Beej’s ‘client’, so these things can talk to each other instead of just having one talk to another.

I see that I’m over 3000 words again, although some would claim code doesn’t count. So, join me again tomorrow as I continue the project.