Re: Performance analysis questions

Rob McCool (robm@ncsa.uiuc.edu)
Wed, 11 May 1994 18:40:10 -0500


/*
* Re: Performance analysis questions by George Phillips
* written on May 12, 1:29am.
*
* Andrew Payne said:
* >I started hacking some instrumentation into NCSA's httpd to see where it
* >was spending time (I couldn't find a profiling tool that would work through
* >the fork(), though in retrospect it probably would have been easier to run
* >the server in single connection mode and throw out all of the startup
* >stuff). NOT counting the fork() time, I found the server spending about
* >20-30% of its time (wall and CPU) in the code that reads the request
* >headers. Code like this doesn't help (getline() in util.c):
* >
* > if((ret = read(f,&s[i],1)) <= 0) {
* >
* >Your mileage may vary.
*
* Whilst looking through the httpd code, I noticed this too. I meant to
* send off a "bug" report to Rob, but never got around to it. This is
* pretty expensive way to go about things. Sure, a big Sparc II can
* crank through read calls at 100,000 per second, but at around 1000
* characters per HTTP/1.0 header it adds up.

An unloaded sparc 2 can do that... the problem arises when that sparc 2 is
handling 100 connections.

* This is done because it wants to hand off the file descriptor to
* CGI scripts that handle POSTs. I'd suggest the right way to fix
* things is to read a bufferful and cat the extra to the scripts that
* need it. However, a quick hack could double the speed by doing
* read(f, &s[i], 2) because you know that at least CR LF will terminate
* the line. If it's the header boundary that's a problem, you could
* quadruple the speed with read(f, &s[i], 4) since you have at least
* "GET " for HTTP/0.9 requests and HTTP/1.0 headers will terminate
* with CR LF CR LF (well, they better!).
*/

I feel compelled to explain one of the worst implementation decisions I've
ever made. Yes, that's why it was done. Why? At the time I was implementing
it, Marc was testing my code by sending me forms literally megabytes long.
At roughly the same time, Mosaic/X started sending full Accept: headers that
often totalled over 1000 bytes. I wasn't aware the headers were so long (I
thought that they were MUCH shorter) and processing forms that were
megabytes long would be very common, then this would be a big win.

As it turns out, the headers for Mosaic/X are over 1000 bytes long, forms
are almost always well under 1000 bytes themselves, and my implementation
loses big time and is not scalable. This is something I was meaning to fix
in 2.0, but for the NCSA version someone else will be taking up the reins.

--Rob