New Message: HTTP 1.1 "HEAD" requests

webmaster at userland.com webmaster at userland.com
Sat Nov 19 03:09:22 CST 2005


A new message was posted:

Address: http://manila.userland.com/discuss/msgReader$1483

By: Matt Deatherage (frontier at gcsf.com)

It's hell on whole wheat to get Google's new sitemap feature to verify a Manila site.

First, it's hard enough to obey Google's instructions to make some random page like http://www.example.com/GOOGLE11c2901dfb630838.html return no error. I had to do it through the site structure; the server's "domain" settings page implies that it can do this, but I couldn't make /anything/ work on that page (I'm talking about http://www.example.com:5336/settings?page=1.4 in case that's not clear). In no way could I ever convince that page to serve a file from the filesystem for a given domain and path.

Second, mainresponder itself prohibits Google's second test from suceeding. Google uses HTTP 1.1 "HEAD" requests to request the page it tells you about, and also for another random page whose name starts with "GOOGLE404probe". If the HEAD request for the page that should not exist does not return the "404" HTTP status code, Google refuses to verify that you own the site.

MainResponder, however, /never/ returns 404 for a HEAD request. Here's the code, from mainresponder.respond:

bundle { //JES 11/5/02: handle HTTP 1.1 HEAD requests
 if adrparamtable^.method == "HEAD" {
 adrparamtable^.responseHeaders.["Content-Length"] = sizeOf (adrparamtable^.responseBody);
 adrparamtable^.responseBody = "";
 }

I had presumed that some logic elsewhere would cause the code to skip this if the object was not found, but it doesn't. According to the Apache log, mainresponder always returns code 302 and this information, whether or not the object exists.

The solution was to modify mainresponder.respond around my attempts to verify a site:

bundle { //JES 11/5/02: handle HTTP 1.1 HEAD requests
 if adrparamtable^.method == "HEAD" {
 adrparamtable^.responseHeaders.["Content-Length"] = sizeOf (adrparamtable^.responseBody);
 adrparamtable^.responseBody = "";
 if adrparamtable^.request contains "GOOGLE404probe" {
 adrparamtable^.code = 404}}}

It took me way too long to figure out where to put this; I don't have the energy to figure out how it /should/ know it was a 404 error. (Length of responseBody == 0 won't work because then you'll get a 404 for any blank page.)

Since Google Sitemaps are going to become very, very important, and we're obviously going to have to deal with them ourselves in Manila 9.6 (hopefully 9.7 or 10.0 or whatever will generate them automatically, I hope and pray), we at least need mainresponder.respond to return the proper HTTP codes for pages that don't exist and not always assume they do.

Time to go home.

This is a Manila site... http://manila.userland.com/.




More information about the Manila-Users mailing list