isapi filters and unicode urls

If you write a custom isapi filter then you have several ways to get the url, the difference between them being not exactly well documented (actually it seems that no one has even bothered updating the documentation since IIS 6.0). So I did a couple of tests with IIS 8.0. Here is what I could see when getting a SF_NOTIFY_PREPROC_HEADERS notification. pfc is the HTTP_FILTER_CONTEXT* and pHeaders is pvNotification cast into HYYP_FILTER_PREPROC_HEADERS*.

  • pHeaders->GetHeader(pfc, "URL", buf, &size)
    This is the complete raw url (as char array) as requested by the client. For example requesting

    /☃?foo=bar

    results in buf containing

    /%E2%98%83?foo=bar
  • pfc->GetServerVariable(pfc, "URL", buf, &size)
    This contains the decoded path part of the requested url – also as a char array. This is probably encoded in the system ANSI Code Page (ACP). Characters not present in the ACP are replaced with '?'. For example requesting the above url results in buf containing

    /?

    Note that according to this the server variable URL shouldn't even be available in SF_NOTIFY_PREPROC_HEADERS. Apparently it is.

  • pfc->GetServerVariable(pfc, "UNICODE_URL", buf, &size)
    This is the same as above except that buf now contains a wchar_t array. Characters outside BMP seems to be correctly handled as a surrogate pair. So with the request from earlier buf will contain (as an UTF-16LE string):

    /☃
  • pHeaders->GetHeader(pfc, "UNICODE_URL", buf, &size)
    As one would likely expect this doesn't exists. The call returns ERROR_INVALID_PARAMETER.

Leave a Reply

Your email address will not be published. Required fields are marked *