Reserved & unsafe characters: URL encoding

The following characters are either unsafe on one operating system or another, or reserved by the HTML or URL specification. Therefore, to be assured of cross-platform compatibility and functionality, use the hexadecimal encoding shown below when these characters appear in URLs. For the reserved characters, do not encode them when they appear in their conventional meaning for a URL; for example, do not encode the slash (/) when using it as part of the URL syntax. Always encode the unsafe characters in URLs.

Note: this encoding differs from the encoding for named entities in the non-URL text of your documents. Use URL encoding only when the URL appears in the HTML syntax, such as in an HREF, SRC, or LOWSRC attribute.

Reserved characters

Character Name URL code
; semicolon %2B
/ slash, virgule, separatrix, or solidus %2F
? question mark %3F
: colon %3A
@ at sign %40
= equals sign %3D
& ampersand %26 (but really use &)

Special note about the ampersand (&)

The ampersand is a special case, because it has a special meaning in HTML. In this course, the URLs we write into our files are almost always HTML, so the ampersand should be encoded as an entity rather than as a hexadecimal code.

Special note about "mailto:" URLs

The mailto scheme for URLs further reserves the parentheses. You must encode parentheses in the address portion of a mailto URL.

Unsafe characters

Character Name URL code
  space %20
< less-than sign (left angle bracket) %3C
> greater-than sign (right angle bracket) %3E
" double quote %22
# hash mark, number sign, pound sign %23
% percent mark %25
[ left square bracket %5B
] right square bracket %5D
{ left brace %7B
} right brace %7D
| vertical bar %7C
\ backslash, reverse slash, slosh, backslant, or backwhack %5C
^ caret %5E
~ tilde %7E
` backquote or backtick %60

© 1996-2001 Fred Condo, Ph.D. Tous droits réservés.

$Id: URL-encoding.html,v 1.38 2001/02/20 01:08:44 fred Exp $