URL Encoding and Decoding


Introduction to URL

What is a URL

A URL is short for Uniform Resource Locator, used to identify the location of a resource on the internet.

The format of a URL is scheme:[//authority]path[?query][#fragment], where authority = [userinfo@]host[:port].

  1. scheme - Indicates the protocol, such as http or ftp, required
  2. userinfo - User information for authentication, formatted as username:password, optional
  3. host - The host, can be a domain name or IP, required
  4. port - Port number, optional, defaults to the protocol's standard port
  5. path - Path, used to represent the directory and file address on the host. Optional
  6. query - Query parameters. Optional
  7. fragment - Fragment, refers to a segment of a network resource. Optional

An example of a simple URL: https://www.codeeeee.com, using only scheme and host, with the default port for https being 443

What is a URI

URI stands for Uniform Resource Identifier, a short name for identifying a resource on the internet. URL and URI have the same format and are conceptually similar, sometimes interchangeable. The difference between URL and URI is, the former represents the location of a resource, while the latter represents the name of a resource. URL is a type of URI.

Why encode URLs

URLs can only contain characters from the ASCII set, so encoding is needed when characters outside this set appear. Also, some reserved characters in URLs, like :, /, &, need encoding to avoid confusion in URL parsing.

URL encoding rules

When encoding URLs, characters are represented by corresponding percent encodings (%). The specific rules for URL encoding can be referenced here

In Javascript, encodeURIComponent and decodeURIComponent can be used for encoding and decoding URLs.