url
1.12.11devel
|
The Sofia url module contains macros and functions for using URL datatype url_t, parsing and printing URLs.
The URL library provides URL datatype and helper functions related to it. There is URL parser, which separates the URL components to the url_t structure.
The formal URI syntax is defined in the RFC 3986.
The URLs consist of a subset of printable ASCII (ECMA-5) characters. The subset excludes space and characters commonly used as delimiters in text-based protocols, such as < > # % and " (double quote), and so called unwise characters whose positions are reserved for national extensions in ECMA-5. In US-ASCII, those characters are: { } | \ ^ [ ] `
There are also nine characters that can have special syntactic meaning in some parts of the URI. These reserved characters are used to separate syntactical parts of the URLs from each other. The reserved characters are as follows: : @ / ; ? & = + and $.
The URL library understands two alternative URL syntaxes. First, the basic syntax used by, e.g., ftp:, http: and rtsp: URLs:
scheme ":" ["//" [ user [":" password ] "@"] host [":" port ] ] ["/" path ] ["?" query ] ["#" fragment ]
Alternatively, the syntax used by mailto:, sip:, im:, tel, and pres: URLs:
scheme ":" [ [ user [":" password ] "@"] host [":" port ] ] [";" params ] ["?" query ] ["#" fragment ]
Note that url parser also considers "*" to be a valid URL (with type url_any).
For example:
The function url_make() converts a string to a freshly allocated url_t structure. The URL components are split into parts as shown above. The hex encoding using % is removed if the encoded character can syntactically be part of the field. For instance, "%41" is decoded as "A" in the user part, but "%40" (@) is left as is. (This is called canonization of the URL fields.)
The function url_format() is provided for generating the URL with printf()-like formatting.
For example, when we make the url from the string below
the components are NUL-terminated, canonized and assigned to the structure as follows:
You can use the function url_param() and url_have_param() to access particular parameters from url->url_params string.
The function url_as_string() converts contents of url_t structure to a newly allocated string.
The include file <sofia-sip/url.h> contains the types, function and macros of URL module. The functions and macros are listed here for the reference, too. The most important functions and macros for manipulating URLs are here:
There are functions for handling %-encoding used in URLs:
There are a few function and macros helping resolving URLs:
In addition to the basic URL structure, url_t, the library interface provides an union type url_string_t for passing unparsed strings instead of parsed URLs as function arguments:
There are a macros for printf()-like formatting of URLs:
These functions calculate MD5 digest of URL or contribute contents of the URL to MD5 sum:
SIP or SIPS URIs have some parameters that control transport of the request. In some cases, they should be detected and removed:
Finally, there are functions used as building blocks for protocol parsers using URLs: