public class URI
extends java.lang.Object
The elements are always stored in escaped form.
While RFC-2396 distinguishes between just two forms of URI's, those that follow the generic syntax and those that don't, this class knows about a third form, named semi-generic, used by quite a few popular schemes. Semi-generic syntax treats the path part as opaque, i.e. has the form <scheme>://<authority>/<opaque> . Relative URI's of this type are only resolved as far as absolute paths - relative paths do not exist.
Ideally, java.net.URL should subclass URI.
Modifier and Type | Field and Description |
---|---|
protected static java.util.BitSet |
alphanumChar |
protected static java.util.Hashtable |
defaultPorts |
static boolean |
ENABLE_BACKWARDS_COMPATIBILITY
If true, then the parser will resolve certain URI's in backwards
compatible (but technically incorrect) manner.
|
static java.util.BitSet |
escpdFragChar
list of characters which must not be escaped when escaping a fragment identifier
|
static java.util.BitSet |
escpdPathChar
list of characters which must not be escaped when escaping a path
|
static java.util.BitSet |
escpdQueryChar
list of characters which must not be escaped when escaping a query string
|
protected java.lang.String |
fragment |
protected static int |
GENERIC |
protected java.lang.String |
host |
protected static java.util.BitSet |
hostChar |
protected static java.util.BitSet |
markChar |
protected java.lang.String |
opaque |
protected static int |
OPAQUE |
protected static java.util.BitSet |
opaqueChar |
protected java.lang.String |
path |
protected static java.util.BitSet |
pcharChar |
protected int |
port |
protected java.lang.String |
query |
protected static java.util.BitSet |
reg_nameChar |
protected static java.util.BitSet |
reservedChar |
static java.util.BitSet |
resvdHostChar
list of characters which must not be unescaped when unescaping a host
|
static java.util.BitSet |
resvdPathChar
list of characters which must not be unescaped when unescaping a path
|
static java.util.BitSet |
resvdQueryChar
list of characters which must not be unescaped when unescaping a query string
|
static java.util.BitSet |
resvdSchemeChar
list of characters which must not be unescaped when unescaping a scheme
|
static java.util.BitSet |
resvdUIChar
list of characters which must not be unescaped when unescaping a userinfo
|
protected java.lang.String |
scheme |
protected static java.util.BitSet |
schemeChar |
protected static int |
SEMI_GENERIC |
protected int |
type |
protected static java.util.BitSet |
unreservedChar |
protected static java.util.BitSet |
uricChar |
protected java.net.URL |
url |
protected java.lang.String |
userinfo |
protected static java.util.BitSet |
userinfoChar |
protected static java.util.Hashtable |
usesGenericSyntax |
protected static java.util.Hashtable |
usesSemiGenericSyntax |
Constructor and Description |
---|
URI(java.lang.String uri)
Constructs a URI from the given string representation.
|
URI(java.lang.String scheme,
java.lang.String opaque)
Constructs an opaque URI from the given parts.
|
URI(java.lang.String scheme,
java.lang.String host,
int port,
java.lang.String path)
Constructs a URI from the given parts.
|
URI(java.lang.String scheme,
java.lang.String host,
java.lang.String path)
Constructs a URI from the given parts, using the default port for
this scheme (if known).
|
URI(java.lang.String scheme,
java.lang.String userinfo,
java.lang.String host,
int port,
java.lang.String path,
java.lang.String query,
java.lang.String fragment)
Constructs a URI from the given parts.
|
URI(URI base,
java.lang.String rel_uri)
Constructs a URI from the given string representation, relative to
the given base URI.
|
URI(java.net.URL url)
Construct a URI from the given URL.
|
Modifier and Type | Method and Description |
---|---|
static java.lang.String |
canonicalizePath(java.lang.String path)
Remove all "/../" and "/./" from path, where possible.
|
static int |
defaultPort(java.lang.String protocol)
Return the default port used by a given protocol.
|
boolean |
equals(java.lang.Object other) |
static char[] |
escape(char[] elem,
java.util.BitSet allowed_char,
boolean utf8)
Escape any character not in the given character class.
|
static java.lang.String |
escape(java.lang.String elem,
java.util.BitSet allowed_char,
boolean utf8)
Escape any character not in the given character class.
|
java.lang.String |
getFragment() |
java.lang.String |
getHost() |
java.lang.String |
getOpaque() |
java.lang.String |
getPath() |
java.lang.String |
getPathAndQuery() |
int |
getPort() |
java.lang.String |
getQueryString() |
java.lang.String |
getScheme() |
java.lang.String |
getUserinfo() |
int |
hashCode()
The hash code is calculated over scheme, host, path, and query.
|
boolean |
isGenericURI()
Does the scheme specific part of this URI use the generic-URI syntax?
|
boolean |
isSemiGenericURI()
Does the scheme specific part of this URI use the semi-generic-URI syntax?
|
static void |
main(java.lang.String[] args)
Run test set.
|
java.lang.String |
toExternalForm() |
java.lang.String |
toString()
Return the URI as string.
|
java.net.URL |
toURL()
Will try to create a java.net.URL object from this URI.
|
static java.lang.String |
unescape(java.lang.String str,
java.util.BitSet reserved)
Unescape escaped characters (i.e.
|
static boolean |
usesGenericSyntax(java.lang.String scheme) |
static boolean |
usesSemiGenericSyntax(java.lang.String scheme) |
public static final boolean ENABLE_BACKWARDS_COMPATIBILITY
base = http://a/b/c/d;p?q rel = http:g result = http:g (correct) result = http://a/b/c/g (backwards compatible)See rfc-2396, section 5.2, step 3, second paragraph.
protected static final java.util.Hashtable defaultPorts
protected static final java.util.Hashtable usesGenericSyntax
protected static final java.util.Hashtable usesSemiGenericSyntax
protected static final java.util.BitSet alphanumChar
protected static final java.util.BitSet markChar
protected static final java.util.BitSet reservedChar
protected static final java.util.BitSet unreservedChar
protected static final java.util.BitSet uricChar
protected static final java.util.BitSet pcharChar
protected static final java.util.BitSet userinfoChar
protected static final java.util.BitSet schemeChar
protected static final java.util.BitSet hostChar
protected static final java.util.BitSet opaqueChar
protected static final java.util.BitSet reg_nameChar
public static final java.util.BitSet resvdSchemeChar
public static final java.util.BitSet resvdUIChar
public static final java.util.BitSet resvdHostChar
public static final java.util.BitSet resvdPathChar
public static final java.util.BitSet resvdQueryChar
public static final java.util.BitSet escpdPathChar
public static final java.util.BitSet escpdQueryChar
public static final java.util.BitSet escpdFragChar
protected static final int OPAQUE
protected static final int SEMI_GENERIC
protected static final int GENERIC
protected int type
protected java.lang.String scheme
protected java.lang.String opaque
protected java.lang.String userinfo
protected java.lang.String host
protected int port
protected java.lang.String path
protected java.lang.String query
protected java.lang.String fragment
protected java.net.URL url
public URI(java.lang.String uri) throws ParseException
uri
- a String containing an absolute URIParseException
- if no scheme can be found or a specified
port cannot be parsed as a numberpublic URI(URI base, java.lang.String rel_uri) throws ParseException
base
- the base URI, relative to which rel_uri
is to be parsedrel_uri
- a String containing a relative or absolute URIParseException
- if base is null and
rel_uri is not an absolute URI, or
if base is not null and the scheme
is not known to use the generic syntax, or
if a given port cannot be parsed as a numberpublic URI(java.net.URL url) throws ParseException
url
- the URLParseException
- if url.toExternalForm()
generates
an invalid string representationpublic URI(java.lang.String scheme, java.lang.String host, java.lang.String path) throws ParseException
scheme
- the scheme (sometimes known as protocol)host
- the hostpath
- the path partParseException
- if scheme is nullpublic URI(java.lang.String scheme, java.lang.String host, int port, java.lang.String path) throws ParseException
scheme
- the scheme (sometimes known as protocol)host
- the hostport
- the portpath
- the path partParseException
- if scheme is nullpublic URI(java.lang.String scheme, java.lang.String userinfo, java.lang.String host, int port, java.lang.String path, java.lang.String query, java.lang.String fragment) throws ParseException
scheme
- the scheme (sometimes known as protocol)userinfo
- the userinfohost
- the hostport
- the portpath
- the path partquery
- the query stringfragment
- the fragment identifierParseException
- if scheme is nullpublic URI(java.lang.String scheme, java.lang.String opaque) throws ParseException
scheme
- the scheme (sometimes known as protocol)opaque
- the opaque partParseException
- if scheme is nullpublic static java.lang.String canonicalizePath(java.lang.String path)
path
- the path to canonicalizepublic static boolean usesGenericSyntax(java.lang.String scheme)
public static boolean usesSemiGenericSyntax(java.lang.String scheme)
public static final int defaultPort(java.lang.String protocol)
protocol
- the protocolpublic java.lang.String getScheme()
public java.lang.String getOpaque()
public java.lang.String getHost()
public int getPort()
public java.lang.String getUserinfo()
public java.lang.String getPath()
public java.lang.String getQueryString()
public java.lang.String getPathAndQuery()
public java.lang.String getFragment()
public boolean isGenericURI()
In general URI are split into two categories: opaque-URI and generic-URI. The generic-URI syntax is the syntax most are familiar with from URLs such as ftp- and http-URLs, which is roughly:
generic-URI = scheme ":" [ "//" server ] [ "/" ] [ path_segments ] [ "?" query ](see RFC-2396 for exact syntax). Only URLs using the generic-URI syntax can be used to create and resolve relative URIs.
Whether a given scheme is parsed according to the generic-URI syntax or wether it is treated as opaque is determined by an internal table of URI schemes.
public boolean isSemiGenericURI()
Many schemes which don't follow the full generic syntax actually follow a reduced form where the path part is treated is opaque. This is used for example by ldap, smtp, pop, etc, and is roughly
generic-URI = scheme ":" [ "//" server ] [ "/" [ opaque_path ] ]I.e. parsing is identical to the generic-syntax, except that the path part is not further parsed. URLs using the semi-generic-URI syntax can be used to create and resolve relative URIs with the restriction that all paths are treated as absolute.
Whether a given scheme is parsed according to the semi-generic-URI syntax is determined by an internal table of URI schemes.
isGenericURI()
public java.net.URL toURL() throws java.net.MalformedURLException
java.net.MalformedURLException
- if no handler is available for the
schemepublic java.lang.String toExternalForm()
public java.lang.String toString()
toString
in class java.lang.Object
toExternalForm()
public boolean equals(java.lang.Object other)
equals
in class java.lang.Object
public int hashCode()
hashCode
in class java.lang.Object
public static java.lang.String escape(java.lang.String elem, java.util.BitSet allowed_char, boolean utf8)
elem
- the string to escapeallowed_char
- the BitSet of all allowed charactersutf8
- if true, will first UTF-8 encode unallowed characterspublic static char[] escape(char[] elem, java.util.BitSet allowed_char, boolean utf8)
elem
- the array of characters to escapeallowed_char
- the BitSet of all allowed charactersutf8
- if true, will first UTF-8 encode unallowed characterspublic static final java.lang.String unescape(java.lang.String str, java.util.BitSet reserved) throws ParseException
str
- the string to unescapereserved
- the characters which may not be unescaped, or nullParseException
- if the two digits following a `%' are
not a valid hex numberpublic static void main(java.lang.String[] args) throws java.lang.Exception
java.lang.Exception
- if any test fails