public class URI extends Object
The elements are always stored in escaped form.
While RFC-2396 distinguishes between just two forms of URI's, those that follow the generic syntax and those that don't, this class knows about a third form, named semi-generic, used by quite a few popular schemes. Semi-generic syntax treats the path part as opaque, i.e. has the form <scheme>://<authority>/<opaque> . Relative URI's of this type are only resolved as far as absolute paths - relative paths do not exist.
Ideally, java.net.URL should subclass URI.
Modifier and Type | Field and Description |
---|---|
protected static BitSet |
alphanumChar |
protected static Hashtable |
defaultPorts |
static boolean |
ENABLE_BACKWARDS_COMPATIBILITY
If true, then the parser will resolve certain URI's in backwards
compatible (but technically incorrect) manner.
|
static BitSet |
escpdFragChar
list of characters which must not be escaped when escaping a fragment identifier
|
static BitSet |
escpdPathChar
list of characters which must not be escaped when escaping a path
|
static BitSet |
escpdQueryChar
list of characters which must not be escaped when escaping a query string
|
protected String |
fragment |
protected static int |
GENERIC |
protected String |
host |
protected static BitSet |
hostChar |
protected static BitSet |
markChar |
protected String |
opaque |
protected static int |
OPAQUE |
protected static BitSet |
opaqueChar |
protected String |
path |
protected static BitSet |
pcharChar |
protected int |
port |
protected String |
query |
protected static BitSet |
reg_nameChar |
protected static BitSet |
reservedChar |
static BitSet |
resvdHostChar
list of characters which must not be unescaped when unescaping a host
|
static BitSet |
resvdPathChar
list of characters which must not be unescaped when unescaping a path
|
static BitSet |
resvdQueryChar
list of characters which must not be unescaped when unescaping a query string
|
static BitSet |
resvdSchemeChar
list of characters which must not be unescaped when unescaping a scheme
|
static BitSet |
resvdUIChar
list of characters which must not be unescaped when unescaping a userinfo
|
protected String |
scheme |
protected static BitSet |
schemeChar |
protected static int |
SEMI_GENERIC |
protected int |
type |
protected static BitSet |
unreservedChar |
protected static BitSet |
uricChar |
protected URL |
url |
protected String |
userinfo |
protected static BitSet |
userinfoChar |
protected static Hashtable |
usesGenericSyntax |
protected static Hashtable |
usesSemiGenericSyntax |
Constructor and Description |
---|
URI(String uri)
Constructs a URI from the given string representation.
|
URI(String scheme,
String opaque)
Constructs an opaque URI from the given parts.
|
URI(String scheme,
String host,
int port,
String path)
Constructs a URI from the given parts.
|
URI(String scheme,
String host,
String path)
Constructs a URI from the given parts, using the default port for
this scheme (if known).
|
URI(String scheme,
String userinfo,
String host,
int port,
String path,
String query,
String fragment)
Constructs a URI from the given parts.
|
URI(URI base,
String rel_uri)
Constructs a URI from the given string representation, relative to
the given base URI.
|
URI(URL url)
Construct a URI from the given URL.
|
Modifier and Type | Method and Description |
---|---|
static String |
canonicalizePath(String path)
Remove all "/../" and "/./" from path, where possible.
|
static int |
defaultPort(String protocol)
Return the default port used by a given protocol.
|
boolean |
equals(Object other) |
static char[] |
escape(char[] elem,
BitSet allowed_char,
boolean utf8)
Escape any character not in the given character class.
|
static String |
escape(String elem,
BitSet allowed_char,
boolean utf8)
Escape any character not in the given character class.
|
String |
getFragment() |
String |
getHost() |
String |
getOpaque() |
String |
getPath() |
String |
getPathAndQuery() |
int |
getPort() |
String |
getQueryString() |
String |
getScheme() |
String |
getUserinfo() |
int |
hashCode()
The hash code is calculated over scheme, host, path, and query.
|
boolean |
isGenericURI()
Does the scheme specific part of this URI use the generic-URI syntax?
|
boolean |
isSemiGenericURI()
Does the scheme specific part of this URI use the semi-generic-URI syntax?
|
static void |
main(String[] args)
Run test set.
|
String |
toExternalForm() |
String |
toString()
Return the URI as string.
|
URL |
toURL()
Will try to create a java.net.URL object from this URI.
|
static String |
unescape(String str,
BitSet reserved)
Unescape escaped characters (i.e.
|
static boolean |
usesGenericSyntax(String scheme) |
static boolean |
usesSemiGenericSyntax(String scheme) |
public static final boolean ENABLE_BACKWARDS_COMPATIBILITY
base = http://a/b/c/d;p?q rel = http:g result = http:g (correct) result = http://a/b/c/g (backwards compatible)See rfc-2396, section 5.2, step 3, second paragraph.
protected static final Hashtable defaultPorts
protected static final Hashtable usesGenericSyntax
protected static final Hashtable usesSemiGenericSyntax
protected static final BitSet alphanumChar
protected static final BitSet markChar
protected static final BitSet reservedChar
protected static final BitSet unreservedChar
protected static final BitSet uricChar
protected static final BitSet pcharChar
protected static final BitSet userinfoChar
protected static final BitSet schemeChar
protected static final BitSet hostChar
protected static final BitSet opaqueChar
protected static final BitSet reg_nameChar
public static final BitSet resvdSchemeChar
public static final BitSet resvdUIChar
public static final BitSet resvdHostChar
public static final BitSet resvdPathChar
public static final BitSet resvdQueryChar
public static final BitSet escpdPathChar
public static final BitSet escpdQueryChar
public static final BitSet escpdFragChar
protected static final int OPAQUE
protected static final int SEMI_GENERIC
protected static final int GENERIC
protected int type
protected String scheme
protected String opaque
protected String userinfo
protected String host
protected int port
protected String path
protected String query
protected String fragment
protected URL url
public URI(String uri) throws ParseException
uri
- a String containing an absolute URIParseException
- if no scheme can be found or a specified
port cannot be parsed as a numberpublic URI(URI base, String rel_uri) throws ParseException
base
- the base URI, relative to which rel_uri
is to be parsedrel_uri
- a String containing a relative or absolute URIParseException
- if base is null and
rel_uri is not an absolute URI, or
if base is not null and the scheme
is not known to use the generic syntax, or
if a given port cannot be parsed as a numberpublic URI(URL url) throws ParseException
url
- the URLParseException
- if url.toExternalForm()
generates
an invalid string representationpublic URI(String scheme, String host, String path) throws ParseException
scheme
- the scheme (sometimes known as protocol)host
- the hostpath
- the path partParseException
- if scheme is nullpublic URI(String scheme, String host, int port, String path) throws ParseException
scheme
- the scheme (sometimes known as protocol)host
- the hostport
- the portpath
- the path partParseException
- if scheme is nullpublic URI(String scheme, String userinfo, String host, int port, String path, String query, String fragment) throws ParseException
scheme
- the scheme (sometimes known as protocol)userinfo
- the userinfohost
- the hostport
- the portpath
- the path partquery
- the query stringfragment
- the fragment identifierParseException
- if scheme is nullpublic URI(String scheme, String opaque) throws ParseException
scheme
- the scheme (sometimes known as protocol)opaque
- the opaque partParseException
- if scheme is nullpublic static String canonicalizePath(String path)
path
- the path to canonicalizepublic static boolean usesGenericSyntax(String scheme)
public static boolean usesSemiGenericSyntax(String scheme)
public static final int defaultPort(String protocol)
protocol
- the protocolpublic String getScheme()
public String getOpaque()
public String getHost()
public int getPort()
public String getUserinfo()
public String getPath()
public String getQueryString()
public String getPathAndQuery()
public String getFragment()
public boolean isGenericURI()
In general URI are split into two categories: opaque-URI and generic-URI. The generic-URI syntax is the syntax most are familiar with from URLs such as ftp- and http-URLs, which is roughly:
generic-URI = scheme ":" [ "//" server ] [ "/" ] [ path_segments ] [ "?" query ](see RFC-2396 for exact syntax). Only URLs using the generic-URI syntax can be used to create and resolve relative URIs.
Whether a given scheme is parsed according to the generic-URI syntax or wether it is treated as opaque is determined by an internal table of URI schemes.
public boolean isSemiGenericURI()
Many schemes which don't follow the full generic syntax actually follow a reduced form where the path part is treated is opaque. This is used for example by ldap, smtp, pop, etc, and is roughly
generic-URI = scheme ":" [ "//" server ] [ "/" [ opaque_path ] ]I.e. parsing is identical to the generic-syntax, except that the path part is not further parsed. URLs using the semi-generic-URI syntax can be used to create and resolve relative URIs with the restriction that all paths are treated as absolute.
Whether a given scheme is parsed according to the semi-generic-URI syntax is determined by an internal table of URI schemes.
isGenericURI()
public URL toURL() throws MalformedURLException
MalformedURLException
- if no handler is available for the
schemepublic String toExternalForm()
public String toString()
toString
in class Object
toExternalForm()
public boolean equals(Object other)
public int hashCode()
public static String escape(String elem, BitSet allowed_char, boolean utf8)
elem
- the string to escapeallowed_char
- the BitSet of all allowed charactersutf8
- if true, will first UTF-8 encode unallowed characterspublic static char[] escape(char[] elem, BitSet allowed_char, boolean utf8)
elem
- the array of characters to escapeallowed_char
- the BitSet of all allowed charactersutf8
- if true, will first UTF-8 encode unallowed characterspublic static final String unescape(String str, BitSet reserved) throws ParseException
str
- the string to unescapereserved
- the characters which may not be unescaped, or nullParseException
- if the two digits following a `%' are
not a valid hex numberCopyright © 2000-2014. All Rights Reserved.