public final class

Pattern

extends Object
implements Serializable
java.lang.Object
   ↳ java.util.regex.Pattern

Class Overview

Represents a pattern used for matching, searching, or replacing strings. Patterns are specified in terms of regular expressions and compiled using an instance of this class. They are then used in conjunction with a Matcher to perform the actual search.

A typical use case looks like this:

 Pattern p = Pattern.compile("Hello, A[a-z]*!");
  
 Matcher m = p.matcher("Hello, Android!");
 boolean b1 = m.matches(); // true
  
 m.setInput("Hello, Robot!");
 boolean b2 = m.matches(); // false
 

The above code could also be written in a more compact fashion, though this variant is less efficient, since Pattern and Matcher objects are created on the fly instead of being reused. fashion:

     boolean b1 = Pattern.matches("Hello, A[a-z]*!", "Hello, Android!"); // true
     boolean b2 = Pattern.matches("Hello, A[a-z]*!", "Hello, Robot!");   // false
 

Please consult the package documentation for an overview of the regular expression syntax used in this class as well as Android-specific implementation details.

See Also

Summary

Constants
int CANON_EQ This constant specifies that a character in a Pattern and a character in the input string only match if they are canonically equivalent.
int CASE_INSENSITIVE This constant specifies that a Pattern is matched case-insensitively.
int COMMENTS This constant specifies that a Pattern may contain whitespace or comments.
int DOTALL This constant specifies that the '.' meta character matches arbitrary characters, including line endings, which is normally not the case.
int LITERAL This constant specifies that the whole Pattern is to be taken literally, that is, all meta characters lose their meanings.
int MULTILINE This constant specifies that the meta characters '^' and '$' match only the beginning and end end of an input line, respectively.
int UNICODE_CASE This constant specifies that a Pattern is matched case-insensitively with regard to all Unicode characters.
int UNIX_LINES This constant specifies that a pattern matches Unix line endings ('\n') only against the '.', '^', and '$' meta characters.
Public Methods
static Pattern compile(String pattern, int flags)
Compiles a regular expression, creating a new Pattern instance in the process.
static Pattern compile(String pattern)
Compiles a regular expression, creating a new Pattern instance in the process.
int flags()
Returns the flags that have been set for this Pattern.
Matcher matcher(CharSequence input)
Returns a Matcher for the Pattern and a given input.
static boolean matches(String regex, CharSequence input)
Tries to match a given regular expression against a given input.
String pattern()
Returns the regular expression that was compiled into this Pattern.
static String quote(String s)
Quotes a given string using "\Q" and "\E", so that all other meta-characters lose their special meaning.
String[] split(CharSequence inputSeq, int limit)
Splits the given input sequence around occurrences of the Pattern.
String[] split(CharSequence input)
Splits a given input around occurrences of a regular expression.
String toString()
Returns a string containing a concise, human-readable description of this object.
Protected Methods
void finalize()
Is called before the object's memory is being reclaimed by the VM.
[Expand]
Inherited Methods
From class java.lang.Object

Constants

public static final int CANON_EQ

This constant specifies that a character in a Pattern and a character in the input string only match if they are canonically equivalent. It is (currently) not supported in Android.

Constant Value: 128 (0x00000080)

public static final int CASE_INSENSITIVE

This constant specifies that a Pattern is matched case-insensitively. That is, the patterns "a+" and "A+" would both match the string "aAaAaA".

Note: For Android, the CASE_INSENSITIVE constant (currently) always includes the meaning of the UNICODE_CASE constant. So if case insensitivity is enabled, this automatically extends to all Unicode characters. The UNICODE_CASE constant itself has no special consequences.

Constant Value: 2 (0x00000002)

public static final int COMMENTS

This constant specifies that a Pattern may contain whitespace or comments. Otherwise comments and whitespace are taken as literal characters.

Constant Value: 4 (0x00000004)

public static final int DOTALL

This constant specifies that the '.' meta character matches arbitrary characters, including line endings, which is normally not the case.

Constant Value: 32 (0x00000020)

public static final int LITERAL

This constant specifies that the whole Pattern is to be taken literally, that is, all meta characters lose their meanings.

Constant Value: 16 (0x00000010)

public static final int MULTILINE

This constant specifies that the meta characters '^' and '$' match only the beginning and end end of an input line, respectively. Normally, they match the beginning and the end of the complete input.

Constant Value: 8 (0x00000008)

public static final int UNICODE_CASE

This constant specifies that a Pattern is matched case-insensitively with regard to all Unicode characters. It is used in conjunction with the CASE_INSENSITIVE constant to extend its meaning to all Unicode characters.

Note: For Android, the CASE_INSENSITIVE constant (currently) always includes the meaning of the UNICODE_CASE constant. So if case insensitivity is enabled, this automatically extends to all Unicode characters. The UNICODE_CASE constant then has no special consequences.

Constant Value: 64 (0x00000040)

public static final int UNIX_LINES

This constant specifies that a pattern matches Unix line endings ('\n') only against the '.', '^', and '$' meta characters.

Constant Value: 1 (0x00000001)

Public Methods

public static Pattern compile (String pattern, int flags)

Compiles a regular expression, creating a new Pattern instance in the process. Allows to set some flags that modify the behavior of the Pattern.

Parameters
pattern the regular expression.
flags the flags to set. Basically, any combination of the constants defined in this class is valid.

Note: Currently, the CASE_INSENSITIVE and UNICODE_CASE constants have slightly special behavior in Android, and the CANON_EQ constant is not supported at all.

Returns
  • the new Pattern instance.
Throws
PatternSyntaxException if the regular expression is syntactically incorrect.

public static Pattern compile (String pattern)

Compiles a regular expression, creating a new Pattern instance in the process. This is actually a convenience method that calls compile(String, int) with a flags value of zero.

Parameters
pattern the regular expression.
Returns
  • the new Pattern instance.
Throws
PatternSyntaxException if the regular expression is syntactically incorrect.

public int flags ()

Returns the flags that have been set for this Pattern.

Returns
  • the flags that have been set. A combination of the constants defined in this class.

public Matcher matcher (CharSequence input)

Returns a Matcher for the Pattern and a given input. The Matcher can be used to match the Pattern against the whole input, find occurrences of the Pattern in the input, or replace parts of the input.

Parameters
input the input to process.
Returns
  • the resulting Matcher.

public static boolean matches (String regex, CharSequence input)

Tries to match a given regular expression against a given input. This is actually nothing but a convenience method that compiles the regular expression into a Pattern, builds a Matcher for it, and then does the match. If the same regular expression is used for multiple operations, it is recommended to compile it into a Pattern explicitly and request a reusable Matcher.

Parameters
regex the regular expression.
input the input to process.
Returns
  • true if and only if the Pattern matches the input.

public String pattern ()

Returns the regular expression that was compiled into this Pattern.

Returns
  • the regular expression.

public static String quote (String s)

Quotes a given string using "\Q" and "\E", so that all other meta-characters lose their special meaning. If the string is used for a Pattern afterwards, it can only be matched literally.

Parameters
s the string to quote.
Returns
  • the quoted string.

public String[] split (CharSequence inputSeq, int limit)

Splits the given input sequence around occurrences of the Pattern. The function first determines all occurrences of the Pattern inside the input sequence. It then builds an array of the "remaining" strings before, in-between, and after these occurrences. An additional parameter determines the maximal number of entries in the resulting array and the handling of trailing empty strings.

Parameters
inputSeq the input sequence.
limit Determines the maximal number of entries in the resulting array.
  • For n > 0, it is guaranteed that the resulting array contains at most n entries.
  • For n < 0, the length of the resulting array is exactly the number of occurrences of the Pattern +1. All entries are included.
  • For n == 0, the length of the resulting array is at most the number of occurrences of the Pattern +1. Empty strings at the end of the array are not included.
Returns
  • the resulting array.

public String[] split (CharSequence input)

Splits a given input around occurrences of a regular expression. This is a convenience method that is equivalent to calling the method split(java.lang.CharSequence, int) with a limit of 0.

Parameters
input the input sequence.
Returns
  • the resulting array.

public String toString ()

Returns a string containing a concise, human-readable description of this object. Subclasses are encouraged to override this method and provide an implementation that takes into account the object's type and data. The default implementation simply concatenates the class name, the '@' sign and a hexadecimal representation of the object's hashCode(), that is, it is equivalent to the following expression:

 getClass().getName() + '@' + Integer.toHexString(hashCode())
 

Returns
  • a printable representation of this object.

Protected Methods

protected void finalize ()

Is called before the object's memory is being reclaimed by the VM. This can only happen once the VM has detected, during a run of the garbage collector, that the object is no longer reachable by any thread of the running application.

The method can be used to free system resources or perform other cleanup before the object is garbage collected. The default implementation of the method is empty, which is also expected by the VM, but subclasses can override finalize() as required. Uncaught exceptions which are thrown during the execution of this method cause it to terminate immediately but are otherwise ignored.

Note that the VM does guarantee that finalize() is called at most once for any object, but it doesn't guarantee when (if at all) finalize() will be called. For example, object B's finalize() can delay the execution of object A's finalize() method and therefore it can delay the reclamation of A's memory. To be safe, use a ReferenceQueue, because it provides more control over the way the VM deals with references during garbage collection.

Throws
Throwable