|
|
su.dbms.sql- SU.DBMS.SQL ------------------------------------------------------------------ From : Andy Igoshin 2:5020/400 17 Mar 2001 18:12:31 To : All Subject : sql3: similar to ? -------------------------------------------------------------------------------- Hi! Какая-нибудь база это уже поддерживает? --------------------------------------------------------------------------- 8.6 <similar predicate> Function Specify a character string similarity by means of a regular expression. Format <similar predicate> ::= <character match value> [ NOT ] SIMILAR TO <similar pattern> [ ESCAPE <escape character> ] <similar pattern> ::= <character value expression> <regular expression> ::= <regular term> | <regular expression> <vertical bar> <regular term> <regular term> ::= <regular factor> | <regular term> <regular factor> <regular factor> ::= <regular primary> | <regular primary> <asterisk> | <regular primary> <plus sign> <regular primary> ::= <character specifier> | <percent> | <regular character set> | <left paren> <regular expression> <right paren> <character specifier> ::= <non-escaped character> | <escaped character> <non-escaped character> ::= !! See the Syntax Rules <escaped character> ::= !! See the Syntax Rules <regular character set> ::= <underscore> | <left bracket> <character enumeration>... <right bracket> | <left bracket> <circumflex> <character enumeration>... <right bracket> | <left bracket> <colon> <regular character set identifier> <colon> <right bracket> <character enumeration> ::= <character specifier> | <character specifier> <minus sign> <character specifier> 372 Foundation (SQL/Foundation) ISO/IEC ISO/IEC 9075-2:1999 (E) 8.6 <similar predicate> <regular character set identifier> ::= <identifier> Syntax Rules 1) The declared types of <character match value>, <similar pattern>, and <escape character> shall be character string. <character match value>, <similar pattern>, and <escape character> shall be comparable. 2) Let CM be the <character match value> and let SP be the <similar pattern>. If <escape character> EC is specified, then CM NOT SIMILAR TO SP ESCAPE EC is equivalent to NOT ( CM SIMILAR TO SP ESCAPE EC ) If <escape character> EC is not specified, then CM NOT SIMILAR TO SP is equivalent to NOT ( CM SIMILAR TO SP ) 3) The value of the <identifier> that is a <regular character set identifier> shall be either ALPHA, UPPER, LOWER, DIGIT, or ALNUM. 4) Case: a) If <escape character> is not specified, then the collating sequence used for the <similar predicate> is determined by Table 3, "Collating sequence usage for comparisons", taking <character match value> as comparand 1 (one) and <similar pattern> as comparand 2. b) Otherwise, let C1 be the coercibility characteristic and collating sequence of the <character match value>, and C2 be the coercibility characteristic and collating sequence of the <similar pattern>. Let C3 be the resulting coercibility characteristic and collating sequence as determined by Table 2, "Collating coercibility rules for dyadic operators", taking C1 as the operand 1 (one) coercibility and C2 as the operand 2 coercibility. The collating sequence used for the <similar predicate> is determined by Table 3, "Collating sequence usage for comparisons", taking C3 as the coercibility characteristic and collating sequence of comparand 1 (one) and <escape character> as comparand 2. It is implementation-defined, whether all, some, or no collating sequences other than the default collating sequence for the character set of the <character match value> can be used as the collating sequence of the <similar predicate>. Predicates 373 ISO/IEC 9075-2:1999 (E) ISO/IEC 8.6 <similar predicate> 5) A <non-escaped character> is any single character from the character set of the <similar pattern> that is not a <left bracket>, <right bracket>, <left paren>, <right paren>, <vertical bar>, <circumflex>, <minus sign>, <plus sign>, <asterisk>, <underscore>, <percent>, or the character specified by the result of the <character value expression> of <escape character>. A <character specifier> that is a <non-escaped character> represents itself. 6) An <escaped character> is a sequence of two characters: the character specified by the result of the <character value expression> of <escape character>, followed by a second character that is a <left bracket>, <right bracket>, <left paren>, <right paren>, <vertical bar>, <circumflex>, <minus sign>, <plus sign>, <asterisk>, <underscore>, <percent>, or the character specified by the result of the <character value expression> of <escape character>. A <character specifier> that is an <escaped character> represents its second character. 7) A <character enumeration> shall not be specified in a way that both its first and its last <character specifier>s are <non- escaped character>s that are <colon>s. Access Rules None. General Rules 1) Let MCV be the result of the <character value expression> of the <character match value> and let PCV be the result of the <character value expression> of the <similar pattern>. If EC is specified, then let ECV be its value. 2) If the result of the <character value expression> of the <similar pattern> is not a zero-length string and does not have the format of a <regular expression>, then an exception condition is raised: data exception - invalid regular expression. 3) If an <escape character> is specified, then: If the length in characters of ECV is not equal to 1 (one), then an exception condition is raised: data exception - invalid escape character. a) If ECV is one of <left bracket>, <right bracket>, <left paren>, <right paren>, <vertical bar>, <circumflex>, <minus sign>, <plus sign>, <asterisk>, <underscore> or <percent> and ECV occurs in the <regular expression> except in an <escaped character>, then an exception condition is raised: data exception - invalid use of escape character. 374 Foundation (SQL/Foundation) ISO/IEC ISO/IEC 9075-2:1999 (E) 8.6 <similar predicate> b) If ECV is a <colon> and the <regular expression> contains a <regular character set identifier>, then an exception condition is raised: data exception - escape character conflict. 4) Case: a) If ESCAPE is not specified, then if either or both of MCV and PCV are the null value, then the result of CM SIMILAR TO SP is unknown. b) If ESCAPE is specified, then if one or more of MCV, PCV, and ECV are the null value, then the result of CM SIMILAR TO SP ESCAPE EC is unknown. NOTE 133 - If none of MCV, PCV, and ECV (if present) are the null value, then the result is either true or false. 5) The set of characters in a <character enumeration> is defined as a) If the enumeration is specified in the form "<character specifier> <minus sign> <character specifier>", then the set of all characters that collate greater than or equal to the character represented by the left <character specifier> and less than or equal to the character represented by the right <character specifier>, according to the collating sequence of the pattern P. b) Otherwise, the character that the <character specifier> in the <character enumeration> represents. 6) Let R be the result of the <character value expression> of the <similar pattern>. The regular language L(R) of the <similar pattern> is a (possibly infinite) set of strings. It is defined recursively for well-formed <regular expression>s Q, Q1, and Q2 by the following rules: a) L( Q1 <vertical bar> Q2 ) is the union of L(Q1) and L(Q2) b) L( Q <asterisk> ) is the set of all strings that can be constructed by concatenating zero or more strings from L(Q). c) L( Q <plus sign> ) is the set of all strings that can be constructed by concatenating one or more strings from L(Q). d) L( <character specifier> ) Predicates 375 ISO/IEC 9075-2:1999 (E) ISO/IEC 8.6 <similar predicate> is a set that contains a single string of length 1 (one) with the character that the <character specifier> represents e) L( <percent> ) is the set of all strings of any length (zero or more) from the character set of the pattern P. f) L( <left paren> Q <right paren> ) is equal to L(Q) g) L( <underscore> ) is the set of all strings of length 1 (one) from the character set of the pattern P. h) L( <left bracket> <character enumeration> <right bracket> ) is the set of all strings of length 1 (one) from the set of characters in the <character enumeration>s. i) L( <left bracket> <circumflex> <character enumeration> <right bracket> ) is the set of all strings of length 1 (one) with characters from the character set of the pattern P that are not contained in the set of characters in the <character enumeration>. j) L( <left bracket> <colon> ALPHA <colon> <right bracket> ) is the set of all character strings of length 1 (one) that are <simple Latin letter>s. k) L( <left bracket> <colon> UPPER <colon> <right bracket> ) is the set of all character strings of length 1 (one) that are <simple Latin upper case letter>s. l) L( <left bracket> <colon> LOWER <colon> <right bracket> ) is the set of all character strings of length 1 (one) that are <simple Latin lower case letter>s. m) L( <left bracket> <colon> DIGIT <colon> <right bracket> ) is the set of all character strings of length 1 (one) that are <digit>s. n) L( <left bracket> <colon> ALNUM <colon> <right bracket> ) is the set of all character strings of length 1 (one) that are <simple Latin letter>s or <digit>s. o) L( Q1 || Q2 ) is the set of all strings that can be constructed by concatenating one element of L(Q1) and one element of L(Q2). p) L( Q ) 376 Foundation (SQL/Foundation) ISO/IEC ISO/IEC 9075-2:1999 (E) 8.6 <similar predicate> is the set of the zero-length string, if Q is an empty regular expression. 7) The <similar predicate> CM SIMILAR TO SP is true, if there exists at least one element X of L(R) that is equal to MCV according to the collating sequence of the <similar predicate>; otherwise, it is false. NOTE 134 - The <similar predicate> is defined differently from equivalent forms of the LIKE predicate. In particular, blanks at the end of a pattern and collating sequences are handled differently. Conformance Rules 1) Without Feature T141, "SIMILAR predicate", conforming SQL language shall contain no <similar predicate>. --------------------------------------------------------------------------- -- Andy --- ifmail v.2.15dev5 * Origin: Network Operation Center of VSU (2:5020/400) Вернуться к списку тем, сортированных по: возрастание даты уменьшение даты тема автор
Архивное /su.dbms.sql/5494b8c06f63.html, оценка из 5, голосов 10
|