[php] [php] Á¤±Ô½ÄÇ¥Çö(Regular Expression Functions)
|
|
ÀÛ¼ºÀÚ :
rootman
Date : 2005-05-03 15:03 | Hit : 11,968
|
ÆÄÀÏÀ̳ª ¹®ÀÚ¿ ³»¿¡ Æ÷ÇԵǾî Àִ Ưº°ÇÑ ÆÐÅÏ (¶Ç´Â Ưº°ÇÑ Á¶°ÇÀ» ¸¸Á·ÇÏ´Â ¹®ÀÚ¿)À» °Ë»öÇϱâ À§ÇØ ¹Ì¸® Á¤ÀÇµÈ ´Ù¾çÇÑ Æ¯¼ö ¹®ÀÚµéÀÇ Á¶ÇÕÀ» Á¤±Ô½Ä(regular expression)À̶ó ÇÑ´Ù. Á¤±Ô½Ä¿¡¼ÀÇ Æ¯¼ö ¹®ÀÚ(special character)´Â ´ÙÀ½°ú °°´Ù.
---------------------------------------------------------------------------------- 1. Á¤±Ô½Ä ---------------------------------------------------------------------------------- (1) ^ (caret) : ¶óÀÎÀÇ Ã³À½À̳ª ¹®ÀÚ¿ÀÇ Ã³À½À» Ç¥½Ã Ȱ¿ë) /etc/services ÆÄÀÏ¿¡¼ rsync·Î ½ÃÀ۵Ǵ ¹®ÀÚ¿

(2) $ (dollar) : ¶óÀÎÀÇ ³¡À̳ª ¹®ÀÚ¿ÀÇ ³¡À» Ç¥½Ã Ȱ¿ë) /etc/mail/sendmail.cf ÆÄÀÏ¿¡¼ "#defineÀ¸·Î ½ÃÀÛÇÏ¸é¼ dnl·Î ³¡³ª´Â ¹®ÀÚ¿À» Ç¥½Ã

(3) . (period) : ÀÓÀÇÀÇ ÇÑ ¹®ÀÚ¸¦ Ç¥½Ã ¿¹) ^a.c (¹®ÀÚ¿ÀÇ Ã³À½¿¡ abc, adc, aZc µîÀº Âü, aa ´Â °ÅÁþ) a..b$ (¹®ÀÚ¿ÀÇ ³¡¿¡ aaab, abbb, azzb µîÀ» Æ÷ÇÔÇϸé Âü)
(4) [] (bracket) : ¹®ÀÚÀÇ ÁýÇÕÀ̳ª ¹üÀ§¸¦ ³ªÅ¸³¿, µÎ ¹®ÀÚ »çÀÌÀÇ "-"´Â ¹üÀ§¸¦ ³ªÅ¸³¿ []³»¿¡¼ "^"ÀÌ ¼±ÇàµÇ¸é notÀ» ³ªÅ¸³¿ À̿ܿ¡µµ "¹®ÀÚŬ·¡½º"¸¦ Æ÷ÇÔÇÏ´Â [:¹®ÀÚŬ·¡½º:]ÀÇ ÇüŰ¡ ÀÖ´Ù. ¿©±â¿¡¼ "¹®ÀÚŬ·¡½º"¿¡´Â alpha, blank, cntrl, digit, graph, lower, print, space, uppper, xdigit°¡ ÀÖ´Ù.
ÀÌ¿¡ ´ëÇÑ ÀÚ¼¼ÇÑ ³»¿ëÀº C¾ð¾îÀÇ ¸¦ ÂüÁ¶ÇÏ¸é µÈ´Ù.
¿¹¸¦ µé¾î [:digit:]´Â [0-9]¿Í [:alpha:]´Â [A-Za-z]¿Í µ¿ÀÏÇÏ´Ù.
À̿ܿ¡ [:<:]¿Í [:>:]´Â ¾î¶² ´Ü¾î(¼ýÀÚ, ¾ËÆÄºª, '_'·Î ±¸¼ºµÊ)ÀÇ ½ÃÀÛ°ú ³¡À» ³ªÅ¸³½´Ù.
ex) [abc] (a, b, c Áß ¾î¶² ¹®ÀÚ, "[a-c]."°ú µ¿ÀÏ)
[Yy] (Y ¶Ç´Â y) [A-Za-z0-9] (¸ðµç ¾ËÆÄºª°ú ¼ýÀÚ) [-A-Z]. ("-"(hyphen)°ú ¸ðµç ´ë¹®ÀÚ) [^a-z] (¼Ò¹®ÀÚ ÀÌ¿ÜÀÇ ¹®ÀÚ) [^0-9] (¼ýÀÚ ÀÌ¿ÜÀÇ ¹®ÀÚ) [[:digit:]] ([0-9]¿Í µ¿ÀÏ)
(5) {} (brace) : {} ³»ÀÇ ¼ýÀÚ´Â Á÷ÀüÀÇ ¼±Ç๮ÀÚ°¡ ³ªÅ¸³ª´Â Ƚ¼ö ¶Ç´Â ¹üÀ§¸¦ ³ªÅ¸³¿ ex) a{3} ('a'ÀÇ 3¹ø ¹Ýº¹ÀÎ aaa¸¸ ÇØ´çµÊ) a{3,} ('a'°¡ 3¹ø ÀÌ»ó ¹Ýº¹ÀÎ aaa, aaaa, aaaa, ... µîÀ» ³ªÅ¸³¿) a{3,5} (aaa, aaaa, aaaaa ¸¸ ÇØ´çµÊ) ab{2,3} (abb¿Í abbb ¸¸ ÇØ´çµÊ) [0-9]{2} (µÎ ÀÚ¸® ¼ýÀÚ) doc[7-9]{2} (doc77, doc87, doc97 µîÀÌ ÇØ´ç) [^Zz]{5} (Z¿Í z¸¦ Æ÷ÇÔÇÏÁö ¾Ê´Â 5°³ÀÇ ¹®ÀÚ¿, abcde, ttttt µîÀÌ ÇØ´ç) .{3,4}er ('er'¾Õ¿¡ ¼¼ °³ ¶Ç´Â ³× °³ÀÇ ¹®ÀÚ¸¦ Æ÷ÇÔÇÏ´Â ¹®ÀÚ¿À̹ǷΠPeter, mother µîÀÌ ÇØ´ç)
(6) * (asterisk) : "*" Á÷ÀüÀÇ ¼±Ç๮ÀÚ°¡ 0¹ø ¶Ç´Â ¿©·¯¹ø ³ªÅ¸³ª´Â ¹®ÀÚ¿ ex) ab*c ('b'¸¦ 0¹ø ¶Ç´Â ¿©·¯¹ø Æ÷ÇÔÇϹǷΠac, ackdddd, abc, abbc, abbbbbbbc µî) * (¼±Ç๮ÀÚ°¡ ¾ø´Â °æ¿ìÀ̹ǷΠÀÓÀÇÀÇ ¹®ÀÚ¿ ¹× °ø¹é ¹®ÀÚ¿µµ ÇØ´çµÊ) .* (¼±Ç๮ÀÚ°¡ "."À̹ǷΠÇϳª ÀÌ»óÀÇ ¹®ÀÚ¸¦ Æ÷ÇÔÇÏ´Â ¹®ÀÚ¿, °ø¹é ¹®ÀÚ¿Àº ¾ÈµÊ) ab* ('b'¸¦ 0¹ø ¶Ç´Â ¿©·¯¹ø Æ÷ÇÔÇϹǷΠa, accc, abb, abbbbbbb µî) a* ('a'¸¦ 0¹ø ¶Ç´Â ¿©·¯¹ø Æ÷ÇÔÇϹǷΠk, kdd, sdfrrt, a, aaaa, abb, °ø¹é¹®ÀÚ¿ µî) doc[7-9]* (doc7, doc777, doc778989, doc µîÀÌ ÇØ´ç) [A-Z].* (´ë¹®Àڷθ¸ ÀÌ·ç¾îÁø ¹®ÀÚ¿) like.* (Á÷ÀüÀÇ ¼±Ç๮ÀÚ°¡ '.'À̹ǷΠlike¿¡ 0 ¶Ç´Â Çϳª ÀÌ»óÀÇ ¹®ÀÚ°¡ Ãß°¡µÈ ¹®ÀÚ¿ÀÌ µÊ, like, likely, liker, likelihood µî)
(7) + (asterisk) : "+" Á÷ÀüÀÇ ¼±Ç๮ÀÚ°¡ 1¹ø ÀÌ»ó ³ªÅ¸³ª´Â ¹®ÀÚ¿ ¿¹) ab+c ('b'¸¦ 1¹ø ¶Ç´Â ¿©·¯¹ø Æ÷ÇÔÇϹǷΠabc, abckdddd, abbc, abbbbbbbc µî, ac´Â ¾ÈµÊ) ab+ ('b'¸¦ 1¹ø ¶Ç´Â ¿©·¯¹ø Æ÷ÇÔÇϹǷΠab, abccc, abb, abbbbbbb µî) like.+ (Á÷ÀüÀÇ ¼±Ç๮ÀÚ°¡ '.'À̹ǷΠlike¿¡ Çϳª ÀÌ»óÀÇ ¹®ÀÚ°¡ Ãß°¡µÈ ¹®ÀÚ¿ÀÌ µÊ, likely, liker, likelihood µî, ±×·¯³ª like´Â ÇØ´ç¾ÈµÊ) [A-Z]+ (´ë¹®Àڷθ¸ ÀÌ·ç¾îÁø ¹®ÀÚ¿)
¿¹) http://www.rootman.co.kr °°Àº URL Ç¥½Ã ^[[:alnum:]]+/*$
(8) ? (asterisk) : "?" Á÷ÀüÀÇ ¼±Ç๮ÀÚ°¡ 0¹ø ¶Ç´Â 1¹ø ³ªÅ¸³ª´Â ¹®ÀÚ¿ ex) ab?c ('b'¸¦ 0¹ø ¶Ç´Â 1¹ø Æ÷ÇÔÇϹǷΠabc, abcd ¸¸ ÇØ´çµÊ)
(9) () (parenthesis) : ()´Â Á¤±Ô½Ä³»¿¡¼ ÆÐÅÏÀ» ±×·ìÈ ÇÒ ¶§ »ç¿ë
(10) | (bar) : or¸¦ ³ªÅ¸³¿ ¿¹) nationlist.txt ÆÄÀÏ¿¡¼ korea ¶Ç´Â china ¹®ÀÚ¿¸¸ °Ë»ö

(11) \\ (backslash) : À§¿¡¼ »ç¿ëµÈ Ư¼ö ¹®ÀÚµéÀ» Á¤±Ô½Ä³»¿¡¼ ¹®ÀÚ¸¦ Ãë±ÞÇÏ°í ½ÍÀ» ¶§ '\\'¸¦ ¼±Çà½ÃÄѼ »ç¿ëÇϸéµÊ ¿¹) iplist.txt ÆÄÀÏ¿¡¼ 1.2.3 ¹®ÀÚ¿À» °Ë»ö (Ư¼ö ¹®ÀÚ¸¦ ¹®ÀÚ·Î Ãë±Þ)
 [\\?\\[\\\\\\]] ('?', '[', '\\', ']' Áß Çϳª)
Á¤±Ô½Ä¿¡¼´Â À§¿¡¼ ¾ð±ÞÇÑ Æ¯¼ö ¹®ÀÚ¸¦ Á¦¿ÜÇÑ ³ª¸ÓÁö ¹®ÀÚµéÀº ÀÏ¹Ý ¹®ÀÚ·Î Ãë±ÞÇÔ Á¤±Ô½ÄÀº UnixÀÇ ´ëÇ¥ÀûÀÎ À¯Æ¿¸®Æ¼ÀÎ vi, emacs, ed, sed, awk, grep, egrep µî¿¡¼ »ç¿ëÇÒ ¼ö ÀÖ´Ù. ´ÙÀ½Àº grep¿¡¼ Á¤±Ô½ÄÀ» Ȱ¿ëÇÑ ¿¹¸¦ º¸¿© ÁÖ°í ÀÖ´Ù.
---------------------------------------------------------------------------------- 2. egrep Á¤±Ô½ÄÀ» Ȱ¿ëÇÑ ¿¹ ---------------------------------------------------------------------------------- (1) /root µð·ºÅ͸®¿¡¼ "directory" ±¸Á¶¸¸ ã¾Æ³¿

(2) /root µð·ºÅ͸®¿¡¼ "directory" ±¸Á¶°¡ ¾Æ´Ñ °Íµé¸¸ °Ë»ö

(3) sitelist.txt ÆÄÀÏ¿¡¼ »çÀÌÆ® URL¸¸ Ãß·Á³¿..

(4) sitelist.txt ÆÄÀÏ¿¡¼ »çÀÌÆ® URL Áß index.html·Î ³¡³ª´Â ÆÄÀϸ¸ °Ë»ö

---------------------------------------------------------------------------------- 3. PHP¿¡¼´Â Á¤±Ô½Ä°ú °ü·ÃÇÏ¿© ´ÙÀ½ÀÇ ³×°¡Áö ÇÔ¼ö¸¦ Á¦°ø ---------------------------------------------------------------------------------- (1) int ereg(string givenPattern, string givenString, array matched); - givenStringÀ» "string1stringAstring2stringBstring3 ... string9stringI" ·Î ÁÖ¾îÁ® ÀÖ´Ù°í ÇÏÀÚ. À̶§ stringA, stringB, ... , stringI´Â NULL À̾ »ó°üÀÌ ¾ø´Ù (Áï givenStringÀº "string1string2string3 ... string9" ÀÎ °æ¿ìÀÓ).
- givenStringÀÌ À§¿Í °°ÀÌ ÁÖ¾îÁø °æ¿ì, givenPatternÀº "(pattern1)stringA(pattern2)stringB(pattern3) ... (pattern9)stringI"·Î ÀÔ·ÂÇÏ¿©¾ß ÇÑ´Ù. Áï pattern1, pattern2, ..., pattern9´Â °¢°¢ string1, string2, ... , string9¿¡¼ ã°íÀÚÇÏ´Â Á¤±Ô½ÄÀÎ °ÍÀÌ´Ù.
- À̶§ pattern1ÀÌ string1¿¡¼ ¹ß°ßÇÑ ÆÐÅÏÀº $matched[1]¿¡ ÀúÀåµÇ°í, pattern2°¡ string2¿¡¼ ¹ß°ßÇÑ ÆÐÅÏÀº $matched[2]¿¡ ÀúÀåµÇ°í, ..., pattern9°¡ string9¿¡¼ ¹ß°ßÇÑ ÆÐÅÏÀº $matched[9]¿¡ ÀúÀåµÈ´Ù. PHP3ÀÇ °æ¿ì ereg¿¡¼´Â ÃÖ´ë 9°³ ±îÁöÀÇ patternÀ» ãÀ» ¼ö ÀÖµµ·Ï ¼³Á¤µÇ¾î ÀÖÀ½¿¡ À¯ÀÇÇÏÀÚ. - ±×¸®°í $matched[0]¿¡´Â $matched[1]stringA$matched[2]stringB ... $matched[9]stringI°¡ ÀúÀåµÈ´Ù. - ereg°¡ ¹ÝȯÇÏ´Â °ªÀº $matched[0]¿¡ ÀúÀåµÈ ¹®ÀÚ¿ÀÇ °³¼öÀÌ´Ù. - ereg´Â case sensitive - eregi´Â case insensitive
1) ¿¹1 ÄÚµå => print(ereg ("(.*)ef([abc].*)","abcdefabc",$matched)); print(" "); while (list($a,$b)=each($matched)) if ($b) print("$a, $b ");
°á°ú => 9 0, abcdefabc 1, abcd 2, abc
2) ¿¹2 ÄÚµå => print(ereg ("(.*)d(.*)e(.*)qrs(.*)","abcdefghijklmnopqrstuvwxyz",$matched)); print(" "); while (list($a,$b)=each($matched)) if ($b) print("$a, $b ");
°á°ú => 26 0, abcdefghijklmnopqrstuvwxyz 1, abc 3, fghijklmnop 4, tuvwxyz
3) ¿¹ 3 ÄÚµå => $date="1999-11-17"; if (ereg("([0-9]{4})-([0-9]{1,2})-([0-9]{1,2})", $date, $regs)) print("$regs[3].$regs[2].$regs[1]"); else print("Invalid date format: $date");
°á°ú => 17.11.1999
4) ¿¹ 4 ÄÚµå => $joomin="711011-1234567"; if (ereg("([0-9]{2})([01]{1}[09]{1}[0-3]{1}[0-9]{1})-([12]{1}[0-9]{6})",$date, $regs)) print("Valid"); else print("Invalid format: $joomin");
(2) int eregi(string givenPattern, string givenString, array matched); - eregÀÇ 'case insensitive' ¹öÁ¯
1) ÄÚµå => $email="[email protected]"; eregi("(^[_\\.0-9a-z-]+)@(([0-9a-z][0-9a-z-]+\\.)+)([a-z]{2,3}$)",$email,$matched);
while (list($a,$b)=each($matched)) if ($b) print("$a, $b ");
°á°ú => 0, [email protected] 1, xs9_tx-abc.yyy_c 2, cne.kyungsung.ac. 3, ac. 4, kr
2) ÄÚµå => eregi("^[_\\.0-9a-z-][email protected]([0-9a-z][0-9a-z-]+\\.)+[a-z]{2,3}$",$email,$matched); while (list($a,$b)=each($matched)) if ($b) print("$a, $b "); °á°ú => 0, [email protected] 1, ac.
(3) string ereg_replace(string givenPattern, string replacementPattern, string givenString); - givenString¿¡¼ givenPattern¿¡ ºÎÇÕÇÏ´Â ÅØ½ºÆ®(matched text)¸¦ ã¾Æ¼, replacementPatternÀ¸·Î ´ëü - givenPatternÀÌ "(ÆÐÅÏ)"À¸·Î ¹ÀÎ ¹®ÀÚ¿µéÀ» Æ÷ÇÔÇϰí ÀÖÀ¸¸é, replacementPattern¿¡´Â ÀÌ¿¡ ´ëÀÀÇÏ´Â "\\\\digit(¹®ÀÚ¿)" ÇüÅÂÀÇ ¹®ÀÚ¿µéÀ» Æ÷ÇÔÇϰí ÀÖ¾î¾ß ÇÑ´Ù(digit´Â 0, 1, ... ,9 Áß Çϳª). ±×¸®°í givenStringÀº "(ÆÐÅÏ)"À» ÀÌ¿ëÇØ ãÀº °á°úµéÀ» "\\\\digit(¹®ÀÚ¿)"¿¡ ÀÖ´Â "¹®ÀÚ¿"µé·Î ´ëüÇÏ°Ô µÈ´Ù. "\\\\0" ´Â givenString Àüü¿¡ ´ëÇØ "(ÆÐÅÏ)"ÀÇ °á°ú¸¦ Àû¿ëÇÒ ¶§ ÀÌ¿ëµÈ´Ù. - º¯°æµÈ ¹®ÀÚ¿À» ¸®ÅÏ - case sensitive
1) ÄÚµå => $string = "This is a test"; print(ereg_replace(" is", " was",$string)); print(" "); print(ereg_replace("( )is","\\\\1was",$string)); print(" "); print(ereg_replace("(( )is)","\\\\2was",$string)); print(" "); print(ereg_replace("(( )is)(( )a)(( )test)", "\\\\1was\\\\2an\\\\3exam",$string));
°á°ú => "This was a test"; "This was a test"; "This was a test"; "This was an exam";
2) ¿¹ 2 : redundant whitespace ¾ø¾Ö±â ÄÚµå => $str ="~ s/\\s+/ /g"; $str = eregi_replace("[[:space:]]+", " ", $str); print("$str ");
°á°ú => ~ s/\\s+/ /g
3) string eregi_replace(string givenPattern, string replacementPattern, string givenString); - ereg_replaceÀÇ 'case insensitive' ¹öÁ¯
|
|
|