regex - Extracting a subsection from a String in java -


i have 1 huge string form:

markerbeg 1

... ...

markerend 1

markerbeg 2

...

markerend 2

i have information in string , want extract string between each markers(...), there way using regex or simple strings methods looking each marker.

regards,

[edited because question became clearer]

here's problem understand it: have long string blocks of text in delimited "markerbeg [identifier]" , "markerend [identifier]". not text in string inside 1 of these blocks, , blocks cannot nested. identifiers can arbitrary string (here i'm assuming contain characters in \w class: letters, numbers, , underscores). need extract both identifiers , strings inside blocks.

here's code want:

import java.util.regex.*;  public class hello {     public static void main(string[] args) {         string s = "markerbeg 1\n text\nmarkerend 1\nxxx\nmarkerbeg 2\nhi there :) \nmarkerend 2\nxyz\nmarkerbeg hello\nzgfds\nmarkerend hello";         system.out.println("source string:\n" + s);         pattern p = pattern.compile("markerbeg\\s+(\\w+)\\s+(.*)\\s+markerend\\s+\\1");         matcher m = p.matcher(s);         system.out.println("\nextracted:");         while (m.find()) {             string ident = m.group(1);             string string = m.group(2);             system.out.println(ident + ": " + string);         }     } } 

this prints out:

source string: markerbeg 1  text markerend 1 xxx markerbeg 2 hi there :)  markerend 2 xyz markerbeg hello zgfds markerend hello  extracted: 1: text 2: hi there :)  hello: zgfds

the regex works follows:
regex: "markerbeg\s+(\w+)\s+(.*)\s+markerend\s+\1" (backslashes escaped in original code)
markerbeg: taken literally
\s+: 1 or more whitespace characters
(w+): 1 or more letter, number, or underscore characters, placed in capturing group extract later (your identifier)
\s+: above
(.*): 0 or more characters, placed in capturing group (the contents of text block)
\s+: above
markerend: taken literally
\s+: above
\1: contents of first capturing group (i.e. identifier)

then in loop, use m.group(1) , m.group(2) contents of first , second capturing groups.


Comments

Popular posts from this blog

php - What is the difference between $_SERVER['PATH_INFO'] and $_SERVER['ORIG_PATH_INFO']? -

fortran - Function return type mismatch -

queue - mq_receive: message too long -