Reply
Regular Expressions newbie question
Old 12-23-2006, 05:37 PM Regular Expressions newbie question
ADAM Web Design's Avatar
Canadastaninianite

Posts: 5,945
Name: Adam for web page design, not program
Location: Toronto, Ontario, Canada
I don't know if anyone on here has touched regular expressions yet, but I'm trying to learn them, so here goes.

I have some text I'm messing with removing certain HTML elements from.
Code:
<b>Bold</b>
<a href="anchor text">Anchor</a>
<i>Italic</i>
<strong>Strong</strong>
<em>em</em>
<blockquote>Blockquote</blockquote>
<ol>
<li>Ordered List</li>
</ol>
<ul>
<li>Ordered List</li>
</ul>
<p>Paragraph</p>
<sup>Superscript</sup>
<br />
<sub>Subscript</sub>
<div>Div</div>
<img src="something-cool.jpg" />
<h1>Heading 1</h1>
Now, when I apply this regular expression:

<([^bai](.*))>

I get the following, which is what I expected:
Code:
</b>
</a>
</i>
<strong>Strong</strong>
<em>em</em>
</blockquote>
<ol>
<li>Ordered List</li>
</ol>
<ul>
<li>Ordered List</li>
</ul>
<p>Paragraph</p>
<sup>Superscript</sup>
<sub>Subscript</sub>
<div>Div</div>
<h1>Heading 1</h1>
Which is what I expected (since the pattern doesn't check for a closing tag for </b>, </a>, </i>, and </blockquote>

However, what's weird is that I get the same thing when I apply the following expressions, I get the exact same thing!

</?([^bai](.*))>
</*([^bai](.*))>
<\047*([^bai](.*))>
<\047?([^bai](.*))>
</{0,1}([^bai](.*))>
<\047{0,1}([^bai](.*))>

I don't understand why that would be, since the pattern should match for both </ (which it doesn't) and < (which it does).

Any help? Thanks.

Last edited by ADAM Web Design : 12-23-2006 at 05:38 PM.
ADAM Web Design is offline
Reply With Quote
View Public Profile Visit ADAM Web Design's homepage!
 
When You Register, These Ads Go Away!
Old 12-23-2006, 06:49 PM Re: Regular Expressions newbie question
chrishirst's Avatar
Super Moderator

Posts: 13,658
Location: Blackpool. UK
you need to combine two patterns, one for the start tags

<[bai][^>]*>

and one for the end tags

<\/[bai]>*

so the whole RegEx pattern will be

<[bai][^>]*>|<\/[bai]>*

the | (pipe) symbol is the "alternation" operator, it tells the regex to match everything to the left of it or everything to the right of it
__________________
Chris. ->> Links are advertising NOT optimising!! <<-
Indifference will be the downfall of mankind, but who cares?
Code Samples | People Counting System
chrishirst is offline
Reply With Quote
View Public Profile Visit chrishirst's homepage!
 
Old 12-24-2006, 10:31 PM Re: Regular Expressions newbie question
ADAM Web Design's Avatar
Canadastaninianite

Posts: 5,945
Name: Adam for web page design, not program
Location: Toronto, Ontario, Canada
Thanks, dude.

It wasn't quite what I wanted to accomplish, but it was enough to get me there (which means it's my fault for not explaining clearly.) Got what I wanted.
ADAM Web Design is offline
Reply With Quote
View Public Profile Visit ADAM Web Design's homepage!
 
Reply     « Reply to Regular Expressions newbie question
 

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off




   
RSS Feed  Feeds: RSS   JS   XML
RSS Feed  Feeds for this forum: RSS   JS   XML

 


Page generated in 0.13338 seconds with 12 queries