How To Remove Header Tags and Their Content in WordPress Excerpt

WordPress Excerpt is a summary of your content. In this tutorial, I will show you how to selectively remove Header Tags (h1 through h6) and their content for the automatically generated excerpt. Also I will show you how to selectively remove Header Tags and their content while at the same time preserving HTML formatting for other HTML tags. For a manually typed excerpt, WordPress will keep all HTML formatting you have included in the excerpt.

Excerpts are generated from the post content by a WordPress filter using the function wp_trim_excerpt() located in wp-includes/formatting.php of your core WordPress code.

This tutorial is a continuation of my two previous tutorials: How To preserve HTML Tags in WordPress Excerpt and How To Improve WordPress Excerpt.

The relationship between the Manual Excerpt and Automatic Excerpt is this: When a post has no manually typed excerpt and the post template uses the the_excerpt() tag, WordPress will automatically generate an excerpt by selecting the first 55 words of the post followed by the unlinked ellipsis “[...]”.

Manually Typed Post Excerpt

For a manually typed post excerpt, WordPress will keep all HTML tags you included in your excerpt. If you don´t know where and how to add a manual excerpt, read the section “How To: Manually Add a Post Excerpt” in my first related tutorial.

Automatically Generated Post Excerpt

In WordPress, if you do not provide a manually typed excerpt to a post, WordPress will display an automatically generated excerpt. By default, WordPress display the excerpt with the first 55 words of the post´s content, an un-linked ´[...]´ string at the end, and with all HTML tags stripped from the excerpt´s content. This makes the excerpt one text paragraph without any line breaks.

In this tutorial, I will show you how to remove Header tags and their content only, and How to remove Header tags and their content while at the same time preserving other HTML tags in the automatically generated excerpt, along with other settings that you can modify. All these changes are done in your theme´s functions.php file.

Strip Header Tags and Their Content in the Auto-Generated Excerpt

Use CODE-1 below if you want to remove Header Tags (tags h1 to h6) along with their content. Also CODE-1 gives you the ability to change the default excerpt length and how the excerpt ends. However CODE-1 does NOT give you the ability to preserve any HTML formatting in the auto-generated excerpt.

Open functions.php file located in your current theme folder and add (copy and paste) the following code. Save the file and upload it to your server. You can edit lines 28 and 32 accordingly.

CODE-1:
Note: To scroll within the code: You can also click on the code window and use your keyboard´s arrow keys.

<?php
/***********************CODE-1************************************************
* @Author: Boutros AbiChedid 
* @Date:   April 18, 2012
* @Websites: bacsoftwareconsulting.com/ ; blueoliveonline.com/
* @Description: Remove header tags and their content From the automatically 
* generated Excerpt.
* Code modifies default excerpt_length and excerpt_more filters.
* Code Does NOT preserve any other HTML formatting in the excerpt.
* @Tested on: WordPress version 3.3.1 
****************************************************************************/ 

function bac_wp_strip_header_tags( $text ) {
	$raw_excerpt = $text;
	if ( '' == $text ) {
		//Retrieve the post content.
		$text = get_the_content(''); 
		//remove shortcode tags from the given content.
		$text = strip_shortcodes( $text );
		$text = apply_filters('the_content', $text);
		$text = str_replace(']]>', ']]&gt;', $text);
	
		//Regular expression that strips the header tags and their content.
        $regex = '#(<h([1-6])[^>]*>)\s?(.*)?\s?(<\/h\2>)#';
		$text = preg_replace($regex,'', $text);
	
		/***Change the excerpt word count.***/
		$excerpt_word_count = 55; //This is WP default.
		$excerpt_length = apply_filters('excerpt_length', $excerpt_word_count);
		
		/*** Change the excerpt ending.***/
		$excerpt_end = '[...]'; //This is the WP default.
		$excerpt_more = apply_filters('excerpt_more', ' ' . $excerpt_end);
		
		$excerpt = wp_trim_words( $text, $excerpt_length, $excerpt_more );
		}
		return apply_filters('wp_trim_excerpt', $excerpt, $raw_excerpt);
}
add_filter( 'get_the_excerpt', 'bac_wp_strip_header_tags', 5);
?>

Watch out with the <!–more–> tag, if you are using the <!–more–> tag in your content, make sure that it resides outside your defined excerpt length. Otherwise your excerpt will end without an ending link. If there is a More tag in your post, then the excerpt word count stops at either where the More tag is placed, without the excerpt trailing characters, or your defined excerpt length, whichever comes first.

CODE-1 works on WordPress 3.3.0 and higher. But I hope that you will always upgrade to the latest version.

The Regular Expression defined on Line 24, has few shortcomings. It is not perfect, but works in all normal cases. For instance, The Regular expression fails when:

  1. You comment out the headers. In this case the excerpt will be displayed blank.
  2. Header tags are not properly closed (e.g. <h1>header1</h2>). In this case the header content will show unformatted.
  3. Header tags are written in all Capital Letters (e.g. <H1></H1>) or a combination of upper and lower cases (e.g. <h1></H1>). But the good thing about this, is that even though the RegEx fails, WordPress converts all upper case tags into lower case. So you will not even notice that the RegEx fails unless you test it in the Regular Expression Test Tool.

To test the Regular expression on line 24, you could use this Regular Expression Test Tool.

CODE-1 works only for the automatically generated excerpt but NOT the manually typed excerpt. If you only use manually typed excerpts for your posts, then you should use CODE-3 below to strip header tags and their content from the excerpt.

Result of CODE-1:

The image below shows an example of an edited post in the dashboard, which has header tags (h1 – h6) inserted with their content.

A sample WordPress Post as shown in the dashboard with header tags inserted.

The image below shows the result after using CODE-1.

Result of CODE1: The post showing in the frontend (browser) with header tags and their content removed.

Strip Header Tags and Their Content While Preserving Other HTML Tags in the Excerpt

Use CODE-2 below if you want to remove Header Tags (tags h1 to h6) with their content AND you also want to preserve other chosen HTML formatting for the auto-generated excerpt. Same as CODE-1, CODE-2 also gives you the ability to change the default excerpt length and the excerpt´s ending.

Open functions.php file located in your current theme folder and add (copy and paste) the following code. You can edit lines 29, 33, and 37 accordingly.

CODE-2:
Note: To scroll within the code: You can also click on the code window and use your keyboard´s arrow keys.

<?php
/*************************CODE-2**********************************************
* @Author: Boutros AbiChedid 
* @Date:   April 18, 2012
* @Websites: bacsoftwareconsulting.com/ ; blueoliveonline.com/
* @Description: Remove header tags and their content from the automatically 
* generated Excerpt WHILE AT THE SAME TIME preserving other chosen HTML tags.
* Also code modifies default excerpt_length and excerpt_more filters.
* @Tested on: WordPress version 3.3.1
*****************************************************************************/ 

function bac_wp_strip_header_tags_keep_other_formatting( $text ) {

$raw_excerpt = $text;
if ( '' == $text ) {
	//Retrieve the post content.
	$text = get_the_content(''); 
	//remove shortcode tags from the given content.
	$text = strip_shortcodes( $text );
	$text = apply_filters('the_content', $text);
	$text = str_replace(']]>', ']]&gt;', $text);
	
	//Regular expression that removes the h1-h6 tags with their content.
	$regex = '#(<h([1-6])[^>]*>)\s?(.*)?\s?(<\/h\2>)#';
	$text = preg_replace($regex,'', $text);
	
	/***Add the allowed HTML tags separated by a comma. 
	h1-h6 header tags are NOT allowed. DO NOT add h1,h2,h3,h4,h5,h6 tags here.***/
    $allowed_tags = '<p>,<em>,<strong>';  //I added p, em, and strong tags.
    $text = strip_tags($text, $allowed_tags);

	/***Change the excerpt word count.***/
	$excerpt_word_count = 55; //This is WP default.
	$excerpt_length = apply_filters('excerpt_length', $excerpt_word_count);
	
	/*** Change the excerpt ending.***/
	$excerpt_end = '[...]'; //This is the WP default.
	$excerpt_more = apply_filters('excerpt_more', ' ' . $excerpt_end);
	
	$words = preg_split("/[\n\r\t ]+/", $text, $excerpt_length + 1, PREG_SPLIT_NO_EMPTY);
		if ( count($words) > $excerpt_length ) {
			array_pop($words);
			$text = implode(' ', $words);
			$text = $text . $excerpt_more;
		} else {
			$text = implode(' ', $words);
		}
	}
	return apply_filters('wp_trim_excerpt', $text, $raw_excerpt);
}
add_filter( 'get_the_excerpt', 'bac_wp_strip_header_tags_keep_other_formatting', 5);
?>

Watch out with the <!–more–> tag, if you are using the <!–more–> tag in your content, make sure that it resides outside your defined excerpt length. Otherwise your excerpt will end without an ending link. If there is a More tag in your post, then the excerpt word count stops at either where the More tag is placed, without the excerpt trailing characters, or your defined excerpt length, whichever comes first.

CODE-2 works on WordPress 2.9 and higher. But I hope that you will always upgrade to the latest version.

To test the regular expression on line 24, you could use this Regular Expression Test Tool

CODE-2 works only for the automatically generated excerpt but NOT the manually typed excerpt. If you only use manually typed excerpts for your posts, then use CODE-3 to remove header tags and their content from the excerpt.

A Word of Appreciation: Thanks to these great guys for all their help and feedback: M0R7IF3R, alchymyth and Jeff Lambert. Without them this tutorial might not be possible.

Result of CODE-2:

The image below shows an example of an edited post in the dashboard, which has header tags (h1 – h6) inserted with their content.

A sample WordPress Post as shown in the dashboard with header tags inserted.

The image below shows the result after using CODE-2.

Result of CODE2: The post showing in the frontend (browser) with header tags and their content removed AND also preserving HTML formatting.

CODE-2 References:

How To Preserve HTML Formatting in CODE-2

If you want to keep some chosen HTML tags. For example if you like to keep the p, a, em, strong, img tags then find line 29 of CODE-2 and replace it with this:

<?php 
//Replace line 29 of CODE-2 with this:
//Separate tags by commas. Add or Remove tags as you wish. Do NOT add any header tags.
$allowed_tags = '<p>,<a>,<em>,<strong>,<img>'; 
?>

WordPress counts the content of the HTML tags as part of the excerpt total word count. Therefore you need to be careful what tags you need to add. For example the title attribute in an image or link are counted as part of the excerpt length.

Add the tags that you want to allow separated by comma. Be careful what tags you add because if you have the “continue reading” link and you stop in the middle of a non-closed tag, the link will take its formatting. For instance, if the excerpt stopped somewhere in an open “strong” tag, the link will be formatted bold. And most importantly, if you stop in the middle of a non-closed tag (like an anchor tag) you might be getting XML validation errors/warnings that cause your RSS feed to not work properly. Therefore, test your blog and your RSS feed.

How To Change the Excerpt Length in CODE-1 or CODE-2

Depending on what Code you decided to choose: If you like to change the automatically generated excerpt length from the default 55 words, find line 28 of CODE-1 OR Line 33 of CODE-2 and replace it with this:

<?php 
//Replace line 28 of the CODE-1 or Line 33 of CODE-2 with this:	
$excerpt_word_count = 65; /* Choose any number you like. But keep it reasonable.*/
?>

How To Replace the Excerpt More String with a Link in CODE-1 or CODE-2

By default WordPress outputs an un-linked “[...]” at the end of each excerpt, which is not useful for Accessibility and SEO purposes.

Depending on what Code you decided to choose: If you want to modify the default excerpt more string with a link, then find line 32 of CODE-1 OR Line 37 of CODE-2 and replace it with this Line of code (Line-A):

<?php 
//Replace line 32 of CODE-1 or Line 37 of CODE-2 with this:	
$excerpt_end = ' <a href="'. get_permalink($post->ID) . '">' . '&raquo; Continue Reading.' . '</a>'; 
?>

The image below shows the result when using CODE-2 modified with Line-A.

Result of CODE2: The post showing in the browser with header tags and their content removed and also preserving HTML formatting and also adding a link to the excerpt more string.

Is There a Simpler Code to Use?

I just want to strip the Header Tags and Their Content in the Excerpt and nothing else. But I don´t want to rewrite the wp_trim_excerpt() function. Is there another method?

Yes there is but it works on some themes but NOT others. If this all what you want to do is to only remove header tags and their content, I suggest that you try CODE-3 first to see if it works for you.

Open functions.php file located in your current theme folder and add (copy and paste) the following CODE-3.

CODE-3:

<?php
/***********************CODE-3**********************************************
* @Author: Boutros AbiChedid 
* @Date:   April 18, 2012
* @Description: ONLY Remove header tags and their content from the Excerpt.
* DOES NOT modify anything else in the excerpt.
* @Tested on: WordPress version 3.3.1 
****************************************************************************/ 

function bac_wp_strip_header_tags_only( $excerpt ) {
  
    $regex = '#(<h([1-6])[^>]*>)\s?(.*)?\s?(<\/h\2>)#';
    $excerpt = preg_replace($regex,'', $excerpt);
    
	return $excerpt;
}
add_filter( 'the_excerpt', 'bac_wp_strip_header_tags_only', 0);

Thanks to M0R7IF3R for the code. Without him (especially for the RegEx part), I would not know where to start.

CODE-3 works on some themes or child themes but not others. So all you have to do is to try the code out.

If CODE-3 does not work for your theme (or child theme), then try replacing line 17 with the following. The point here is to use a different filter hook.

<?php 
//Replace line 17 of <em>CODE-3</em> with this line:	
add_filter( 'get_the_excerpt', 'bac_wp_strip_header_tags_only', 0);
?>

If none worked, then you really need to rewrite the wp_trim_excerpt() function by using CODE-1 OR CODE-2 above, or you could spend hours like I did trying to figure out why CODE-3 is not working.

In my case, CODE-3 works for the manually typed excerpt but not the automatically generated one. It works with any filter hook, “the_excerpt” or “get_the_excerpt”. Interesting!

Conclusion

In this tutorial, I showed you 3 different codes that you can use either one to modify the default WordPress Excerpt. I showed you how to remove Header tags and their content only from the excerpt, and how to remove Header tags and their content while at the same time preserving other HTML tags from the excerpt. I also revisited the excerpt_more and excerpt_length filters and how they can be used in a more convenient way.

Do you have any questions? or anything else to say? If so, please share your opinion in the comments section. Your opinion matters, unless it is a Spam.

If you found this post useful, please consider: linking back to it, subscribing by email to future posts, or subscribing to the RSS feed to have new articles delivered to your feed reader, or feel free to donate. Thanks!

About the Author |
Boutros is a professional Drupal & WordPress developer, Web developer, Web designer, Software Engineer and Blogger. He strives for pixel perfect design, clean robust code, and user-friendly interface. If you have a project in mind and like his work, feel free to contact him. Connect with Boutros on Twitter, and LinkedIn.
Visit Boutros AbiChedid Website.

5 Responses to “How To Remove Header Tags and Their Content in WordPress Excerpt”

  1. Ron Carter says:

    Great tutorials on your site, thank you so much. Is it possible to remove other tags, such as <p (paragraph), <ul (lists), etc, instead of just the H tags? If so, how would it be written into the code?

  2. Gareth James says:

    Lovely stuff! Pasted code-1 into my theme’s functions.php file and it worked like a treat! Many thanks…

  3. [...] Remove Header Tags and Their Content in WordPress Excerpt Without a Plugin. [...]

  4. Prb says:

    Very useful info. thanks for the graphics, they really helped out a lot.

  5. Kat Skinner says:

    Thanks for helping out mate; I really appreciate it! About to go through your tutorial and give it a go! Thanks again so much!