Allowing Hyperlinks in Your WordPress Excerpts
September 22nd, 2010 | Published in Mind
By default, WordPress strips out all the HTML tags from your post excerpts. I needed to allow hyperlinks, but there is a problem when WordPress tries to truncate the post’s content. The wp_trim_excerpt function is what WordPress uses to do all the trimming work, I simply copied the code, modified it, and stuck my new version in the mu-plugins directory (or you could add it to your functions.php).
The first line I needed to change was the PHP function call strip_tags(). I need to set it to allow the <a> tag… very simple to do:
Original Line:
$text = strip_tags($text);
New Line:
$text = strip_tags($text, '<a>');
The second line I needed to change was where WordPress splits the excerpt up into separate words. This was a bit tricky because you run into an issue where you could be cutting of the </a>, which could screw up your entire page. I decided the best thing to do would be to treat the entire hyperlink as a single word. So, <a href=”http://lewayotte.com/”>My Website</a>, would count as a single word. By default, WordPress creates a 55 word excerpt. That is pretty easy to change as well.
Original Line:
$words = preg_split('/[\n\r\t ]', $text, $excerpt_length + 1, PREG_SPLIT_NO_EMPTY );
New Line:
$words = preg_split('/(<a.*?a>)|\n|\r|\t|\s/', $text, $excerpt_length + 1, PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE );
Next, you simple need to remove the current wp_trim_excerpt filter and add a new one that points to your new function. Here is what the entire block of code looks like for me:
<?php
function new_wp_trim_excerpt($text) {
$raw_excerpt = $text;
if ( '' == $text ) {
$text = get_the_content('');
$text = strip_shortcodes( $text );
$text = apply_filters('the_content', $text);
$text = str_replace(']]>', ']]>', $text);
$text = strip_tags($text, '<a>');
$excerpt_length = apply_filters('excerpt_length', 55);
$excerpt_more = apply_filters('excerpt_more', ' ' . '[...]');
$words = preg_split('/(<a.*?a>)|\n|\r|\t|\s/', $text, $excerpt_length + 1, PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE );
if ( count($words) > $excerpt_length ) {
array_pop($words);
$text = implode(' ', $words);
$text = $text . $excerpt_more;
} else {
$text = implode(' ', $words);
}
}
return apply_filters('new_wp_trim_excerpt', $text, $raw_excerpt);
}
remove_filter('get_the_excerpt', 'wp_trim_excerpt');
add_filter('get_the_excerpt', 'new_wp_trim_excerpt');
Regular Expression Explained
I changed /[\n\r\t\s ]/ to /(<a.*?a>)|\n|\r|\t|\s/ because I needed to capture everything in the <a> HTML tags and count it as a single word. \n, \r, \t, \s are all pretty basic regex characters that preg_split() is using to break up the content. (<a.*?a>) is what captures everything in the <a> tag. The .* means all “characters” adding the ? to .* makes it “non-greedy.” This for the case where there are multiple hyperlinks in the content. It prevents the regex from thinking <a href=”link1″>link 1</a>, <a href=”link2″>link 2</a> is a single word. The parenthesis simply group the <a.*?a> together, so it doesn’t try to split those items separately and then I needed to add PREG_SPLIT_DELIM_CAPTURE to preg_split, or preg_split would have just removed the link as garbage.
You should be able to use this as a pretty solid base for adding other HTML tags.
Tags: functions.php, get_the_excerpt, mu-plugins, php, preg_split, strip_tags, wordpress, wp_trim_excerpt

