Webmaster Key - Discussion Forums


Welcome, Guest. Please login or register.
Did you miss your activation email?
February 08, 2012, 06:41:53 PM

Login with username, password and session length
Welceome to Forums!

Important information for guests and new members:

In order to understand the full benefits of becoming an active member of this forum, please review the following information on guest and new member restrictions. These forum changes have been prompted by an overwhelming and unreasonable amount of bot postings and incoherent guest spam messages. We wish to prevent these events from happening in the future and make our community a more comfortable place for all of our members.

For guests:

Guests are not allowed to open new topics, polls, or posts attachments.
If you wish to open up new discussions on this forum, we encourage you to register.

For new members:

New members with less than five posts are not allowed to modify additional profile information such as avatars, contact information, biographies, and signatures. However, new members are encouraged to post their own topics or reply to topics initiated by other members. Become active on the forums and 5 posts should be an easy task!

We are a diverse community with members from all over the world. We encourage new ideas and interesting conversation. Do not be afraid to post webmaster/computer-related questions or problems, as our active members are always willing to help when they are able. Interested? Join us.

+ Webmaster Key Forums
|-+ Webmaster Corner
| |-+ Site Design and Web Authoring
| | |-+ Coding Talk
| | | |-+ Keyword Filtering Function in PHP
0 Members and 1 Guest are viewing this topic. « previous next »
Pages: [1] Go Down Stumble Upon! Digg It! del.icio.us! Add to Technorati! ReddIt!  Send this topic Print
Author Topic: Keyword Filtering Function in PHP  (Read 1985 times)
Andy
Administrator
Veteran
*****
Posts: 5 752



« on: June 21, 2008, 03:04:52 PM »

This is something I started coding today as part of a project so I thought it would be an idea to pose the challenge of coding things to others and then post my solution once I ironed it out.

The challenge here is to code a PHP function that accepts a string of what is supposed to be a comma delimited list of keywords/phrases and outputs a list of useable keywords in lower case separated by commas.

So the function sanitizes the input and removes duplicate keywords. Then it returns the keyword list.

The input can be any sequence of characters.

Here is what I coded so far but it is not finished yet:

Code:
<?php
function clean_keywords($keywords) {
$keywords ereg_replace('[^, a-z]'''trim(strtolower($keywords))); //remove unwanted chars
$keywords ereg_replace(' +'' '$keywords); //reduce long spaces to one space
$keywords ereg_replace(',+ *'','$keywords); //after the above reductions condense any wiped out entries
$keywords trim($keywords","); //get rid of stragglers at the left and right extremities
$ka explode(',',$keywords); // separate out the remaining keywords
// Next we will remove duplicate keywords
// If exists $newkey which is $value then remove the current - thinking out loud now ...
foreach($ka as $key=>$value) echo strpos($keywords,$value); //if (strpos($keywords,$value) !== false) unset($ka[$key]);

// Remove short words

// Remove offensive words

// Spell check the remaining words

$keywords implode(', ',$ka); // Convert the array of words back into a string
return $keywords;
}
?>


As you can see, it is maybe trickier than you first thought when you have to imagine every possible scenario of what your website visitors may enter to your web form and submit.

Further thoughts: the keywords should be limited to words that are longer than say 2 characters. Also, should this apply to the individual words of a phrase?

Then, there should be a limit maybe to the number of keywords in the list and only the best keywords retained. This is because the keyword list will be used in a meta tag and used to form a url stub for a web page.

Also, the words should be passed through a bad words filter to remove offensive words.
« Last Edit: June 21, 2008, 03:26:46 PM by Andy » Report to moderator   Logged

Andy
Administrator
Veteran
*****
Posts: 5 752



« Reply #1 on: June 22, 2008, 07:15:17 AM »

Here is the finished version of the code that I came up with:

Code:
<?php
function word_ok($w) {
global $bad_words// This is an array of bad words that is defined outside of the function.
foreach ($bad_words as $bad) if (strpos($w,$bad)!== false) return false;
return true;
}

function 
clean_keywords($keywords) {
$keywords ereg_replace('[^, a-z]'''trim(strtolower($keywords)));
$keywords ereg_replace(' +'' '$keywords);
$keywords ereg_replace(',+ *'','$keywords);
$keywords trim($keywords",");
$ka explode(',',$keywords);
$words = array();
foreach($ka as $value) if (strlen($value) > 2) if (word_ok($value)) $words[$value] = $value// Reject short words, bad words and avoid duplicating words
$keywords implode(', ',$words);
return $keywords;
}
?>


In this code I use arrays to easily split and combine the list of keywords.

I used the keyword as an index key for a temporary array, so if there are duplicate words, they are not added as new array elements. This is much simpler than counting how many times words appear in the list and so on.

To filter out bad words, I added another function word_ok()

Related to this, I developed a function to create keyword-rich URLs. It takes the keyword list or if this is empty, it uses the title of the web page for the words (in this case they are passed to the keywords script to clean them up).

Then the software checks for an already existing URL like this in the database and appends a number to it to make it unique. In Wordpress, this is done too when you choose to have keyword rich URLs, however I did not examine their code since it can be very complicated tracing through all the functions that are involved. also, I wanted to have my own solution to avoid having to apply the GPL licence.

This is all aimed at automated SEO of user-entered data.

The other thing I will do is to write some Javascript to filter the characters as they are typed on the keyboard. This helps to force/educate users to enter the correct format data. The PHP is more complex since it has to assume that any kind of data was entered such as by hackers trying to break the script.
« Last Edit: June 22, 2008, 12:47:53 PM by Andy » Report to moderator   Logged

Pages: [1] Go Up Stumble Upon! Digg It! del.icio.us! Add to Technorati! ReddIt!  Send this topic Print 
+ Webmaster Key Forums
|-+ Webmaster Corner
| |-+ Site Design and Web Authoring
| | |-+ Coding Talk
| | | |-+ Keyword Filtering Function in PHP

Jump to:  
« previous next »


Our Partners
RelmaxTOP Ranking System Web Hosting RelmaxTOP Ranking System
Staff Sites
12Noon[12Noon Gallery] Andy[Urgentclick]
Tamuril[Tamuril's Digital Art Exhibit] Sensovision
Powered by MySQL Powered by PHP We are hosted by Relmax Inc. |Our Privacy Policy | Sitemap
Powered by SMF 1.1.9 | SMF © 2006-2009, Simple Machines LLC
Forum design by Tamuril © 2005.
Valid XHTML 1.0! Valid CSS!