PHP paragraph function

Status
This thread has been Locked and is not open to further replies. Please start a New Thread if you're having a similar issue. View our Welcome Guide to learn how to use this site.

dudeking

Thread Starter
Joined
Feb 7, 2007
Messages
483
Hey I have made a very simple cms for a client to be able to update text on the site.. the only thing he needs to be able to add is paragraphs. So I have wrote this script that takes an input from a text area.

PHP:
function addParagraphs($string)
{
	//Remove HTML
	$string = strip_tags($string);
	
	//Remove Key Words
	$string = str_replace("{Line_Break}","[Line_Break]",$string);
	
	//Strip Line Breaks
	$string = str_replace("\r\n","\n",$string);
	$string = str_replace("\r","\n",$string);
	$string = str_replace("\n\n","\n",$string);
	
	//Remove Multiple Spaces
	$string = str_replace("  "," ",$string);
	
	//Add Tag Where break is Needed
	$string = str_replace("\n","{Line_Break}",$string);
	
	//Replace Tag With HTML
	$string = str_replace("{Line_Break}","</p><p>",$string);
	$string = '<p>'.$string.'</p>';
	
	//Return Formatted String
	return $string;
}
It works in my closed testing but is there anything that could crash it or create a vulnerability. The data from here goes into a mysql database.

Thanks for any advice :)
 

tomdkat

Retired Trusted Advisor
Joined
May 6, 2006
Messages
7,148
Maybe try throwing random garbage at it a see what happens. Also, try embedding HTML tags in the text being passed to the function to make sure that's cool.

What if someone entered text that contained an HTML link or reference to another site which tried to load something when referenced? That sort of thing. :)

Peace...
 

dudeking

Thread Starter
Joined
Feb 7, 2007
Messages
483
Okay after lots of testing I found that in some cases empty paragraphs were created, e.g. <p></p> So I have added abit to remove that and after your comment about HTML I have added support for bold text but I don't want links to be added into text. On testing when a <a> tag is entered just the actual text remains, all the tag and link goes.

New script...

PHP:
function addParagraphs($string)
{
	//Remove HTML
	$string = strip_tags($string);
	
	//Remove Key Words
	$string = str_replace("{Line_Break}","[Line_Break]",$string);
	
	//Strip Line Breaks
	$string = str_replace("\r\n","\n",$string);
	$string = str_replace("\r","\n",$string);
	$string = str_replace("\n\n","\n",$string);
	
	//Remove Multiple Spaces
	$string = str_replace("  "," ",$string);
	
	//Add Tag Where break is Needed
	$string = str_replace("\n","{Line_Break}",$string);
	
	//Replace Tag With HTML - Line Break
	$string = str_replace("{Line_Break}","</p><p>",$string);
	$string = '<p>'.$string.'</p>';
	
	//Replace Tag With HTML - Bold
	$string = str_replace("{bold}",'<span class="bold">',$string);
	$string = str_replace("{/bold}",'</span>',$string);
	
	//Remove Useless Formatting
	$string = str_replace('<span class="bold"></span>','',$string);
	$string = str_replace('<p></p>','',$string);
	
	//Return Formatted String
	return $string;
}
 

tomdkat

Retired Trusted Advisor
Joined
May 6, 2006
Messages
7,148
What if they enter a paragraph tag in the text they enter, like this:
This is a test of creating <p>a paragraph within a paragraph</p>
and of course, the permutations:
This is a test of creating a <p> paragraph with a paragraph.
This is a test of creating a paragraph </p> with a paragraph.
This is a test of creating a <p> </p>paragraph with a paragraph.
This is a test of creating a <p>paragraph with
a paragraph.</p>
Also, will you allow other HTML tags?
Check out my swell image! <img src="http://www . badsite-do-not-visit-me .com/sexygirl.jpg"/>
Or JavaScript:
<html>
<head>
<script type="text/javascript" src="http://www . badsite-do-not-visit-me .com/badscript.js"></script>
<script type="text/javascript">
// call bad function in badscript.js here
</script>
</head>
<body onload="runBadScript()">
This is some text <p>to hack the site!
All your base are belong to us!</p>
</body>
</html>
And so on. :)

Peace...
 

dudeking

Thread Starter
Joined
Feb 7, 2007
Messages
483
Beautiful :) Just tested it with all your examples and it just removes the html leaving the paragraphing as manually formatted and adds no images etc...

Thanks
 

tomdkat

Retired Trusted Advisor
Joined
May 6, 2006
Messages
7,148
Excellent! I suggest giving others some time to think of things we've missed so far.

One thing that comes to my mind is embedding PHP code, like this:

Code:
This is a <p>paragraph <?php call some function to do something obscure ?></p>
This is another <b>paragraph
<?php call some function to do something obscure ?>
that should be in bold
</b>
</p>
You get the idea. :)

Peace...
 

dudeking

Thread Starter
Joined
Feb 7, 2007
Messages
483
Arrr I hadn't thought of that....I'll give it a go now.
Thanks
 

tomdkat

Retired Trusted Advisor
Joined
May 6, 2006
Messages
7,148
I'm just trying to think "outside the box". :)

I know now how bulletproof you'll need that to be but I would also look at maybe filtering attempts to embed objects as well with the "<object>" tag, including the "<embed>" tag.

EDIT: The PHP example might be a hole if someone can figure how some kind of SQL injection exploit using existing database connections or something I don't know how to do. :)

Peace...
 

dudeking

Thread Starter
Joined
Feb 7, 2007
Messages
483
I can't actually find anything strip_tags dosnt remove. Are there any known instances that can stop this working.. or any tags that are allowed through?
 

tomdkat

Retired Trusted Advisor
Joined
May 6, 2006
Messages
7,148
I don't know. So, I'm thinking up as much random stuff as I can to make sure something unwanted doesn't creep through. :)

Peace...
 

dudeking

Thread Starter
Joined
Feb 7, 2007
Messages
483
My sites pretty secure against SQL injection. I've the server setup to add \ to all quotes in $_POST data. So unless they can edit my .ini file I don't think thats too much of an issue. But im sure theres people out there who can.
 

colinsp

Colin
Joined
Sep 5, 2007
Messages
2,354
Not sure about the vulnerabilities but what about giving them a wysiwyg editor that may help as there is a lot of code checking built in.

A couple that spring to mind are TinyMCE and OpenWYSIWYG.
 

dudeking

Thread Starter
Joined
Feb 7, 2007
Messages
483
I have tried that, but most of the lack support for safari. And its kind of surpluses to requirements... It really just needs to be plain text, then its styled to fit in with the rest of the site. The only functionality I needed was paragraphs (needed to be detected and have <p> tags placed in, and a bold tag for if people want to be really direct.

I'm just concerned about having this box letting things be added to the database. How does PHP handle weird ASCII codes.. like these hearts and things people put on social networks. Can I force UTF8?
 

tomdkat

Retired Trusted Advisor
Joined
May 6, 2006
Messages
7,148
My sites pretty secure against SQL injection. I've the server setup to add \ to all quotes in $_POST data. So unless they can edit my .ini file I don't think thats too much of an issue. But im sure theres people out there who can.
Remember to think outside the box. As soon as you open up the database to receiving input from a user, you're opening yourself up for who knows what to be thrown at the database. :)

I'm just concerned about having this box letting things be added to the database. How does PHP handle weird ASCII codes.. like these hearts and things people put on social networks. Can I force UTF8?
Great question! I know some malicious JavaScript is obfuscated and I wonder if a malicious PHP exploit could be obfuscated in a similar fashion, and include binary data that would be interpreted as ASCII control codes (the symbols you mention). Since you're using a HTML form you might be able to control the character encoding through one of the attributes of the <form> tag.

Additionally, we need to consider the exposure of the input fields. Will random people be submitting text or will one or two specific people you know be the only ones?

Peace...
 

dudeking

Thread Starter
Joined
Feb 7, 2007
Messages
483
In this instance purely the client and any members of staff will have access. But I do need a commenting system for a bands website im working on, so eventually it will be used for that too.

Thing is controlling encoding from the <form> tag can easily be changed. I've just done some googling and theres a function called utf8_encode(). So I'm guessing that will make sure everything is saved in the database correctly.
 
Status
This thread has been Locked and is not open to further replies. Please start a New Thread if you're having a similar issue. View our Welcome Guide to learn how to use this site.

Users Who Are Viewing This Thread (Users: 0, Guests: 1)

As Seen On
As Seen On...

Welcome to Tech Support Guy!

Are you looking for the solution to your computer problem? Join our site today to ask your question. This site is completely free -- paid for by advertisers and donations.

If you're not already familiar with forums, watch our Welcome Guide to get started.

Join over 807,865 other people just like you!

Latest posts

Staff online

Top