There's no such thing as a stupid question, but they're the easiest to answer.
JoinTour
Login
 
Tag Cloud
bios black screen blue screen boot computer connection crash css dell display driver drivers email error excel firefox firefox 3 freeze game hard drive internet internet explorer itunes laptop lcd linux malware monitor network networking nvidia outlook outlook 2003 outlook express partition password printer problem router slow sound startup trojan usb video virus vista windows windows xp wireless
Software Development
Search
Search in:
 
Advanced Search
Tech Support Guy Forums > Software & Hardware > Software Development >
Merging HTML Files, How do I?


HELLO AND WELCOME! Before you can post your question, you'll have to register -- it's completely free! Click here to join today! We highly recommend that you print a copy of our Guide for New Members. Enjoy!

Closed Thread
 
Thread Tools
HowdeeDoodee's Avatar
Senior Member with 218 posts.
 
Join Date: Aug 2004
Experience: Intermediate
26-May-2006, 10:54 AM #1
Merging HTML Files, How do I?
I need to merge thousands of HTML files. I have stripped out the header info and the footer info so basically what I have are text files with an html extension. I have tried changing the file names from html to txt but do this creates files with corrupted text in the file.

I have tried inserting html files into a Word document but that only works for a few files and does not work for as many files as I have.

I have not been able to find a file merge utility that will merge all the html files together.

So...

Question 1:
Is there a way, perhaps by vba code, of inserting a directory of html files into a Word document regardless of how many files are in the directory? The average directory size is about 5 meg.

OR

Question 2:
Is there a way, any way, of merging or combining html files.

OR

Question 3:
Is there a way, any way, of inserting a directory html files into an editor?

Thank you in advance for any replies.
Rockn's Avatar
Computer Specs
Distinguished Member with 17,888 posts.
 
Join Date: Jul 2001
Location: Mexico of the North, MN
Experience: Disenfranchised American Male
26-May-2006, 11:57 AM #2
How do you want to merge them? Do you want a Word document or an HTML document? What about PDF format?
cristobal03's Avatar
Distinguished Member with 2,992 posts.
 
Join Date: Aug 2005
Experience: Advanced
26-May-2006, 12:06 PM #3
If you're merging a 5MB directory exclusively of HTML files, you must be talking about hundreds of files. Why do you want to do this? If I understand correctly, you want to merge everything from <BODY> to </BODY> in each file into one single file?

chris.
HowdeeDoodee's Avatar
Senior Member with 218 posts.
 
Join Date: Aug 2004
Experience: Intermediate
26-May-2006, 01:30 PM #4
Thank you for the responses. OK, here are the questions and answers.

Quote:
How do you want to merge them? Do you want a Word document or an HTML document?
What about PDF format?
I would prefer a Word document or txt. The problem I am having is when I try to change the file name, the file contents can be corrupted. No pdf. If I have a large html file I think I can open the file up in FP and get at the contents that way.


Quote:
If you're merging a 5MB directory exclusively of HTML files, you must be talking about hundreds of files. Why do you want to do this? If I understand correctly, you want to merge everything from <BODY> to </BODY> in each file into one single file?
Actually the number of files is over 12,000. I want to do this so I can convert the body section of the file into txt which will be placed in Excel which will then be conformed to a file for MySql input. The content of the files becomes part of a MySql database. I have deleted all the header information in each file so all I have is the non-header and non-footer section you see on the screen.

I have tried stripping out the html tags with a tag stripper but some files still ended up corrupted because the file names were changed.

Thank you again for your time and the replies.
Rockn's Avatar
Computer Specs
Distinguished Member with 17,888 posts.
 
Join Date: Jul 2001
Location: Mexico of the North, MN
Experience: Disenfranchised American Male
26-May-2006, 01:41 PM #5
Do you have a sample of the HTML you can post here?
cristobal03's Avatar
Distinguished Member with 2,992 posts.
 
Join Date: Aug 2005
Experience: Advanced
26-May-2006, 01:47 PM #6
I don't see how changing the extension from html to txt would cause file corruption. What did you use to generate the HTML files?

chris.
Rockn's Avatar
Computer Specs
Distinguished Member with 17,888 posts.
 
Join Date: Jul 2001
Location: Mexico of the North, MN
Experience: Disenfranchised American Male
26-May-2006, 03:01 PM #7
Neither do I. Parsing out all of the HTML would make it even more unreadable as the output would be all strung together without any formatting.
HowdeeDoodee's Avatar
Senior Member with 218 posts.
 
Join Date: Aug 2004
Experience: Intermediate
26-May-2006, 04:26 PM #8
Thank you for the replies.

This issue has been solved.

Here is the solution.

Remember DOS?

Go to...

> Start
> Programs
> Accessories
> Command Prompt
Type in and confirm by hitting the Enter key

C:\>copy c:\TempStore\a*.html c:\TempStore\AllLettTwo.html

All file beginning with the letter "a" will be merged and joined into the file AllLettTwo.html


After the html files are merged, I can access them with FrontPage or another html editor. Copy the contents of the file into the Word document.

You know that little message you get when you try to change a filename extension that says something to the effect "You may lose data"? There is a reason for that message.

Keywords: Merging html files, joining html files, combining html files, merge html files, join html files, join html files
Rockn's Avatar
Computer Specs
Distinguished Member with 17,888 posts.
 
Join Date: Jul 2001
Location: Mexico of the North, MN
Experience: Disenfranchised American Male
26-May-2006, 06:20 PM #9
Well that was a simple and unexpected solution...COOL! Sometimes ya gotta take a step back and approach things from a different angle.
Closed Thread

THIS THREAD HAS EXPIRED.
Are you having the same problem? We have volunteers ready to answer your question, but first you'll have to join for free. Need help getting started? Check out our Welcome Guide.


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
WELCOME TO TECH SUPPORT GUY! Are you looking for the solution to your computer problem? Join our site today to ask your question -- for free! Our site is run completely by volunteers who help people like you solve computer problems. See our Welcome Guide to get started.



Thread Tools


You Are Using:
Server ID
Advertisements do not imply our endorsement of that product or service.
All times are GMT -4. The time now is 01:13 PM.
Copyright © 1996 - 2008 TechGuy, Inc. All rights reserved.
Powered by vBulletin, Copyright © 2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.1.0
Powered by Cermak Technologies, Inc.