1. Computer problem? Tech Support Guy is completely free -- paid for by advertisers and donations. Click here to join today! If you're new to Tech Support Guy, we highly recommend that you visit our Guide for New Members.

Replacing HTML in Excel cells with formatted text

Discussion in 'Business Applications' started by JNTM, May 2, 2010.

Thread Status:
Not open for further replies.
Advertisement
  1. JNTM

    JNTM Thread Starter

    Joined:
    May 2, 2010
    Messages:
    11
    I have as Excel 2007 spreadsheet exported from another piece of software that includes a column of 1000 or so short reports, one report per Excel cell. The reports were originally formatted documents (headings, bullet points, etc. - even one or two small graphics), but the exporting process has converted them into HTML.

    I want to retain the structure of the spreadsheet (really just a three column data-base of text items) but convert the HTML back into readable formatted text with the original headings, etc. Since the spreadsheet export happens repeatedly, I'm ideally looking for a simple process that could be done as a regular routine.

    Suggestions gratefully received!
     
  2. Keebellah

    Keebellah Trusted Advisor

    Joined:
    Mar 27, 2008
    Messages:
    6,582
    First Name:
    Hans
    welcome to the forum,
    To get an idea, could you post an Excel sheet with let's say one or two rows (non sensitve data) so that we can picture it?

    Maybe its simple but it's easier to 'talk' about if you can see it.
    Thanks.
     
  3. JNTM

    JNTM Thread Starter

    Joined:
    May 2, 2010
    Messages:
    11
    Thanks for your interest.

    The attached demo is one row of the data-base, and contains an artificial 'sampler' report that uses many more style options that most students would use. It also includes one graphics link (with URL disabled).

    It is a kind of 'lab-book' for students, with each row giving the name of the activity, the student's (usually quite brief) report, and the ID of the student. Since you might have, say, a dozen activities and a hundred students, the data-base could have 1-2000 reports.

    The students enter their reports via a standard text-entry window that allows quite a range of style elements. The system holds the reports internally in HTML, and so its Export facility (which generates the data-base I am concerned with) provides the reports in the HTML format shown in my example.

    The people who have to mark the reports have to use the data-base, so offering them raw HTML or even cleaned up plain text isn't likely to be appreciated - nor would it be fair to the student who has taken the trouble to lay out their report neatly. Hence the need for a way to get the database to present its content as properly formatted reports.

    While there may well be proprietary products that could do this for me, I'd prefer to stick with something like Excel that is universally available.

    I know that a single Excel cell is capable of holding (and displaying) a suitably formatted version of one of the HTML reports, because I've manually created a formatted report in a single cell. But so far I've failed to find a way to do this automatically (and my skills don't yet run to macro-writing!).

    I've tried copying the Excel cells into another package, formatting them, and pasting them back into Excel, but I get caught in a Catch 22: If I paste formatted text into Excel, it puts each line in a separate cell. If I try to temporarily create a single paragraph, so that it goes into a single cell, with the intention of then restoring the paragraphs, I lose the formatting!

    Help gratefully received!
     

    Attached Files:

  4. Keebellah

    Keebellah Trusted Advisor

    Joined:
    Mar 27, 2008
    Messages:
    6,582
    First Name:
    Hans
    Okay I have it, I'm not that good with 2007 or higher but will see if I understand it and can help.
    I can't promise success yet.
     
  5. Keebellah

    Keebellah Trusted Advisor

    Joined:
    Mar 27, 2008
    Messages:
    6,582
    First Name:
    Hans
    Okay, I hope this does what you want.
    I added a module to your sheet, the modules name and the function's name is HTML2Txt
    If you look in the code I think you'll get the idea.
    It searches and replaces html code that you have to update if run into codes that aren't present.
    I vertiaclly alligned the cells to top and you will have to format the cells to wrap text etc etc for you final version but that will be simple.

    Pass the cell value to the function and itt will return plain text
    The HTML text is in cell A2 then this is the formaula for let's say B2:

    Syntax: =HTML2Txt(Range("A2")
     

    Attached Files:

  6. JNTM

    JNTM Thread Starter

    Joined:
    May 2, 2010
    Messages:
    11
    Many thanks, Keebellah! Looks most interesting - I'll have a play. I'm most impressed by the speed of response!
     
  7. JNTM

    JNTM Thread Starter

    Joined:
    May 2, 2010
    Messages:
    11
    I've had a chance to look at your suggestion more closely. It has done a great job of clearing the tags to leave plain text (thank you!).

    However, what I was really hoping for was something with the original styling. See attached thumbnail, which was created by copying the report in the demo I sent you into Notepad, getting rid of duplicate quote marks, changing the file extension from '.txt' to '.htm' and then viewing in a browser (my attachment to this message is a screen dump of that).

    I can certainly view the fully styled reports using the Notepad/Browser route, but I'm not sure how well that will scale up to 1000 reports or more, and it loses the manipulability of a spreadsheet.

    So in an ideal world the styled copy would still be inside an Excel cell - just as in your reply, but with styled text rather than plain text.

    Any suggestions?
     

    Attached Files:

    • Demo.jpg
      Demo.jpg
      File size:
      54.6 KB
      Views:
      433
  8. Keebellah

    Keebellah Trusted Advisor

    Joined:
    Mar 27, 2008
    Messages:
    6,582
    First Name:
    Hans
    This last one I'm not so sure.
    I think the vba code could carry out something like that but then there will have be quite some programming to be done.
    You would have to go through the whole text and then based upon the formatting code <b> xxxxxxx </b> you would in this case select that part of the text en set the attribute to BOLD.
    The same with all the other codes.

    It will be a search and replace set the attributes and then remove teh <b> and the </b>
    <i> Italics </i>
    <ul> underline </ul>
    These are some of the clues and ideas but I don't know if I can put it together for you.
    I suggest Google and something like enterig HTML formatted text in cells ?
    If I find something I'll letb you know. I'll see if I can do something for the bold part and if that works you will have to add the other coding yourslef. Okay? I'll try
     
  9. slurpee55

    slurpee55

    Joined:
    Oct 20, 2004
    Messages:
    7,837
    Hans, VBA help has this - I don't know if you could use it:
    WebPreFormattedTextToColumns Property
    See AlsoApplies ToExampleSpecificsReturns or sets whether data contained within HTML <PRE> tags in the Web page is parsed into columns when you import the page into a query table. The default is True. Read/write Boolean.

    Remarks
    This property is used only when the QueryType property of the query table is xlWebQuery and the query returns a HTML document.

    Example
    This example adds a new Web query table to the first worksheet in the first workbook. Note that the example doesn&#8217;t parse into columns any data located between the HTML <PRE> tags.

    Set shFirstQtr = Workbooks(1).Worksheets(1)
    Set qtQtrResults = shFirstQtr.QueryTables _
    .Add(Connection := "URL;http://datasvr/98q1/19980331.htm", _
    Destination := shFirstQtr.Cells(1,1))
    With qtQtrResults
    .WebFormatting = xlNone
    .WebPreFormattedTextToColumns = False
    .Refresh
    End With
     
  10. Keebellah

    Keebellah Trusted Advisor

    Joined:
    Mar 27, 2008
    Messages:
    6,582
    First Name:
    Hans
    Hi Slurpee,
    I must confess that I didn't take the time for that.
    I use the function I posted because I have a helpdesk support programma that exports RTF coded text and that needs to 'undone' of all the formatting, so that's why I used it in this case.
    It's true that HTML is more structured than RTF.
    I'll see if I can do something with it.

    Thanks, I'll let you know too.(y)
     
  11. Keebellah

    Keebellah Trusted Advisor

    Joined:
    Mar 27, 2008
    Messages:
    6,582
    First Name:
    Hans
    Since I don't have the source documents I think you should test what Sulpee has suggested and see it it does what you expect.
    Reading the code I expect it does.
    Let us know.
     
  12. JNTM

    JNTM Thread Starter

    Joined:
    May 2, 2010
    Messages:
    11
    Hi, Keebellah and Slurpee55

    Many thanks for your obviously erudite comments on this. I'm afraid I'm well out of my depth here, but I'll see if one of my more tech literate colleagues can interpret it for me!
     
  13. Keebellah

    Keebellah Trusted Advisor

    Joined:
    Mar 27, 2008
    Messages:
    6,582
    First Name:
    Hans
    if you attach one of the documents I'll give it a swing, no promises though.
    Okay?
     
  14. JNTM

    JNTM Thread Starter

    Joined:
    May 2, 2010
    Messages:
    11
    Thank you! Your offer to have a go is much appreciated!

    I attach a version of my earlier demo now with five rows rather than just one - the real thing could have 1000 rows or more, but I hope that 5 gives you enough to work with.

    The five rows are essentially duplicates, but with small changes to each cell so that they are uniquely identifiable.

    I've also added a graphic (with apologies to Wikipedia, but it seemed a safe and neutral source to link to).

    Many thanks! JNTM
     

    Attached Files:

  15. Keebellah

    Keebellah Trusted Advisor

    Joined:
    Mar 27, 2008
    Messages:
    6,582
    First Name:
    Hans
    Well, It's a little out of my league but I'm going to give it some thought.
    You want formatted text and the image? You want to see it in the cell?
    Like I said, itá a challenge and i'll take it step by step.
    Like I said in one of the other replies in another post, 'miracles take a little longer'.
     
  16. Sponsor

As Seen On
As Seen On...

Welcome to Tech Support Guy!

Are you looking for the solution to your computer problem? Join our site today to ask your question. This site is completely free -- paid for by advertisers and donations.

If you're not already familiar with forums, watch our Welcome Guide to get started.

Join over 733,556 other people just like you!

Thread Status:
Not open for further replies.

Short URL to this thread: https://techguy.org/920679

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice