Solved Script To Automatically Trim Text Data Into Several Lines

Status
This thread has been Locked and is not open to further replies. Please start a New Thread if you're having a similar issue. View our Welcome Guide to learn how to use this site.

hamtaroid

Castielus
Thread Starter
Joined
Aug 26, 2017
Messages
38
Hello. I work in a computer store in Indonesia as an administrator. My job is to create invoices. My crazy supervisor wants to see trimmed label address. She said it's more beautiful to look at -,-' Problem is the address I receive is always jumbled into one long line. Separating them (based on street address, etc) takes time. During the national shopping day, where we can have like 500 invoices in a day, I don't think I can handle it.

Below is one example of the address data that I have to work with:
17082509333FJ40
Alamat Pengiriman
Agus, 62812345678
Perumahan Orang Kaya, Jl.Jalan 2 BLOK H1/1001, Katulampa, Bogor Barat, KOTA BOGOR, BOGOR Barat - KOTA, JAWA BARAT, ID, 12345

The result I want is:
17082509333FJ40
Agus
Perumahan Orang Kaya
Jl.Jalan 2
BLOK H1/1001
Katulampa
Bogor Barat
KOTA BOGOR
12345
'0812345678

Of course, I understand it is impossible for a script to wholly replace man's logic. But if the script can help me create new lines based on several conditions, it will greatly help me. The conditions are as follow..
New line is made immediately before:
1. Word "Jalan" or "Jl." or "Jl"
2. Word "Perumahan" or "Perum"
3. Word "Blok" or "Block"
4. Word "RT" or "RT***" (*=number)
5. Word "Gang" or "Gg." or "Gg"
6. Word "Kel." or "Kel" or "Kelurahan"
7. Word "Kec." or "Kec" or "Kecamatan"
8. Word "Kota"
9. 5 random number (post code)

New line is made immediately after:
10. A comma and a space ", "

(NOTE1: IF the condition described above has already had a new line, the script doesn't need to re-add new line for that condition. In my example above, the word "Perumahan" starts the new line. Therefore there is no need for additional new line for that word)

(NOTE 2: The words described above doesn't necessarily starts with a capital letter. It can starts with a small letters. For example the word "perumahan" instead of "Perumahan"


Things to erase/replace
1. Word "ID"
2. A comma and a space ", "
3. Word "Alamat Pengiriman"
4. 62 is replaced with "'0" (yes, there is an ' inside)

Please teach me how to automatically sort the address data. I prefer to use WordPad. But if it's impossible, Notepad++ or MS Word is fine. Please help, thank you very much in advance for your time and willingness to help. God bless :)
 
Joined
Jun 8, 2001
Messages
2,583
Without a sample file to work from, any work done at this point would likely be inaccurate speculation.
The new line criteria described above would be tough since 'random number' occurs 6 times in the example provided. The data appears comma delimited but OBPs question would be answered if you would upload a test file.

Once we see a test file (replace personal identifying data before posting) in as close to the original format as possible we could determine a better end of line.

For example, I'm seeing an 11 digit number (17082509333) as a possible begin new record trigger.

If you are wanting to "sort" Word / Word pad will NOT be part of the solution. You will need to use Excel. If you open your present file with Excel (and indicate its comma delimited on open) you will see a different view.
 

hamtaroid

Castielus
Thread Starter
Joined
Aug 26, 2017
Messages
38
Is this data coming from a .csv or .txt file?
The real data is from ecommerce website. I paste the address data to a text file in Wordpad and edit manually there.

Without a sample file to work from, any work done at this point would likely be inaccurate speculation.

If you are wanting to "sort" Word / Word pad will NOT be part of the solution. You will need to use Excel.
How about MS Word Macro? Using the Find-Replace method? Will it work? I know it's not perfect but it helps if the data can divide into several lines.
The problem is, I do not know how to specifically move data to a certain line.

I include 2 files. One is the source. The other is the result. I have changed the customer data, so it's safe to upload.
Please help, thank you
 

Attachments

Last edited:

OBP

Joined
Mar 8, 2005
Messages
19,896
The best program to do this in is Access, which will not only parse the Address but will also store it in a searchable table with great searching facilities for future use.
The next best is Excel which can also store the Addresses on a seperate worksheet for future use.
Word may be able to do this with VBA, but I have never programmed Word.
So my recommendation would be to do this in Excel, unless you have Access and then send the finished article to Word if that is required for actually using the data.
The whole point of this exercise is to remove practically all the human intervention in the process, so that you will not need to "work in" the software.
So, do you have Access or Excel?
 

OBP

Joined
Mar 8, 2005
Messages
19,896
Looking at the source and result data there is one thing that VBA cannot do unless there is some prior decision making and that is converting this "pondok keranjang ciputat" to
pondok keranjang
ciputat
because there is no Cue that ciputat is not part of the original address.
 

OBP

Joined
Mar 8, 2005
Messages
19,896
Similarly, what is the difference between
Jl. Rusuh raya no.66 which is on one line
and
Jl. Jago Dalam Gg. 5 No. 777 which is over 2 lines

For vba to work there needs to be a certain amount of logic to the decisions.
If that logic exists, then it needs to be predetermined and hard coded in to the code using arrays
 

OBP

Joined
Mar 8, 2005
Messages
19,896
Here is an Excel Worksheet with some VBA that does most of what you want apart from the items that I mentioned.
The VBA is intiated by pressing Ctrl + q.
It parses the data and places the results in seperate Columns.

ps this version is not working quite right due to differences in the layout of the no & No so I will look at that.

pps added new version with changes, I doubt that will cover all the variations you will come accross, but it handles most of those in the data supplied.
 

Attachments

Last edited:

hamtaroid

Castielus
Thread Starter
Joined
Aug 26, 2017
Messages
38
So, do you have Access or Excel?
Yes I use Excel. In fact, the end data I must work with is Excel actually, to print invoice and address label.

there is no Cue that ciputat is not part of the original address.
Ciputat is sub-district name. Is there anyway I can insert some data to the script? So whenever the script comes across the word Ciputat, it will make new line? Yes I know the data will be long because there are a lot of sub-districts in Indonesia. But I think that is the only way?

Similarly, what is the difference between
Jl. Rusuh raya no.66 which is on one line
and
Jl. Jago Dalam Gg. 5 No. 777 which is over 2 lines
Jl. jago Dalam has two lines because there is "Gg". In Indonesia, it is called Gang. It's like a street but very narrow one. You can say it's like a "sub-street" if you will. So, within a street, there are several gangs along it. Actually it is not a must to make "Gg" a new line.


pps added new version with changes, I doubt that will cover all the variations you will come accross, but it handles most of those in the data supplied.
Wow, it's like magic :) Thank you so much .
Yes I know it's not perfect. Is there any way for me to edit the script manually?
 

OBP

Joined
Mar 8, 2005
Messages
19,896
First question first, Items like Ciputat can be built in to the VBA code, so it would be worth while collecting any others like that together and add them all.
As to Manually editing the script, it depends on what you mean.
If you mean interacting with it while it is running that is possible.
If you mean editing the actual VBA code then of course, however be cautioned that it is quite easy to break the code if you are not famiar with VBA.
If you have specific items that you would like added I can do so.
 

hamtaroid

Castielus
Thread Starter
Joined
Aug 26, 2017
Messages
38
If you have specific items that you would like added I can do so.
Sorry to trouble you but could you please help me modify the script. It was great until I tested it with several same data. The result was jumbled although the data was the same. I saved it and attached it.

1. Line Confusion
Maybe you could make the script to accept one address data only, to avoid confusion? Because I don't know on which line I must start pasting the data. In the example file I attached, you can see I pasted the same data several times on several lines.

2. Blank data
When I pasted blank data, the previous data didn't go away even after I pressed Ctrl-Q several times. Is it possible to make the script to accept blank data and process it by deleting the previous result?

3. Commas
Is it possible to make commas as cues for new line? Of course, the other conditions are still applied, for example the word "ID" must be erased.

Thank you so much for your time and willingness to help me ^_^
You're such an angel.
 

Attachments

Last edited:

OBP

Joined
Mar 8, 2005
Messages
19,896
The code was written expecting you to paste the data with 3 lines between each address as I thought that is how it was spaced in the file you are pasting from, it can be made to accomodate varying spacing but it would be better if you kept it uniform.
Do you wish to clear all old parsed addresses?
The comma is already processed as a new line, that is what commastring does, which one is not working, I may be able to fix it.
 

hamtaroid

Castielus
Thread Starter
Joined
Aug 26, 2017
Messages
38
The code was written expecting you to paste the data with 3 lines between each address as I thought that is how it was spaced in the file you are pasting from, it can be made to accomodate varying spacing but it would be better if you kept it uniform.
Do you wish to clear all old parsed addresses?
Yes, I think it's best to clear the old parsed data. Because I will use it by clearing old data then pasting new data. I tried pasting as you suggested, after 3 lines. But the result was the same. Old data still attached itself on the result. I attached the saved result for you to analyze :)

The comma is already processed as a new line, that is what commastring does, which one is not working, I may be able to fix it.
I haven't figured that out. I think the jumbled result I mentioned earlier happened because of the unerased previously parsed data.

Please help :) Thank you, OBP.
 

Attachments

OBP

Joined
Mar 8, 2005
Messages
19,896
This version clears the data from a fairly large area, but you could of course clear the whole sheet prior to pasting the data by clicking on the to left corner of the worksheet to "select all" and then delete.
 

Attachments

hamtaroid

Castielus
Thread Starter
Joined
Aug 26, 2017
Messages
38
"select all" and then delete.
Oh I'm very sorry. I did NOT know that I could clear the whole sheet without mesing the script o_O I thought it's a no-no to erase the parsed data.

1. Is it possible to add new cues? I see that "Jl"/"Jalan" and "Gg"/"Gang" have not triggered new line.

2. Right after the comma of KOTA data, is it possible to move the data behind it on top of the parsed result of KOTA data? In example below, I want the word "CIPUTAT SELATAN" placed on top of KOTA TANGERANG SELATAN. It's the fixed line location for sub district/district.

3. I see the the data of "No" and "RT" immediately disappear on the parsed result? Any way to save it?


Example:
"Jl. Rusuh raya no.66 pondok keranjang ciputat, KOTA TANGERANG SELATAN, CIPUTAT SELATAN, BANTEN, ID, 15412"
becomes
"...
CIPUTAT SELATAN
KOTA TANGERANG SELATAN
...

"

Thank you again OBP. You have no idea how much you have helped me :)
 
Last edited:
Status
This thread has been Locked and is not open to further replies. Please start a New Thread if you're having a similar issue. View our Welcome Guide to learn how to use this site.

Users Who Are Viewing This Thread (Users: 0, Guests: 1)

As Seen On
As Seen On...

Welcome to Tech Support Guy!

Are you looking for the solution to your computer problem? Join our site today to ask your question. This site is completely free -- paid for by advertisers and donations.

If you're not already familiar with forums, watch our Welcome Guide to get started.

Join over 807,865 other people just like you!

Latest posts

Staff online

Members online

Top