There's no such thing as a stupid question, but they're the easiest to answer.
JoinTour
Login
 
Tag Cloud
audio avg avg 8 bios boot browser bsod computer cpu crash css dell desktop driver dvd email error excel explorer firefox firefox 3 freeze game graphics hard drive hardware help please hijackthis hjt install internet internet explorer itunes javascript lan laptop malware missing monitor msn network networking openoffice outlook outlook 2003 outlook express php popups problem problems router seo slow sound sp3 spyware startup trojan usb video virtumonde virus vista vundo windows windows vista windows xp winxp wireless word
DOS/PDA/Other
Search
Search in:
 
Advanced Search
Tech Support Guy Forums > Operating Systems > DOS/PDA/Other >
Append to end of line


HELLO AND WELCOME! Before you can post your question, you'll have to register -- it's completely free! Click here to join today! We highly recommend that you print a copy of our Guide for New Members. Enjoy!

 
Thread Tools
devil_himself's Avatar
Distinguished Member with 4,779 posts.
 
Join Date: Apr 2007
Location: India
Experience: Advanced
04-May-2008, 02:51 AM #16
Now This Script Doesn't Care About The File Type

Code:
@echo off
setlocal enabledelayedexpansion
set tp=c:\tp.txt
if exist c:\tmpfile.txt del /q c:\tmpfile.txt
(for /f "tokens=*" %%a in ('dir /b /a-d') do (
   if not %%a==%~nx0 (
   (for /f "delims=" %%i in ('cmd /c for /f "delims=" %%j in (%%a^) do echo %%j ^^^& exit') do set f=%%i)
     (for /f "skip=1 usebackq delims=" %%b in ("%%~dpnxa") do (
     echo %%b:"%%~nxa" >> "%tp%"
    )
  ))
))

echo %f%:"Filename">c:\tmpfile.txt
type "%tp%">>c:\tmpfile.txt
del /q "%tp%"
file1.txt
Code:
"Name":"Address":"City":"State":"ZipCode"
"John Doe":"503 Grand Ave":"San Jose":"CA":"93456"
file2.txt
Code:
"Name":"Address":"City":"State":"ZipCode"
"Jim Andersen":"512 Grand Ave":"San Diego":"WA":"93456"
output.txt
Code:
"Name":"Address":"City":"State":"ZipCode" :"Filename"
"John Doe":"503 Grand Ave":"San Jose":"CA":"93456":"file1.txt" 
"Jim Andersen":"512 Grand Ave":"San Diego":"WA":"93456":"file2.txt"
devil_himself's Avatar
Distinguished Member with 4,779 posts.
 
Join Date: Apr 2007
Location: India
Experience: Advanced
04-May-2008, 03:38 AM #17
if you want the output to be like this

Code:
"Name"         :"Address"      :"City"         :"State"        :"ZipCode"      :"Filename"      
"John Doe"     :"503 Grand Ave":"San Jose"     :"CA"           :"93456"        :"file1.txt"     
"Jim Andersen" :"512 Grand Ave":"San Diego"    :"WA"           :"93456"        :"file2.txt"
then this should do it
Code:
@echo off & setlocal enableextensions enabledelayedexpansion
  for /f "tokens=1-6 delims=:" %%a in ('type Myfile.txt') do (
   set l1=%%a               .
   set l2=%%b               .
   set l3=%%c               .
   set l4=%%d               .
   set l5=%%e               .
   set l6=%%f               .
   echo !l1:~0,15!:!l2:~0,15!:!l3:~0,15!:!l4:~0,15!:!l5:~0,15!:!l6:~0,15! >>newfile.txt
   )
  endlocal & goto :EOF
Squashman's Avatar
Distinguished Member with 12,231 posts.
 
Join Date: Apr 2003
Location: 1265 Lombardi Ave
04-May-2008, 12:00 PM #18
Thanks guys. I will give it a try when I get back to work on Monday. I don't want any spaces in between the delimiters but that is pretty cool. I think I might have a use for that on another order I coordinate. How does it know to expand it out to the longest variable of each field?
__________________
I hate asking the same question twice!
How to ask questions the smart way!
Microsoft MVP - Windows Shell/User
devil_himself's Avatar
Distinguished Member with 4,779 posts.
 
Join Date: Apr 2007
Location: India
Experience: Advanced
04-May-2008, 01:06 PM #19
How does it know to expand it out to the longest variable of each field?

>>> 15 characters
space after the variable is what makes it work

Code:
set l1=%%a               .
i think we can also expand it by finding the length of the longest record !
Squashman's Avatar
Distinguished Member with 12,231 posts.
 
Join Date: Apr 2003
Location: 1265 Lombardi Ave
05-May-2008, 12:31 PM #20
I gave Devil's a test at home. I noticed in your example above that it put a Space in the Header record after Zipcode. I can't have any spaces in between the delimiters.
"Name":"Address":"City":"State":"ZipCode" :"Filename"
"Name":"Address":"City":"State":"ZipCode":"Filename"

Also need the output written back to the directory that the input files are located. These files need to stay on the Network file Server.
__________________
I hate asking the same question twice!
How to ask questions the smart way!
Microsoft MVP - Windows Shell/User
Squashman's Avatar
Distinguished Member with 12,231 posts.
 
Join Date: Apr 2003
Location: 1265 Lombardi Ave
05-May-2008, 12:42 PM #21
Quote:
Originally Posted by TheOutcaste View Post
My script doesn't care about the extension, devil's looks for *.txt -- changing that to *.chr in the For statement would fix that. Don't know how his would work with quoted data, or adding a comma, but shouldn't take much of a change.

Tweaked mine to work with quoted data strings and using the colon, and this should work. Be sure to change the output file name (part in red) to use the extension you want:

Code:
@Echo off
::Set Output file name here
Set _f{1}=Combined.txt
If EXIST "%_f{1}%" Del "%_f{1}%"
::Gets first filename in alphabetical order, excluding the batch file
For /F "tokens=*" %%A In ('dir /b /a-d /o:-n ^|Find /I /V "%~nx0"') Do Set _t0=%%A
::Read Header from first line in first file
For /F "usebackq tokens=*" %%A In (`Find /V /N "" "%_t0%" ^|Findstr /B /C:"[1]"`) Do Set _t1=%%A
::Output Header to temp file
>%temp%\_f{0} Echo.%_t1:~3%:"Filename"
::Read lines from each file excluding the batch file and excluding the header line
::Output to temp file adding :"filename" to end of line
For /F "tokens=*" %%A In ('dir /b /a-d /o:n ^|Find /I /V "%~nx0"') Do (
For /F "usebackq skip=2 tokens=1* delims=]" %%B In (`Find /V /N "" "%%A" ^|Findstr /I /V /B /C:"[1"`) Do @Echo %%C:"%%A">>%temp%\_f{0}
)
Move %temp%\_f{0} "%_f{1}%"
For /L %%A In (0,1,1) Do Set _t%%A=
HTH

Jerry
Thanks. That worked just fine. I will test it out on our live data when I get to work this afternoon. I will have to walk thru all the code to make sure I understand it all, just in case I ever need to tweak it. I am going to do a little write up on the code you guys provided, whatever I don't understand I will post back in here and maybe you can explain it a little better. If I could throw a few echo and pause statements in there to see what each statement is doing.
__________________
I hate asking the same question twice!
How to ask questions the smart way!
Microsoft MVP - Windows Shell/User
TheOutcaste's Avatar
Computer Specs
Senior Member with 1,537 posts.
 
Join Date: Aug 2007
Location: Oregon, USA
Experience: Intermediate
05-May-2008, 02:24 PM #22
Be glad to answer any questions. I guess I should use more descriptive temp variable and temp file names, as it would make it easier to follow, but using numbered ones makes clean-up real easy. Not a lot of them used in this script though.

Would be nice if the For statement would let you use something other than just a letter.

For the main output loop, instead of %%A, %%B, %%C, using StrFileName, StrRecordInput, StrRecordOutput would make it easier to follow.

Also just noticed I left an @echo in that line, the @ is not needed and can be removed. I tend to use @Echo a lot in testing to suppress the Echo command line and just leave the output:
Code:
For /F "tokens=*" %%A In ('dir /b /a-d /o:n ^|Find /I /V "%~nx0"') Do (
For /F "usebackq skip=2 tokens=1* delims=]" %%B In (`Find /V /N "" "%%A" ^|Findstr /I /V /B /C:"[1"`) Do @Echo %%C:"%%A">>%temp%\_f{0}
)
Jerry
__________________
Of course I know all the answers ; I just don't always match the answers to the right questions

Warning -- Windows spoken here. (Rated R for Strong Language and Violence -- When your Windows PC flies through a window, that's violent, right?)
TheOutcaste's Avatar
Computer Specs
Senior Member with 1,537 posts.
 
Join Date: Aug 2007
Location: Oregon, USA
Experience: Intermediate
05-May-2008, 02:46 PM #23
Quote:
Originally Posted by Squashman View Post
I gave Devil's a test at home. I noticed in your example above that it put a Space in the Header record after Zipcode. I can't have any spaces in between the delimiters.
"Name":"Address":"City":"State":"ZipCode" :"Filename"
"Name":"Address":"City":"State":"ZipCode":"Filename"

Also need the output written back to the directory that the input files are located. These files need to stay on the Network file Server.
Hope Devil_himself doesn't mind me jumping in, but to remove the space, you need to remove the space between %j and ^^^ in this line:
Code:
(for /f "delims=" %%i in ('cmd /c for /f "delims=" %%j in (%%a^) do echo %%j ^^^& exit') do set f=%%i)
To keep the files in the same directory, remove c:\ from in front of the tp.txt and tempfile.txt filenames.

If your file names have spaces (file 1.chr instead of file1.chr) you will get an error "The system cannot find the file file on each filename with a space. If ALL the files have spaces in the names, the header line won't get read, but if at least one filename has no spaces, it will read the header line from that file.
That can be fixed with the changes shown in red below:
Code:
(for /f "delims=" %%i in ('cmd /c for /f "usebackq delims=" %%j in ("%%a"^) do echo %%j ^^^& exit') do set f=%%i)
Jerry

Edit: Just found out that since Devil's file doesn't delete tp.txt when it starts, if you remove the c:\ from the set tp=c:\tp.txt line and run this file twice, it will never end, as it will read the tp.txt file from the previous run. As that is also the output file, it may never end. So leave that line alone.
off to delete a 2,000,000+ line file....
__________________
Of course I know all the answers ; I just don't always match the answers to the right questions

Warning -- Windows spoken here. (Rated R for Strong Language and Violence -- When your Windows PC flies through a window, that's violent, right?)

Last edited by TheOutcaste : 06-May-2008 06:57 AM.
Squashman's Avatar
Distinguished Member with 12,231 posts.
 
Join Date: Apr 2003
Location: 1265 Lombardi Ave
05-May-2008, 06:53 PM #24
Where could I put in an echo statement to see what file it is working on at the moment? Alot of the files I work with are really big. Sometimes millions of records. Was hoping I could see the progression of files so I know how far it is along.
__________________
I hate asking the same question twice!
How to ask questions the smart way!
Microsoft MVP - Windows Shell/User
Squashman's Avatar
Distinguished Member with 12,231 posts.
 
Join Date: Apr 2003
Location: 1265 Lombardi Ave
05-May-2008, 09:56 PM #25
Houston we have a problem.
I ran Outcaste's batch file on some data I have here at work and it took an eternity to run. Roughly about 3 hours for it to run. It then didn't output all the records from the input files. I should have had roughly 572,458 lines but I only ended up with 409,667. Not sure how I can troubleshoot this.

The other weird thing is that the software I use to view large files is having a heck of a time handling the output file. It takes forever to open it and this software is designed to open large files. It chunks them into smaller sections and shows you one chunk at a time. It was taking about 15 seconds to go from chunk to chunk when it should only take about 2.

I ran the data thru my script with the unix utilites and it doesn't seem to have any problems handling the output file from that and I also got all the output records.

Outcaste what can we do to debug your batch file? I unfortunately can't send you our customer data to test your batch file with so we are going to have to do it all on my end. Still hoping I can use your batch file or Devil's to do this.

I am going to test Devil's batch file next. Will let you know how that one comes out.
__________________
I hate asking the same question twice!
How to ask questions the smart way!
Microsoft MVP - Windows Shell/User
Squashman's Avatar
Distinguished Member with 12,231 posts.
 
Join Date: Apr 2003
Location: 1265 Lombardi Ave
05-May-2008, 11:19 PM #26
Well here is some more results. Devil's batch file took about 32 minutes to run thru those 500,000 lines.

Then I ran my script. I put time stamps into a log file when each batch file started and stopped.

Devil's
Mon 05/05/2008 21:07:53.31
Mon 05/05/2008 21:39:11.28
Squashman's
Mon 05/05/2008 21:51:33.20
Mon 05/05/2008 21:51:43.56

I really can't explain why mine only takes 10 seconds. It is beyond my comprehension.

I ran it with the data on the Network drive vs my hard drive and it took about 4 minutes.
__________________
I hate asking the same question twice!
How to ask questions the smart way!
Microsoft MVP - Windows Shell/User
devil_himself's Avatar
Distinguished Member with 4,779 posts.
 
Join Date: Apr 2007
Location: India
Experience: Advanced
05-May-2008, 11:24 PM #27
hmm .. let me see if i can tweak it .. i think the nested for loops is the problem
devil_himself's Avatar
Distinguished Member with 4,779 posts.
 
Join Date: Apr 2007
Location: India
Experience: Advanced
05-May-2008, 11:26 PM #28
Quote:
Originally Posted by Squashman View Post
Where could I put in an echo statement to see what file it is working on at the moment? Alot of the files I work with are really big. Sometimes millions of records. Was hoping I could see the progression of files so I know how far it is along.

Run The Script With "ECHO ON" << First Line

It Also Helps To Understand How The Batch Works ... But The Command Shell Only Displays A Limited Amount of Text
devil_himself's Avatar
Distinguished Member with 4,779 posts.
 
Join Date: Apr 2007
Location: India
Experience: Advanced
05-May-2008, 11:32 PM #29
Can You Test This Script . At This Moment it Will Not Output the First Line ...

Code:
@echo off
setlocal enabledelayedexpansion
for /f "tokens=*" %%a in ('dir /b /a-d *.chr') do (
        for /f "skip=1 usebackq delims=" %%b in ("%%~dpnxa") do (
     echo %%b:"%%~nxa" >> Output.txt
  )
)
How Much Large Are The Text Files ?
Your Batch Uses SED - Stream Editor ..which is made for text manipulation .. i think thats why it takes less time
TheOutcaste's Avatar
Computer Specs
Senior Member with 1,537 posts.
 
Join Date: Aug 2007
Location: Oregon, USA
Experience: Intermediate
06-May-2008, 05:41 AM #30
Quote:
Originally Posted by Squashman View Post
Where could I put in an echo statement to see what file it is working on at the moment? Alot of the files I work with are really big. Sometimes millions of records. Was hoping I could see the progression of files so I know how far it is along.
You can add an Echo command right after the DO portion of the statement in the for loop if you want to see the For variable as it progresses. Just add @Echo %%X &&
after the DO part (be sure to leave a space after the DO)
Quote:
Originally Posted by Squashman View Post
Houston we have a problem.
I ran Outcaste's batch file on some data I have here at work and it took an eternity to run. Roughly about 3 hours for it to run. It then didn't output all the records from the input files. I should have had roughly 572,458 lines but I only ended up with 409,667. Not sure how I can troubleshoot this.

The other weird thing is that the software I use to view large files is having a heck of a time handling the output file. It takes forever to open it and this software is designed to open large files. It chunks them into smaller sections and shows you one chunk at a time. It was taking about 15 seconds to go from chunk to chunk when it should only take about 2.

I ran the data thru my script with the unix utilites and it doesn't seem to have any problems handling the output file from that and I also got all the output records.

Outcaste what can we do to debug your batch file? I unfortunately can't send you our customer data to test your batch file with so we are going to have to do it all on my end. Still hoping I can use your batch file or Devil's to do this.

I am going to test Devil's batch file next. Will let you know how that one comes out.
I was going to say if the files are large, or there are a large number of them, Devil's code will be much more efficient. Mine is so slow because it basically reads each file twice: once to generate a numbered list of all records, then a second time to remove the header line. I started that way to cover a more generic situation where the header line may be repeated every XXX number of records, such as a file ready to print with the header on each page. I just modified that to exclude the line that is numbered "one" -- should have just skipped the first line and not used the Find and findstr statements.
Plus I read the entire first file just to get the header line (once with find, then with findstr), then read it again to actually process the file, so it was getting read 4 times.
I can change those, but then the only difference between devil's file and mine are our choice of variable names.

The missing lines are a typo in my file -- There is a missing \]\ in the 3rd line from the end:
Code:
For /F "usebackq skip=2 tokens=1* delims=]" %%B In (`Find /V /N "" "%%A" ^|Findstr /I /V /B /C:"[1"`) Do @Echo %%C:"%%A">>%temp%\_f{0}
should be 
For /F "usebackq skip=2 tokens=1* delims=]" %%B In (`Find /V /N "" "%%A" ^|Findstr /I /V /B /C:"[1]"`) Do @Echo %%C:"%%A">>%temp%\_f{0}
The editor makes it look like there is a space between the ] and the " because I colored it red -- there is no space.

I was using find to number lines (adds [number] to the start of each line), then findstr to exclude line 1; with out the last bracket, it excludes every line that starts with 1, so lines 10-19, 100-199, 1000-1999, etc were dropped, which would account for 111,110 lines out of the 162,791 missing lines. Not sure about the rest of the missing lines.
I'd specifically created files with 20 lines to check that, but never noticed the change in the output file when the ] got dropped. You could add that in and see if you get all the lines, but it will take even longer.
This will be much more efficient, which is the same way devil's file processes the records:
Code:
For /F "usebackq skip=1 delims=" %%B In ("%%A") Do Echo.%%B:"%%A">>%temp%\_f{0}
Quote:
Originally Posted by Squashman View Post
Well here is some more results. Devil's batch file took about 32 minutes to run thru those 500,000 lines.

Then I ran my script. I put time stamps into a log file when each batch file started and stopped.

Devil's
Mon 05/05/2008 21:07:53.31
Mon 05/05/2008 21:39:11.28
Squashman's
Mon 05/05/2008 21:51:33.20
Mon 05/05/2008 21:51:43.56

I really can't explain why mine only takes 10 seconds. It is beyond my comprehension.

I ran it with the data on the Network drive vs my hard drive and it took about 4 minutes.
A Command Prompt (aka DOS) is going to be much slower. It was never really meant to deal with the contents of files, just the files themselves. The batch file commands have to be interpreted, and the external commands like find have to be called and passed parameters, whereas SED will have machine language routines to do it's manipulation all internally, which can easily be hundreds of times faster as you can see.

I'm also just guessing that your software may be taking so long with the output file because DOS uses Carriage Return/LineFeed (CR/LF) to end lines. Most *nix systems just use LF. Find and For will read in lines terminated with just LF, but when the filename is added to each record, and the line written to the combined output file, each line will end with CR/LF instead of just the LF. If your software has to convert the CR/LF to just LF before displaying each chunk, it will slow it considerably.

If you can hard code the header line in the batch file, devil's code above or the one I show below will be about the fastest you can get in a batch script.
Devil's method of reading the header only takes about 0.55 to 0.60 seconds for a 700,000 line (avg 88 char/line) file on my system. That shouldn't change on a per file basis, so hard coding the header would only shave about one minute off the time to process about 100 large files
A visual basic script might be a bit faster, but I'm not at all proficient writing those.

Need to pick either the Red or the Blue lines depending on if you want to process only the one extension, or all files except the batch.

Code:
@Echo off
::Set Output file name here
Set _f{1}=Combined.txt
If EXIST "%_f{1}%" Del "%_f{1}%"
::Output Header to temp file
>%temp%\_f{0} Echo."Name":"Street Address":"City":"St":"Zip":"Filename"
::Read lines from each file excluding the batch file and excluding the header line
::Output to temp file adding :"filename" to end of line

::This line processes every file in the folder except this batch file
For /F "tokens=*" %%A In ('dir /b /a-d /o:n ^|Find /I /V "%~nx0"') Do (
  For /F "usebackq skip=1 delims=" %%B In ("%%A") Do Echo.%%B:"%%A">>%temp%\_f{0} 
This line processes only files with a .CHR extension
For /F "tokens=*" %%A In ('dir /b /a-d /o:n "*.chr"') Do (
  For /F "usebackq skip=1 delims=" %%B In ("%%A") Do Echo.%%B:"%%A">>%temp%\_f{0}
)
Move %temp%\_f{0} "%_f{1}%"
For /L %%A In (0,1,1) Do Set _t%%A=
I'm running a test with this using the "more efficient" line shown above with a sample file with 600,000 lines to see how long it takes.
Will then try this script to see the difference by hard coding the header.
Then will try devil's file
Running on a 3.0 GHz Pentium D

Jerry
__________________
Of course I know all the answers ; I just don't always match the answers to the right questions

Warning -- Windows spoken here. (Rated R for Strong Language and Violence -- When your Windows PC flies through a window, that's violent, right?)
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are Off
Refbacks are Off

You Are Using:
Server ID
Advertisements do not imply our endorsement of that product or service.
All times are GMT -4. The time now is 04:10 AM.
Copyright © 1996 - 2008 TechGuy, Inc. All rights reserved.
Powered by vBulletin, Copyright © 2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.1.0
Powered by Cermak Technologies, Inc.