1. Computer problem? Tech Support Guy is completely free -- paid for by advertisers and donations. Click here to join today! If you're new to Tech Support Guy, we highly recommend that you visit our Guide for New Members.

Solved: csv file splitting and unique names

Discussion in 'Software Development' started by amstro, Feb 7, 2013.

Thread Status:
Not open for further replies.
Advertisement
  1. amstro

    amstro Thread Starter

    Joined:
    Feb 7, 2013
    Messages:
    10
    Hi friends, I need help with editing a csv file. I have multiple rows in a file and I want each row to be transfered to a separate csv file. For ex, if my csv file is as below
    crj,ckjck,dbnch,dnj,ch
    152,15,15,487
    abc,def,ghi,jkl,abc,def,ghi,jkl
    ........ and so on

    I want n files created where n is the number of lines in my csv files. Each csv file will have exactly one line from the source csv file. I found a code elsewhere in this forum (http://forums.techguy.org/software-development/1023949-solved-split-100000-line-csv.html). It does exactly what I need with one exception. It generates 5000 lines into one csv file, while I need a single line. This is an easy fix. I change the variable 5000 to 1 and I got the results. Worked like magic. However, I have another issue; the file name. It uses a standard file name and increments it by 1 and so all file names have the same file name with a number at the end that keeps incrementing. I cannot use this naming convention. I want each file to use the first word from that line as the file name. So in my case. The first file will have name crj.csv, the 2nd will have name 152.csv, the third abc.csv and so on.

    I am not a programming guy and I need help with this task. I am very grateful for any help in advance. Thank you so much.
     
  2. Rollin_Again

    Rollin_Again

    Joined:
    Sep 4, 2003
    Messages:
    4,912
    Post your existing macro code please.

    Rollin
     
  3. amstro

    amstro Thread Starter

    Joined:
    Feb 7, 2013
    Messages:
    10
    Code:
    @echo off setlocal ENABLEDELAYEDEXPANSION REM Edit this value to change the name of the file that needs splitting. Include the extension. SET BFN=BigFile.csv REM Edit this value to change the number of lines per file. SET LPF=5000 REM Edit this value to change the name of each short file. It will be followed by a number indicating where it is in the list. SET SFN=SplitFile REM Do not change beyond this line. SET SFX=%BFN:~-3% SET /A LineNum=0 SET /A FileNum=1 For /F "delims==" %%l in (%BFN%) Do ( SET /A LineNum+=1 echo %%l >> %SFN%!FileNum!.%SFX% if !LineNum! EQU !LPF! ( SET /A LineNum=0 SET /A FileNum+=1 ) ) endlocal Pause
    Thanks again!
     
  4. Squashman

    Squashman Trusted Advisor

    Joined:
    Apr 4, 2003
    Messages:
    19,786
    How about putting some CR\LF in there.
    Code:
    @echo off 
    setlocal ENABLEDELAYEDEXPANSION 
    
    REM Edit this value to change the name of the file that needs splitting. Include the extension. 
    SET BFN=BigFile.csv 
    
    REM Edit this value to change the number of lines per file. 
    SET LPF=5000 
    
    REM Edit this value to change the name of each short file. It will be followed by a number indicating where it is in the list. 
    SET SFN=SplitFile 
    
    REM Do not change beyond this line. 
    SET SFX=%BFN:~-3% 
    SET /A LineNum=0 
    SET /A FileNum=1 
    For /F "delims==" %%l in (%BFN%) Do ( 
    	SET /A LineNum+=1 
    	echo %%l >> %SFN%!FileNum!.%SFX% 
    	if !LineNum! EQU !LPF! (
    		SET /A LineNum=0 
    		SET /A FileNum+=1
    	)
    )
    endlocal
    Pause
     
  5. Squashman

    Squashman Trusted Advisor

    Joined:
    Apr 4, 2003
    Messages:
    19,786
    Can you guarantee that the first delimited column will never have the same information otherwise you will have files that will overwrite each other.

    Don't even need most of this code if you are doing 1 line per output file. It is a much smaller and less complicated batch file as long as you don't have to keep track of possible file overwrites.
     
  6. foxidrive

    foxidrive Banned

    Joined:
    Oct 20, 2012
    Messages:
    793
    This should work. If your file.csv has a long filename with spaces etc then rename it for this.

    Code:
    @echo off
    for /f "delims=" %%a in (file.csv) do (
    for /f "delims=," %%b in ("%%a") do (
    > "%%b.csv" echo %%a
    )
    )
    If doesn't check for existing files, as Squashman has mentioned.
     
  7. amstro

    amstro Thread Starter

    Joined:
    Feb 7, 2013
    Messages:
    10
    Thanks for the reply Squashman. Yes, I can guarantee that the first word will never be repeated in any of the lines. I will try the code and update back the result.
     
  8. amstro

    amstro Thread Starter

    Joined:
    Feb 7, 2013
    Messages:
    10
    Thank you foxidrive. I will check and update back with the result. Thanks everyone for helping me out.
     
  9. amstro

    amstro Thread Starter

    Joined:
    Feb 7, 2013
    Messages:
    10
    Foxidrive, this worked like magic. I get exactly what I needed. I have to ask one more help. How can I make it so that the output csv file has the results in a column fashion and not in a row. For ex, Genius.csv file has the result as Genius,talent,smart. How can I edit your batch file so that the result is
    Genius
    talent
    smart
    So basically I will have a big csv file and then once I run the batch file, multiple csv files are created with the first word from each line as the file name and each single line as content of the resulting files and also the resulting files have the data in column format and not in a line

    Thanks a billion for your help
     
  10. foxidrive

    foxidrive Banned

    Joined:
    Oct 20, 2012
    Messages:
    793
    This uses a second batch file called batch2.bat (see below).
    It replaces every comma with a newline sequence

    *) save the top batch file
    *) save the second batch file and call it batch2.bat
    *) put them both in the same folder with file.csv
    *) run the first batch file.

    See how it goes.


    Code:
    @echo off
    for /f "delims=" %%a in (file.csv) do (
    for /f "delims=," %%b in ("%%a") do (
    > "a.tmp" echo %%a
    call batch2.bat
    del "a.tmp"
    ren "b.tmp" "%%b.csv"
    )
    )
    batch2.bat

    Code:
    @set @JScript=1/*
    @echo off
    setlocal
    cscript //nologo //E:JScript "%~f0"
    goto :eof
    */
    // JScript
    //
    var ForReading= 1
    var ForWriting = 2
    var fso = new ActiveXObject("Scripting.FileSystemObject");
    // paths must have doubled backslashes if used
    // var input = fso.OpenTextFile("c:\\temp\\filein.txt", ForReading)
    // var output = fso.OpenTextFile("c:\\temp\\fileout.txt", ForWriting, true)
    var input = fso.OpenTextFile("a.tmp", ForReading)
    var output = fso.OpenTextFile("b.tmp", ForWriting, true)
    var data = input.Readall()
    data = data.replace(/\x2c/gi, "\x0D\x0A")
    output.Write(data)
    input.Close()
    output.Close()
     
  11. Squashman

    Squashman Trusted Advisor

    Joined:
    Apr 4, 2003
    Messages:
    19,786
    You could in theory do this all with a pure batch sub-routine by calling the sub-routine with a call statement. The commas will work as the delimiter to the sub routine and you can reference them as %1 etc. You could use the shift statement to continue processing each argument passed to the sub routine and you can keep checking if %1 is not nul to keep processing the input.
     
  12. foxidrive

    foxidrive Banned

    Joined:
    Oct 20, 2012
    Messages:
    793
    Yes, Squashman, that's feasible. It does depend on the file content however because poison characters have to have special handling.

    The method I used is impervious to any poison characters, but
    a possible flaw in my method is that for an entry like this:
    one,two,"three pears, and four boxes",five
    it will also split after the 'three pears' where it is one double quoted entry.
     
  13. amstro

    amstro Thread Starter

    Joined:
    Feb 7, 2013
    Messages:
    10
    Foxidrive, that is awesome! This worked exactly like I wanted it to. I am soo so grateful for your help. Considering the amount of work that I would have to do without these batch files, I would have stayed in my lab straight for a whole month. Thanks a million. I do have one more task at hand. But I will give it a shot based on your batch files and report back. Thanks again for all your help.
     
  14. amstro

    amstro Thread Starter

    Joined:
    Feb 7, 2013
    Messages:
    10
    Foxidrive, I really need your help again. I have this file which is just one big long statement. Imagine a text file about 3 MB with all kinds of text and special characters and everything (it almost take 20 to 30 secs for notepad to open it). The guy never hit the enter key and so there is just one big line. I have to separate this long statement into several statements. A statement starts at the first character in the file and ends at a semicolon. The next line starts after the semicolon and ends at the next semi-colon, and then the next line starts and ends at the next semicolon and so on and on and finally the last line ends at the last semicolon. I tried using your 2 batch files but I get several different files :(
    I just need one result file which has all the content from the original file but there should be several statements as opposed to the original one statement. Also note, that it is a .txt file
    I hope you can help me out again and I promise not to bother you again (for a whole month). Thank you so much.
     
  15. foxidrive

    foxidrive Banned

    Joined:
    Oct 20, 2012
    Messages:
    793
    Try this:

    call it split.bat and launch it like this: split "filein.txt" "fileout.txt"


    Code:
    @echo off
     >_.vbs echo set regex=new regexp
    >>_.vbs echo regex.global=true
    >>_.vbs echo regEx.IgnoreCase=True            
    >>_.vbs echo regex.pattern=";"
    >>_.vbs echo wscript.stdOut.write regex.replace(wscript.stdin.readall,";"+vbCRLF)
    cscript /nologo _.vbs <"%~1" >"%~2"
    del _.vbs
     
  16. Sponsor

As Seen On
As Seen On...

Welcome to Tech Support Guy!

Are you looking for the solution to your computer problem? Join our site today to ask your question. This site is completely free -- paid for by advertisers and donations.

If you're not already familiar with forums, watch our Welcome Guide to get started.

Join over 733,556 other people just like you!

Loading...
Thread Status:
Not open for further replies.

Short URL to this thread: https://techguy.org/1088574

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice