1. Computer problem? Tech Support Guy is completely free -- paid for by advertisers and donations. Click here to join today! If you're new to Tech Support Guy, we highly recommend that you visit our Guide for New Members.

Solved: split a 100000 line csv into 5000 line csv files with DOS batch

Discussion in 'Software Development' started by bphilp, Oct 25, 2011.

Thread Status:
Not open for further replies.
Advertisement
  1. bphilp

    bphilp Thread Starter

    Joined:
    Oct 25, 2011
    Messages:
    4
    I have a large .csv file I wish to break into smaller (5000 line) files such that

    sourceFile.csv (100000 lines)

    becomes

    outputFile1.csv (lines 1-5000)
    outputFile2.csv (lines 5001-10001)
    etc till end of sourceFile.csv providing me 20 outputFiles

    Any ideas?
     
  2. Ent

    Ent Trusted Advisor

    Joined:
    Apr 11, 2009
    Messages:
    5,467
    First Name:
    Josiah
    Is the only thing that needs counting the number of line breaks? i.e. can the actual commas and contents in the thing be ignored?
    Do you mean 5001-10001, or 5001-10000?
    Does it have to be batch, or would another language be possible?

    Also, how many of these files do you have? If you only have a few it would probably be faster to do it by hand.
     
  3. bphilp

    bphilp Thread Starter

    Joined:
    Oct 25, 2011
    Messages:
    4
    yes ignore commas and just spit the file up into smaller files

    file 1 should be rows 1-5001 yielding 5000 rows...iterated though till end of the big file... may be 100000 rows or more

    needs to be a batch file

    needs to be programmed as part of an automation workflow moving data between systems

    thanks!
     
  4. Ent

    Ent Trusted Advisor

    Joined:
    Apr 11, 2009
    Messages:
    5,467
    First Name:
    Josiah
    I have made a batch program that does what you ask: it splits the larger file into smaller ones each with a designated number of lines. The last small file may be shorter if the number of lines in the big file isn't a multiple of the number you're splitting into. You will need to edit some values to fit your situation; follow the instructions in the REMarks. Then simply place the batch file in the same folder as the file that needs splitting and double-click.

    I must warn you that I expect the process to take a very long time and you may be tempted to believe that it's hung. This is partly inevitable due to the simple size of your file that needs splitting. It is worsened considerably by the fact that batch is particularly slow even among interpreted scripting languages. My own tests were conducted on a small file of only 15 or 16 lines split into 5s, and I don't know quite how it will scale.

    Code:
    @echo off
    setlocal ENABLEDELAYEDEXPANSION
    REM Edit this value to change the name of the file that needs splitting. Include the extension.
    SET BFN=BigFile.csv
    REM Edit this value to change the number of lines per file.
    SET LPF=5000
    REM Edit this value to change the name of each short file. It will be followed by a number indicating where it is in the list.
    SET SFN=SplitFile
    
    REM Do not change beyond this line.
    
    SET SFX=%BFN:~-3%
    
    SET /A LineNum=0
    SET /A FileNum=1
    
    For /F "delims==" %%l in (%BFN%) Do (
    SET /A LineNum+=1
    
    echo %%l >> %SFN%!FileNum!.%SFX%
    
    if !LineNum! EQU !LPF! (
    SET /A LineNum=0
    SET /A FileNum+=1
    )
    
    )
    endlocal
    Pause
    
     
  5. bphilp

    bphilp Thread Starter

    Joined:
    Oct 25, 2011
    Messages:
    4
    Hey this is really great... almost there... now it just needs to iterate through the whole file creating SplitFile1...n ... as it is now it just creates 1 SplitFile1.csv.
     
  6. Squashman

    Squashman Trusted Advisor

    Joined:
    Apr 4, 2003
    Messages:
    19,783
    Need to fix that delims option in the FOR loop. One too many Equal signs.
     
  7. bphilp

    bphilp Thread Starter

    Joined:
    Oct 25, 2011
    Messages:
    4
    You guys are awesome.. it's working perfect thanks kindly!
     
  8. Sponsor

As Seen On
As Seen On...

Welcome to Tech Support Guy!

Are you looking for the solution to your computer problem? Join our site today to ask your question. This site is completely free -- paid for by advertisers and donations.

If you're not already familiar with forums, watch our Welcome Guide to get started.

Join over 733,556 other people just like you!

Thread Status:
Not open for further replies.

Short URL to this thread: https://techguy.org/1023949

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice