Reading strings from unicode file using getline in c++

Status
This thread has been Locked and is not open to further replies. Please start a New Thread if you're having a similar issue. View our Welcome Guide to learn how to use this site.

dev_tyagi

Thread Starter
Joined
Jul 2, 2005
Messages
2
hi everyone
i am using C++ for reading unicode formated file i have to read strings to a specified deliminator but this funcion is not working poperly with unicode file
so please help me out to do so
the code i m writing for reading string is
fstream fp,fp1;
fp.open("filename",ios::in);
if (!fp.is_open())
{
cout << "couldn't open file" << endl;
exit(3);
}


else
while(!fp.eof())
{
fp.getline(buffer,2500,'¶');
fp1.open("c:\\notfounddata2.txt",ios::eek:ut);
fp1.write(buffer,2500);
cout<<buffer;
}
fp.close();
fp1.close();
but the prob is its not reading the file properly
so what should i do i m new to c++

with regards
dev
 
Joined
Jun 15, 2005
Messages
431
I'm a C (not C++) programmer, but doesn't getline() read only until it encounters a newline character (or CR/LF for DOS)?

Also, I see a problem in that you are opening your output file inside the loop and closing it outside the loop. Open it once and close it once.
 
Joined
Apr 30, 2001
Messages
2,636
^^ only with non-member getline().

For member getline(), you can specify how many characters to read and specify the delimeter.

@dev_tyagi
If it was a regular file, you could just use 182 or 0xB6 for the getline delimiter, but not sure about unicode.

For visual c++, I think you can use wfstream, but not sure.
 

dev_tyagi

Thread Starter
Joined
Jul 2, 2005
Messages
2
HI
THANKX FOR UR REPLY
BUT THAT WAS JUST A SAMPLE CODE SO BY MISTAKE I HAD WRITTEN LIKE THAT ACTUALLY I WAS WRITTEN LIKE THIS BUT ITS NOT READING
char buffer[300][2500];
fstream fp,fp1;
int i=0;
fp.open("c:\\notfounddata1.txt",ios::in|std::ios::binary);
fp1.open("c:\\notfounddata2.txt",ios::eek:ut|std::ios::binary);
if (!fp.is_open())
{
std::cout<<"couldn't open file" << endl;
exit(3);
}

else
while(!fp.eof())
{
fp.getline(buffer,2500);
fp1<<buffer<<endl;
std::cout<<buffer<<endl;
i++;
}


fp.close();
fp1.close();
 
Joined
Apr 30, 2001
Messages
2,636
Still not sure about the unicode part and getline, but you can ask here and you'll get your answer.

A couple of tips though.

instead of

fstream instance1, instance2
instance1.open("file.txt", ios::in | ios::binary);
instance2.open("file.txt", ios::eek:ut | ios::binary);


you can do

ifstream instance1("file.txt", ios::binary);
ofstream instance2("file.txt", ios::binary);

Also, you shouldn't use exit(). Destructors for anything won't be called. In main() when you want to exit with an error, use return so the destructors get called.

Here's just a simple text file copier as an example.

Code:
#include <iostream>
#include <string>
#include <fstream>

using namespace std;

int main() {
    ifstream in("in.txt");
    if (!in) {
        cout << "\n" << "Error reading in.txt" << endl;
        return 1;
    }
    ofstream out("out.txt");
    if (!out) {
        cout << "\n" << "Error writing to out.txt" << endl;
        return 1;
    }
    for (string s; getline(in,s) ; ) {
        out << s;
        if ( !in.eof() ) {
            out << "\n";
        }
    }
}
In this case, I don't have to .close() the streams as they will be destructed when main() reaches the end of its scope. Now for example, if the stream is open and I need to delete the file, I'd need to .close() it, but instead of using .close(), it's usually better to put the stream operations in a function and let the stream be closed when the function reaches the end of its scope.
 
Joined
Apr 30, 2001
Messages
2,636
I did figure this out though.


Code:
#include <iostream>
#include <fstream>

using namespace std;

int main() {
    // Write  ¶ to file in unicode
    ofstream out("out.txt", ios::binary);
    if (!out) {
        return 1;
    }
    out << static_cast<char>(0xFFFFFFFF) 
        << static_cast<char>(0xFFFFFFFE) 
        << static_cast<char>(0xFFFFFFB6) 
        << static_cast<char>(0);
}

When you save in unicode with EditPlus for example , each char is 32bits long or 16bits long depending. So if "&" is the first character in the file, the first four characters you get when you read the file will make up the bits of &.

Are you talking about utf-8, utf-16 or utf-32 specifically?

If utf-8, you'd write ¶ to a file like this.

Code:
out << static_cast<char>(0xFFFFFFC2);
out << static_cast<char>(0xFFFFFFB6);
That'll give you some hints, but I would ask at that link I posted.

(You don't really need to use binary mode though)
 
Status
This thread has been Locked and is not open to further replies. Please start a New Thread if you're having a similar issue. View our Welcome Guide to learn how to use this site.

Users Who Are Viewing This Thread (Users: 0, Guests: 1)

As Seen On
As Seen On...

Welcome to Tech Support Guy!

Are you looking for the solution to your computer problem? Join our site today to ask your question. This site is completely free -- paid for by advertisers and donations.

If you're not already familiar with forums, watch our Welcome Guide to get started.

Join over 807,865 other people just like you!

Latest posts

Staff online

Top