Reading a file and translating the content from UTF8 to 8-bit
Hi,
I need to read a UTF8 encoded text file and translate the content to 8-bit.
Using %File class and $ZCVT(TXT,"I","UTF8") works , but I see that if the content is larger than max string (32000) and we cut the content
To max string chunks, we can get a <translate> error if we cut it in the "wrong" point..
Is there a better way to do this task?
My code looks like this:
S file=##class(%File).%New(..LocalFileName)
D file.Open("R")
While 'file.AtEnd {
S Line=$ZCVT(Line,"I","UTF8")
}
D file.Close()
and an example of such an error:
USER>s str=$C(215) USER>w $ZCVT(str,"I","UTF8") W $ZCVT(str,"I","UTF8") ^ <TRANSLATE>
Regards,
Nael
Comments
I would recommend to use more suitable class for it. %Stream.FileCharacter when you can set TranslateTable property
Set stream=##class(%Stream.FileCharacter).%New()
Set sc=stream.LinkToFile("c:\myfile.txt")
Set stream.TranslateTable = "UTF8"
While 'stream.AtEnd {
Set line=stream.Read()
; Process the chunk here
}And you don't need any conversions after that
Thanks Dmitry!
That does solve the problem.
Regards,
Nael
Thank you Robert,
We are aware of this parameter and we change it in some of our servers,
But right now we need to write code that works even without it being enabled.
Regards,
Nael
I like that ! ![]()
BUT: as in the original request I seem to miss the file.Read() to fill Line somewhere in the loop. ![]()
Thanks Alexander!
that's a very useful tip, I should have read the whole $ZCONVERT documentation..
still, I think the most elegant solution to this specific problem is the suggestion by Dmitry-
not using ZCVT at all, but using the TranslateTable Property of %Stream.FileCharacter
Regards,
Nael
Thanks for the tip, it helped me a lot I just changed Read to ReadLine
Enable long strings in System Mgmt Portal and get strings up to 3.4 MB
System > Configuration > Memory and Startup
Thank you, Robert. I've updated it.
Sure! Let %Stream.FileCharacter do its job.
Nael,
I think you need to use 4th argument of $zconvert:
Set file=##class(%File).%New(..LocalFileName)
Do file.Open("R")
Set handle=""
While 'file.AtEnd {
Set Line=$ZCVT(file.Read() , "I", "UTF8", handle)
// do something with Line
}
Do file.Close()
Handle "contains the remaining portion of string that could not be converted at the end of $ZCONVERT, and supplies this remaining portion to the next invocation of $ZCONVERT."
Please see reference for $zconvert