Written by

Developer at Matirx, Israel
Question Nael Nasereldeen · Apr 16, 2018

Reading a file and translating the content from UTF8 to 8-bit

Hi,

I need to read a UTF8 encoded text file and translate the content to 8-bit.

Using %File class and $ZCVT(TXT,"I","UTF8") works , but I see that if the content is larger than max string  (32000) and we cut the content

To max string chunks, we can get a <translate> error if we cut it in the "wrong" point..

Is there a better way to do this task?

My code looks like this:

    S file=##class(%File).%New(..LocalFileName)
    D file.Open("R")
    While 'file.AtEnd {    
        S Line=$ZCVT(Line,"I","UTF8")
    }
    D file.Close()

and an example of such an error:

USER>s str=$C(215)
USER>w $ZCVT(str,"I","UTF8")
W $ZCVT(str,"I","UTF8")
^
<TRANSLATE>

Regards,

Nael

Comments

Dmitry Maslennikov · Apr 16, 2018

I would recommend to use more suitable class for it. %Stream.FileCharacter when you can set TranslateTable property

Set stream=##class(%Stream.FileCharacter).%New()
Set sc=stream.LinkToFile("c:\myfile.txt")
Set stream.TranslateTable = "UTF8"
While 'stream.AtEnd {
	Set line=stream.Read()
	; Process the chunk here
}

And you don't need any conversions after that

0
Nael Nasereldeen  Apr 16, 2018 to Dmitry Maslennikov

Thanks Dmitry!

That does solve the problem.

Regards,

Nael

0
Nael Nasereldeen  Apr 16, 2018 to Robert Cemper

Thank you Robert,

We are aware of this parameter and we change it in some of our servers,

But right now we need to write code that works even without it being enabled.

Regards,

Nael

0
Robert Cemper  Apr 16, 2018 to Alexander Koblov

I like that !  yes

BUT: as in the original request I seem to miss the file.Read()  to fill Line somewhere in the loop.  wink

0
Nael Nasereldeen  Apr 16, 2018 to Alexander Koblov

Thanks Alexander!

that's a very useful tip, I should have read the whole $ZCONVERT documentation..

still, I think the most elegant solution to this specific problem is the suggestion by Dmitry- 

not using ZCVT at all, but using the TranslateTable Property of %Stream.FileCharacter

Regards,

Nael

0
Jean Cruz  Nov 4, 2020 to Dmitry Maslennikov

Thanks for the tip, it helped me a lot I just changed Read to ReadLine

0
Robert Cemper · Apr 16, 2018

Enable long strings in System Mgmt Portal and get strings up to 3.4 MB

System > Configuration > Memory and Startup​

 ​

0
Alexander Koblov  Apr 16, 2018 to Robert Cemper

Thank you, Robert. I've updated it.

0
Alexander Koblov  Apr 16, 2018 to Nael Nasereldeen

Sure! Let %Stream.FileCharacter do its job.

0
Alexander Koblov · Apr 16, 2018

Nael,

I think you need to use 4th argument of $zconvert:

Set file=##class(%File).%New(..LocalFileName)
Do file.Open("R")
Set handle=""
While 'file.AtEnd { 
    Set Line=$ZCVT(file.Read() , "I", "UTF8", handle)
   // do something with Line
}
Do file.Close()

Handle "contains the remaining portion of string that could not be converted at the end of $ZCONVERT, and supplies this remaining portion to the next invocation of $ZCONVERT."

Please see reference for $zconvert

0