Question Natasa Klenovsek Arh · Jan 19, 2017

Import data with special charactes (čšž)

Does anyone has any experiance importing data to Cache which contains special characts like ščž?

I have tried several options, but nothing really works. :)

thanks

Comments

Dmitry Maslennikov · Jan 19, 2017

Please add more details, how did you try to do it, and $zversion of your instance. 

0
Alexander Koblov · Jan 19, 2017

Natasha,

we need much more details.

What exactly did you tried? How do you import data? What's the source of the data? What do you mean by "nothing works"?

What version of Caché do you have (exact $zv)? What locale this instance have?

I have no problems importing this data from UTF-8 file.

USER>set f = ##class(%Stream.FileCharacter).%New()

USER>write f.LinkToFile("c:\temp\demo.txt")
1
USER>set line = f.Read()
 
USER>write line
ščž
0
Natasa Klenovsek Arh  Jan 19, 2017 to Alexander Koblov

Version of Cache 2016.2.1.803, source of data is csv exported from mysql

I tried importi with different charset types, i tried importing it with $SYSTEM.SQL.DDLImport, I also tried the same as you, but i get a result ??? instead of ščž

%SQL.Import.Mgr

0
Dmitry Maslennikov  Jan 19, 2017 to Natasa Klenovsek Arh

this version number is not enough, please show output from this command

write $zv

Looks like, your installation in 8-bit mode instead of Unicode. In that case such behavior possible. That's why we are asking for full version name from $zv, 

0
Natasa Klenovsek Arh  Jan 19, 2017 to Dmitry Maslennikov

aha i see :)

write $zv detail: Cache for Windows (x86-64) 2016.2.1 (Build 803U_SU) 

0
Alexander Koblov  Jan 19, 2017 to Natasa Klenovsek Arh

Thanks.

803U means that installation is Unicode.

  1. What locale do you have? You can check it in Management Portal -> System Administration -> Configuration -> National Language Settings -> Locale Definitions.

  2. Please double-check that file is indeed in UTF-8 format.

  3. Can you reproduce this problem with small file? For example, create small text file with just three symbols: ščž Save it in UTF-8 And read it as above in my example. Does it work?

0
Natasa Klenovsek Arh  Jan 19, 2017 to Alexander Koblov

Thank you.

You were right with the first solution, the language was set to English :)

Thank you again, it's working now.

0
Alexey Maslov  Jan 20, 2017 to Alexander Koblov

It seems to be pretty peculiar case, at least for me. Sounds like Cache is supplied with English Unicode locale which does not support all unicode characters. What for?

0
Alexander Koblov  Jan 23, 2017 to Alexey Maslov

Locale enuw has RAW translation table for reading from files. Some other locales, for example rusw, has UTF8 translation table.

So when reading UTF8 in enuw locale you need to specify translation table explicitly. Or have locale with default file translation table UTF8.

0
Evgeny Shvarov  Jan 24, 2017 to Natasa Klenovsek Arh

Hi, Natasa!

If Alexander's answer fits the question for you would you please mark it as "Accepted"?

Thank you in advance!

0