Sean Connelly · Jul 28, 2019 go to post

Hi David,

In general +1 for the generic boiler plate approach.

In terms of a generic JSON solution is could do with some additional type checks to make sure they are output correctly, e.g. booleans as true / false / null, empty numbers being returned as null and not empty string etc.

So just for booleans as an untested example you might replace...

set $PROPERTY(tRow,tColumnName) = $PROPERTY(rSet,tColumnName)


with this..

do tRow.%Set(tColumnName,$PROPERTY(rSet,tColumnName),$Select(tColumn.clientType=16:"boolean",1:""))


The alternative, if just going from SQL to a serialised JSON string could be to just use SQL and JSON_ARRAYAGG as per the examples here...

https://community.intersystems.com/post/how-do-i-return-json-database-sql-call

Sean Connelly · Aug 3, 2019 go to post

Hi Evgeny,

The current GitHub version will only serialise and deserialise to and from class based objects.

I do however have several other solutions in the unofficial version which will efficiently serialise and deserialise to and from globals. I also have a pollyfill solution for DynamicObject and DynamicArray that uses a type mixer class that would allow older versions of Cache to work with these classes now.

However, I've not used these in production, only unit tested. I am happy to release them if there is a need / someone is willing to collaborate on production level testing and debugging.

Sean Connelly · Mar 31, 2017 go to post

Hi Scott,

Sounds like classic teapotism from the vendor.

Typically at this stage I would put Wireshark on the TCP port so that I have absolute truth as to what's going on at the TCP level.

If you see no evidence of these messages in Wireshark then you can bounce the problem back to the vendor with the Wireshark logs.

If you see evidence of messages, then you will have something more to go on.

One thing to look out for is if the HL7 messages are correctly wrapped. If you don't see evidence of ending 1c 0d hex values then the message will get stuck in the buffer. If they are dropping the connection then this can get discarded. You might see warnings relating to this, something like "discarding TCP buffer".

The fact that they think they are getting HL7 level ACK's back is a bit odd. Again with Wireshark you will be able to prove or disprove their observations. There is a scenario where by a timed out connection can collect the previous messages ACK, again it would be obvious once you look at the Wireshark logs.

If you need help with Wireshark then I have some notes digging around that might help.

Sean.

Sean Connelly · Apr 1, 2017 go to post

Hi Evgeny,

Not exactly one command, but it can be done on one line...

set file="foo.zip" do $System.OBJ.ExportToStream("foo*.GBL",.s) open file:("WNS":/GZIP=1) use file Do s.OutputToDevice() close file do s.Clear()

This should work in reverse opening the file with the GZIP flag, read the contents to a temporary binary stream and then using the $System.OBJ.LoadStream on the temp binary stream.

Sean.

Sean Connelly · Apr 3, 2017 go to post

Have you tried to set the segment back to the message?

Do target.SetSegmentAt(seg,idx)
Sean Connelly · Apr 3, 2017 go to post

Hi Tom,

Should have spotted this earlier. GetSegmentAt will return an immutable segment, so you shouldn't be able recycle that segment for other purposes.

If you create a new segment, then you might be able to set it at the old segments idx, but having never done it this way I wouldn't be 100% sure if it would.

By all means give it a go, but you should at least test the status and bubble it back up the stack so that you don't end up with silent failures. If there is an error it will apear in the logs.

set sc=target.SetSegmentAt(newsegment,idx)
if $$$ISERR(sc) quit sc

BUT, if I was doing this by hand, I would remove the segment with...

target.SetValueAt(,"PIDgrpgrp(1).ORCgrp(1).OBXgrp(1).OBX","remove","")

I've explicitly hard-coded the groups. Note that this path is a 2.4 schema path and may be different for other schema's. If your data does have repeating groups in it, then you will need to set these logically.

I would then set the two values using...

set sc=target.SetValueAt(pObservationIdentifier,"PIDgrpgrp(1).ORCgrp(1).NTE(1):SetIDNTE","set","") 
set sc=target.SetValueAt(pObservationValue,"PIDgrpgrp(1).ORCgrp(1).NTE(1):SourceofComment","set","")

However, I wouldn't do this by hand at all. Having developed 1000's of DTL's over the years, 95% of them have always been done via the Data Transformation Build tool. The code it generates will have no typo's in the schema paths, it will handle immutability for you, it will trap errors for you and you will end up with a more maintainable solution.

If anything, use the tool and inspect the code it generates to see the right way to develop by hand.

Sean

Sean Connelly · Apr 4, 2017 go to post

Try this...

ClassMethod Transform(source As EnsLib.HL7.Message, Output target As EnsLib.HL7.Message) As %Status
{
    set target=source.%ConstructClone(1)
    set seg=target.FindSegment("OBX",.idx,.sc)
    while idx'="",$$$ISOK(sc)
    {
        set ntestr = "NTE|"_$I(ident)_"|"_seg.GetValueAt(5)
        set nte = ##class(EnsLib.HL7.Segment).ImportFromString(ntestr,.sc,source.Separators) if $$$ISERR(sc) goto ERROR
        set sc=target.SetSegmentAt(nte,idx) if $$$ISERR(sc) goto ERROR        
        set seg=target.FindSegment("OBX",.idx,.sc)
    }
ERROR
    quit sc
}
Sean Connelly · Apr 4, 2017 go to post

Hi Paul,

Quotes inside quotes need to be escaped, your condition is only looking for one double quote, you will need to try this...

source.{PV1:DischargeDateTime()}=""""""

On a side note, quotes sent in HL7 can be used to nullify a value, e.g. if a previous message had sent a discharge date and time by mistake then "" would be a request to delete that value (as apposed to an empty value).

Sean.

Sean Connelly · Apr 6, 2017 go to post

Hi Bapu,

There is a really simple solution, no Zen required.

Put some pre tags on your web page...

<pre id="json-preview-panel"></pre>

If your JSON is an object then...

document.getElementById("json-preview-panel").innerHTML=JSON.stringify(json, undefined, 2);

Note that the third argument in stringify() is the number of spaces to insert for prettifying.

If your JSON is a string already then you will need to convert it to an object and then back again...

document.getElementById("json-preview-panel").innerHTML=JSON.stringify(JSON.parse(json),undefined,2);

Sean.

Sean Connelly · Apr 11, 2017 go to post

Probably not looking at the underlying code.

I would say its being raised by cspxmlhttp.js when it gets a non 200 status code.

If there was a sever side option then we would probably see some kind of conditional around either of these two functions...

function cspProcessResponse(req) {
  if(req.status != 200) {
    var errText='Unexpected status code, unable to process HyperEvent: ' req.statusText ' (' req.status ')';
    var err new cspHyperEventError(req.status,errText);
    return cspHyperEventErrorHandler(err);
  }

...

}

function cspHyperEventErrorHandler(error)
{
  if (typeof cspRunServerMethodError == 'function'return cspRunServerMethodError(error.text,error);
  alert(error.text);
  return null;
}

Sean Connelly · Apr 12, 2017 go to post

As an alternative to...

   s cn=##Expression($$$quote(%classname))

You could just do...

  cn=$CLASSNAME()

Sean Connelly · Apr 12, 2017 go to post

Sounds like you might be adding unnecessary complexity.

> What is the most efficient way to process this large file?

That really depends on your definition of efficiency.

If you want to solve the problem with the least amount of watts then solving the problem with a single process would be the most efficient.

If you add more processes then you will be executing additional code to co-ordinate responsibilities. There is also the danger that competing processes will flush data blocks out of memory in a less efficient way.

If you want to solve the problem with speed then its important to understand where the bottlenecks are before trying to optimise anything (avoid premature optimisation).

If your process is taking a long time (hours not minutes) then you will most likely have data queries that have a high relative cost. It's not uncommon to have a large job like this run 1000x quicker just by adding the right index in the right place.

Normally I would write a large (single) process job like this and then observe it in the management portal (System>Process>Process Details). If I see its labouring over a specific global then I can track back to where the index might be needed.

You will then get further efficiencies / speed gains by making sure the tables are tuned and that Cache has as much configured memory cache as you can afford.

If you are writing lots of data during this process then also consider using a temporary global that won't hit the transaction files. If the process is repeatable from the file then there is no danger of losing these temp globals during a crash as you can just restart the job after the restore.

Lastly, I would avoid using Ensemble for this. The last thing you want to do is generate 500,000 Ensemble messages if there is no need to integrate the rows of data with anything other than internal data tables.

Correction. It's perfectly fine (for Ensemble) to ingest your file and process it as a single message stream. What I wouldn't do is split the file into 500,000 messages when there is no need to do this. Doing so would obviously cause additional IO. 

Sean Connelly · Apr 12, 2017 go to post

You could create a macro that conditionally compiles the break points into your code based on the existence of a global variable.

Sean Connelly · Apr 13, 2017 go to post

Hi James,

Nothing simple that I can think of (perhaps DeepSee?).

Alternatively, I normally bash out a few lines of code, something like this... 

ClassMethod DisplaySegmentStats(){
  write !!,"Segment Statistics...",!!
  &sql(declare hl7 cursor for select rawContent into :raw from EnsLib_HL7.Message)
  &sql(open hl7)
  &sql(fetch hl7)
  while SQLCODE=0
  {
    for i=1:1:$l(raw,$C(13))-1
    {
      set seg=$p($p(raw,$c(13),i),"|")
      set stats(seg)=$G(stats(seg))+1
    }
    &sql(fetch hl7)
  }
  &sql(close hl7)
  zw stats}
Sean Connelly · Apr 13, 2017 go to post

Various Options...

1. Call out to Java on the command line using $ZF

https://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KE…

2. Access POJO's directly using Jalapeño

https://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KE…

3. Consume a web service using the Cache soap wizard...

https://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KE…

4. Publish a web service from Cache...

https://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KE…

Sean Connelly · Apr 18, 2017 go to post

Hi Alexandr,

If property is in the context of this class then you could try

set value=$property($THIS,propName)

If you make the code blocks outer method a generator, e.g. [ CodeMode = objectgenerator ]

then you can bake the property accessor into the underling INT code, e.g.

do %code.WriteLine(" set value=.."_propName)

this approach means you don't have to make any repetitive IO calls to dictionary at run time.

If you need an expanded example then I can bash something out.

Sean.

Sean Connelly · Apr 19, 2017 go to post

Hi Greg,

The only zip utility that I have come across is in Healthshare (core 10+).

If you have Healthshare then take a look at...

HS.Util.Zip.Adapter


If you don't have Healthshare then it's still easy enough to do via the command line with $zf...

https://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY=RCOS_fzf-1

First, if you are on windows then there is no built in command line outside of powershell. You will need to install 7zip (btw, Healthshare defaults to 7zip on windows as well). If you are on Linux then there is a built in zip command, but you might also chose to install 7zip as well.

Couple of trip hazards.

If you are building the command line on windows then 7zip will be installed in "Program Files" with a space, so you will need to wrap quotes around the exe path, which will need double quoting in a cache string.

If you are unzipping to a directory, the directory needs to exist first. Take a look at CreateDirectoryChain on the %File class to make this easier to do.

A simple untested example...

ClassMethod ZipFile(pSourceFile As %String, pTargetFile As %String) As %Status
{
    set cmd="""C:\Program Files\7-Zip\7z.exe"" a "_pTargetFile _" "_pSourceFile
    set status=$zf(-1,cmd)
    if status=0 quit $$$OK
    quit $$$ERROR($$$GeneralError,"Failed to zip, reason code: "_status)
}


Anyone landing here and happy just to use gzip, then there was a recent discussion here...

https://community.intersystems.com/post/there-option-export-globals-archive

Hope that helps.

Sean.

Sean Connelly · Apr 19, 2017 go to post

Have you tried...

set p=$property(parentObject,"childRefProperty")
do p.Insert(childObject)
Sean Connelly · Apr 19, 2017 go to post

I just tried the above out and it works.

You can do it more succinctly, but you must use "set" on $property as a "do" will throw a compile error for some reason...

set sc=$property(parentObject,"childRefProperty").Insert(childObject)
Sean Connelly · Apr 19, 2017 go to post

Hi Scott,

The %Stream package superseded the stream classes in the %Library package. If you look at the class documentation you will see in the descriptions that the %Library stream classes have been deprecated in favour of the %Stream variants. The only reason they still exist would be for legacy implementations.

The other difference is that one is a character stream and the other is a binary stream. As a general rule you should only write text to the character stream and non text (e.g. images) to the binary stream. The main reason for this is to do with unicode characters. You may not have seen issues writing text to %FileBinaryStream, but that might well be because your text didn't have any unicode conversions going on.

Performance wise I'm not sure there would be much in it between the two. You can access the source code of both and they both use the same underlying raw code for reading and writing to files. If you benchmarked them then I guess you would see a marginal difference, but not enough to question which one to use for best performance.

I wonder, how did you determine that the logIt code was the reason for messages slowing down. On the surface it should only have a small impact on the message throughput. If messages are queueing up then it almost feels like this is just the first observation of an overall performance issue going on. I guess you have monitored overall IO performance. If it's already under strain then this could be the straw that breaks the camels back.

On a curious note, whilst you might have needed to log messages in eGate, I wonder why this would be necessary in Ensemble. Unless you are using in memory messaging, all of your messages will be automatically logged internally, as well as being tailed to the transaction logs. By adding your own logging you are effectively writing the same message to disk not twice but three times. If you also have IO logging enabled on your operation then it will be four times. Not to mention how many times the message was logged before the operation. On top of that, if you have log trace events enabled in production then the IO overhead for just one messages is going to thrash the disks more than it needs to. Multiply that across your production(s) and how well IO is (or is not) spread over disks and it would be easy to see how a peak flow of messages can start to queue.

Another reason I see for messages queuing (due to IO thrashing) is because of poor indexes elsewhere in the production. A data store that worked fast in development will now be so large that even simple lookups will hog the disks and flush out memory cache putting an exponential strain on everything else. Suddenly a simple bespoke logger feels like its writing at the speed of a ZX Spectrum to a tape recorder.

Of course you may well have a highly tuned system and production and all of this is a rambling spam from me. In which case, nine times out of ten if I see messages queuing its just because the downstream system can't process messages as quickly as Ensemble can send them.

Sean.

Sean Connelly · Apr 20, 2017 go to post

Hi Everardo,

There is an extra couple of compilation steps required for the web method.

Each web method requires its own separate message descriptor class. This class contains the arguments of your method as properties of the class, e.g.
 

Property file As %Library.String(MAXLEN = "", XMLIO = "IN");
Property sql As %Library.String(MAXLEN = "", XMLIO = "IN");


This extra class is required to provide a concrete API to your web method. The web service description will project this class as a complex type that the calling services needs to adhere to.

What I think is happening is that when you have an argument called args... that the compiler is trying to compile
 

Property args... As %Library.String(MAXLEN = "", XMLIO = "IN");


Which would fail with an invalid member name error (which correlates with the 5130/5030 error code you have).

I think the main issue here is that there is nothing (to the best of my knowledge) in the SOAP specification that allows for variadic types.

Instead what you want is an argument type that can be projected as a list or an array, e.g.
 

ClassMethod GenerateFileFromSQL(file As %String, sql As %String, delimiter As %String = "", args As %ListOfDataTypes) As %String [ WebMethod ]


That will then be projected in the WSDL as a complex type with an unbounded max occurs, allowing the client to send any number of repeating XML elements for the property args.

If you pass args as %ListOfDataTypes to your non web method then you will need to decide if that method should have the same formal spec, or overload it, something like...
 

if $IsObject(args(1)),args(1).%IsA("%Library.ListOfDataTypes") {
  set list=args(1)
  for i=1:1:list.Count() {
    write !,list.GetAt(i)
  }
} else {
  for i=1:1:args {
      write !,args(i)
  }
}


Sean.

Sean Connelly · Apr 20, 2017 go to post

Check that you have not lost connection just before the Store() method...

ftp.Connected

Also check the value of

ftp.ReturnMessage

just after the Store() method, if there was a failure then this would have something useful to go on.

Sean.

Sean Connelly · Apr 20, 2017 go to post

The error message is heavily escaped, it would look like this...

{"Info":{"Error":"ErrorCode":"5001","ErrorMessage":"ERROR #5001: Cannot find Subject Area: 'SampleCube'"} } }

This error is only raised in the %ParseStatement method of the %DeepSee.Query.Parser class.

I'm at the limits of what I know on DeepSee, but if I read this as it looks, there is a missing cube called SampleCube?

Sean Connelly · Apr 20, 2017 go to post

This is what I would do.

Create a custom process and extract the value using GetValueAt and put it into a string container. String containers are handy Ens.Request messages that you can use to move a strings around without needing to create a custom Ens.Request message. Then just send it async to an operation that will decode the base64 and write it to a file. Two lines of code, nice and simple...

Class My.DocExtractor Extends Ens.BusinessProcess [ ClassType = persistent ]
{
Method OnRequest(pRequest As Ens.Request, Output pResponse As Ens.Response) As %Status
{
    Set msg=##class(Ens.StringContainer).%New(pRequest.GetValueAt("OBX(1):5.5"))
    Quit ..SendRequestAsync("FILE OUT",msg,,"Send DOC as Base64 to a file writer")
}
}


To decode the base64 use this method inside your operation.

set decodedString=##class(%SYSTEM.Encryption).Base64Decode(pRequest.StringValue)


Things to consider...

1. You mention the message is 2.3, but the MSH has a 2.4
2. If you set your inbound service to use either of the default schemas for these two then you will have a problem with required EVN and PV1 segments
3. Therefore you will need to create a custom schema and make these optional.
4. The base 64 decode method is limited to a string, so your PDF documents can not be greater than 3.6MB (assuming large string support is on be default).
5. You probably don't want to decode the document into another message too soon, do this just before writing to a file

T02 to ITK should just be a matter of creating a new transform and dragging the OBX(1):5.5 field onto the reflective target field.

Sean Connelly · Apr 21, 2017 go to post

You could buffer it up, but you will still have a period of writing to disk where the other collection process could grab it mid write.

I normally write the file to a temp folder or use a temp file name and then change the file name once its been fully written to. Making sure the collection process ignores the temp file extension or the temp folder location.

Sean Connelly · Apr 25, 2017 go to post

Hi Andre,

1. Description might be empty whilst NTE might actually exist, so it would be better to do...

request.GetValueAt("NTE(1)")'=""


2. If you are looking to shave off a few characters then take a look at $zdh...

https://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KE…

$zdh((source.{G62:Date},8)


3. For time there is $zth, but it expects a : in the time (e.g. HH:MM).

https://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KE…

It doesn't look like you have a colon, so you could use $tr to reformat the time first. This will save you 10 or so characters...

$zth($tr("Hh:Mm","HhMm",source.{G62:Time}))


4. You can have more than one PID group, but you should not see more than one PID segment in a PID group. Not all message types define (or allow) a repeating PID group. You might see multiple PID groups in an ADT merge message. You might also see bulk records in an ORU message, but in the real world probably not. If you know the implementation only ever sends one PID group then you will see many developers just hard coding its ordinal key to 1, e.g.

request.GetValueAt("PIDgrpgrp(1).ORCgrp(notHardcodedKey)")


Developing with HL7 can feel verbose to begin with. Tbh, what you are currently doing is all perfectly fine and acceptable.

Sean.