Traverse Global Subscripts while using Indirection
Hi all,
I am trying to create a method to count the number of entries in a global, including all subscripts. I am having a bit of trouble getting the code to make it to the second subscript. When I get to the position where my key is "Canada" and I add a comma and empty quotes to it, it returns USA as the new key when I do the order function. Is the $Order or the global not able to use a single string to represent multiple subscripts?
Here is my global structure:
^Locations("Canada",1)="Montreal"
^Locations("Canada",2)="Vancouver"
^Locations("USA",1)="Michigan"
^Locations("USA",2)="Ohio"
^Locations("USA",3)="Florida"
Here is my method:
ClassMethod RecursiveGlobalCount(pGlobalName As %String, pKey As %String, pCount As %Integer) As %Integer
{
///if pCount is not populated set to zero for first run
if $Data(pCount)=0
{
set tCount = 0
}
else
{
set tCount = pCount
}
//pKey should only be undefined on first run
if $Data(pKey)=0
{
set tKey = $Order(@pGlobalName@(""))
w $Data(@pGlobalName@(tKey))
while tKey'=""
{
//check to see if global has descendents
if $Data(@pGlobalName@(tKey))=10
{
do ..RecursiveGlobalCount(pGlobalName,tKey,tCount)
}
set tKey = $Order(@pGlobalName@(tKey))
set tCount=tCount+1
}
}
else
{
set tKey = pKey
if ($Data(@pGlobalName@(tKey))=1)
{
set tKey=""
set tKey = $Order(@pGlobalName@(""))
while tKey'=""
{
set tKey = $Order(@pGlobalName@(tKey))
set tCount=tCount+1
}
}
elseif ($Data(@pGlobalName@(tKey))=10)
{
set tKey=pKey_","""""
set tKey = $Order(@pGlobalName@(tKey))
while tKey'=""
{
set tKey = $Order(@pGlobalName@(tKey))
set tCount=tCount+1
}
}
}
}Comments
The following will count the number of data nodes under a given ^Location(country)
USER>S G1=$NA(^Locations("Canada"))
USER>S G=$E(G1,1,$L(G1)-1)
USER>W G
^Locations("Canada"
USER>W G1
^Locations("Canada")
USER>F S G1=$Q(@G1) Q:G1=""!($E(G1,1,$L(G))'=G) S CT=$I(CT)
USER>W CT
2
Yes, you have right, thank you for the hint. One never should add an alternate function without testing it!
The correct form is:
ClassMethod CountQ(node) As %Integer{ s end=node if $data(@node)#10 { set sum=1 } else { set sum=0 } while 1 { set node=$query(@node) quit:node=""||($name(@node,$qlength(end))'=end) if $increment(sum) } quit sum}I noticed that a couple of folks changed the original:
set tCount=tCount+1
to:
if $increment(sum)
I wondered if that was in fact a performance improvement, so wrote:
s lim=1000000
s start=+$p($now(),",",2)
s count=0
for i=1:1:lim { s count=count+1 }
w count_" count=count+1: "_((+$p($now(),",",2))-start)_" seconds",! s start=+$p($now(),",",2)
s count=0
for i=1:1:lim { s count=1+count }
w count_" count=1+count: "_((+$p($now(),",",2))-start)_" seconds",!
s start=+$p($now(),",",2)
s count=0
for i=1:1:lim { if $i(count) }
w count_" if $i(count): "_((+$p($now(),",",2))-start)_" seconds",!
The result is:
1000000 count=count+1: .010256 seconds
1000000 count=1+count: .008554 seconds
1000000 if $i(count): .024483 seconds
So, "s count=1+count" is a little faster than "s count=count+1", but 3 time faster than "if $i(count)".
set count=count+1 and set count=1+count generate identical object code, so I think we found your margin of error.
if $increment(count) has to set $test, so I would expect it to be slower for a local variable. (I'm not sure about a global.) In IRIS 2018.2 and later, do $increment(count) may close the gap a bit.
Also, if you use $I() or $seq() for id generation it's not comparable with count=count+1 cause $I is not reversible in trollbacks
With time measurements keep in mind:
- usually, you are not alone on a Cache server
There are many other processes, some of them belongs to Cache other to the OS
- the time resolution (whatever you use: $now(), $zh) is also limited
- it depends also on the time, how long your mesurement runs (you are not alone!)
This is my short testroutine:
Times(iter=1E3,count=4) ; show times
w ?3,"count num+1 1+num =$i() $i()",!
w ?15,"times in microseconds",!
w $tr($j("",40)," ",-1),!
f i=1:1:count d time(iter) s iter=iter*10
q
time(iter)
{
s f=1E6/iter // factor for "one operation in microseconds"
w $j(iter,8)
s num=0,t=$zh f i=1:1:iter { s num=num+1 } d t($zh-t*f)
s num=0,t=$zh f i=1:1:iter { s num=1+num } d t($zh-t*f)
s num=0,t=$zh f i=1:1:iter { s num=$i(num) } d t($zh-t*f)
s num=0,t=$zh f i=1:1:iter { i $i(num) } d t($zh-t*f)
w !
}
t(t)
{
w $j(t,8,3)
}
and this is the output
USER>d ^Times(1,8) count num+1 1+num =$i() $i() times in microseconds ---------------------------------------- 1 2.000 1.000 2.000 1.000 10 0.100 0.100 0.100 0.200 100 0.030 0.030 0.080 0.080 1000 0.044 0.042 0.088 0.090 10000 0.028 0.028 0.075 0.077 100000 0.027 0.027 0.064 0.050 1000000 0.018 0.014 0.031 0.032 10000000 0.011 0.011 0.031 0.032 USER>d ^Times(1,8) count num+1 1+num =$i() $i() times in microseconds ---------------------------------------- 1 4.000 0.000 2.000 1.000 10 0.100 0.100 0.100 0.100 100 0.040 0.030 0.080 0.580 1000 0.044 0.041 0.088 0.088 10000 0.028 0.028 0.075 0.077 100000 0.027 0.027 0.073 0.076 1000000 0.027 0.021 0.032 0.032 10000000 0.011 0.011 0.031 0.032 USER>d ^Times(1,8) count num+1 1+num =$i() $i() times in microseconds ---------------------------------------- 1 3.000 1.000 2.000 1.000 10 0.100 0.000 0.100 0.100 100 0.040 0.030 0.080 0.590 1000 0.045 0.041 0.088 0.090 10000 0.028 0.028 0.075 0.077 100000 0.027 0.027 0.073 0.075 1000000 0.015 0.012 0.031 0.032 10000000 0.011 0.011 0.031 0.032 USER> USER> USER>d ^Times(1,8) count num+1 1+num =$i() $i() times in microseconds ---------------------------------------- 1 3.000 0.000 3.000 1.000 10 0.100 0.000 0.100 0.100 100 0.030 0.030 0.080 0.630 1000 0.046 0.042 0.088 0.090 10000 0.028 0.028 0.075 0.077 100000 0.027 0.027 0.073 0.075 1000000 0.014 0.012 0.032 0.032 10000000 0.011 0.011 0.031 0.032 USER>
I consider time measurements only as a rough approximations
Interesting. I added a loop for if $increment(num) {} (i.e., a new-style if statement that doesn't set $test): no measurable improvement over legacy if.
I also added a loop for do $increment(num) (i.e., a do statement that neither sets $test nor returns a value): ever so slightly slower.
USER>d ^Times(1,8)
count num+1 1+num =$i() $i() $i(){} d $i()
times in microseconds
--------------------------------------------------------
1 1.000 0.000 1.000 0.000 0.000 1.000
10 0.000 0.000 0.100 0.100 0.000 0.000
100 0.010 0.010 0.490 0.030 0.030 0.040
1000 0.042 0.011 0.029 0.034 0.029 0.033
10000 0.011 0.010 0.030 0.032 0.032 0.031
100000 0.009 0.010 0.030 0.028 0.027 0.031
1000000 0.009 0.010 0.028 0.028 0.027 0.031
10000000 0.010 0.010 0.028 0.028 0.028 0.031
Incidentally, here are some results with num renamed to ^num:
USER>d ^Times(1,8)
count num+1 1+num =$i() $i() $i(){} d $i()
times in microseconds
--------------------------------------------------------
1 2.000 0.000 2.000 1.000 0.000 0.000
10 0.100 0.200 0.100 0.100 0.100 0.100
100 1.070 0.280 0.130 0.110 0.100 0.110
1000 0.142 0.144 0.142 0.102 0.102 0.106
10000 0.142 0.141 0.110 0.116 0.104 0.108
100000 0.142 0.141 0.102 0.101 0.100 0.104
1000000 0.139 0.140 0.100 0.098 0.100 0.102
10000000 0.138 0.138 0.098 0.098 0.099 0.102
For "=$i()", I assigned a local, rather than redundantly assigning the global.
Timings are always variable, but the general trends are clear ("count=1+count" still wins).
I added a $seq test:
SET $SEQ(^myseq)=1
for i=1:1:lim { if $SEQ(^myseq) }
w count_" if $SEQ(^myseq)): "_((+$p($now(),",",2))-start)_" seconds",!
Results:
1000000 count=count+1: .010362 seconds
1000000 count=1+count: .007998 seconds
1000000 if $i(count): .025006 seconds
1000000 if $SEQ(^myseq)): .099028 seconds
Do you really think it makes a difference if my routine contains "set xx=xx+1" instead of "set xx=1+xx"?
If yes, try the following:
Times2 ; execution time measurement
s num=0,t=$zh f i=1:1:1E6 { s num=num+1 } w $j($zh-t,8,6),!
s num=0,t=$zh f i=1:1:1E6 { s num=num+1 } w $j($zh-t,8,6),!
q
my output values are
USER>d ^Times2 0.047048 0.038218 USER>d ^Times2 0.034727 0.035160 USER>d ^Times2 0.044252 0.036175 USER>d ^Times2 0.045639 0.035366
Both loops are exactly the same! And now, please explain why the times are partly more than 20% different?
Sorry I was hasty in my judgement of "num+1". "1+num" is not faster.
I am not a performance/benchmark expert, but, as noted earlier, timings will vary because the OS is doing other things.
Increasing the loop from 1E6 to 1E7, and repeating/alternating the tests in the program, my laptop was fairly consistent:
+1: 0.204812
1+: 0.201091
+1: 0.201526
1+: 0.207091
+1: 0.20308
1+: 0.201488
+1: 0.202613
1+: 0.202009
With the bellow code, you will can count all nodes.
USER>S COUNTRY="",COUNT=0,NIV=""
USER>F S COUNTRY=$O(^Locations(COUNTRY)) Q:COUNTRY="" F S NIV=$O(^Locations(COUNTRY,NIV)) Q:NIV="" S COUNT=COUNT+1
USER>W COUNT
5
Count only USA
USER>S COUNTRY="USA",COUNT=0,NIV=""
USER>F S NIV=$O(^Locations(COUNTRY,NIV)) Q:NIV="" S COUNT=COUNT+1
USER>W COUNT
3
Count only Canada
USER>S COUNTRY="Canada",COUNT=0,NIV=""
USER>F S NIV=$O(^Locations(COUNTRY,NIV)) Q:NIV="" S COUNT=COUNT+1
USER>W COUNT
2
Hi Flávio,
That works with this particular global, but the method I am trying to create would ideally accept any global with any number of subscripts.
$Query is the command that will traverse the Global. Below is my version of the task at hand with testing.
Class Test.Test1 Extends (%RegisteredObject, %XML.Adaptor)
{
ClassMethod RecursiveGlobalCount(pGlobalName As %String, pKey As %String, pCount As %Integer) As %Integer
{
/// pCount is zero if not provided
Set pCount = +$Get(pCount)
Set tKey = $Get(pKey)
//pKey should only be undefined on first run
Set tKey = $Query(@pGlobalName@(tKey))
If (tKey '= "") Set pCount = 1 + pCount // setting tKey got the first node
For {
Set tKey = $Query(@tKey)
If ((tKey = "") || (tKey '[ pKey)) Quit
Set pCount = 1 + pCount
}
Quit pCount
}
}
Testing:
Write ##class(Test.Test1).RecursiveGlobalCount("^Locations","",0)
5
Write ##class(Test.Test1).RecursiveGlobalCount("^Locations","Canada",0)
2
Write ##class(Test.Test1).RecursiveGlobalCount("^Locations","USA",0)
3
Wow! Very clean and dynamic.
Hi Alan,
Thanks! That works exactly like I was thinking. Can you change your comment to an answer, or post again as an answer so I can accept it? Thanks again!
When you're using subscript indirection with a recursive $order traversal, you may find the $name function useful; e.g.,
do ..RecursiveGlobalCount($na(@pGlobalName@(tKey)),"",.tCount)
As the other answers suggest, you probably want $query instead of $order, but $order can be useful for summarizing on multiple subscript levels (e.g., count, min, and max per country, state, and city).
Class Test.Test1 Extends (%RegisteredObject, %XML.Adaptor)
{
ClassMethod RecursiveGlobalCount(pGlobalName As %String, pKey As %String, pCount As %Integer) As %Integer
{
/// pCount is zero if not provided
Set pCount = +$Get(pCount)
Set tKey = $Get(pKey)
//pKey should only be undefined on first run
Set tKey = $Query(@pGlobalName@(tKey))
If (tKey '= "") Set pCount = 1 + pCount // setting tKey got the first node
For {
Set tKey = $Query(@tKey)
If ((tKey = "") || (tKey '[ pKey)) Quit
Set pCount = 1 + pCount
}
Quit pCount
}
}
Testing:
Write ##class(Test.Test1).RecursiveGlobalCount("^Locations","",0)
5
Write ##class(Test.Test1).RecursiveGlobalCount("^Locations","Canada",0)
2
Write ##class(Test.Test1).RecursiveGlobalCount("^Locations","USA",0)
3
Class community.counter Extends %RegisteredObject{/// Example:/// set ^x(1)=111/// set ^x(3,5)=222/// set ^x(3,7)=333/// /// The above global has 5 nodes:/// ^x without a value/// ^x(1) with value/// ^x(3) without a value/// ^x(3,5) with value/// ^x(3,7) with value/// /// write ##class(community.counter).CountQ($name(^x)) --> 3/// write ##class(community.counter).CountR($name(^x)) --> 3/// /// Using your example:/// write ##class(community.counter).CountQ($name(^Locations)) --> 5/// write ##class(community.counter).CountQ($name(^Locations("USA")) --> 3/// /// /// N.B./// Recursion is a tricky thing!/// It helps one to get a clearly laid out solution
/// but you should take care about runtimes./// /// CountQ(...) is about 4-5 times faster then CountR(...)/// /// --------------------------------------------------------/// /// Return the count of nodes of a global- or a local variable/// which have a value, using $QUERY() function/// /// node:
/// a local or global variable, example: $na(^myGloabl), $na(abc)/// or a local or global reference example: $na(^myGlobal(1,2))/// ClassMethod CountQ(node) As %Integer{ if $data(@node)#10 { set sum=1 } else { set sum=0 } while 1 { set node=$query(@node) quit:node="" if $increment(sum) } quit sum}/// Return the count of nodes of a global- or a local variable/// which have a value, using recursion, using recursion/// /// node:
/// a local or global variable, example: $na(^myGlobal), $na(abc)/// or a local or global reference example: $na(^myGlobal(1,2))/// ClassMethod CountR(node) As %Integer{ set sum=0 do ..nodeCnt($name(@node), .sum) quit sum}ClassMethod nodeCnt(ref, ByRef sum) As %Integer [ Internal, Private ]{ if $data(@ref)#10, $increment(sum) set i="" while 1 { set i=$order(@ref@(i)) quit:i="" do ..nodeCnt($na(@ref@(i)),.sum) }}}Hi.
Julius ConntQ function will give a wrong answer for ^Locations","Canada") since it will count also the "USA" nodes.
Here is a code that will do the trick :
ClassMethod Count(node)
{
S QLen=$QL(node) I QLen S Keys=$QS(node,QLen)
F Count=0:1 S node=$Query(@node) Q:node="" || (QLen && ($QS(node,QLen)'=Keys))
Quit Count
}
W ##class(Yaron.test).Count($name(^Locations))
5
w ##class(Yaron.test).Count($name(^Locations("USA")))
3
w ##class(Yaron.test).Count($name(^Locations("Canada")))
2