Arrays vs Object for Key/Value pair lookups

classic Classic list List threaded Threaded
43 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Arrays vs Object for Key/Value pair lookups

4D Tech mailing list
Hi

I remember at last year’s summit, JPR was emphasising how objects were far more optimised than arrays for doing lookups over large numbers of key value pairs.

e.g. we usually do this:

$x:=find in array(myKEYS;”product_code_x”)

if($x>0)
  $0:=myPRICES{$x}
end if

How do people prefer to do this with objects ? Enumerate the keys in some systematic way and then populate the object like this >

For($i;1;$SIZE)

  $key:=string($i)
  $value:=myarrayVAL{$i}
  OB SET($object;$key;$value)

End For

Then for retreiving:

$key:=string($1)

$0:=OB Get($object;$key)

…or was JPR suggesting we use object arrays and do some kind of “find” over the object arrays ?

Best Regards

Peter

**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Arrays vs Object for Key/Value pair lookups

4D Tech mailing list
I did a lot of testing for this as I need to keep a dictionary of words identified by word IDs with some 300 000 items around.
I need to retrieve the words based on their ID.
Using objects was MAGNITUDES faster than synchronised arrays (Cannot find the number anymore but we are talking measurable differences here, 1ms to several hundred), so I immediately trashed the old array based code and rewrite with objects.
Never looked back :-)

Cheers
Alex

> Am 17.07.2017 um 12:46 schrieb Peter Jakobsson via 4D_Tech <[hidden email]>:
>
> Hi
>
> I remember at last year’s summit, JPR was emphasising how objects were far more optimised than arrays for doing lookups over large numbers of key value pairs.
>
> e.g. we usually do this:
>
> $x:=find in array(myKEYS;”product_code_x”)
>
> if($x>0)
>  $0:=myPRICES{$x}
> end if
>
> How do people prefer to do this with objects ? Enumerate the keys in some systematic way and then populate the object like this >
>
> For($i;1;$SIZE)
>
>  $key:=string($i)
>  $value:=myarrayVAL{$i}
>  OB SET($object;$key;$value)
>
> End For
>
> Then for retreiving:
>
> $key:=string($1)
>
> $0:=OB Get($object;$key)
>
> …or was JPR suggesting we use object arrays and do some kind of “find” over the object arrays ?
>
> Best Regards
>
> Peter
>
> **********************************************************************
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:[hidden email]
> **********************************************************************

**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Arrays vs Object for Key/Value pair lookups

4D Tech mailing list
Interesting subject.

Just to make sure I am able to interpret the the findings correctly, were
you comparing Find in array with object find, or *binary* find in array on
a sorted array? Binary search is very hard (but certainly not impossible)
to compete with. If you have an array of 1,000,000 elements, it takes
something like 36 operations max to find a value. If you have an unsorted
array of 1,000,000 items then it can take 1,000,000 comparisons to check
for a value.

Kind of a big deal.

A naive, sequential Find in array and a smart binary search on a sorted
array are *very* different animals. Conflating the two makes search results
based on one meaningless.

That's why I'm trying to sort out which of these animals you were comparing
with searches on objects. Object may be using some kind of hash table
which, for sure, ought to beat a sequential find in array. We don't know
how many buckets are in the hash table, but say that it's 4,096. You cut
your initial search space down to roughly 256 values. (This could be 0
values or it could be 1,000 - it depends on the data and the hashing
function.) That gives you a *massive* optimization very inexpensively. It's
still hard to beat a binary search under a normal distribution of values
and searches, but it's still way faster than a sequential search.

Then again, we don't actually know *anything* about the way object searches
work in 4D so anything is possible. 4D won't say anything on the subject
for reasons they will not discuss. I find this completely puzzling, but
there it is.
**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Arrays vs Object for Key/Value pair lookups

4D Tech mailing list
In reply to this post by 4D Tech mailing list
Take a look at the new http://livedoc.4d.com/4D-Language-Reference-15.4/Arrays/Find-in-sorted-array.301-3274895.en.html.  This would change the FIA side of the equation.

Keith - CDI

> On Jul 17, 2017, at 5:46 AM, Peter Jakobsson via 4D_Tech <[hidden email]> wrote:
>
> Hi
>
> I remember at last year’s summit, JPR was emphasising how objects were far more optimised than arrays for doing lookups over large numbers of key value pairs.
>
> e.g. we usually do this:
>
> $x:=find in array(myKEYS;”product_code_x”)
>
> if($x>0)
>  $0:=myPRICES{$x}
> end if
>
> How do people prefer to do this with objects ? Enumerate the keys in some systematic way and then populate the object like this >
>
> For($i;1;$SIZE)
>
>  $key:=string($i)
>  $value:=myarrayVAL{$i}
>  OB SET($object;$key;$value)
>
> End For
>
> Then for retreiving:
>
> $key:=string($1)
>
> $0:=OB Get($object;$key)
>
> …or was JPR suggesting we use object arrays and do some kind of “find” over the object arrays ?
>
> Best Regards
>
> Peter
>
> **********************************************************************
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:[hidden email]
> **********************************************************************

**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Arrays vs Object for Key/Value pair lookups

4D Tech mailing list
In reply to this post by 4D Tech mailing list
Thanks Alexander.

Which style of implementation did you use ? Did you use the old array lookup key as the new object key in the key/value pair ? i.e. did you enumerate the keys like this: ?

======= OLD WAY =======

ARRAY LONGINT(vArrKeysID; 1000)
ARRAY LONGINT(vArrKeysNames; 1000)

$x:=Find in Array(vArrKeysID;345)

If($x>0)
$0:= vArrKeysNames{$x}
End if

======= NEW WAY =======

C_OBJECT($myOBJECT)

For($i;1;1000)

 $key:=String($i)
 $value:=$i
 OB SET($myOBJECT;$key;$value)

End For

…then for finding (passing the ID in $1:

$key:=string($1)

$0:=ob get($myOBJECT;$key)

======================

Is that how you did it ? (i.e. with calculated/hashed keys).

Peter


On 17 Jul 2017, at 13:17, Herr Alexander Heintz via 4D_Tech <[hidden email]> wrote:

> Using objects was MAGNITUDES faster than synchronised arrays

**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Arrays vs Object for Key/Value pair lookups

4D Tech mailing list
That’s basically it.
only I don not need the wrapper anymore, i go directly to
word:=OB Get(<>Dict;$t_MyKey;is Text)
Using arrays I sorted the key array and used my own optimized array query routine (same as the new Find in sorted array introduced in V16).
With object, no need to sort, the object system optimizes it by itself.
Only no need to calculate as my dictionary table is quite simple:

WordKey
Language
Word

so I queried for the language I needed and then

apply to selection([dict];ob set(<>Dict;[dict]WordKey;[dict]Word)

ready
set
go

could not be conceivably easier

cheers

> Am 17.07.2017 um 16:45 schrieb Peter Jakobsson via 4D_Tech <[hidden email]>:
>
> Thanks Alexander.
>
> Which style of implementation did you use ? Did you use the old array lookup key as the new object key in the key/value pair ? i.e. did you enumerate the keys like this: ?
>
> ======= OLD WAY =======
>
> ARRAY LONGINT(vArrKeysID; 1000)
> ARRAY LONGINT(vArrKeysNames; 1000)
>
> $x:=Find in Array(vArrKeysID;345)
>
> If($x>0)
> $0:= vArrKeysNames{$x}
> End if
>
> ======= NEW WAY =======
>
> C_OBJECT($myOBJECT)
>
> For($i;1;1000)
>
> $key:=String($i)
> $value:=$i
> OB SET($myOBJECT;$key;$value)
>
> End For
>
> …then for finding (passing the ID in $1:
>
> $key:=string($1)
>
> $0:=ob get($myOBJECT;$key)
>
> ======================
>
> Is that how you did it ? (i.e. with calculated/hashed keys).
>
> Peter
>
>
> On 17 Jul 2017, at 13:17, Herr Alexander Heintz via 4D_Tech <[hidden email]> wrote:
>
>> Using objects was MAGNITUDES faster than synchronised arrays
>
> **********************************************************************
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:[hidden email]
> **********************************************************************

**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Arrays vs Object for Key/Value pair lookups

4D Tech mailing list

On 17 Jul 2017, at 17:03, Herr Alexander Heintz via 4D_Tech <[hidden email]> wrote:

> so I queried for the language I needed and then
> apply to selection([dict];ob set(<>Dict;[dict]WordKey;[dict]Word)

Ah !

So you just ‘hoover up’ into your dictionary object.

Like a hoover ?

Peter

**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Arrays vs Object for Key/Value pair lookups

4D Tech mailing list
In reply to this post by 4D Tech mailing list
I did a 2014 Summit presentation (5 JSON Tips) which should be available
for download that demonstrated the benefits of using objects for key/value
pair cache lookups, but in the end it’s pretty easy to demonstrate. The
benefits start to show up with a few hundred keys, but at 100,000 it’s
easily 20x faster looking up object keys as opposed to find in array in
interpreted. And when you compile, it’s literally hundred of times faster
(400-500x) at 100k keys - and the benefits just get bigger and bigger with
more keys. That’s both for filling the cache and for retrieving values
(objects save you from having to check if a key is already in the array
before adding it).

--
Justin Leavens
[hidden email]   (818) 986-7298 x <//(818) 986-7298 x301>701
Just In Time Consulting, Inc.
Custom software for unique businesses
http://www.linkedin.com/in/justinleavens

On July 17, 2017 at 3:46:26 AM, Peter Jakobsson via 4D_Tech (
[hidden email]) wrote:

Hi

I remember at last year’s summit, JPR was emphasising how objects were far
more optimised than arrays for doing lookups over large numbers of key
value pairs.

e.g. we usually do this:

$x:=find in array(myKEYS;”product_code_x”)

if($x>0)
$0:=myPRICES{$x}
end if

How do people prefer to do this with objects ? Enumerate the keys in some
systematic way and then populate the object like this >

For($i;1;$SIZE)

$key:=string($i)
$value:=myarrayVAL{$i}
OB SET($object;$key;$value)

End For

Then for retreiving:

$key:=string($1)

$0:=OB Get($object;$key)

…or was JPR suggesting we use object arrays and do some kind of “find” over
the object arrays ?

Best Regards

Peter

**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ: http://lists.4d.com/faqnug.html
Archive: http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub: mailto:[hidden email]
**********************************************************************
**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Arrays vs Object for Key/Value pair lookups

4D Tech mailing list
Hello all, I've been interested in this topic for some time but have never
taken the time to run any tests. I don't have the time now (for sure), but
I took some anyway. What I did was grab a huge unique word file, clear out
words that are obviously illegal JSON key names and tried doing lookups
three ways:

* Sequential Find in array
* Binary Find in sorted array
* Object lookup

For reference, here's a link to the original file full of words:
https://github.com/dwyl/english-words/blob/master/words.zip

I tried not to bias the tests as what I'd like are useful results. Still, I
didn't test a whole lot of different ways and bias is nearly impossible to
avoid. Even if my test is totally fair, there's no way that it's complete -
results always depend on the data under test. This is part of why
algorithms are described using big O terminology. You have a way of talking
about the performance envelope around the algorithm under various sorts of
conditions. (That's a terrible description, but so be it.) As an example,
it's very easy to compare s sequential find in array with a binary search.
There are only a couple of cases where sequential is faster, no matter the
size of the array. With 4D's object lookups, we just don't know.  Even if
they are a hash table (likely but not confirmed), this doesn't tell us
much. (Hash tables have a whole lot of components in their implementation,
some of which can behave in weird ways, depending on your data set+hashing
function. It also matters what you use to find actual values, not just hash
bins.)

Anyway, here are a set of results in a compiled system with ~465,000
words/keys:

Words: 466,474
Tests: 10,000
Sequential: 107,777
Binary: 153
Object: 9

The three times are in milliseconds. As in, "Searching for 10,000 different
words in an array of 466,474 unique words took a little over 1/10th of a
second using a sorted array." That's roughly 1/2 of a blink. (Not kidding.)

Comments and take-aways:

* Binary search is great.

* Object search is great.

* Sequential search is not so great, but it still only took about 11
seconds.

* I noticed that setting up the sorted array took no time and that setting
up the object took time that I could feel. I didn't do timing results on
this. But if it's true, the *overall* time (including setup) for the object
was *unfavorable.*

Conclusion: I'll use objects when I need them and sorted arrays when I need
them. The performance difference is too small to be a factor, it will come
down to other properties of these data structures.

++ To Justin on the whole 'use the lookup value as the key' tip (Rob has
mentioned this too.) I sue that all of the time in objects, it's a really
excellent practice.

If anyone wants to re-run the tests or check my code for logic errors, dumb
errors, bias, etc., here's the code with comments:

If (False)
 // https://github.com/dwyl/english-words/blob/master/words.zip
 // Imported into a new table in 4D.
 // Cleared ones starting with numbers or punctuation.
 // 466,475 words left.
End if

  //----------------------------------------------------------------
  // Setup
  //----------------------------------------------------------------
ALL RECORDS([word])
ORDER BY([word];[word]word) // 4D indexed sort

ARRAY TEXT($words_at;0)
SELECTION TO ARRAY([word]word;$words_at)  // Sorted arraa of 464K+ words

C_OBJECT($words_object)
C_LONGINT($words_count)
C_LONGINT($word_index)
$words_object:=JSON Parse("{}")
$words_count:=Size of array($words_at)

For ($word_index;1;$words_count)
C_TEXT($word)
$word:=$words_at{$word_index}
 // {"hello":"HELLO"} - no reason for the lower/upper other than to make it
read in the Debugger.
OB SET($words_object;Lowercase($word);Uppercase($word))
End for

  // Let's build an array of random words from the main array of words.
C_LONGINT($test_words_count)
$test_words_count:=10000  // Note: Cannot be larger than the $words_count


  // Hmmm. Not getting a good distribution of indexes from Random.
  // Instead, I'll pick words from different positions along the array.

ARRAY TEXT($test_words_at;$test_words_count)
$test_words_at{1}:=$words_at{1}  // Best case for a sequential scan
$test_words_at{2}:=$words_at{$words_count}  // Worst case for a sequential
scan

C_LONGINT($interval)
  // Now we want to fill in the rest of the test array.
  // The selected words are grabbed from even intervals along the array.

  // The speed difference for a sequential search should be linear.

  // The speed difference for a binary search should be very small amongst
words.
  // It should take up to about ~18 reads to find the word.

  // The speed difference for the object? No clue, we don't know how
they're implemented.
  // With a fast hash and a large hash table, it could be very quick. Hard
to say.
$interval:=$words_count\$test_words_count

C_LONGINT($test_word_index)
For ($test_word_index;3;$test_words_count)  // Start at 3 because we just
filled in 1 & 2 by hand.
C_LONGINT($word_index)
$word_index:=$interval*$test_word_index
$test_words_at{$test_word_index}:=$words_at{$word_index}
End for

  //----------------------------------------------------------------
  // Tests
  //----------------------------------------------------------------
  // And we're good to go.

  //---------------------------
  // Sequential search in array
  //---------------------------
C_LONGINT($sequential_start)
$sequential_start:=Milliseconds

C_LONGINT($test_word_index)
For ($test_word_index;1;$test_words_count)
C_TEXT($word)
C_LONGINT($match_index)
$word:=$test_words_at{$test_word_index}
$match_index:=Find in array($words_at;$word)
End for

C_LONGINT($sequential_elapsed)
$sequential_elapsed:=Milliseconds-$sequential_start

  //---------------------------
  // Binary search in array
  //---------------------------
C_LONGINT($binary_start)
$binary_start:=Milliseconds

C_LONGINT($test_word_index)
For ($test_word_index;1;$test_words_count)
C_TEXT($word)
C_LONGINT($match_index)
C_BOOLEAN($found)
$word:=$test_words_at{$test_word_index}
$found:=Find in sorted array($words_at;$word;>;$match_index)
End for

C_LONGINT($binary_elapsed)
$binary_elapsed:=Milliseconds-$binary_start

  //---------------------------
  // Object lookup
  //---------------------------
C_LONGINT($object_start)
$object_start:=Milliseconds

C_LONGINT($test_word_index)
For ($test_word_index;1;$test_words_count)
C_TEXT($word)
C_TEXT($word_returned)
$word:=$test_words_at{$test_word_index}
$word_returned:=OB Get($words_object;$word)
End for

C_LONGINT($object_elapsed)
$object_elapsed:=Milliseconds-$object_start

  //---------------------------
  // Results summary
  //---------------------------
C_TEXT($tab)
C_TEXT($cr)
$tab:=Char(Tab)
$cr:=Char(Carriage return)

C_TEXT($results_text)
$results_text:=""
$results_text:=$results_text+"Words:"+$tab+String($words_count;"###,###,###,##0")+$cr
$results_text:=$results_text+"Tests:"+$tab+String($test_words_count;"###,###,###,##0")+$cr
$results_text:=$results_text+"Sequential:"+$tab+String($sequential_elapsed;"###,###,###,##0")+$cr
$results_text:=$results_text+"Binary:"+$tab+String($binary_elapsed;"###,###,###,##0")+$cr
$results_text:=$results_text+"Object:"+$tab+String($object_elapsed;"###,###,###,##0")+$cr

SET TEXT TO PASTEBOARD($results_text)

ALERT($results_text)
**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Arrays vs Object for Key/Value pair lookups

4D Tech mailing list
the "Find" commands accept wild cards and evaluate using collation algorithms (case-insensitive comparison plus some other locale specific rules)
is it really fair to compare the two against object keys?

> 2017/07/18 9:44、David Adams via 4D_Tech <[hidden email]> のメール:
>
> * Sequential Find in array
> * Binary Find in sorted array
> * Object lookup




**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Arrays vs Object for Key/Value pair lookups

4D Tech mailing list
> "Find" commands accept wild cards and evaluate using collation algorithms
(case-insensitive comparison plus some
 > other locale specific rules) is it really fair to compare the two
against object keys?

I'm not sure what "fair" means here, but it's definitely not a
apples-to-apples comparison. Find in array and Find in sorted array are
variants on the same thing, so they're easy to compare. Object keys? No
idea. As far as I can tell, 4D refused to offer any information about how
the work that the company will stand behind publicly. Thats okay, I'm used
to black box testing and I like it...but it's time-consuming.

The point of these comparisons isn't to figure out if one approach is
"better" than another so much as how they work *in the real world.* Put
another way, the goal is to come up with some rules of thumb about what to
use when. Binary search kicks ass, and I know why. Object key lookups kick
ass, and I don't know why. My take-away is to

* Use objects when when they're easier or more appropriate for the problem
at hand.

* Use sorted arrays when when they're easier or more appropriate for the
problem at hand.

* Don't shift from arrays to objects based on a notion that they're
"faster."

* Consider objects instead of arrays if you don't have or can't be sure of
a sorted array order because object key lookups are way faster than
sequential array traversal (Find in array.)

* Don't worry about speed at all unless you've got a solid reason to.

Thinking best on my tests, a few points for anyone that wants to tweak them:

-- If you want sequential searches to look better, just search for the
first items. Search time should be directly related to the position of the
target in the array. I avoided this trap on purpose.

-- I used very small text values for lookups and keys! Long strings might
behave differently, I don't know. I would actually find that an interesting
result, if anyone feels like checking.

-- The object keys are inserted into the test object in sorted order. This
should not make any difference if there's a hash underneath, but we don't
actually know that. Although it does see likely. From the few results I've
gotten, I'd wildly guess that there's:

-- An excellent hash function where "excellent" means "low collision, high
dispersal and fast."

-- A secondary structure off the hash bins that is itself smart. So, not a
linked list (The CS 101 approach), but a second good hash or a tree of some
kind. Or something else.

-- A pretty large range of hash slots to reduce secondary lookup times.

-- Probably some smart scheme for changing the hash table size dynamically
under stress. That's an expensive maneuver (or normally is, I can think of
ways to make it not too expensive.)

Just speculating, I'm probably wrong in every detail here. Doesn't matter.
It's a black box.
**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Arrays vs Object for Key/Value pair lookups

4D Tech mailing list
In reply to this post by 4D Tech mailing list
Case sensitive comparison support in Find in array/Find in sorted array is long overdue. Even best, supported in string compariosn operator (or new operator for string)

$true:=($string1=*$string2)
$true:=($string1>*$string2)
$true:=($string1<*$string2)

Alan Chan

4D iNug Technical <[hidden email]> writes:
>the "Find" commands accept wild cards and evaluate using collation algorithms (case-insensitive comparison plus some other locale specific rules)
>is it really fair to compare the two against object keys?

**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Use of Objects vs Global Variables (Was 'Arrays vs Objects...)

4D Tech mailing list

This is just a brief commentary on my experience with objects after about a year of using them intensively. I now use them extensively for certain things but have ditched them for others where I had initially thought they may have a role.

But one of the reasons I don’t use 4D objects too widely is because I find 4D global variables to really represent the skeleton of an application. I don’t mean that in a functional sense, but in an architectural sense because I see 4GL’s as tending to support a “top down” implementation and 3GL’s more suited to “bottom up” implementations. “Top down” being where properties & logic from systems analysis itself themselves explicitly in and detectably in the code, whereas ‘bottom up’ meaning you’re just developing a load of tools to do…whatever.

If you use objects to represent major business logic properties that persist across methods, I’ve found you can end up with a whole load of extra work to do at the architecture level AND at the code level. For example:

 • one is constantly stuffing and unstuffing all kings of “property bags” in order to manipulate the properties to the point that the business logic gets drowned in a morass of utility code
 • much of the ‘formal verification’ role of the compiler is lost since objects obfuscate the identity of high level business logic properties that are otherwise completely unambiguous when they manifest as globals
 • refactoring is an order of magnitude more of a pain since the IDE is almost oblivious to property naming collisions and because we now have to levels of hierarchy to deal with instead of 1 (object and property)

The original thread on "Arrays vs Object for Key/Value pair lookups” is a good case in point. Yes there is a huge performance boost in using objects (I hold my hands up - I’m doing it ! LoL), but I’m retaining the use of ‘parallel arrays’ for much of the code simply because they’re so nice and trackable everywhere. As David Adam’s say, the performance gain is huge but has to be weighed against other priorities.

The are where I find objects of most benefit is in transport. For example you need to populate a field with some complex hierarchical data that gets passed to another table in a trigger or something like that. Here it really does something which 4D had very little support for before objects arrived.

Just my 2c on objects so far !

Regards

Peter
**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Use of Objects vs Global Variables (Was 'Arrays vs Objects...)

4D Tech mailing list
> The original thread on "Arrays vs Object for Key/Value pair lookups” is a
good case in point. Yes there is a huge
> performance boost in using objects (I hold my hands up - I’m doing it !
LoL), but I’m retaining the use of ‘parallel arrays’
> for much of the code simply because they’re so nice and trackable
everywhere. As David Adam’s say, the performance
> gain is huge but has to be weighed against other priorities.

Hey Peter, my point was arguably the opposite - the performance gains for
using an object instead of a sorted array are *trivial* (best case) and
negative (worst case.) Object setup/teardown is more time-consuming and a
binary search is fast and remains fast even with billions of items. (In
theory.)

Unsorted array searches are just slower by nature...they're sequential so
the pain only gets worse with larger arrays and when you're searching for
elements to the end of the array.

I'm very pleased that 4D now has a native binary search on arrays, and I'm
happy with how it's implemented. It's perfect for everything I needed it
for. Indeed such a thing earlier this year, rewrote an old Dave Terry tech
note, posted the code here and...someone quickly said, "Urrrr, have you
seen the Language Reference lately?"

If anyone out there doesn't understand why binary searches are faster (so
much faster) than standard sequential scans, and why they depend on sorted
values, it's easy to understand. Google it and you should be able to figure
it out in a few minutes.

And, lest anyone forget, you can use binary search logic on sorted
selections with GOTO SELECTED RECORD. Why not?
**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Use of Objects vs Global Variables (Was 'Arrays vs Objects...)

4D Tech mailing list
Long ago - and I am pretty sure the code has been lost to time -
in v2.2.3 (I think) I wrote code to do binary searching on a selection,
as at that time Search Selection (the old command name) was sequential,
but sorting the selection was indexed.

Even with the overhead of managing the pointers, and stepping through
records to find all matching records in the selection, as memory
serves, it was faster (interpretedly) to use the binary search on a
selection size of beginning at 6-8 records.

For those who care, an outline of a binary search (on records in
selection)
The following was written off the stop of my head, in the email editor,
BUT it gets to the basics of a binary search on a selection in 4D.

- get size of selection ($Selection_Size).
- make sure it is sorted on the field you want to search on.

repeat
 $Current_record:= int($Selection_Size/2)

   repeat
    goto selected record([table];$Current_record)
     case of
      :($Current_record>=$Selection_Size) or ($Current_record<=1)
      $Not_Found:=true
      :([table]field = Value)  // found it create collection of
matching records
       add to set([table];"Matching")
       $previous:=$Current_record
       $Next:=$Current_record

        repeat
          $previous:= previous-1
          goto selected record([table];$previous)
         
          if ([table]field = Value)  // found it create collection of
matching records
            add to set([table];"Matching")
          else
            $previous:=0
          end if
        until($previous=0)

        repeat
          $$Next:= $Next+1
          goto selected record([table];$Next)
         
          if ([table]field = Value)  // found it create collection of
matching records
            add to set([table];"Matching")
          else
            $Next:=Selection_Size
          end if
        until($Next= Selection_Size)
        $Mo_More_Matches = true
      :([table]field > Value)  // Not found current record value >
search value
       $Current_record:=int((1+$Current_record)/2)
      :([table]field < Value)  // Not found current record value <
search value
       $Current_record:=int((Selection_Size+1+$Current_record)/2)
      end case
until ($Not_Found)) or ($No_More_Mathes)

At this point the set "Matching" contains all records matching the
criteria, or nothing.

On Tue, 18 Jul 2017 22:16:18 +1000, David Adams via 4D_Tech wrote:
>
> And, lest anyone forget, you can use binary search logic on sorted
> selections with GOTO SELECTED RECORD. Why not?
---------------
Gas is for washing parts
Alcohol is for drinkin'
Nitromethane is for racing
**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Arrays vs Object for Key/Value pair lookups

4D Tech mailing list
In reply to this post by 4D Tech mailing list


[JPR]

Hi Guys,

The exact thing that I've explained was how to use objects to get Associative Arrays in 4D. Associative Arrays are widely used in other languages like PHP or JavaScript.

In computer science, an Associative Array  is an abstract data type composed of a collection of (key, value) pairs, such that each possible key appears just once in the collection. In 4D, the JSON-type Objects are perfect for Associative Arrays.

In many case, you will have 2 parallel arrays, let's say one for the product code ($arCodes), and one for the product name ($arNames). You want to find the name for a specific code (classic 4D way)

- You create the array with a loop, adding elements pairs $myCode and $myName:

$k:=Find in array($arCodes;$myCode)
If ($k>0)
        $arNames{$k}:=$myName
Else
        APPEND TO ARRAY($arCodes;$myCode)
        APPEND TO ARRAY($arNames;$myName)
End if

- Then you can find a name from a code:

$k:=Find in array($arCodes;$myCode)
If ($k>0)
        $myName:=$arNames{$k}
Else
        $myName:=""
End if

Now if you use an Object:

C_OBJECT($myArray)

You create the array with a loop:

OB SET($myArray;$myCode;$myName)

-and you find with:

$myName:=OB Get($myArray;$myCode}

...much simpler, and much faster! Why is it faster? With a classic array, the Find in array command has to parse the entire array, element per element, until the correct element is found (Except in case of a Sorted array, but sometimes you can't sort the array because the index of an element can be meaningful for your method)

In case of an object, the properties are 'indexed' by using an internal Hash table, so the access to one particular Property doesn't need a sequential parsing of the list of values, but an almost direct access. I confirm what Justin says, that is to say that the bigger will be the array, the more efficient will be associative arrays compared with classic parsing of arrays.

My very best,

JPR



> Message: 7
> Date: Mon, 17 Jul 2017 11:43:12 -0700
> From: Justin Leavens <[hidden email]>
> To: 4D iNug Technical <[hidden email]>
> Subject: Re: Arrays vs Object for Key/Value pair lookups
> Message-ID:
> <CABwcA3un5GoYVWrFeRXzpm8bNuOtSdaymO9=[hidden email]>
> Content-Type: text/plain; charset="UTF-8"
>
> I did a 2014 Summit presentation (5 JSON Tips) which should be available
> for download that demonstrated the benefits of using objects for key/value
> pair cache lookups, but in the end it’s pretty easy to demonstrate. The
> benefits start to show up with a few hundred keys, but at 100,000 it’s
> easily 20x faster looking up object keys as opposed to find in array in
> interpreted. And when you compile, it’s literally hundred of times faster
> (400-500x) at 100k keys - and the benefits just get bigger and bigger with
> more keys. That’s both for filling the cache and for retrieving values
> (objects save you from having to check if a key is already in the array
> before adding it).
>
> --
> Justin Leavens
> [hidden email]   (818) 986-7298 x <//(818) 986-7298 x301>701
> Just In Time Consulting, Inc.
> Custom software for unique businesses
> http://www.linkedin.com/in/justinleavens
>
> On July 17, 2017 at 3:46:26 AM, Peter Jakobsson via 4D_Tech (
> [hidden email]) wrote:
>
> Hi
>
> I remember at last year’s summit, JPR was emphasising how objects were far
> more optimised than arrays for doing lookups over large numbers of key
> value pairs.
>
> e.g. we usually do this:
>
> $x:=find in array(myKEYS;”product_code_x”)
>
> if($x>0)
> $0:=myPRICES{$x}
> end if
>
> How do people prefer to do this with objects ? Enumerate the keys in some
> systematic way and then populate the object like this >
>
> For($i;1;$SIZE)
>
> $key:=string($i)
> $value:=myarrayVAL{$i}
> OB SET($object;$key;$value)
>
> End For
>
> Then for retreiving:
>
> $key:=string($1)
>
> $0:=OB Get($object;$key)
>
> …or was JPR suggesting we use object arrays and do some kind of “find” over
> the object arrays ?
>
> Best Regards
>
> Peter
>

**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Arrays vs Object for Key/Value pair lookups

4D Tech mailing list

On 20 Jul 2017, at 11:39, JPR via 4D_Tech <[hidden email]> wrote:

> In case of an object, the properties are 'indexed' by using an internal Hash table, so the access to one particular Property doesn't need a sequential parsing of the list of values, but an almost direct access. I confirm what Justin says, that is to say that the bigger will be the array, the more efficient will be associative arrays compared with classic parsing of arrays.


Many thanks for refreshing your advice JPR !

Very useful.

Regards

Peter

**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Resizing Window in Code keeping Background Image centered

4D Tech mailing list
On Mac. 4D v16

There is an image that can be associated with menu bars and initially it displays in the center of the window. (This image is called the background image in the 4D Tool Box)

If you resize the window by dragging on the sides or corner, that image keeps itself centered.

But if I resize the window in code  (SET WINDOW RECT) the window does resize itself as requested but the image does not keep itself centered.

The only way that I can get it back in the center is to change the Menu Bar and then return to the original Menu Bar and the picture will be recentered.


Is it not intended to use SET WINDOW RECT on this window?

Is there some trick to resize this window in code and have the background image stay centered?

Thanks in advance.



**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Resizing Window in Code keeping Background Image centered

4D Tech mailing list
SET WINDOW RECT is not the command to resize the window,
it is (since v2004) a command to resize the form.

http://doc.4d.com/4Dv15/4D/15.4/RESIZE-FORM-WINDOW.301-3274595.en.html

but since RESIZE FORM WINDOW does not take the window reference as an argument,
you would need to use CALL FORM in order to apply it to the splash window.

2017/07/21 8:22、Robert Livingston via 4D_Tech <[hidden email]<mailto:[hidden email]>> のメール:
Is it not intended to use SET WINDOW RECT on this window?



**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Resizing Window in Code keeping Background Image centered

4D Tech mailing list
I notice after sending the message that this sentence makes no sense.

but the main point stands,
that you would want to use the other command
if you want the resize to work as if done manually.

> 2017/07/21 9:26、miyako <[hidden email]> のメール:
> SET WINDOW RECT is not the command to resize the window,
> it is (since v2004) a command to resize the form.




**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
123