discuss@lists.openscad.org

OpenSCAD general discussion Mailing-list

View all threads

Digging into search( )

C
clothbot
Mon, Apr 20, 2015 12:26 PM

It's aliiiiive!

First attempt at search simplification (passes regressions) is here:

https://github.com/openscad/openscad/pull/1318

See the "Files Changed" report for how I've simplified the usage:

https://github.com/openscad/openscad/pull/1318/files

I updated the example023.scad to wrap the built-in search() in a
user-defined search_vector_one() function to take advantage of simple
[for(i=...)] list building:

function search_vector_one(vec,table,col=0) = [for(i=[0:len(vec)-1])
search(vec[i],table,col)[0]];

https://github.com/clothbot/openscad/blob/search_simplify/examples/Old/example023.scad

I used the same search_vector_one() function in text-search-test.scad to
make it "just work":

https://github.com/clothbot/openscad/blob/search_simplify/testdata/scad/2D/features/text-search-test.scad

The two "search-tests-unicode.scad" and "search-tests.scad" have been
significantly modified to reflect the simplified search behaviour.

https://github.com/clothbot/openscad/blob/search_simplify/testdata/scad/misc/search-tests-unicode.scad

https://github.com/clothbot/openscad/blob/search_simplify/testdata/scad/misc/search-tests.scad

As outlined in the comment here:

https://github.com/clothbot/openscad/blob/search_simplify/src/func.cc#L667

--snip--

Pattern:
"search" "(" match_value  "," string_or_vector_or_table
("," index_col_num )?
")";
match_value : ( Value::NUMBER | Value::STRING );
string_or_vector_or_table : ( Value::STRING | "[" Value ("," Value)* "]" |
"[" ("[" Value ("," Value)* "]")+ "]" );
index_col_num : int;

--end-snip--

  • A string 'match_value' searches for full-string matches.

    • It does not iterate over each character in the string and return a
      list of matches per character any more.
  • All matches are returned every time

    • no more 'num_returns_per_match' parameter.
    • use user-defined functions like the above search_vector_one() example to
      massage search results to your liking.
  • the no-matches condition returns 'undef' instead of an empty vector '[]'

    • conditional expressions based on no-search-results will work now.
  • Assigning any vector to 'match_value' throws a WARNING and return 'undef'

    • I started trying to get smart and 'collapse vectors of length=1' for
      backward compatibility but... no. Better to rip this bandaid off clean.
    • Perhaps a future enhancement could support vector-type match_value for
      things like searching for points... That could be handy for process
      polygon() and polyhedron() point sets.

Thoughts? Comments?

Speak now or fix it yourself.?. ;-)

Andrew.

--
View this message in context: http://forum.openscad.org/Digging-into-search-tp12421p12442.html
Sent from the OpenSCAD mailing list archive at Nabble.com.

It's aliiiiive! First attempt at search simplification (passes regressions) is here: https://github.com/openscad/openscad/pull/1318 See the "Files Changed" report for how I've simplified the usage: https://github.com/openscad/openscad/pull/1318/files I updated the example023.scad to wrap the built-in search() in a user-defined search_vector_one() function to take advantage of simple [for(i=...)] list building: function search_vector_one(vec,table,col=0) = [for(i=[0:len(vec)-1]) search(vec[i],table,col)[0]]; https://github.com/clothbot/openscad/blob/search_simplify/examples/Old/example023.scad I used the same search_vector_one() function in text-search-test.scad to make it "just work": https://github.com/clothbot/openscad/blob/search_simplify/testdata/scad/2D/features/text-search-test.scad The two "search-tests-unicode.scad" and "search-tests.scad" have been significantly modified to reflect the simplified search behaviour. https://github.com/clothbot/openscad/blob/search_simplify/testdata/scad/misc/search-tests-unicode.scad https://github.com/clothbot/openscad/blob/search_simplify/testdata/scad/misc/search-tests.scad As outlined in the comment here: https://github.com/clothbot/openscad/blob/search_simplify/src/func.cc#L667 --snip-- Pattern: "search" "(" match_value "," string_or_vector_or_table ("," index_col_num )? ")"; match_value : ( Value::NUMBER | Value::STRING ); string_or_vector_or_table : ( Value::STRING | "[" Value ("," Value)* "]" | "[" ("[" Value ("," Value)* "]")+ "]" ); index_col_num : int; --end-snip-- - A string 'match_value' searches for full-string matches. - It does *not* iterate over each character in the string and return a list of matches per character any more. - All matches are returned every time - no more 'num_returns_per_match' parameter. - use user-defined functions like the above search_vector_one() example to massage search results to your liking. - the no-matches condition returns 'undef' instead of an empty vector '[]' - conditional expressions based on no-search-results will work now. - Assigning any vector to 'match_value' throws a WARNING and return 'undef' - I started trying to get smart and 'collapse vectors of length=1' for backward compatibility but... no. Better to rip this bandaid off clean. - Perhaps a future enhancement could support vector-type match_value for things like searching for points... That could be handy for process polygon() and polyhedron() point sets. Thoughts? Comments? Speak now or fix it yourself.?. ;-) Andrew. -- View this message in context: http://forum.openscad.org/Digging-into-search-tp12421p12442.html Sent from the OpenSCAD mailing list archive at Nabble.com.
R
runsun
Mon, Apr 20, 2015 2:08 PM

Wow, Andrew, that was quick !!

Without going over the links in details, here is my quick view:

It looks great. The removal of iteration over match_value and the
num_returns_per_match is very significant.

One note:

match_value doesn't have to exclude lists. You just treat it as a value and
don't iterate over it. This way, it can be used to search points like you
wish.

In fact, since it is "a value", there's no need to enforce any type
constraint on match_value. It could be anything even boolean or even undef.
Thus, there is no need for the warning sign. It wouldn't be too hard to
check why a search doesn't return indices as expected.

Certainly, if match_value=vector is allowed,  we have to think about how to
deal with this:

search(  ["abc",1],  [ ["abc",1], [ ["abc",1],2 ], ["ghi",3] ...]  )

Will it give [0] ? [1]? [0,1] ?

This can be controlled by index_col_num, for example,

index_col_num = 0 ==> [1]  (match the column #0 )
index_col_num = -1 ==> [0] (means, no selection of column, so match the
whole item, in this case, ["abc",1], and return [0] )

Lastly, a side note:

Since search( ) now seems to allow flat list (which I believe was not
original design for), what it does is returning index:

search( "def", ["abc",1,"def",2,"ghi",3] )

and a step-short to serve the purpose of hash-like feature, because this
will fail :

search( "def", ["abc","def","def",2,"ghi",3] )

It returns 1, the index of value of key "abc", but not 2, the index of key
"def".

Unless a new argument, every, is introduced. every=1 (default), every=2:
allows for key search in a list of key-value pairs. Its addition depends on
how you feel how important this "key-value pairs" is and if this search()
wants to play that role.

BTW: I have a whole set of test cases for search(). Once it is merged into
the nightly, I can try them out.

clothbot wrote

  • A string 'match_value' searches for full-string matches.

    • It does not iterate over each character in the string and return a
      list of matches per character any more.
  • All matches are returned every time

    • no more 'num_returns_per_match' parameter.
    • use user-defined functions like the above search_vector_one() example
      to massage search results to your liking.
  • the no-matches condition returns 'undef' instead of an empty vector '[]'

    • conditional expressions based on no-search-results will work now.
  • Assigning any vector to 'match_value' throws a WARNING and return
    'undef'

    • I started trying to get smart and 'collapse vectors of length=1' for
      backward compatibility but... no. Better to rip this bandaid off clean.
    • Perhaps a future enhancement could support vector-type match_value for
      things like searching for points... That could be handy for process
      polygon() and polyhedron() point sets.

Thoughts? Comments?

Speak now or fix it yourself.?. ;-)

Andrew.


$  Runsun Pan, PhD

$ -- OpenScad_DocTest: doc and unit test ( Github , Thingiverse  )

$ -- hash parameter model: here , here

$ -- Linux Mint 17.1 Rebecca x64  + OpenSCAD 2015.03.15/2015.04.01.nightly

--
View this message in context: http://forum.openscad.org/Digging-into-search-tp12421p12443.html
Sent from the OpenSCAD mailing list archive at Nabble.com.

Wow, Andrew, that was quick !! Without going over the links in details, here is my quick view: It looks great. The removal of iteration over match_value and the num_returns_per_match is very significant. One note: match_value doesn't have to exclude lists. You just treat it as a value and don't iterate over it. This way, it can be used to search points like you wish. In fact, since it is "a value", there's no need to enforce any type constraint on match_value. It could be anything even boolean or even undef. Thus, there is no need for the warning sign. It wouldn't be too hard to check why a search doesn't return indices as expected. Certainly, if match_value=vector is allowed, we have to think about how to deal with this: search( ["abc",1], [ ["abc",1], [ ["abc",1],2 ], ["ghi",3] ...] ) Will it give [0] ? [1]? [0,1] ? This can be controlled by index_col_num, for example, index_col_num = 0 ==> [1] (match the column #0 ) index_col_num = -1 ==> [0] (means, no selection of column, so match the whole item, in this case, ["abc",1], and return [0] ) Lastly, a side note: Since search( ) now seems to allow flat list (which I believe was not original design for), what it does is returning index: search( "def", ["abc",1,"def",2,"ghi",3] ) and a step-short to serve the purpose of hash-like feature, because this will fail : search( "def", ["abc","def","def",2,"ghi",3] ) It returns 1, the index of value of key "abc", but not 2, the index of key "def". Unless a new argument, every, is introduced. every=1 (default), every=2: allows for key search in a list of key-value pairs. Its addition depends on how you feel how important this "key-value pairs" is and if this search() wants to play that role. BTW: I have a whole set of test cases for search(). Once it is merged into the nightly, I can try them out. clothbot wrote > - A string 'match_value' searches for full-string matches. > - It does *not* iterate over each character in the string and return a > list of matches per character any more. > > - All matches are returned every time > - no more 'num_returns_per_match' parameter. > - use user-defined functions like the above search_vector_one() example > to massage search results to your liking. > > - the no-matches condition returns 'undef' instead of an empty vector '[]' > - conditional expressions based on no-search-results will work now. > > - Assigning any vector to 'match_value' throws a WARNING and return > 'undef' > - I started trying to get smart and 'collapse vectors of length=1' for > backward compatibility but... no. Better to rip this bandaid off clean. > - Perhaps a future enhancement could support vector-type match_value for > things like searching for points... That could be handy for process > polygon() and polyhedron() point sets. > > Thoughts? Comments? > > Speak now or fix it yourself.?. ;-) > > Andrew. ----- $ Runsun Pan, PhD $ -- OpenScad_DocTest: doc and unit test ( Github , Thingiverse ) $ -- hash parameter model: here , here $ -- Linux Mint 17.1 Rebecca x64 + OpenSCAD 2015.03.15/2015.04.01.nightly -- View this message in context: http://forum.openscad.org/Digging-into-search-tp12421p12443.html Sent from the OpenSCAD mailing list archive at Nabble.com.
C
clothbot
Mon, Apr 20, 2015 3:38 PM

Hi Rusun,

Very briefly, vector and string 'index' counting starts at 0, not 1.

list1= ["abc",1,"def",2,"ghi",3]
search( "def", list1 )

...will return '[2]' because list1[2]=="def"; list1[0]=="abc"

list2=["abc","def","def",2,"ghi",3]
search( "def", list2 )

...will now return '[1,2]' because list2[1]=="def" and list2[2]=="def";
list2[0]=="abc"

In my simplified search, all matches are always returned.  It is now up to
the user to decide how many/few to filter off and by what
mechanism/algorithm.

I think that yes, I'll eventually add list/vector support to match_value,
however it will be considerably more involved to implement than the simple
'atomic' data structures.

Support search for an N-dimension vector match could be fun+useful:

  • add 'tol[erance]' parameter to allow for 'close enough' floating point
    'distance' matches.

Picking my battles. :-)

Andrew.

--
View this message in context: http://forum.openscad.org/Digging-into-search-tp12421p12444.html
Sent from the OpenSCAD mailing list archive at Nabble.com.

Hi Rusun, Very briefly, vector and string 'index' counting starts at 0, not 1. list1= ["abc",1,"def",2,"ghi",3] search( "def", list1 ) ...will return '[2]' because list1[2]=="def"; list1[0]=="abc" list2=["abc","def","def",2,"ghi",3] search( "def", list2 ) ...will now return '[1,2]' because list2[1]=="def" and list2[2]=="def"; list2[0]=="abc" In my simplified search, all matches are always returned. It is now up to the user to decide how many/few to filter off and by what mechanism/algorithm. I think that yes, I'll eventually add list/vector support to match_value, however it will be considerably more involved to implement than the simple 'atomic' data structures. Support search for an N-dimension vector match could be fun+useful: - add 'tol[erance]' parameter to allow for 'close enough' floating point 'distance' matches. Picking my battles. :-) Andrew. -- View this message in context: http://forum.openscad.org/Digging-into-search-tp12421p12444.html Sent from the OpenSCAD mailing list archive at Nabble.com.
R
runsun
Mon, Apr 20, 2015 4:23 PM

Ok. I think I've bugged you enough. Whatever you decide, I think it's in a
good direction :) :)


$  Runsun Pan, PhD

$ -- OpenScad_DocTest: doc and unit test ( Github , Thingiverse  )

$ -- hash parameter model: here , here

$ -- Linux Mint 17.1 Rebecca x64  + OpenSCAD 2015.03.15/2015.04.01.nightly

--
View this message in context: http://forum.openscad.org/Digging-into-search-tp12421p12445.html
Sent from the OpenSCAD mailing list archive at Nabble.com.

Ok. I think I've bugged you enough. Whatever you decide, I think it's in a good direction :) :) ----- $ Runsun Pan, PhD $ -- OpenScad_DocTest: doc and unit test ( Github , Thingiverse ) $ -- hash parameter model: here , here $ -- Linux Mint 17.1 Rebecca x64 + OpenSCAD 2015.03.15/2015.04.01.nightly -- View this message in context: http://forum.openscad.org/Digging-into-search-tp12421p12445.html Sent from the OpenSCAD mailing list archive at Nabble.com.
M
MichaelAtOz
Mon, Apr 20, 2015 10:15 PM

I think you missed the point @runsun.

"Very briefly, vector and string 'index' counting starts at 0, not 1. "

This explains why you had so much difficulty understanding it.

search("a",[ "d", "c", "b", "a"); // returns [3]
0    1    2    3
__


Unless specifically shown otherwise above, my contribution is in the Public Domain; To the extent possible under law, I have waived all copyright and related or neighbouring rights to this work. This work is published globally via the internet. :) Inclusion of works of previous authors is not included in the above.

The TPP is no simple “trade agreement.”  Fight it! http://www.ourfairdeal.org/

View this message in context: http://forum.openscad.org/Digging-into-search-tp12421p12451.html
Sent from the OpenSCAD mailing list archive at Nabble.com.

I think you missed the point @runsun. "Very briefly, vector and string 'index' counting starts at 0, not 1. " This explains why you had so much difficulty understanding it. search("a",[ "d", "c", "b", "a"); // returns [3] 0 1 2 3 __ ----- Unless specifically shown otherwise above, my contribution is in the Public Domain; To the extent possible under law, I have waived all copyright and related or neighbouring rights to this work. This work is published globally via the internet. :) Inclusion of works of previous authors is not included in the above. The TPP is no simple “trade agreement.” Fight it! http://www.ourfairdeal.org/ -- View this message in context: http://forum.openscad.org/Digging-into-search-tp12421p12451.html Sent from the OpenSCAD mailing list archive at Nabble.com.
R
runsun
Mon, Apr 20, 2015 10:49 PM

@ Michael, I didn't explain too much in details about the context, I guess
that's why you (and Andrew) misunderstood.

The reason that I have this argument,

 search( "def", [ "abc","def","def",1 ] ) 

will cause confusion is that, the whole discussion stems from a discussion
on other thread about hash parameter mapping. This includes the example you
gave using lookup.

Then search() was mentioned. Search() was not designed for this type of
key-value mapping, but I tried to use it that way, by applying search() on a
flat list as above. Note that all examples on the doc about search() are
either against strings :

search( ... "abcdef")

or against list of vectors:

search( ... [["abc",1],["def",2] ...])

but not flat list.

So if search() is to be use as a key-value type mapping against a flat list
like I like it to be, it has to be able to find every other item to map,
that is,

search( "def", ["abc", "def", "def", 1] )

def has to map item 0, that is "abc", and skip item 1 (which is a value
associates to item 0), then item 2, that is "def".

In this case, it should have returned [2]. Or, if return all, [1,2].

But, like I said, search() can't do that, and when set to return only one
item, it will return [1], but not [2], which is not what I want. So using
search() in a key-value mapping manner will fail.

This can be solved if search() can match every other item (see my previous
post).

But I understand that this is probably just my way of using it, so I leave
the decision to Andrew. It would probably make it too complicated, anyway.

So this is not the problem of mistaking the base indexing. Guess I was just
too lazy to explain the entire context. :( :(

MichaelAtOz wrote

I think you missed the point @runsun.

"Very briefly, vector and string 'index' counting starts at 0, not 1. "

This explains why you had so much difficulty understanding it.

search("a",[ "d", "c", "b", "a"); // returns [3]
0    1    2    3
__


$  Runsun Pan, PhD

$ -- OpenScad_DocTest: doc and unit test ( Github , Thingiverse  )

$ -- hash parameter model: here , here

$ -- Linux Mint 17.1 Rebecca x64  + OpenSCAD 2015.03.15/2015.04.01.nightly

--
View this message in context: http://forum.openscad.org/Digging-into-search-tp12421p12452.html
Sent from the OpenSCAD mailing list archive at Nabble.com.

@ Michael, I didn't explain too much in details about the context, I guess that's why you (and Andrew) misunderstood. The reason that I have this argument, search( "def", [ "abc","def","def",1 ] ) will cause confusion is that, the whole discussion stems from a discussion on other thread about hash parameter mapping. This includes the example you gave using lookup. Then search() was mentioned. Search() was not designed for this type of key-value mapping, but I tried to use it that way, by applying search() on a flat list as above. Note that all examples on the doc about search() are either against strings : search( ... "abcdef") or against list of vectors: search( ... [["abc",1],["def",2] ...]) but not flat list. So if search() is to be use as a key-value type mapping against a flat list like I like it to be, it has to be able to find every other item to map, that is, search( "def", ["abc", "def", "def", 1] ) def has to map item 0, that is "abc", and skip item 1 (which is a value associates to item 0), then item 2, that is "def". In this case, it should have returned [2]. Or, if return all, [1,2]. But, like I said, search() can't do that, and when set to return only one item, it will return [1], but not [2], which is not what I want. So using search() in a key-value mapping manner will fail. This can be solved if search() can match every other item (see my previous post). But I understand that this is probably just my way of using it, so I leave the decision to Andrew. It would probably make it too complicated, anyway. So this is not the problem of mistaking the base indexing. Guess I was just too lazy to explain the entire context. :( :( MichaelAtOz wrote > I think you missed the point @runsun. > > "Very briefly, vector and string 'index' counting starts at 0, not 1. " > > This explains why you had so much difficulty understanding it. > > search("a",[ "d", "c", "b", "a"); // returns [3] > 0 1 2 3 > __ ----- $ Runsun Pan, PhD $ -- OpenScad_DocTest: doc and unit test ( Github , Thingiverse ) $ -- hash parameter model: here , here $ -- Linux Mint 17.1 Rebecca x64 + OpenSCAD 2015.03.15/2015.04.01.nightly -- View this message in context: http://forum.openscad.org/Digging-into-search-tp12421p12452.html Sent from the OpenSCAD mailing list archive at Nabble.com.