[OpenSCAD] Digging into search( )

Andrew Plumb andrew at plumb.org
Sun Apr 19 00:47:55 EDT 2015


Hi Runsun,

Let me lead by saying thank you for taking the time to collect and share your thoughts and observations. It is very much appreciated!

A few comments/history behind my original writing of search():

When I wrote it (in 2012):

	1. The ‘undef’ didn’t exist as a return option so I settled on returning empty lists which could be detected (list of length 0) as ‘no match’ conditions - it predated the Value rewrite of the code-base.

	2. The ‘concat’ list construction operator didn’t exist; I needed a way to search for a string-of-characters (aka. an ordered list of character values) and get the results back in order, as a list.

	3. The ‘let’ operator didn’t exist.

	4. Lists were statically defined; [ for() … ] dynamically generated lists weren’t possible.

	5. Function recursion was (and still is to some degree) dog-slow for more ‘elegant’ list construction approaches.

	6. The text() module didn’t exist.
		- See example023.scad combined with MCAD/fonts.scad for insight into how I was generating text, and the original motivation behind coding up search().

	7. The no-match warnings are gone as of last week; you’ll have to build from source to see that.

It’s not buggy, just written within the constraints of the time. ;-)

All that said, I agree that now would be a good time to simplify+rewrite!


Rough outline of hypothetical simplified behaviour I’ll start looking at implementing:

1. search( substring, string):
	- return list of substring match indices

Example 1:

	string1=”abcdabcabcdd”;
	search(“abc”,string1);
		[0,4,7]
	search(“efg”,string1);
		undef

2. search( fullstring, vector_of_strings):
	- return list of indices (set of ‘i’ values) where fullstring == vector_of_strings[i]
	- do not attempt substring matches since [ for() …] list traversal and construction works

Example 2:

	list2=[“caterpillar”,3,”cat”,2,”dog”,2,”cattle”,5,”cod”,42];
	search(“cat”, list2);
		[2]
	search(2,list2);
		[3,5]
	search(“bird”,list2);
		undef

3. search( match_value, vector_of_vectors [, index_col_num] ):
	- return list of indices (set of ‘i’ values) where match_value == vector[i][index_col_num]
	- this simplification should make it even more powerful+useful for hash-style table lookup operations

Example 3:

	table3 =[ [“caterpillar”,3],[“cat”,2],[“dog”,2],[“cattle”,5],[“cod”,42]];
	search(“cat”, table3);
		[1]
	search(2,table3,1);
		[1,2]
	search(“bird",table3);
		undef

4. search( match_vector, string_or_vector [, index_col_num]):
	- deprecate confusing legacy behaviour

Example 4:

	search([“abc”],string1);
		undef // Throw WARNING about deprecated usage; use new list comprehension capabilities.
	search([“cat”],list2);
		undef // Throw WARNING about deprecated usage; use new list comprehension capabilities.
	search([“cat”],table3);
		undef // Throw WARNING about deprecated usage; use new list comprehension capabilities.


Just to re-iterate, thank you for taking the time to collect and share your thoughts and observations. Please *do* continue this!

This is very much in the spirit of keeping OpenSCAD compact, more synthesizable HDL-like than bloating into a poor substitute for a scripting language like Python.

Gotta pick your battles. :-)

Andrew.

> On Apr 18, 2015, at 10:39 PM, runsun <runsun at gmail.com> wrote:
> 
> 
> Spent some time digging into the built-in search() <http://en.wikibooks.org/wiki/OpenSCAD_User_Manual/The_OpenSCAD_Language#Search> :
> 
>   search( match_value , string_or_vector [, num_returns_per_match [, index_col_num ] ] )
> 
> Conclusion first: complicated, buggy, unpredictable, giving out unnecessary warnings. 
> 
> 
> Take notes for 2 points on its design:
> 
> 1.  match_value is set to:
>     
>     1.1. Can be a single value or vector of values.
>     1.2. Strings are treated as vectors-of-characters to iterate over;
>     1.3. If match_value is a vector of strings, search will look for exact string matches.
> 
>     In practical, match_value can be: list, number, string (treated as a collection of characters)
> 
> 2. The return should be either a list, when: num_returns_per_match is unset or set to 1:
> 
>       search( "a","abcdabcd" )= [0]
>       search( "abc","abcdabcd" )= [0, 1, 2]
> 
>     or a list of lists, when: num_returns_per_match is set to anything other than 1 :
> 
>      search( "a","abcdabcd",0 )= [[0, 4]]
>      search( "abc","abcdabcd",0 )= [[0, 4], [1, 5], [2, 6]] 
> 
>      data4= [["a", 1], ["b", 2], ["c", 3], ["a", 4], ["b", 5]]
>      search( "abc",data4,2 )= [[0, 3], [1, 4], [2]]
> 
> 
> 
> Observations:
> 
> A. Since a match_value is treated as a list of chars, the following 2 should give same results:
> 
>     search( "abc","abcdabcd" )= [0, 1, 2]
>     search( ["a","b","c"],"abcdabcd" ) want: [0, 1, 2] got: [[], [], []]
> 
> B. The following searches give same return. Users have no way to know what are matched:
> 
>     search( "bc","abcdabcd" )= [1, 2]
>     search( "xbc","abcdabcd" )= [1, 2] 
>     search( "xbzjck","abcdabcd" )= [1, 2]
> 
> C. Users can't possibly predict the following return:
> 
>     data9= [ ["cat", 1], ["b", 2], ["c", 3], ["dog", 4]
>                 , ["a", 5], ["b", 6], ["c", 7], ["d", 8]
>                 , ["e", 9], ["apple", 10], ["a", 11] ] 
>     q= ["b", "zzz", "a", "c", "apple", "dog"]
> 
>     search( "cat",data9 ) want: [2,4] got: [0, 4]
>     
>     This also gives out a warning:  WARNING: search term not found: "t"
> 
> D. This is unpredictable, too:
> 
>      a1= [["ab",1],["bc",2],["cd",3]] 
> 
>      search( "ab", a1) want: [ ] got: [0, 1]
> 
> E. This gives correct answer, but showing two warnings:
>    
>      WARNING: search term not found: "p"
>      WARNING: search term not found: "q"
> 
>      search( "pq",a1 )= [ ]
> 
> F. Inconsistent return when match not found:
> 
>     search( "e","abcdabcd" )= [] 
>     search( ["zzz"],data9 )= [[]] 
> 
>     Since [[]] is treated as true, the following would be impossible:
> 
>     search(...) ? do_found : do_not_found
> 
> 
> So what I think so far are:
> 
> 1) It is very difficult to understand how it works. 
> 2) It is still buggy.
> 3) Would make a lot of effort for users to understand it, and a lot of effort trying to debug, if possible.  
> 
> My conclusion is that search() is still buggy and it would probably not a good idea for it in the release yet. 
> 
> What I believe is that it tries to accommodate too many usages in a single. For example, match_value could have been designed as just a value (one of number, string or list), but not list of values. Since we have list comprehension, it'd be extremely easy to achieve this:
> 
>    [ for (m in match_values) search( m, ...) ]
> 
> This will take away a large chunk of complication inside the coding of search().  
> 
> 
> $ <http://forum.openscad.org/mailing_list/MailingListOptions.jtp?forum=1> Runsun Pan, PhD 
> $ -- OpenScad_DocTest: doc and unit test ( Github <https://github.com/runsun/openscad_doctest>, Thingiverse <https://www.thingiverse.com/thing:410831> )
> $ -- hash parameter model: here <http://forum.openscad.org/parameterized-models-td8303.html#a8306>, here <http://forum.openscad.org/Can-I-get-some-code-review-up-in-here-tp12341p12355.html>
> $ -- Linux Mint 17.1 Rebecca x64 + OpenSCAD 2015.03.15/2015.04.01.nightly
> 
> View this message in context: Digging into search( ) <http://forum.openscad.org/Digging-into-search-tp12421.html>
> Sent from the OpenSCAD mailing list archive <http://forum.openscad.org/> at Nabble.com.
> _______________________________________________
> OpenSCAD mailing list
> Discuss at lists.openscad.org
> http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openscad.org/pipermail/discuss_lists.openscad.org/attachments/20150419/23e9c7e7/attachment-0002.html>


More information about the Discuss mailing list