discuss@lists.openscad.org

OpenSCAD general discussion Mailing-list

View all threads

braille-en-us-g1.scad

JH
John Heim
Tue, Aug 23, 2022 4:40 PM

All,

Below is code I wrote to take US English and generate a unicode braille
string that can then be passed to my generic braille library. It
transcribes plain text into grade 1, US English braille. Grade 1 braille
is uncontracted. Mostly, it's letter for letter. The complicated thing
is capitalization. In fact, there is a known bug -- a string of caps is
not transcribed correctly. The code below will correctly transcribe
"Hello, world!" but it messes up "OpenSCAD".

Work is proceding on a grade2 transcriber written entirely in OpenSCAD
code.

=== cut here ===

/*
This library generates grade 1US , English braille.
It requires the general braille library, braille.scad.
*/
include <braille.scad>;

// For a given North American Braille ASCII character, return the
corresponding unicode braille character.
brailleASCII   = "
a1b'k2l@cif/msp"e3h9o6r^djg>ntq,*5<-u8v.%[$+x!&;:4\0z7(_?w]#y)=";
brailleUnicode = "
⠁⠂⠃⠄⠅⠆⠇⠈⠉⠊⠋⠌⠍⠎⠏⠐⠑⠒⠓⠔⠕⠖⠗⠘⠙⠚⠛⠜⠝⠞⠟⠠⠡⠢⠣⠤⠥⠦⠧⠨⠩⠪⠫⠬⠭⠮⠯⠰⠱⠲⠳⠴⠵⠶⠷⠸⠹⠺⠻⠼⠽⠾⠿";
function asciiCharToUnicodeChar (character) = brailleUnicode[
search(character,brailleASCII)[0] ];

// Convert a North American Braille ASCII string to a Unicode braille
string.
function asciiStrToUnicodeStr (str) = [for (char = str)
asciiCharToUnicodeChar (char)];

function str_cmp(str,sidxex,pattern) =
    len(str)-sidxex <len(pattern)? false :
    _str_cmp_recurse(str,sidxex,pattern,len(pattern));

function _str_cmp_recurse(str,sidxex,pattern,plen,pidxex=0,) =
    pidxex < plen && pattern[pidxex]==str[sidxex]
       ? _str_cmp_recurse(str,sidxex+1,pattern,plen ,pidxex+1)
        : (pidxex==plen);

// Transcribe US English text into North American ASCII braille.
function transcribe(text,idx=0, result="") =
    idx==len(text) ? result
  : let (
      next=
      str_cmp(text,idx,"ing") ? [3,"+"]
      : str_cmp(text,idx,"en") ? [2,"5"]
      : str_cmp(text,idx,"er") ? [2,"]"]
      : (ord(text[idx]) >= ord("A") && ord(text[idx]) <= ord("Z")) ?
[1, str(",", chr(ord(text[idx]) + ord("a") - ord("A") ))]
      : [1, text[idx]]
   )
   transcribe(text,idx+next[0], str(result, next[1]));

// EOF

=== cut here ===

All, Below is code I wrote to take US English and generate a unicode braille string that can then be passed to my generic braille library. It transcribes plain text into grade 1, US English braille. Grade 1 braille is uncontracted. Mostly, it's letter for letter. The complicated thing is capitalization. In fact, there is a known bug -- a string of caps is not transcribed correctly. The code below will correctly transcribe "Hello, world!" but it messes up "OpenSCAD". Work is proceding on a grade2 transcriber written entirely in OpenSCAD code. === cut here === /* This library generates grade 1US , English braille. It requires the general braille library, braille.scad. */ include <braille.scad>; // For a given North American Braille ASCII character, return the corresponding unicode braille character. brailleASCII   = " a1b'k2l@cif/msp\"e3h9o6r^djg>ntq,*5<-u8v.%[$+x!&;:4\\0z7(_?w]#y)="; brailleUnicode = " ⠁⠂⠃⠄⠅⠆⠇⠈⠉⠊⠋⠌⠍⠎⠏⠐⠑⠒⠓⠔⠕⠖⠗⠘⠙⠚⠛⠜⠝⠞⠟⠠⠡⠢⠣⠤⠥⠦⠧⠨⠩⠪⠫⠬⠭⠮⠯⠰⠱⠲⠳⠴⠵⠶⠷⠸⠹⠺⠻⠼⠽⠾⠿"; function asciiCharToUnicodeChar (character) = brailleUnicode[ search(character,brailleASCII)[0] ]; // Convert a North American Braille ASCII string to a Unicode braille string. function asciiStrToUnicodeStr (str) = [for (char = str) asciiCharToUnicodeChar (char)]; function str_cmp(str,sidxex,pattern) =     len(str)-sidxex <len(pattern)? false :     _str_cmp_recurse(str,sidxex,pattern,len(pattern)); function _str_cmp_recurse(str,sidxex,pattern,plen,pidxex=0,) =     pidxex < plen && pattern[pidxex]==str[sidxex]        ? _str_cmp_recurse(str,sidxex+1,pattern,plen ,pidxex+1)         : (pidxex==plen); // Transcribe US English text into North American ASCII braille. function transcribe(text,idx=0, result="") =     idx==len(text) ? result   : let (       next=       str_cmp(text,idx,"ing") ? [3,"+"]       : str_cmp(text,idx,"en") ? [2,"5"]       : str_cmp(text,idx,"er") ? [2,"]"]       : (ord(text[idx]) >= ord("A") && ord(text[idx]) <= ord("Z")) ? [1, str(",", chr(ord(text[idx]) + ord("a") - ord("A") ))]       : [1, text[idx]]    )    transcribe(text,idx+next[0], str(result, next[1])); // EOF === cut here ===
JB
Jordan Brown
Sun, Sep 4, 2022 8:59 PM

[ I believe I'm looking at the second version you sent. ]

Again, mostly straightforward.  As somebody not familiar with the
ASCII-BrailleASCII-UnicodeBraille translation conventions, I would have
been helped by more comments.

include <braille.scad>;

Include is not exactly a statement and doesn't need a semicolon.

// For a given North American Braille ASCII character, return the
corresponding unicode braille character. charList = " a,b k l cif msp
e h o!r djg ntq -u?v x . z w y "; brailleASCII = "
a1b'k2l`cif/msp"e3h9o6r^djg>ntq,*5<-u8v.%[$+x!&;:4\0z7(_?w]#y)=";

It would be helpful to describe how these strings relate ASCII to
Braille.  The second appears to mostly be a superset of the first, but
handles numbers completely differently.  I assume that they are two
different mappings of Braille symbols to ASCII, each mapping a 0-63
Braille code to ASCII.

My instinctive reaction is that these tables would be better if they
were transposed, if they were indexed by the ASCII character code to get
the corresponding Braille character, either as a 0-63 code or directly
to Unicode.  However, they wouldn't really be any easier to read, and I
see that this presentation is used in other references.  For small
numbers of characters to translate it won't matter; for large numbers it
might make sense to derive that table at the start and then use it for
translation.

Note:  That table is easy to generate in conventional languages, but
actually kind of hard in OpenSCAD, because OpenSCAD doesn't have a
way to initialize an array in a random-access way.  The easiest and
maybe only way to do it would be to linear-search for each of the codes.

brailleUnicode = "
⠁⠂⠃⠄⠅⠆⠇⠈⠉⠊⠋⠌⠍⠎⠏⠐⠑⠒⠓⠔⠕⠖⠗⠘⠙⠚⠛⠜⠝⠞⠟⠠⠡⠢⠣⠤⠥⠦⠧⠨⠩⠪⠫⠬⠭⠮⠯⠰⠱⠲⠳⠴⠵⠶⠷⠸⠹⠺⠻⠼⠽⠾⠿";

Note that brailleUnicode[i] is equal to chr(10240+i); you don't really
need the table.

function xlate (character) = brailleASCII[ search(character,
charList)[0] ];

Since there are several kinds of translations in this program, you might
want to give this a more descriptive name.

// Convert a North American Braille ASCII string to a Unicode braille
string. function asciiStrToUnicodeStr (str) = [for (char = str)
asciiCharToUnicodeChar (char)];

Note that the result here is an array of strings, each of which is a
single character.  It is unfortunately tedious to turn that into a
string.  This means that if you pass the result to brailleLabel, as your
test program does, you get a vertical line of Braille symbols.

Here's a function that concatenates an array of strings into one string:

function arrayToString(a, i=0) =
    i < len(a)
    ? str(a[i], arrayToString(a, i+1))
    : "";

I see that you figured that pattern out for transcribe().

I would change asciiStrToUnicodeStr to call that, or to do the recursion
itself, so that it returns a string rather than a list of one-letter
strings.

function isUpper (character) = (ord(character) >= ord("A") &&
ord(character) <= ord("Z")) ? true : false;

Your comparisons and && are already yielding a boolean true or false;
you don't need the explicit "? true : false".

I don't immediately see any reason why you need to call ord().  You
should be able to just directly compare the character with "A" and "Z".

function isUpper (character) = character >= "A" && character <= "Z";

It's kind of unfortunate that OpenSCAD doesn't have at least a basic
suite of character tools like this and the string comparison functions
below.  (You can implement most of them in base OpenSCAD, but to be
really correct they should be Unicode-aware, and that is hard in base
OpenSCAD.)

function toLower (character) = chr(ord(character) + ord("a") - ord("A"));

Here you do need the ord().  You might want to include an "isUpper()" in
there so that it doesn't do stupid things to non-upper-case characters.

function str_cmp(str,sidxex,pattern) = len(str)-sidxex <len(pattern)?
false : _str_cmp_recurse(str,sidxex,pattern,len(pattern));

"cmp" is a poor abbreviation to use here, because it doesn't tell the
reader what the result means.  The only time I would use "cmp" is when
I'm writing a function in the style of C's strcmp, where it returns
negative, zero, or positive to indicate less-than, equal-to, or
greater-than.

And note that this function doesn't exactly compare strings.  Rather, it
checks to see whether the first string (as offset by the index) begins
with the second string.  As a result, I would probably call this
function strbw(), or if I wanted to be more verbose but probably more
screen-reader friendly, stringBegins().

Regardless, it needs a comment explaining; "sidexex" is not very
self-explanatory.  I would probably use "start" or "startIndex" for that
parameter name.

function _str_cmp_recurse(str,sidxex,pattern,plen,pidxex=0,) = pidxex
< plen && pattern[pidxex]==str[sidxex] ?
_str_cmp_recurse(str,sidxex+1,pattern,plen ,pidxex+1) : (pidxex==plen);

Same basic comments.

Unnecessary comma at the end of the parameter list.  (I'm actually
surprised that isn't a syntax error.)

I would probably rearrange it into a two-step test, instead of the one
step with an && and then at the end figuring out which case you were in:

function strBegins(str, start, pattern) =
    len(str)-start < len(pattern)
    ? false
    : _strBeginsRecurse(str, start, pattern, len(pattern));

function _strBeginsRecurse(str, start, pattern, plen, pstart=0) =
    pstart >= plen
    ? true
    : pattern[pstart]!=str[start]
    ? false
    : _strBeginsRecurse(str,start+1,pattern,plen, pstart+1);

This variant also might be better in terms of tail-recursion optimization.

// Transcribe US English text into North American ASCII braille.
function transcribe(text,idx=0, result="") = idx==len(text) ? result :
let ( next= str_cmp(text,idx,"ing") ? [3,"+"] : str_cmp(text,idx,"en")
? [2,"5"] : str_cmp(text,idx,"er") ? [2,"]"] : isUpper(text[idx]) ?
[1, str(",",toLower(text[idx]))] : [1, xlate(text[idx]) ] )
transcribe(text,idx+next[0], str(result, next[1]));

This demonstrates a nuisance aspect of OpenSCAD's scoping and variable
rules:  there's no good way for one test to control two resulting
values.  I don't know of any way better than what you've done.

In the isUpper case I would probably call xlate(), just for consistency,
even though it's not needed.  That would keep all of the knowledge of
how to translate letters in that one function.

As noted on the other file, I probably would have rearranged it to avoid
needing "result", by having the last line look something like so:

str(next[1], transcribe(text, idx+next[0]));

(Though I suspect that would be worse for tail-recursion reasons.)

Let's take a quick look at your test program:

include <braille-en-us-g1.scad>; text = ""; if (len(text) > 0) {
asciiBraille = transcribe (text); echo (asciiBraille); unicodeBraille
= asciiStrToUnicodeStr (asciiBraille); echo (unicodeBraille);
brailleLabel (unicodeBraille); } // fi // EOF

Get rid of the "if (len(text) > 0)" test.  It's best if the functions
all handle empty strings sensibly.  Your labels may not be very
interesting with no text, but there might be other applications for
which it's the right answer.

And in fact passing in an empty string reveals an issue:  drawLine()
doesn't handle empty strings, because it uses a range [0:len(line)-1]
and that will be [ -1, 0 ] here.  (In a now-deprecated behavior,
OpenSCAD will automatically reverse the direction of a range with an end
less than the start.)  That in turn runs into trouble when it tries to
index the string by -1.  You can force it to go forwards, and thus to do
nothing with an empty string, by explicitly including the step: 
[0:1:len(line)-1].

As mentioned above, asciiStrToUnicodeStr returns an array of
one-character strings, so the result is a vertical stack.  Naïvely
adding that in still yields a vertical stack, because what brailleLabel
is expecting is a list of strings, and a single string looks like a list
of one-letter strings.  Adding brackets somewhere, to turn the single
string into a list with one entry, does the trick.

Net (after changing asciiStrToUnicodeStr so that it returns a single
string):

include <braille-en-us-g1.scad>

text = "";
asciiBraille = transcribe (text);
echo (asciiBraille);
unicodeBraille = asciiStrToUnicodeStr (asciiBraille);
echo (unicodeBraille);
brailleLabel ([unicodeBraille]);
// EOF

Random note:since there's Braille symbols in Unicode, and since those
symbols are in several standard fonts, one might expect that text()
would be able to generate them.  However, experiments say no; in all of
the fonts that I have tried, including ones that appear to include the
Braille symbols when used for ordinary text, those Unicode code points
end up as the replacement glyph, usually a box.  Even if they were
available, it might be better to do it the way you've done it, because
rounding the tops would be no fun.  (You have to do it using multiple
layers and offset().)

Anyhow, hope that was all helpful.

[ I believe I'm looking at the second version you sent. ] Again, mostly straightforward.  As somebody not familiar with the ASCII-BrailleASCII-UnicodeBraille translation conventions, I would have been helped by more comments. > include <braille.scad>; Include is not exactly a statement and doesn't need a semicolon. > // For a given North American Braille ASCII character, return the > corresponding unicode braille character. charList = " a,b k l cif msp > e h o!r djg ntq -u?v x . z w y "; brailleASCII = " > a1b'k2l`cif/msp\"e3h9o6r^djg>ntq,*5<-u8v.%[$+x!&;:4\\0z7(_?w]#y)="; It would be helpful to describe how these strings relate ASCII to Braille.  The second appears to mostly be a superset of the first, but handles numbers completely differently.  I assume that they are two different mappings of Braille symbols to ASCII, each mapping a 0-63 Braille code to ASCII. My instinctive reaction is that these tables would be better if they were transposed, if they were indexed by the ASCII character code to get the corresponding Braille character, either as a 0-63 code or directly to Unicode.  However, they wouldn't really be any easier to read, and I see that this presentation is used in other references.  For small numbers of characters to translate it won't matter; for large numbers it might make sense to derive that table at the start and then use it for translation. Note:  That table is easy to generate in conventional languages, but actually kind of hard in OpenSCAD, because OpenSCAD doesn't have a way to initialize an array in a random-access way.  The easiest and maybe only way to do it would be to linear-search for each of the codes. > brailleUnicode = " > ⠁⠂⠃⠄⠅⠆⠇⠈⠉⠊⠋⠌⠍⠎⠏⠐⠑⠒⠓⠔⠕⠖⠗⠘⠙⠚⠛⠜⠝⠞⠟⠠⠡⠢⠣⠤⠥⠦⠧⠨⠩⠪⠫⠬⠭⠮⠯⠰⠱⠲⠳⠴⠵⠶⠷⠸⠹⠺⠻⠼⠽⠾⠿"; Note that brailleUnicode[i] is equal to chr(10240+i); you don't really need the table. > function xlate (character) = brailleASCII[ search(character, > charList)[0] ]; Since there are several kinds of translations in this program, you might want to give this a more descriptive name. > // Convert a North American Braille ASCII string to a Unicode braille > string. function asciiStrToUnicodeStr (str) = [for (char = str) > asciiCharToUnicodeChar (char)]; Note that the result here is an array of strings, each of which is a single character.  It is unfortunately tedious to turn that into a string.  This means that if you pass the result to brailleLabel, as your test program does, you get a vertical line of Braille symbols. Here's a function that concatenates an array of strings into one string: function arrayToString(a, i=0) = i < len(a) ? str(a[i], arrayToString(a, i+1)) : ""; I see that you figured that pattern out for transcribe(). I would change asciiStrToUnicodeStr to call that, or to do the recursion itself, so that it returns a string rather than a list of one-letter strings. > function isUpper (character) = (ord(character) >= ord("A") && > ord(character) <= ord("Z")) ? true : false; Your comparisons and && are already yielding a boolean true or false; you don't need the explicit "? true : false". I don't immediately see any reason why you need to call ord().  You should be able to just directly compare the character with "A" and "Z". function isUpper (character) = character >= "A" && character <= "Z"; It's kind of unfortunate that OpenSCAD doesn't have at least a basic suite of character tools like this and the string comparison functions below.  (You can implement most of them in base OpenSCAD, but to be really correct they should be Unicode-aware, and *that* is hard in base OpenSCAD.) > function toLower (character) = chr(ord(character) + ord("a") - ord("A")); Here you do need the ord().  You might want to include an "isUpper()" in there so that it doesn't do stupid things to non-upper-case characters. > function str_cmp(str,sidxex,pattern) = len(str)-sidxex <len(pattern)? > false : _str_cmp_recurse(str,sidxex,pattern,len(pattern)); "cmp" is a poor abbreviation to use here, because it doesn't tell the reader what the result means.  The only time I would use "cmp" is when I'm writing a function in the style of C's strcmp, where it returns negative, zero, or positive to indicate less-than, equal-to, or greater-than. And note that this function doesn't exactly compare strings.  Rather, it checks to see whether the first string (as offset by the index) begins with the second string.  As a result, I would probably call this function strbw(), or if I wanted to be more verbose but probably more screen-reader friendly, stringBegins(). Regardless, it needs a comment explaining; "sidexex" is not very self-explanatory.  I would probably use "start" or "startIndex" for that parameter name. > function _str_cmp_recurse(str,sidxex,pattern,plen,pidxex=0,) = pidxex > < plen && pattern[pidxex]==str[sidxex] ? > _str_cmp_recurse(str,sidxex+1,pattern,plen ,pidxex+1) : (pidxex==plen); Same basic comments. Unnecessary comma at the end of the parameter list.  (I'm actually surprised that isn't a syntax error.) I would probably rearrange it into a two-step test, instead of the one step with an && and then at the end figuring out which case you were in: function strBegins(str, start, pattern) = len(str)-start < len(pattern) ? false : _strBeginsRecurse(str, start, pattern, len(pattern)); function _strBeginsRecurse(str, start, pattern, plen, pstart=0) = pstart >= plen ? true : pattern[pstart]!=str[start] ? false : _strBeginsRecurse(str,start+1,pattern,plen, pstart+1); This variant also might be better in terms of tail-recursion optimization. > // Transcribe US English text into North American ASCII braille. > function transcribe(text,idx=0, result="") = idx==len(text) ? result : > let ( next= str_cmp(text,idx,"ing") ? [3,"+"] : str_cmp(text,idx,"en") > ? [2,"5"] : str_cmp(text,idx,"er") ? [2,"]"] : isUpper(text[idx]) ? > [1, str(",",toLower(text[idx]))] : [1, xlate(text[idx]) ] ) > transcribe(text,idx+next[0], str(result, next[1])); This demonstrates a nuisance aspect of OpenSCAD's scoping and variable rules:  there's no good way for one test to control two resulting values.  I don't know of any way better than what you've done. In the isUpper case I would probably call xlate(), just for consistency, even though it's not needed.  That would keep all of the knowledge of how to translate letters in that one function. As noted on the other file, I probably would have rearranged it to avoid needing "result", by having the last line look something like so: str(next[1], transcribe(text, idx+next[0])); (Though I suspect that would be worse for tail-recursion reasons.) Let's take a quick look at your test program: > include <braille-en-us-g1.scad>; text = ""; if (len(text) > 0) { > asciiBraille = transcribe (text); echo (asciiBraille); unicodeBraille > = asciiStrToUnicodeStr (asciiBraille); echo (unicodeBraille); > brailleLabel (unicodeBraille); } // fi // EOF Get rid of the "if (len(text) > 0)" test.  It's best if the functions all handle empty strings sensibly.  Your labels may not be very interesting with no text, but there might be other applications for which it's the right answer. And in fact passing in an empty string reveals an issue:  drawLine() doesn't handle empty strings, because it uses a range [0:len(line)-1] and that will be [ -1, 0 ] here.  (In a now-deprecated behavior, OpenSCAD will automatically reverse the direction of a range with an end less than the start.)  That in turn runs into trouble when it tries to index the string by -1.  You can force it to go forwards, and thus to do nothing with an empty string, by explicitly including the step:  [0:1:len(line)-1]. As mentioned above, asciiStrToUnicodeStr returns an array of one-character strings, so the result is a vertical stack.  Naïvely adding that in still yields a vertical stack, because what brailleLabel is expecting is a list of strings, and a single string looks like a list of one-letter strings.  Adding brackets somewhere, to turn the single string into a list with one entry, does the trick. Net (after changing asciiStrToUnicodeStr so that it returns a single string): include <braille-en-us-g1.scad> text = ""; asciiBraille = transcribe (text); echo (asciiBraille); unicodeBraille = asciiStrToUnicodeStr (asciiBraille); echo (unicodeBraille); brailleLabel ([unicodeBraille]); // EOF Random note:since there's Braille symbols in Unicode, and since those symbols are in several standard fonts, one might expect that text() would be able to generate them.  However, experiments say no; in all of the fonts that I have tried, including ones that appear to include the Braille symbols when used for ordinary text, those Unicode code points end up as the replacement glyph, usually a box.  Even if they were available, it might be better to do it the way you've done it, because rounding the tops would be no fun.  (You have to do it using multiple layers and offset().) Anyhow, hope that was all helpful.