discuss@lists.openscad.org

OpenSCAD general discussion Mailing-list

View all threads

text(): the 2.4% problem

JB
Jordan Brown
Fri, Jan 1, 2021 6:41 PM

[ As with most of the text() related threads, if you don't care about
fine-grained behavior of text layout you can completely ignore this
message. ]

There's a few references to the 2.4% problem scattered around, and since
I think I've started to understand it pretty completely I thought I'd
write it up.

See also text() metrics / glyph size problem #2436
https://github.com/openscad/openscad/issues/2436 which accurately
points out the problem but doesn't go into detail on what it means.

Overview

The core of the problem is that OpenSCAD asks FreeType to generate
glyphs with a nominal size of 100,000 units, and then when processing
some of the metrics it scales those metrics down by a factor of
102,400.  Net, the metrics end up 2.4% smaller than the font is designed
for.  This does not affect the sizes of the glyphs themselves, only
how they are laid out.  (You might think it is that the glyphs are too
large, but they have symmetric scaling factors; it is the metrics that
have 100,000 on one side and 102,400 on the other side.)

To find the places affected, search the OpenSCAD sources for references
to "unscale".  The key constants are at
https://github.com/openscad/openscad/blob/7618d4d0f3d6cd1a43da7f3598885fc6e20849d2/src/FreetypeRenderer.cc#L49

Effects

There are approximately six things affected:

History

The "unscale" constant was added ~9 months ago in response to #3262.  It
perpetuated an existing scale mismatch, with commentary about being
uncertain whether the mismatch was intentional but that it was being
left for backwards compatibility.

The original 1000/1024 factor dates back to the introduction of the
text() module in 2014.  In that version, the "scale" value was 1000 (vs
100,000 today) and the unscale value, expressed as numeric constants
rather than a symbolic constant, was 64*16 = 1024.

The 1000 factor is intended, I believe, to yield glyphs at a
conveniently large size so that their rendering resolution will not be
an issue.  I suspect that it's a constant size so that the underlying
Freetype will only need to render one copy of the font, rather than one
for each size that the OpenSCAD user requests, with OpenSCAD scaling as
required for the individual request.  (Note:  I believe that obsessive
typographers would say that this is wrong, that fonts should not be
simply scaled as they are enlarged but rather may need their dimensions
(e.g. line widths) adjusted for the size.  But I'm not that obsessive.)

I suspect that the 1000/1024 mismatch derives from a misunderstanding
about the values accepted by and returned by the FreeType and HarfBuzz
interfaces.  The 64*16 factor aligns, at least somewhat, with the fact
that the values are 26.6 bit fixed-point values - or, in other words,
values in 1/64 of a point or maybe 1/64 of a pixel.  The mismatch is, I
suspect, that the author didn't realize that the input size (the
argument to FT_Set_Char_size) is also in 26.6 factional points and so
the request for a size of 1000 was a request for a size of 15.625 points.

Related (or, really, unrelated)

The as-rendered size of the glyphs is unrelated to this issue.  For
those cases, the value coming out of FreeType is de-scaled by "scale",
100,000, rather than by 102,400.  (It's further scaled up to the
requested size in DrawingCallback.)
Ref
https://github.com/openscad/openscad/blob/7618d4d0f3d6cd1a43da7f3598885fc6e20849d2/src/FreetypeRenderer.cc#L45
Ref
https://github.com/openscad/openscad/blob/7618d4d0f3d6cd1a43da7f3598885fc6e20849d2/src/FreetypeRenderer.cc#L72
(et seq.)

One might think that the size parameter to the text() module would
control the height of the glyphs rendered, in some obvious way.  (Maybe
it's the height of a capital letter, or that plus descenders, or the
interline spacing.)  Whether or not that's true depends on the font
design.  When I print a 72-point Liberation Sans X from MS Word, you
might think it'd be an inch tall... but it isn't; it's closer to 3/4". 
Top of capital letters to bottom of descenders?  No, 7/8".  Maybe the
interline spacing?  Nope, that's closer to 1 1/8".  Times New Roman is
about the same.  Arial is a little larger; Courier New is quite a bit
smaller.

I'm not sure that the scaling calculations are truly correct - as #2436
also says, it looks like there may be a 72/100 problem too - but any
discrepancies between the specified size and the dimensions of the
glyphs as rendered are definitely not tied to the 2.4% problem.

Why does Jordan care?

It's a small thing, but I'm trying to write a function that returns
metric information about text and fonts, and I need to decide whether to
return information that matches the glyphs that are rendered, or that
matches what OpenSCAD will do to them, or perhaps both.  Also, the
current behavior is, well, wrong and makes it hard(er) to follow what's
going on in the implementation.

Future?

  • We could leave it alone.  My new metrics functions can either return
    the "as designed" values and the "as OpenSCAD calculates" values, so
    that  you could figure out whatever you need to know.
  • We could fix it, and eat the various compatibility breaks.  Quite
    possibly nobody would actually care.
  • We could fix it, optionally, by having either a configuration switch
    or a $ variable that sets whether to be in old incorrect mode or new
    correct mode.
[ As with most of the text() related threads, if you don't care about fine-grained behavior of text layout you can completely ignore this message. ] There's a few references to the 2.4% problem scattered around, and since I think I've started to understand it pretty completely I thought I'd write it up. See also text() metrics / glyph size problem #2436 <https://github.com/openscad/openscad/issues/2436> which accurately points out the problem but doesn't go into detail on what it means. Overview The core of the problem is that OpenSCAD asks FreeType to generate glyphs with a nominal size of 100,000 units, and then when processing some of the metrics it scales those metrics down by a factor of 102,400.  Net, the metrics end up 2.4% smaller than the font is designed for.  This does *not* affect the sizes of the glyphs themselves, only how they are laid out.  (You might think it is that the glyphs are too large, but they have symmetric scaling factors; it is the metrics that have 100,000 on one side and 102,400 on the other side.) To find the places affected, search the OpenSCAD sources for references to "unscale".  The key constants are at https://github.com/openscad/openscad/blob/7618d4d0f3d6cd1a43da7f3598885fc6e20849d2/src/FreetypeRenderer.cc#L49 Effects There are approximately six things affected: * The ascent and descent of the text, used for valign of horizontal text, are based on the text's actual bounding box, times this 1000/1024 factor.  Thus, as noted in #2436, valign=top will leave you with a small amount of the character above the X axis, and similarly with valign=bottom. Ref https://github.com/openscad/openscad/blob/7618d4d0f3d6cd1a43da7f3598885fc6e20849d2/src/FreetypeRenderer.cc#L284 Observe that the top of this M is slightly above the X axis. * The width metric of a vertical-layout string (direction=ttb/btt) is based on the text's actual bounding box, times this 1000/1024 factor.  This leaves the glyphs slightly offset to the right in the default (wrong, see other messages) centered horizontal alignment. Ref https://github.com/openscad/openscad/blob/7618d4d0f3d6cd1a43da7f3598885fc6e20849d2/src/FreetypeRenderer.cc#L290 Note how the center of this X is slightly off the Y axis: *  The "offset" values returned by HarfBuzz are multiplied by this factor.  It is not clear to me when these values are used.  So far I've seen them in vertical text and in some parts of Hebrew text.  The impact would be that the glyphs are shifted 2.4% of the offset value.  Offset values appear to be between 0 and 0.5 in X on both vertical and Hebrew, and typically a bit over 1.0 in Y on vertical text, times the text size, so the shift would be ~2.4% of the text size.  I suspect that the offsets are used in Hebrew to position the diacritic marks that indicate vowels.  In vertical text they seem to be used to center the glyphs around the Y axis and to shift the origin down so that the glyphs lie below the X axis. Ref https://github.com/openscad/openscad/blob/7618d4d0f3d6cd1a43da7f3598885fc6e20849d2/src/FreetypeRenderer.h#L117 Ref https://github.com/openscad/openscad/blob/7618d4d0f3d6cd1a43da7f3598885fc6e20849d2/src/FreetypeRenderer.cc#L301 * The "advance" values returned by HarfBuzz are multiplied by this factor. o These values are used to step from one glyph to the next.  The impact is that the glyphs are packed together slightly more closely than the font design intends, for both horizontal and vertical text. Ref https://github.com/openscad/openscad/blob/7618d4d0f3d6cd1a43da7f3598885fc6e20849d2/src/FreetypeRenderer.h#L119 Ref https://github.com/openscad/openscad/blob/7618d4d0f3d6cd1a43da7f3598885fc6e20849d2/src/FreetypeRenderer.cc#L305 This is the point where two Liberation Sans Ws meet, as OpenSCAD renders them: and this is how the font is designed: o These values are used to calculate the width of the text for halign of horizontal text. The impact is that the width is slightly underestimated, by 2.4% of the width of the last glyph.  (The error in previous glyphs corresponds to them being packed together more tightly, and so although it is not as the font is designed it is consistent.)  The impact is that halign=left and halign=center yield text that is slightly right of where it should be. Ref https://github.com/openscad/openscad/blob/7618d4d0f3d6cd1a43da7f3598885fc6e20849d2/src/FreetypeRenderer.cc#L286 This is the top right corner of a Liberation Sans W, right aligned: o These values are used to calculate the height of the text for valign of vertical text.  The result is that with valign=top and valign=center, the text is shifted slightly down from its design location.  As noted in other messages, valign=top yields unexpected results; it places the text *above* the origin. Ref https://github.com/openscad/openscad/blob/7618d4d0f3d6cd1a43da7f3598885fc6e20849d2/src/FreetypeRenderer.cc#L292 Here is the bottom of a Liberation Sans open parenthesis: and here it is as designed: History The "unscale" constant was added ~9 months ago in response to #3262.  It perpetuated an existing scale mismatch, with commentary about being uncertain whether the mismatch was intentional but that it was being left for backwards compatibility. The original 1000/1024 factor dates back to the introduction of the text() module in 2014.  In that version, the "scale" value was 1000 (vs 100,000 today) and the unscale value, expressed as numeric constants rather than a symbolic constant, was 64*16 = 1024. The 1000 factor is intended, I believe, to yield glyphs at a conveniently large size so that their rendering resolution will not be an issue.  I suspect that it's a constant size so that the underlying Freetype will only need to render one copy of the font, rather than one for each size that the OpenSCAD user requests, with OpenSCAD scaling as required for the individual request.  (Note:  I believe that obsessive typographers would say that this is wrong, that fonts should not be simply scaled as they are enlarged but rather may need their dimensions (e.g. line widths) adjusted for the size.  But I'm not that obsessive.) I suspect that the 1000/1024 mismatch derives from a misunderstanding about the values accepted by and returned by the FreeType and HarfBuzz interfaces.  The 64*16 factor aligns, at least somewhat, with the fact that the values are 26.6 bit fixed-point values - or, in other words, values in 1/64 of a point or maybe 1/64 of a pixel.  The mismatch is, I suspect, that the author didn't realize that the input size (the argument to FT_Set_Char_size) is *also* in 26.6 factional points and so the request for a size of 1000 was a request for a size of 15.625 points. Related (or, really, unrelated) The as-rendered size of the glyphs is unrelated to this issue.  For those cases, the value coming out of FreeType is de-scaled by "scale", 100,000, rather than by 102,400.  (It's further scaled up to the requested size in DrawingCallback.) Ref https://github.com/openscad/openscad/blob/7618d4d0f3d6cd1a43da7f3598885fc6e20849d2/src/FreetypeRenderer.cc#L45 Ref https://github.com/openscad/openscad/blob/7618d4d0f3d6cd1a43da7f3598885fc6e20849d2/src/FreetypeRenderer.cc#L72 (et seq.) One might think that the size parameter to the text() module would control the height of the glyphs rendered, in some obvious way.  (Maybe it's the height of a capital letter, or that plus descenders, or the interline spacing.)  Whether or not that's true depends on the font design.  When I print a 72-point Liberation Sans X from MS Word, you might think it'd be an inch tall... but it isn't; it's closer to 3/4".  Top of capital letters to bottom of descenders?  No, 7/8".  Maybe the interline spacing?  Nope, that's closer to 1 1/8".  Times New Roman is about the same.  Arial is a little larger; Courier New is quite a bit smaller. I'm not sure that the scaling calculations are truly correct - as #2436 also says, it looks like there may be a 72/100 problem too - but any discrepancies between the specified size and the dimensions of the glyphs as rendered are definitely not tied to the 2.4% problem. Why does Jordan care? It's a small thing, but I'm trying to write a function that returns metric information about text and fonts, and I need to decide whether to return information that matches the glyphs that are rendered, or that matches what OpenSCAD will do to them, or perhaps both.  Also, the current behavior is, well, wrong and makes it hard(er) to follow what's going on in the implementation. Future? * We could leave it alone.  My new metrics functions can either return the "as designed" values and the "as OpenSCAD calculates" values, so that  you could figure out whatever you need to know. * We could fix it, and eat the various compatibility breaks.  Quite possibly nobody would actually care. * We could fix it, optionally, by having either a configuration switch or a $ variable that sets whether to be in old incorrect mode or new correct mode.
L
lar3ry
Fri, Jan 1, 2021 7:43 PM

It was not until reading this that I found out that HarfBuzz is Persian.

When I typed حرف‌باز  into Google translate, it translated it as
"talkative". Hence my comment in another thread.

--
Sent from: http://forum.openscad.org/

It was not until reading this that I found out that HarfBuzz is Persian. When I typed حرف‌باز into Google translate, it translated it as "talkative". Hence my comment in another thread. -- Sent from: http://forum.openscad.org/
A
adrianv
Sat, Jan 2, 2021 1:24 PM

My vote is FIX IT.  I'd rather keep things clean and simple and not have a
compatibility mode personally, but I also don't have any big text-dependent
projects to worry about fixing.

--
Sent from: http://forum.openscad.org/

My vote is FIX IT. I'd rather keep things clean and simple and not have a compatibility mode personally, but I also don't have any big text-dependent projects to worry about fixing. -- Sent from: http://forum.openscad.org/
T
Troberg
Sat, Jan 2, 2021 4:38 PM

adrianv wrote

My vote is FIX IT.  I'd rather keep things clean and simple and not have a
compatibility mode personally, but I also don't have any big
text-dependent
projects to worry about fixing.

I agree. Perpetuating bugs is perpetuating trouble.

--
Sent from: http://forum.openscad.org/

adrianv wrote > My vote is FIX IT. I'd rather keep things clean and simple and not have a > compatibility mode personally, but I also don't have any big > text-dependent > projects to worry about fixing. I agree. Perpetuating bugs is perpetuating trouble. -- Sent from: http://forum.openscad.org/