tp3 wrote
I disagree it's up to OpenSCAD trying to fix that. OpenSCAD
files are UTF-8 and that's it.
in Windows they are NOT. Thats the point.
Hm? Of cause you can encode file content as UTF-8 on Windows
using any reasonable editor. Notepad++ can handle UTF-8 encoded
files just fine - minus the guessing game shown there:
https://notepad-plus-plus.org/community/topic/10981/np-6-8-8-file-encode-type-does-not-stay-as-utf8
I'd prefer to not see a similar discussion for the OpenSCAD
editor.
I guess Notepad++ has to deal with that as general purpose
editor. For OpenSCAD there is no need for that.
ciao,
Torsten.
tp3 wrote
I'd prefer to not see a similar discussion for the OpenSCAD editor.
I thought, I was clear, but maybe there is a misunderstanding:
Both OpenSCAD editors do not save UTF8-encoded files on Windows. You can use
Notepad to check that. Save a file containing some comment "//öäü" with
QScintilla and open it with any other editor doing no translation. You will
find "//öäü".
So, where is the logic in converting a filename from native to UTF8, if the
OpenSCAD editors do not save UTF8 on windows?
--
View this message in context: http://forum.openscad.org/After-changement-it-doesn-t-work-any-longer-use-seems-to-work-the-wrong-way-or-work-not-any-more-tp20017p20054.html
Sent from the OpenSCAD mailing list archive at Nabble.com.
I am struggling to understand this discussion. Surely files don't have an
encoding. They are just a sequence of bytes that get read back the same as
they are written. It is the editor that decides how to interpret them.
When I save your comment in OpenSCAD and read it in NotePad++ it looks the
same to me. Two clashes o umlaut, a umlaut, u umlaut. What is the problem?
On 12 January 2017 at 18:48, Parkinbot rudolf@parkinbot.com wrote:
tp3 wrote
I'd prefer to not see a similar discussion for the OpenSCAD editor.
I thought, I was clear, but maybe there is a misunderstanding:
Both OpenSCAD editors do not save UTF8-encoded files on Windows. You can
use
Notepad to check that. Save a file containing some comment "//öäü" with
QScintilla and open it with any other editor doing no translation. You will
find "//öäü".
So, where is the logic in converting a filename from native to UTF8, if the
OpenSCAD editors do not save UTF8 on windows?
--
View this message in context: http://forum.openscad.org/
After-changement-it-doesn-t-work-any-longer-use-seems-to-
work-the-wrong-way-or-work-not-any-more-tp20017p20054.html
Sent from the OpenSCAD mailing list archive at Nabble.com.
OpenSCAD mailing list
Discuss@lists.openscad.org
http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org
On 01/12/2017 07:48 PM, Parkinbot wrote:
tp3 wrote
I'd prefer to not see a similar discussion for the OpenSCAD editor.
I thought, I was clear, but maybe there is a misunderstanding:
Both OpenSCAD editors do not save UTF8-encoded files on Windows. You can use
Notepad to check that. Save a file containing some comment "//öäü" with
QScintilla and open it with any other editor doing no translation. You will
find "//öäü".
That's really bad then. The scintilla editor is explicitly set to UTF8
https://github.com/openscad/openscad/blob/master/src/scintillaeditor.cpp#L145
and the writer is also forced to UTF8
https://github.com/openscad/openscad/blob/master/src/mainwin.cc#L1438
if the text still does not end up as UTF-8 on disk, that's not good
and certainly not expected.
So, where is the logic in converting a filename from native to UTF8, if the
OpenSCAD editors do not save UTF8 on windows?
The first part is needed to have some common ground. But relies on
the second part (always use UTF8). If that's not working, then it's
likely the reason for the initially reported issue.
ciao,
Torsten.
On Linux and Macintosh, UTF-8 is the standard default text encoding. This
makes things very simple.
On Windows, there are two different 8-bit text file encodings, which we'll
call UTF-8 and ANSI. The UTF-8 encoding begins with a 3 byte magic number
called the BOM, which isn't present in the ANSI encoding. The ANSI encoding
is some non-Unicode encoding, based on how the Windows installation is
localized. It is whatever encoding is specified by the current "code page"
in the Windows registry. So passing Windows ANSI text files around can be
tricky, since you don't know how they are encoded.
If you are unfortunate enough to maintain a Windows program, then command
line arguments, such as file names, are either passed in ANSI encoding (if
you declare main as main(charargv, int argc), or they are passed in 16
bit UTF-16 unicode encoding (if you declare main as main(wchar_targv, int
argc)). When you read a text file, you check for the BOM to know whether
you have UTF-8 encoding or ANSI encoding. There is a different BOM you
check for that indicates UTF-16 text file encoding. Anything written to
standard output is either in ANSI encoding or UTF-16 encoding, depending on
what options were used to compile the program.
On 12 January 2017 at 14:07, nop head nop.head@gmail.com wrote:
I am struggling to understand this discussion. Surely files don't have an
encoding. They are just a sequence of bytes that get read back the same as
they are written. It is the editor that decides how to interpret them.
When I save your comment in OpenSCAD and read it in NotePad++ it looks the
same to me. Two clashes o umlaut, a umlaut, u umlaut. What is the problem?
On 12 January 2017 at 18:48, Parkinbot rudolf@parkinbot.com wrote:
tp3 wrote
I'd prefer to not see a similar discussion for the OpenSCAD editor.
I thought, I was clear, but maybe there is a misunderstanding:
Both OpenSCAD editors do not save UTF8-encoded files on Windows. You can
use
Notepad to check that. Save a file containing some comment "//öäü" with
QScintilla and open it with any other editor doing no translation. You
will
find "//öäü".
So, where is the logic in converting a filename from native to UTF8, if
the
OpenSCAD editors do not save UTF8 on windows?
--
View this message in context: http://forum.openscad.org/Afte
r-changement-it-doesn-t-work-any-longer-use-seems-to-work-
the-wrong-way-or-work-not-any-more-tp20017p20054.html
Sent from the OpenSCAD mailing list archive at Nabble.com.
OpenSCAD mailing list
Discuss@lists.openscad.org
http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org
On 01/12/2017 08:07 PM, nop head wrote:
I am struggling to understand this discussion. Surely files don't
have an encoding.
They have, but in most cases it's not explicitly stated in the
file content (except when using BOM headers which can be quite
annoying in itself).
They are just a sequence of bytes that get read back the
same as they are written. It is the editor that decides how
to interpret them.
Exactly, and OpenSCAD is supposed to interpret the binary
data as UTF-8. This means the german a-umlaut (ä) is encoded
as two bytes 0xc3 0xa4.
http://unicode-search.net/unicode-namesearch.pl?term=%C3%A4&.submit=Search
This will result in codepoint 0xe4 in the unicode table. When
encoding the same character in latin-1 it's just a single byte
(also 0xe4).
So as there's no info in the file how to interpret, most
editors including notepad++ try to guess the encoding.
OpenSCAD is supposed to simply always use UTF-8, but maybe
that's broken on Windows right now. And that would be a
problem for both exchanging files with users on other systems
and also for accessing files on filesystems that don't encode
as UTF-8 (e.g. NTFS uses UTF-16 at least natively).
ciao,
Torsten.
So on Win7with UK settings the file has no BOM but those characters are
encoded as 22 2F 2F C3 B6 C3 A4 C3 BC 22 0D 0A. Is it just a problem in
other locals?
On 12 January 2017 at 19:28, Torsten Paul Torsten.Paul@gmx.de wrote:
On 01/12/2017 08:07 PM, nop head wrote:
I am struggling to understand this discussion. Surely files don't
have an encoding.
They have, but in most cases it's not explicitly stated in the
file content (except when using BOM headers which can be quite
annoying in itself).
They are just a sequence of bytes that get read back the
same as they are written. It is the editor that decides how
to interpret them.
Exactly, and OpenSCAD is supposed to interpret the binary
data as UTF-8. This means the german a-umlaut (ä) is encoded
as two bytes 0xc3 0xa4.
http://unicode-search.net/unicode-namesearch.pl?term=%C3%A4&.submit=Search
This will result in codepoint 0xe4 in the unicode table. When
encoding the same character in latin-1 it's just a single byte
(also 0xe4).
So as there's no info in the file how to interpret, most
editors including notepad++ try to guess the encoding.
OpenSCAD is supposed to simply always use UTF-8, but maybe
that's broken on Windows right now. And that would be a
problem for both exchanging files with users on other systems
and also for accessing files on filesystems that don't encode
as UTF-8 (e.g. NTFS uses UTF-16 at least natively).
ciao,
Torsten.
OpenSCAD mailing list
Discuss@lists.openscad.org
http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org
On 01/12/2017 08:39 PM, nop head wrote:
So on Win7with UK settings the file has no BOM but those
characters are encoded as 22 2F 2F C3 B6 C3 A4 C3 BC 22 0D 0A.
Is it just a problem in other locals?
Right, that's nice and clean UTF-8. That's how it's supposed
to look like. So it seems it's not just as simple as a Windows
issue.
ciao,
Torsten.
HxD shows "C3 A4 C3 B6 C3 BC" for "äöü" saved with OpenSCAD Editor in
Windows. So far this is, as expected, and it is UTF8.
Now: I can save my scadfile using "äöü.scad" as filename and open it again,
no problem. I can also export it as STL. This is done in native coding!!!,
the screenshot shows this. But I can't include/use this file or import it
into any other scad file. Have a look at the screenshot.
Further note that it is not possible to write something like
include <\uC3A4\uC3B6\uC3BC.scad>
but that wouldn't help either, because also the following doesn't work
import ("äöü.stl"); // WARNING: Can't open import file
'C:/Users/Rudi/Documents/3D Designs\äöü.stl'.
import ("���.stl"); // WARNING: Can't open import file
'C:/Users/Rudi/Documents/3D Designs\���.stl'.
import ("\uC3A4\uC3B6\uC3BC.stl"); // WARNING: Can't open import file
'C:/Users/Rudi/Documents/3D Designs\쎤쎶쎼.stl'.
with ��� copied from the console.
http://forum.openscad.org/file/n20062/Umlaute.png
--
View this message in context: http://forum.openscad.org/After-changement-it-doesn-t-work-any-longer-use-seems-to-work-the-wrong-way-or-work-not-any-more-tp20017p20062.html
Sent from the OpenSCAD mailing list archive at Nabble.com.
Ah, so it's filenames that can't be UTF8, not file contents.
On 12 January 2017 at 21:18, Parkinbot rudolf@parkinbot.com wrote:
HxD shows "C3 A4 C3 B6 C3 BC" for "äöü" saved with OpenSCAD Editor in
Windows. So far this is, as expected, and it is UTF8.
Now: I can save my scadfile using "äöü.scad" as filename and open it again,
no problem. I can also export it as STL. This is done in native coding!!!,
the screenshot shows this. But I can't include/use this file or import it
into any other scad file. Have a look at the screenshot.
Further note that it is not possible to write something like
include <\uC3A4\uC3B6\uC3BC.scad>
but that wouldn't help either, because also the following doesn't work
import ("äöü.stl"); // WARNING: Can't open import file
'C:/Users/Rudi/Documents/3D Designs\äöü.stl'.
import ("���.stl"); // WARNING: Can't open import file
'C:/Users/Rudi/Documents/3D Designs\���.stl'.
import ("\uC3A4\uC3B6\uC3BC.stl"); // WARNING: Can't open import file
'C:/Users/Rudi/Documents/3D Designs\쎤쎶쎼.stl'.
with ��� copied from the console.
http://forum.openscad.org/file/n20062/Umlaute.png
--
View this message in context: http://forum.openscad.org/
After-changement-it-doesn-t-work-any-longer-use-seems-to-
work-the-wrong-way-or-work-not-any-more-tp20017p20062.html
Sent from the OpenSCAD mailing list archive at Nabble.com.
OpenSCAD mailing list
Discuss@lists.openscad.org
http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org