IFS=$'\n'

lemming741@lemmy.world · 2 days ago

IFS=$'\n'

rdri@lemmy.world · 9 hours ago

I hate software that doesn’t support Unicode, and it’s also not difficult to implement. At one point I wrote a dll that hacked a way how one app was handling filenames, to force it to use CreateFileW instead of CreateFileA. Just that allowed it to support Unicode filenames basically.

notarobot@lemmy.zip · edit-2 11 hours ago

I can’t believe I’m mentioning svelte kit file structure twice in like a week. I made an example to show in another comment but when I tried to copy and paste here it escaped Ann the characters, so here is a link

https://lemmy.zip/comment/21657580

Edit: I got it. The / at the end are not part of the name, they are just to show that they are folders

routes/
+page.ts
(admin)/
  +page.ts
  [user=uuid]/
    [[community]]/
      +page.ts
    posts/
      [...postIds@]/
        +page.ts

nialv7@lemmy.world · 12 hours ago

“winamp and play D:”

What an awful folder name D:

lemming741@lemmy.world · 10 hours ago

That post inspired this one :)

https://lemmyverse.link/lemmy.world/comment/19538308

stupidcasey@lemmy.world · 15 hours ago

But I want to make sure my program is absolutely incompatible with anything except the exact system I am building on and I don’t know how to hash a motherboards serial number.

lemming741@lemmy.world · 10 hours ago

Surely the TPM has this power

stupidcasey@lemmy.world · edit-2 10 hours ago

🤷‍♂️ IDK, I only had to do it once and they never had a problem with it.

Edit: also the TPM module wasn’t as standard back then.

I Cast Fist@programming.dev · 22 hours ago

Just you wait until latin languages need to throw áccênts and çedils around

stupidcasey@lemmy.world · 15 hours ago

deleted by creator

tourist@lemmy.world · 1 day ago

if you think this is bad, wait for the .🙏🥴❤️ TLDs

WIZARD POPE💫@lemmy.world · 15 hours ago

Wait for it to be one of these. 🦶🦶🏻🦶🏼🦶🏽🦶🏾🦶🏿 Good luck not getting scammed by .🦶🏼 when it was actually .🦶🏽 you wanted.

tourist@lemmy.world · edit-2 14 hours ago

Oh my god, I did some research

They’re selling emoji domains

https://xn--i-7iq.ws/ is ground zero

I strongly advise against buying one. It resolves to unreadable asdfgjkl-Hdlslfjrms.tld anyway, which entirely defeats the purpose.

Give your money to a charity instead.

Or give it to me. I will spend it on cigarettes.

Grendel@tiny.tilde.website · 13 hours ago

@tourist @Wizard_Pope

after I discovered egyptian hieroglyphs have unicode chars bought 𓅃.com as a joke last year.

even better a friend bought
𓂺.com

WIZARD POPE💫@lemmy.world · 12 hours ago

They saddly also resolve to gibberish for me.

Grendel@tiny.tilde.website · 11 hours ago

@Wizard_Pope oh yeah they do that for everyone. It’s a safety feature called puny code.

people were using weird chars to impersonate well known domains for phishing, so if you register a domain using unusual or mixed language characters it renders it as puny code to prevent spoofing.

WIZARD POPE💫@lemmy.world · 11 hours ago

Yeah I know about the spoofing thing like using cyrilic letters and shit. I just thought hieroglyphs would resolve to actual hieroglyphs

WIZARD POPE💫@lemmy.world · 12 hours ago

Fuck it just buy one because I want gibberish domains

lemming741@lemmy.world · 1 day ago

Don’t you put that voodoo on me!

NeatNit@discuss.tchncs.de · edit-2 2 days ago

You complain about ASCII filenames but a few of the examples are obviously Unicode, namely using emoji, well outside of the ASCII character set. But since you’ve brought up Unicode file names, let me introduce you to bidirectional text!

If you use Hebrew or Arabic, some of your directories or files will have right-to-left text in them. This is a recipe for disaster.

If in English you’d have “C:\Users\Adam\Documents\Research\Paper.pdf”, which breaks down to:

C:\
Users\
Adam\
Documents\
Research\
Paper.pdf

In Hebrew you’d have: “C:\משתמשים\אדם\מסמכים\מחקר\מאמר.pdf”, which breaks down to:

C:\
משתמשים\
אדם\
מסמכים\
מחקר\
מאמר.pdf

The entire path goes backwards, and the “.pdf” extension is visually attached to the “Users” folder if the text is rendered naively. It’s insane. Fortunately many GUI shells nowadays separate each path item so they can’t get intermixed like this. Example:

But still, if you copy a path into plaintext, it will still visually look wrong, and there is literally nothing that anyone can do about it. This is the correct way to render this text.

Exact same issues occur in Arabic and the few other RTL languages usedin the world. It’s a massive pain.

Edit: oh, and on commandline on Windows, the required characters aren’t even available by default so you get this lovely thing

Redex@lemmy.world · 20 hours ago

How would it look if you intermingled Hebrew and English in the path? E.g. C:\English\Hebrew\Hebrew\English\Hebrew

NeatNit@discuss.tchncs.de · edit-2 7 hours ago

I’m most cases, a consecutive run of RTL or neutral characters would be rendered RTL, while the rest would be rendered LTR. However, if it’s within a RTL paragraph, this would be reversed.

For example, the following two paragraphs have the same path, but the surrounding text is translated:

Open C:\Users\אדם\Documents\דברים\מסמך.pdf and click “Sign”.

פתח את C:\Users\אדם\Documents\דברים\מסמך.pdf ולחץ על “חתימה”.

Depending on your client, these should be rendered differently. If they don’t, click here to see it: https://jsfiddle.net/jex3yfrw/

Edit: looks like Voyager needs a bug report! The web Lemmy seems to render it RTL (correctly) but still left-aligned which is not ideal.

squaresinger@lemmy.world · 24 hours ago

Crazy… Btw, how does this work for top-to-bottom languages?

sarmale@lemmy.zip · edit-2 11 hours ago

I’m pretty sure you can’t do that in windows, but i have https://president.mn/mng for you, that’s pretty cool

squaresinger@lemmy.world · 11 hours ago

It is. Did they just rotate the page after rendering?

NeatNit@discuss.tchncs.de · 20 hours ago

I’m not sure, as I’ve never used them. But I imagine this is a lot more straightforward.

The problem with bidirectional text is that it’s bi-directional. Parts of it are RTL and parts are LTR. The main problem is how to order the characters visually, assuming that they are stored in memory in the order in which they are intended to be read.

For text that goes in only one direction this is trivial. LTR: characters are arranged from left to right. RTL: characters are arranged from right to left. Easy peasy!

The problem, as I’ve said, is when you have a sentence or paragraph with both LTR and RTL text inside it. Then the algorithm is needed.

To my knowledge, there is no bottom-to-top language, and certainly not one that would be mixed in within top-to-bottom text or vice versa. So an algorithm isn’t necessary: if TTB (top-to-bottom) is used, characters simply need to be arranged top to bottom.

To add on to this, I believe TTB text is only used explicitly. By default, all text is rendered horizontally (usually LTR) unless you explicitly set the software to render it top-to-bottom. So if you just have plain text in Japanese or whatever language with no additional markup, it will be rendered horizontally and subject to the UBA.

squaresinger@lemmy.world · edit-2 20 hours ago

There are a lot of top-to-bottom languages in Asia. Some chinese languages for example are traditionally written top to bottom.

Bidirectional text only really occurs when mixing languages, like in the example above where RTL Hebrew is mixed with LTR English (or in this case specifically LTR file paths that have originally been created in the context of an LTR language and thus are LTR).

If there was actual TTB language support in Windows Explorer, and you had a file path incorporating both TTB file names and LTR file endings and drive letters, then you’d also have the same issue with mixing LTR and RTL, only that you are now mixing writing directions in two dimensions.

But I’m guessing even though Unicode’s stated goal is to encode all writing, TTB is probably where they drew the line.

NeatNit@discuss.tchncs.de · 7 hours ago

Bidirectional text only really occurs when mixing languages

And also any time numbers are used in RTL text*, which is pretty common. Besides, you might be surprised how often English words or acronyms are used in everyday texts. If there’s a news story covering the American FBI, there’s no way to avoid writing it as “FBI”, in Latin letters.

There are a lot of top-to-bottom languages in Asia. Some chinese languages for example are traditionally written top to bottom.

But is that how it’s rendered by default when typed into a computer, for example into Notepad? Or into a chatting app like WhatsApp, Telegram, Discord, etc.? To my knowledge, they are rendered horizontally unless the software is specifically configured to render them TTB.

But I’m guessing even though Unicode’s stated goal is to encode all writing, TTB is probably where they drew the line.

I believe there actually are a few TTB properties in the Unicode database.

* except that one language that also uses RTL digits. I don’t remember its name or where it’s used, but it exists.

MonkderVierte@lemmy.zip · 22 hours ago

They usually have a ltr or rtl version of it.

squaresinger@lemmy.world · 22 hours ago

Makes sense. Can’t imagine a lot of software supports top-to-bottom.

Does Unicode even support that?

optional@sh.itjust.works · 2 days ago

Why not use

ꟻbq.משתמשים/אדם/מסמכים/מחקר/מאמר/:ↄ

instead? If you want to write from right to left, you should go all the way.

NeatNit@discuss.tchncs.de · 2 days ago

You maniac!

Optional@lemmy.world · 2 days ago

Daaammn youuuuuu! Damn you all to hellllll! *sobs*

lemming741@lemmy.world · edit-2 2 days ago

Excuse me, officer- this guy right here

this meme was fueled by sleep deprivation, alcohol, and caffeine. any views implied or expressed are not to be taken seriously and may result in side effects such as nausea and vomiting

Cethin@lemmy.zip · 2 days ago

The way this should work is it’s set as either left-to-right or right-to-left. (C:)/1/2/3.ext or ext.3/2/1/(:C). It shouldn’t render part of it one direction and part of it the other direction logically. It’s probably impossible to fix at this point, but this makes a lot more sense.

NeatNit@discuss.tchncs.de · edit-2 2 days ago

Yeah, that is pretty much how it works in some GUIs like in the screenshot, where each slash is replaced by >. But if you represent the path in a string, and put that string in some context that doesn’t know it’s a path and that it should be rendered by some special rules, then it’ll just be subject to the usual Unicode Bidirectional Algorithm (UBA).

The UBA is a masterpiece, and I’m not being sarcastic. For everyday text with mixed directionality, such as a WhatsApp chat in Arabic/Hebrew with a bit of English or just some numbers mixed in, the UBA’s default output is the ideal way to order the characters.

The problem is, special cases (such as file paths) just can’t be covered by a universal algorithm. You can insert special characters into the path, namely FSI and PDI (“First Strong directional Isolate” and “Pop Directional Isolate”) to make the text render the way you want under the UBA… But then, when you copy that path, the special characters would still be there so software would consider them part of the path, and then of course, File Not Found.

AnarchistArtificer@slrpnk.net · 1 day ago

I was already interested based on your first comment but this:

“The UBA is a masterpiece, and I’m not being sarcastic.”

has thoroughly piqued my interest. Thank you for being an opinionated nerd on the internet.

MonkderVierte@lemmy.zip · edit-2 22 hours ago

(:-C) boah or (C:) nose?

kopasz7@sh.itjust.works · edit-2 1 day ago

Make it vertical? ¯⁠\⁠_⁠(⁠ツ⁠)⁠_⁠/⁠¯

Korne127@lemmy.world · 2 days ago

The creator of this meme does not know what ASCII is

lemming741@lemmy.world · 2 days ago

Korne127@lemmy.world · 19 hours ago

Welp, it’s not a good one

lemming741@lemmy.world · 16 hours ago

🥲.png

rtxn@lemmy.world · 2 days ago

Fun fact: C:\: is a perfectly valid NTFS path. Windows won’t let you create it, though, because Windows doesn’t even fully support the NTFS specification. That’s why you have to specify the windows_names option when mounting an NTFS filesystem on Linux.

Optional@lemmy.world · 2 days ago

Me: *slowly reaches for loaded weapon*

Me: You best just keep on drivin’

Victor@lemmy.world · 2 days ago

TIL. Thanks!

DarkAri@lemmy.blahaj.zone · 2 days ago

I guess the most annoying part of it to me is that you have put your locations in quotes if you use them in a shell. I do use spaces for file names sometimes, except when writing code or something, then I use underscores.

azuth@sh.itjust.works · 21 hours ago

You can just escape the spaces with a \ .

DarkAri@lemmy.blahaj.zone · 15 hours ago

Cool I didn’t know that

katy ✨@piefed.blahaj.zone · 2 days ago

TEXTFI~2.TXT

S_H_K@lemmy.dbzer0.com · 2 days ago

I remember hiding whacky wheels for DOS using ascii characters in the folder name. The professor tried to delete it using win 3.1… Hahahahaha… Good luck with that!