Re-posted to fix my filename emoji. You can’t make this shit up

  • NeatNit@discuss.tchncs.de
    link
    fedilink
    arrow-up
    4
    ·
    1 day ago

    I’m not sure, as I’ve never used them. But I imagine this is a lot more straightforward.

    The problem with bidirectional text is that it’s bi-directional. Parts of it are RTL and parts are LTR. The main problem is how to order the characters visually, assuming that they are stored in memory in the order in which they are intended to be read.

    For text that goes in only one direction this is trivial. LTR: characters are arranged from left to right. RTL: characters are arranged from right to left. Easy peasy!

    The problem, as I’ve said, is when you have a sentence or paragraph with both LTR and RTL text inside it. Then the algorithm is needed.

    To my knowledge, there is no bottom-to-top language, and certainly not one that would be mixed in within top-to-bottom text or vice versa. So an algorithm isn’t necessary: if TTB (top-to-bottom) is used, characters simply need to be arranged top to bottom.

    To add on to this, I believe TTB text is only used explicitly. By default, all text is rendered horizontally (usually LTR) unless you explicitly set the software to render it top-to-bottom. So if you just have plain text in Japanese or whatever language with no additional markup, it will be rendered horizontally and subject to the UBA.

    • squaresinger@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      1 day ago

      There are a lot of top-to-bottom languages in Asia. Some chinese languages for example are traditionally written top to bottom.

      Bidirectional text only really occurs when mixing languages, like in the example above where RTL Hebrew is mixed with LTR English (or in this case specifically LTR file paths that have originally been created in the context of an LTR language and thus are LTR).

      If there was actual TTB language support in Windows Explorer, and you had a file path incorporating both TTB file names and LTR file endings and drive letters, then you’d also have the same issue with mixing LTR and RTL, only that you are now mixing writing directions in two dimensions.

      But I’m guessing even though Unicode’s stated goal is to encode all writing, TTB is probably where they drew the line.

      • NeatNit@discuss.tchncs.de
        link
        fedilink
        arrow-up
        1
        ·
        17 hours ago

        Bidirectional text only really occurs when mixing languages

        And also any time numbers are used in RTL text*, which is pretty common. Besides, you might be surprised how often English words or acronyms are used in everyday texts. If there’s a news story covering the American FBI, there’s no way to avoid writing it as “FBI”, in Latin letters.

        There are a lot of top-to-bottom languages in Asia. Some chinese languages for example are traditionally written top to bottom.

        But is that how it’s rendered by default when typed into a computer, for example into Notepad? Or into a chatting app like WhatsApp, Telegram, Discord, etc.? To my knowledge, they are rendered horizontally unless the software is specifically configured to render them TTB.

        But I’m guessing even though Unicode’s stated goal is to encode all writing, TTB is probably where they drew the line.

        I believe there actually are a few TTB properties in the Unicode database.

        * except that one language that also uses RTL digits. I don’t remember its name or where it’s used, but it exists.