if you could pick a standard format for a purpose what would it be and why?

e.g. flac for lossless audio because…

(yes you can add new categories)

summary:

  1. photos .jxl
  2. open domain image data .exr
  3. videos .av1
  4. lossless audio .flac
  5. lossy audio .opus
  6. subtitles srt/ass
  7. fonts .otf
  8. container mkv (doesnt contain .jxl)
  9. plain text utf-8 (many also say markup but disagree on the implementation)
  10. documents .odt
  11. archive files (this one is causing a bloodbath so i picked randomly) .tar.zst
  12. configuration files toml
  13. typesetting typst
  14. interchange format .ora
  15. models .gltf / .glb
  16. daw session files .dawproject
  17. otdr measurement results .xml
  • Elise
    link
    fedilink
    18
    edit-2
    1 year ago

    I wish there was a more standardized open format for documents. And more people and software should use markdown/.md because you just don’t need anything fancier for most types of documents.

      • SunRed
        link
        fedilink
        11 year ago

        I am surprised no one mentioned HCL yet. It’s just as sane as toml but it is also properly nestable, like yaml, while being easily parsable and formattable. I wish it was used more as a config language.

  • @neomis@sh.itjust.works
    link
    fedilink
    201 year ago

    Data output from manufacturing equipment. Just pick a standard. JSON works. TOML / YAML if you need to write as you go. Stop creating your own format that’s 80% JSON anyways.

  • @DigitalJacobin@lemmy.ml
    link
    fedilink
    English
    59
    edit-2
    1 year ago

    This is the kind of thing i think about all the time so i have a few.

    • Archive files: .tar.zst
      • Produces better compression ratios than the DEFLATE compression algorithm (used by .zip and gzip/.gz) and does so faster.
      • By separating the jobs of archiving (.tar), compressing (.zst), and (if you so choose) encrypting (.gpg), .tar.zst follows the Unix philosophy of “Make each program do one thing well.”.
      • .tar.xz is also very good and seems more popular (probably since it was released 6 years earlier in 2009), but, when tuned to it’s maximum compression level, .tar.zst can achieve a compression ratio pretty close to LZMA (used by .tar.xz and .7z) and do it faster[1].

        zstd and xz trade blows in their compression ratio. Recompressing all packages to zstd with our options yields a total ~0.8% increase in package size on all of our packages combined, but the decompression time for all packages saw a ~1300% speedup.

    • Image files: JPEG XL/.jxl
      • “Why JPEG XL”
      • Free and open format.
      • Can handle lossy images, lossless images, images with transparency, images with layers, and animated images, giving it the potential of being a universal image format.
      • Much better quality and compression efficiency than current lossy and lossless image formats (.jpeg, .png, .gif).
      • Produces much smaller files for lossless images than AVIF[2]
      • Supports much larger resolutions than AVIF’s 9-megapixel limit (important for lossless images).
      • Supports up to 24-bit color depth, much more than AVIF’s 12-bit color depth limit (which, to be fair, is probably good enough).
    • Videos (Codec): AV1
      • Free and open format.
      • Much more efficient than x264 (used by .mp4) and VP9[3].
    • Documents: OpenDocument / ODF / .odt

      it’s already a NATO standard for documents Because the Microsoft Word ones (.doc, .docx) are unusable outside the Microsoft Office ecosystem. I feel outraged every time I need to edit .docx file because it breaks the layout easily. And some older .doc files cannot even work with Microsoft Word.


    1. https://archlinux.org/news/now-using-zstandard-instead-of-xz-for-package-compression/ ↩︎

    2. https://tonisagrista.com/blog/2023/jpegxl-vs-avif/ ↩︎

    3. https://engineering.fb.com/2018/04/10/video-engineering/av1-beats-x264-and-libvpx-vp9-in-practical-use-case/ ↩︎

          • Gamma
            link
            fedilink
            English
            41 year ago

            I get your point. Since a .tar.zst file can be handled natively by tar, using .tzst instead does make sense.

          • @sebsch@discuss.tchncs.de
            link
            fedilink
            31 year ago

            I would argue what windows does with the extensions is a bad idea. Why do you think engineers should do things in favour of these horrible decisions the most insecure OS is designed with?

              • @DigitalJacobin@lemmy.ml
                link
                fedilink
                English
                1
                edit-2
                1 year ago

                I get the frustration, but Windows is the one that strayed from convention/standard.

                Also, i should’ve asked this earlier, but doesn’t Windows also only look at the characters following the last dot in the filename when determining the file type? If so, then this should be fine for Windows, since there’s only one canonical file extension at a time, right?

    • @jackpot@lemmy.mlOP
      link
      fedilink
      71 year ago
      • By separating the jobs of archiving (.tar), compressing (.zst), and (if you so choose) encrypting (.gpg), .tar.zst follows the Unix philosophy of “Make each program do one thing well.”.

      wait so does it do all of those things?

      • @DigitalJacobin@lemmy.ml
        link
        fedilink
        English
        141 year ago

        So there’s a tool called tar that creates an archive (a .tar file. Then theres a tool called zstd that can be used to compress files, including .tar files, which then becomes a .tar.zst file. And then you can encrypt your .tar.zst file using a tool called gpg, which would leave you with an encrypted, compressed .tar.zst.gpg archive.

        Now, most people aren’t doing everything in the terminal, so the process for most people would be pretty much the same as creating a ZIP archive.

    • @piexil@lemmy.world
      link
      fedilink
      31 year ago

      I get better compression ratio with xz than zstd, both at highest. When building an Ubuntu squashFS

      Zstd is way faster though

    • @ronweasleysl@lemmy.ml
      link
      fedilink
      English
      21 year ago

      Damn didn’t realize that JXL was such a big deal. That whole JPEG recompression actually seems pretty damn cool as well. There was some noise about GNOME starting to make use of JXL in their ecosystem too…

  • danielfgom
    link
    fedilink
    English
    81 year ago

    Definitely FLAC for audio because it’s lossless, if you record from a high fidelity source…

    exFAT for external hard drives and SD cards because both Windows and Mac can read and write to it as well as Linux. And you don’t have the permission pain…

      • danielfgom
        link
        fedilink
        English
        31 year ago

        If you were to format the drive with extra and then copy something to it from Linux - if you try open it on another Linux machine (eg you distro hop after this event) it won’t open the file because your aren’t the owner.

        Then you have to jump though hoops trying to make yourself the owner just so you can open your own file.

        I learnt this the hard way so I just use exFAT and it all works.

    • dinckel
      link
      fedilink
      191 year ago

      The existence of zip, and especially rar files, actually hurts me. It’s slow, it’s insecure, and the compression is from the jurassic era. We can do better

      • It’s a 30 year old format, and large amounts of research and innovation in lossy audio compression have occurred since then. Opus can achieve better quality in like 40% the bitrate. Also, the format is, much like zip, a mess of partially broken implementations in the early days (although now everyone uses LAME so not as big of a deal). Its container/stream format is very messy too. Also no native tag format so it needs ID3 tags which don’t enforce any standardized text encoding.

    • @TheAnonymouseJoker@lemmy.ml
      link
      fedilink
      -1
      edit-2
      1 year ago

      (mp3 needs to die)

      How are you going to recreate the MP3 audio artifacts that give a lot of music its originality, when encoding to OPUS? Past audio recordings cannot be fiddled with too much.

      Also, fuck Zstandard, its a problematic format due to single file compression ability, hard to repair, not fully stable and lacking too many features compared to 7Z/RAR. Zst is also 15-20% worse at compression ratio. Its only a good format for temporary fast data transit applications (webpage/CDN serving, quick temporary database backups).

      • Zip has terrible compression ratio compared to modern formats, it’s also a mess of different partially incompatible implementations by different software, and also doesn’t enforce utf8 or any standard for that matter for filenames, leading to garbled names when extracting old files. Its encryption is vulnerable to a known-plaintext attack and its key-derivation function is very easy to brute force.

        Rar is proprietary. That alone is reason enough not to use it. It’s also very slow.

  • @rtxn@lemmy.world
    link
    fedilink
    English
    16
    edit-2
    1 year ago

    XML for machine-readable data because I live to cause chaos

    Either markdown or Org for human-readable text-only documents. MS Office formats and the way they are handled have been a mess since the 2007 -x versions were introduced, and those and Open Document formats are way too bloated for when you only want to share a presentable text file.

    While we’re at it, standardize the fucking markdown syntax! I still have nightmares about Reddit’s degenerate four-space-indent code blocks.

  • raubarno
    link
    fedilink
    831 year ago

    Open Document Standard (.odt) for all documents. In all public institutions (it’s already a NATO standard for documents).

    Because the Microsoft Word ones (.doc, .docx) are unusable outside the Microsoft Office ecosystem. I feel outraged every time I need to edit .docx file because it breaks the layout easily. And some older .doc files cannot even work with Microsoft Word.

    Actually, IMHO, there should be some better alternative to .odt as well. Something more out of a declarative/scripted fashion like LaTeX but still WYSIWYG. LaTeX (and XeTeX, for my use cases) is too messy for me to work with, especially when a package is Byzantine. And it can be non-reproducible if I share/reuse the same document somewhere else.

    Something has to be made with document files.

    • megane-kun
      link
      fedilink
      101 year ago

      I was too young to use it in any serious context, but I kinda dig how WordPerfect does formatting. It is hidden by default, but you can show them and manipulate them as needed.

      It might already be a thing, but I am imagining a LaTeX-based standard for document formatting would do well with a WYSIWYG editor that would hide the complexity by default, but is available for those who need to manipulate it.

      • raubarno
        link
        fedilink
        61 year ago

        There are programs (LyX, TexMacs) that implement WYSIWYG for LaTeX, TexMacs is exceptionally good. I don’t know about the standards, though.

        Another problem with LaTeX and most of the other document formats is that they are so bloated and depend on many other tasks that it is hardly possible to embed the tool into a larger document. That’s a bit of criticism for UNIX design philosophy, as well. And LaTeX code is especially hard to make portable.

        There used to be a similar situation with PDFs, it was really hard to display a PDF embedded in application. Finally, Firefox pdf.js came in and solved that issue.

        The only embedded and easy-to-implement standard that describes a ‘document’ is HTML, for now (with Javascript for scripting). Only that it’s not aware of page layout. If only there’s an extension standard that could make a HTML page into a document…

        • megane-kun
          link
          fedilink
          English
          31 year ago

          I was actually thinking of something like markdown or HTML forming the base of that standard. But it’s almost impossible (is it?) to do page layout with either of them.

          But yeah! What I was thinking when I mentioned a LaTeX-based standard is to have a base set of “modules” (for a lack of a better term) that everyone should have and that would guarantee interoperability. That it’s possible to create a document with the exact layout one wants with just the base standard functionality. That things won’t be broken when opening up a document in a different editor.

          There could be additional modules to facilitate things, but nothing like the 90’s proprietary IE tags. The way I’m imagining this is that the additional modules would work on the base modules, making things slightly easier but that they ultimately depend on the base functionality.

          IDK, it’s really an idea that probably won’t work upon further investigation, but I just really like the idea of an open standard for documents based on LaTeX (kinda like how HTML has been for web pages), where you could work on it as a text file (with all the tags) if needed.

      • @DigitalJacobin@lemmy.ml
        link
        fedilink
        English
        81 year ago

        What’s messed up is that, technically, we do. Originally, OpenDocument was the ISO standard document format. But then, baffling everyone, Microsoft got the ISO to also have .docx as an ISO standard. So now we have 2 competing document standards, the second of which is simply worse.

    • @erogenouswarzone@lemmy.ml
      link
      fedilink
      English
      61 year ago

      Bro, trying to give padding in Ms word, when you know… YOU KNOOOOW… they can convert to html. It drives me up the wall.

      And don’t get me started on excel.

      Kill em all, I say.

  • @seaQueue@lemmy.world
    link
    fedilink
    16
    edit-2
    1 year ago

    I’d like an update to the epub ebook format that leverages zstd compression and jpeg-xl. You’d see much better decompression performance (especially for very large books,) smaller file sizes and/or better image quality. I’ve been toying with the idea of implementing this as a .zpub book format and plugin for KOReader but haven’t written any code for it yet.

  • @mindbleach@sh.itjust.works
    link
    fedilink
    181 year ago

    I don’t give a shit which debugging format any platform picks, but if they could each pick one that every emulator reads and every compiler emits, that’d be fucking great.

    • @brax@sh.itjust.works
      link
      fedilink
      8
      edit-2
      1 year ago

      Even more simpler, I’d really like if we could just unify whether or not $ is needed for variables, and pick # or // for comments. I’m sick of breaking my brain when I flip between languages because of these stupid nuance inconsistencies.

      • @Spore@lemmy.ml
        link
        fedilink
        11 year ago

        It does not work like that. $ is required in shell languages because they have quoteless strings and need to be super concise when calling commands. # and // are valid identifiers in many languages and all of them are well beyond the point of no return. My suggestion is to make use of your editor’s “turn this line into line comment” function and stop remembering them by yourself.

      • @mindbleach@sh.itjust.works
        link
        fedilink
        4
        edit-2
        1 year ago

        Don’t forget ; is a comment in assembly.

        For extra fun, did you know // wasn’t standardized until C99? Comments in K&R C are all /* */. Possibly the most tedious commending format ever devised.

        • @brax@sh.itjust.works
          link
          fedilink
          2
          edit-2
          1 year ago

          /* */ is used in CSS as well, I think.

          Also we’ve got VB (and probably BASIC) out there using ' because why not lol

          [EDIT] I stand corrected by another comment REM is what BASIC uses. DOS batch files use that, too. They’re old though, maybe we give them a pass “it’s okay grampa, let’s get you back to the museum” 🤣 (disclaimer: I am also old, don’t worry)

    • @iegod@lemm.ee
      link
      fedilink
      31 year ago

      That just sounds impossible given the absolute breadth of all platforms and architectures though.

  • glibg10b
    link
    fedilink
    141 year ago

    JPEG XL for images because it compresses better than JPEG, PNG and WEBP most of the time.

    XZ because it theoretically offers the highest compression ratio in most circumstances, and long decompression time isn’t really an issue when the alternative is downloading a larger file over a slow connection.

    Config files stored as serialized data structures instead of in plain text. This speeds up read times and removes the possibility of syntax or type errors. Also, fuck JSON.

    I wish there were a good format for typesetting. Docx is closed and inflexible. LaTeX is unreadable, inefficient to type and hard to learn due to the inconsistencies that arise from its reliance on third-party packages and its lack of guidelines for their design.

    • @davefischer@beehaw.org
      link
      fedilink
      English
      41 year ago

      TeX / LaTex documentation is infuriating. It’s either “use your university’s package to make a document that looks like this:” -or- program in alien assembly language.

      I like postscript for graphic design, but not so much for typesetting. For a flyer or poster, PS is great.

  • darcy
    link
    fedilink
    81 year ago

    i hate to be that guy, but pick the right tool for the right job. use markdown for a readme and latex for a research paper. you dont need to create ‘the ultimate file format’ that can do both, but worse and less compatible

    • @intrepid@lemmy.ca
      link
      fedilink
      21 year ago

      I agree with your assertion that there isn’t a perfect format. But the example you gave - markdown vs latex has a counter example - org mode. It can be used for both purposes and a load of others. Matroska container is similarly versatile. They are examples that carefully designed formats can reach a high level of versatility, though they may never become the perfect solution.