if you could pick a standard format for a purpose what would it be and why?

e.g. flac for lossless audio because…

(yes you can add new categories)

summary:

  1. photos .jxl
  2. open domain image data .exr
  3. videos .av1
  4. lossless audio .flac
  5. lossy audio .opus
  6. subtitles srt/ass
  7. fonts .otf
  8. container mkv (doesnt contain .jxl)
  9. plain text utf-8 (many also say markup but disagree on the implementation)
  10. documents .odt
  11. archive files (this one is causing a bloodbath so i picked randomly) .tar.zst
  12. configuration files toml
  13. typesetting typst
  14. interchange format .ora
  15. models .gltf / .glb
  16. daw session files .dawproject
  17. otdr measurement results .xml
  • DigitalJacobin@lemmy.ml
    link
    fedilink
    English
    arrow-up
    50
    ·
    edit-2
    1 year ago

    This is the kind of thing i think about all the time so i have a few.

    • Archive files: .tar.zst
      • Produces better compression ratios than the DEFLATE compression algorithm (used by .zip and gzip/.gz) and does so faster.
      • By separating the jobs of archiving (.tar), compressing (.zst), and (if you so choose) encrypting (.gpg), .tar.zst follows the Unix philosophy of “Make each program do one thing well.”.
      • .tar.xz is also very good and seems more popular (probably since it was released 6 years earlier in 2009), but, when tuned to it’s maximum compression level, .tar.zst can achieve a compression ratio pretty close to LZMA (used by .tar.xz and .7z) and do it faster[1].

        zstd and xz trade blows in their compression ratio. Recompressing all packages to zstd with our options yields a total ~0.8% increase in package size on all of our packages combined, but the decompression time for all packages saw a ~1300% speedup.

    • Image files: JPEG XL/.jxl
      • “Why JPEG XL”
      • Free and open format.
      • Can handle lossy images, lossless images, images with transparency, images with layers, and animated images, giving it the potential of being a universal image format.
      • Much better quality and compression efficiency than current lossy and lossless image formats (.jpeg, .png, .gif).
      • Produces much smaller files for lossless images than AVIF[2]
      • Supports much larger resolutions than AVIF’s 9-megapixel limit (important for lossless images).
      • Supports up to 24-bit color depth, much more than AVIF’s 12-bit color depth limit (which, to be fair, is probably good enough).
    • Videos (Codec): AV1
      • Free and open format.
      • Much more efficient than x264 (used by .mp4) and VP9[3].
    • Documents: OpenDocument / ODF / .odt

      it’s already a NATO standard for documents Because the Microsoft Word ones (.doc, .docx) are unusable outside the Microsoft Office ecosystem. I feel outraged every time I need to edit .docx file because it breaks the layout easily. And some older .doc files cannot even work with Microsoft Word.


    1. https://archlinux.org/news/now-using-zstandard-instead-of-xz-for-package-compression/ ↩︎

    2. https://tonisagrista.com/blog/2023/jpegxl-vs-avif/ ↩︎

    3. https://engineering.fb.com/2018/04/10/video-engineering/av1-beats-x264-and-libvpx-vp9-in-practical-use-case/ ↩︎

    • jackpot@lemmy.mlOP
      link
      fedilink
      arrow-up
      6
      ·
      1 year ago
      • By separating the jobs of archiving (.tar), compressing (.zst), and (if you so choose) encrypting (.gpg), .tar.zst follows the Unix philosophy of “Make each program do one thing well.”.

      wait so does it do all of those things?

      • DigitalJacobin@lemmy.ml
        link
        fedilink
        English
        arrow-up
        13
        ·
        1 year ago

        So there’s a tool called tar that creates an archive (a .tar file. Then theres a tool called zstd that can be used to compress files, including .tar files, which then becomes a .tar.zst file. And then you can encrypt your .tar.zst file using a tool called gpg, which would leave you with an encrypted, compressed .tar.zst.gpg archive.

        Now, most people aren’t doing everything in the terminal, so the process for most people would be pretty much the same as creating a ZIP archive.

    • piexil@lemmy.world
      link
      fedilink
      arrow-up
      3
      ·
      1 year ago

      I get better compression ratio with xz than zstd, both at highest. When building an Ubuntu squashFS

      Zstd is way faster though

    • ronweasleysl@lemmy.ml
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      Damn didn’t realize that JXL was such a big deal. That whole JPEG recompression actually seems pretty damn cool as well. There was some noise about GNOME starting to make use of JXL in their ecosystem too…

              • DigitalJacobin@lemmy.ml
                link
                fedilink
                English
                arrow-up
                1
                ·
                edit-2
                1 year ago

                I get the frustration, but Windows is the one that strayed from convention/standard.

                Also, i should’ve asked this earlier, but doesn’t Windows also only look at the characters following the last dot in the filename when determining the file type? If so, then this should be fine for Windows, since there’s only one canonical file extension at a time, right?

                  • DigitalJacobin@lemmy.ml
                    link
                    fedilink
                    English
                    arrow-up
                    2
                    ·
                    1 year ago

                    However, getting people used to double extensions is one quick way of increasing the success rate of attacks such as the infamous “.pdf.exe” invoice from an email attachment.

                    Very good point. Though, i would argue that this would be much less of a problem if Windows stopped sometimes hiding file extensions.

                    I can’t see how Windows’ convention is worse

                    I don’t believe what you’re referring to is really a Windows versus Linux/Unix thing.

                    If I zip a file, it doesn’t matter what it was in a previous life, it’s now a zip - this is also how Unix deals with many filetypes, I’ve never seen a .h264.mp4 file, even though the .mp4 container can actually represent different types of encoding.

                    I disagree, but i do get what you’re saying here. I don’t think that example really works though, because a .mp4 file isn’t derived from a .h264 file. A .mp4 is a container that may include h264-encoded video, but it may also have a channel with Opus-encoded audio or something. It’s apples and oranges.

                    Also, even though there shouldn’t be any technical issues with this on Windows, you can still use a typical short filename suffix if you wish, though i would argue that using the long filename suffix is more expressive. From “tar (computing)” on Wikipedia:

                    Compressor Long Short
                    bzip2 .tar.bz2 .tb2, .tbz, .tbz2, .tz2
                    gzip .tar.gz .taz, .tgz
                    lzip .tar.lz
                    lzma .tar.lzma .tlz
                    lzop .tar.lzo
                    xz .tar.xz .tx
                    compress .tar.Z .tZ, .taZ
                    zstd .tar.zst .tzst