• Deckweiss@lemmy.world
    link
    fedilink
    arrow-up
    14
    arrow-down
    1
    ·
    edit-2
    7 months ago

    On linux you could easily do it, by running a script every n minutes that takes a screenshot and pipes it to an ai and then stores it wherever you want (including a local NAS)

    With some extra effort you could tie it into the DE/WM and take a screenshot when a new app is opened or on focus switch or virtual desktop change or whatever and then slow down the periodic ones - so you don’t end up making 3000 screenshots of the same long gaming session. Or just constantly log the currently running processes as well to give the ai additional context.

    There are so many cool opportunities with this. I hope somebody makes something cool and useful with it for Linux, that runs completely locally. You could then ask your computer “Hey, what video did I watch about godot a week ago, it had something with tilesets in it and I was coding alongside it”. Or "how many hours have I been working on that project for in the last 2 weeks?’

    • taladar@sh.itjust.works
      link
      fedilink
      arrow-up
      7
      arrow-down
      1
      ·
      7 months ago

      Or "how many hours have I been working on that project for in the last 2 weeks?’

      I highly doubt current AI models are capable of figuring out which bit of work you do is related to which project.

      • marcos@lemmy.world
        link
        fedilink
        arrow-up
        5
        arrow-down
        2
        ·
        7 months ago

        AIs have been capable of doing this for ages already.

        It just falls into the set of useful stuff that LLMs trained as chatbots suck at because they had the useless goal of convincing people they are smart.

        • taladar@sh.itjust.works
          link
          fedilink
          arrow-up
          1
          ·
          edit-2
          7 months ago

          I mean sure, if you program in folder A for project A and folder B for project B that is easy and doesn’t even require any machine learning but I was thinking of research in the browser or writing documents that are not directly labelled with any project information.

          • marcos@lemmy.world
            link
            fedilink
            arrow-up
            2
            arrow-down
            2
            ·
            7 months ago

            No, AIs have been capable of looking at your code|text|image|whatever and telling the project apart. For ages. It’s not even impressive anymore.

            • taladar@sh.itjust.works
              link
              fedilink
              arrow-up
              2
              ·
              7 months ago

              Even humans couldn’t do that. How would AI know that that API documentation for the standard library I am looking at is something I am looking at because I need it for code in a specific project. That information just isn’t there unless you can also read my mind at the time.

      • Deckweiss@lemmy.world
        link
        fedilink
        arrow-up
        3
        ·
        edit-2
        7 months ago

        It would not work for certain cases, you’re right, like stuff with many differnelty names documents or doing research in the browser.

        But for what I had in mind - I’ve checked and on my setup the projects name is in the apps window decoration, in the cli when I do the commits, in the directory view sidebar, in the OS taskbar etc.

        It should be pretty straight forward to figure out from a screenshot, even when the app is not in the foreground.

        • andrew_bidlaw@sh.itjust.works
          link
          fedilink
          arrow-up
          3
          ·
          7 months ago

          We can log active processes and services, windows’ headers and states, their and mouse’s position, integrate it with one’s git versions’ and browser view’s history, history of all file relocations done by select programs or\and by user. If there’s an AI assistant like M$ Autopilot, also log every request and output in a text form, log keys, back up settings and configs. If we talk about screenshots, pure text table is as light as a feather and is easier to work with, so this 3sec delay looks like an overkill, even though they’d find a way to compress it. With enough data, it’s probably easier to take time and reconstruct an approximate screencap than hoard it.

          I imagine dragging your position on a timeline across entire months may be a fun novelty. But I don’t see myself having a reason to use it and prefer to lose information over logging so much of it even if it’s secure.