Still new and not very well versed when it comes to Wayland and x11.

But I’ve had CachyOS with KDE Plasma installed on my gaming PC for almost 2 months now and not had many issues until last week.

Before I begin, here’s my specs: AMD Ryzen 5 5600X, 64GB DDR4 across 4 cards, RTX 4080 Super.

So last week, I booted up my computer and logged into CachyOS. About 10 minutes in, and suddenly the computer is restarting itself. Weird but okay, so I log in again and don’t have the same issue again. When I’m done, I always shut down so I shut down and then next day I boot up again. It was a few days later that this next occurrence happened and it happened in the same way. Another few days go by and then it happens again. Until Tuesday of this week and it got a lot worse. I boot up and log in but I am in for about 7 minutes and then it logs me out. But when I enter my password and hit enter, it freezes up and then my computer restarts after about 3 minutes.

This keeps happening and getting worse and the time is getting shorter that I can be logged in until today where I log in after a restart and am almost immediately kicked back to the login screen and then it freezes and restarts. I barely was able to run commands to get logs. I then began using TTY to do the fixes and copy logs out to read on my laptop.

In hindsight, I did have a few blips of blackness on login happening shortly before this happened where I’d log in and the screen would be black for many minutes until finally loading me up. This was before the freezing and log outs.

Today, I dug deep into the journalctl logs with an LLM and the LLM suggested that the logs were encountering an issue relating to Wayland. The LLM also saw an error involving an AMD graphics card, but that can’t be because I only use the Nvidia RTX 4080 Super so my guess is the error is not relevant to me and was just a generalized error. But it still suggested we try switching over to X11 and after following its steps, I am now in X11 and not having any of these issues anymore.

I’m not seeing any real differences or anything that makes me want to go back to Wayland, but again, I’m still not very knowledgeable on this.

I think what the most likely explanation is that I have been learning how to use pacman and around this time had learned about “sudo pacman -Syu” to update packages, so the most likely reason for this is that something got updated and began going haywire. I’ve been running this more often to ensure my machine is receiving updates. Prior to this, I hadn’t updated anything on the system and was using whatever came from the image I downloaded.

So I have a few questions I was hoping the community could help me with:

  1. Has anyone ever heard of anything like this happening before and know what could cause this? The way it kept getting worse almost felt like hardware degradation, like how physical objects naturally degrade and get worse over time if you don’t fix them. Not the normal issue where you encounter the same issue in the same timespan nearly each time which is what I’m expecting from software, not where it gets progressively worse as each day goes. The LLM had suggested hardware as a possible issue and I was leaning towards that possibility but not having issues on X11 makes me think that’s not the case here.

  2. My main use on this system is gaming. Are there any differences between Wayland and X11 that would make me want to go back to Wayland? Are there any other reasons I may want to go back and figure out what caused this problem and fix it permanently? Or any reasons I may end up preferring X11?

  3. Is it possible X11 will encounter the same issues eventually that Wayland did based on the behaviors described?

  • just_another_person@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    16 days ago

    More likely to be hardware than software. Have you checked your temps and voltages lately?

    Check dmesg for errors related to memory, and after the machine comes back up, check your logs prior to it shutting down.

    Overheating and bad memory present similar to this, but bad voltage could as well depending.

    • mranderson17@infosec.pub
      link
      fedilink
      English
      arrow-up
      4
      ·
      16 days ago

      This. I had symptoms extremely similar to OPs and saw this in the kernel logs immediately before the system would reset:

      [    0.705185] mce: [Hardware Error]: Machine check events logged
      [    0.705187] mce: [Hardware Error]: CPU 17: Machine Check: 0 Bank 5: baa0000000090150
      [    0.705190] fbcon: Taking over console
      [    0.705191] mce: [Hardware Error]: TSC 0 MISC d012000200000000 SYND 4d000020 IPID 500b000000000
      [    0.705195] mce: [Hardware Error]: PROCESSOR 2:a20f12 TIME 1678252812 SOCKET 0 APIC 3 microcode a20120a
      

      It turned out to be a hardware issue with my CPU (AMD Ryzen 9 5950X). I got it replaced under warranty (twice actually, the first replacement had other issues) and everything is fine now. Definitely check what the kernel logs say.

      You can look at the previous kernel log from before a reboot with journalctl -k -b -1 (as root)