trinity-users@lists.pearsoncomputing.net

Message: previous - next
Month: February 2019

Kdesktop 100% CPU issue

From: Russell Brown <russell@...>
Date: Wed, 6 Feb 2019 12:44:14 +0000 (GMT)

Sorry; here's another R14.0.6 issue we're seeing (Yes, I can/will report
this as a bug but thought I'd throw it out here first just to see if
anyone's got any ideas).



Logging in (via XDMCP) multiple times as the same user/uid is causing
kdesktop to do weird refresh stuff then get stuck in a loop (100% CPU
usage).  I have seen this once-in-a-blue-moon under 14.0.5/Debian Jessie
but 14.0.6/Debian Stretch seems to make it fail every time.

Yes, I know it's bad practise to login multiple times as the same user
but people will do these things despite being told not to :-(

So, before I get into the nitty gritty about creating a failing case, is
there anything obvious I should set/unset to stop this?  Is it a known
problem?





How to recreate.

Hard: have a bunch of X-Terminals.

Easy:  Use vnc :-) (see below for setup).

User logs in (XDMCP/tdm) on first xterm/vnc session.

Same user logs in again from second xterm/vnc session.

The cursor on each desktop flickers with a busy?  sort of pointer and
the desktop refreshes sparodically.  NB.  Logging out of one session
seems to set a desktop lock on the second.

kdesktop for each session will probably have a high CPU use.

Same user logs in a third time and you end up with a kdesktop running
100% CPU and the desktop(s) get unresponsive (can happen with just two
sessions but three makes it fail quickly).

  PID   USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
  10285 utest     20   0  258196  23212  19128 S 100.0  0.0  45:48.54 kdesktop
  21584 lls       20   0   44608   4636   2804 R  26.3  0.0   0:00.07 top
      1 lls       20   0  205688   7984   5312 S   0.0  0.0   3:50.32 systemd

Looking at the 100% kdesktop process, it has a thread (10287) running in
a constant poll timeout loop (strace streams the lines below):

poll([{fd=13, events=POLLIN}, {fd=15, events=POLLIN}], 2, 0) = 0 (Timeout)
poll([{fd=13, events=POLLIN}, {fd=15, events=POLLIN}], 2, 0) = 0 (Timeout)
poll([{fd=13, events=POLLIN}, {fd=15, events=POLLIN}], 2, 0) = 0 (Timeout)
poll([{fd=13, events=POLLIN}, {fd=15, events=POLLIN}], 2, 0) = 0 (Timeout)

Looking at /proc/10287/fd I see:

lr-x------ 1 utest tract 64 Feb  6 11:17 0 -> pipe:[11722107]
l-wx------ 1 utest tract 64 Feb  6 11:17 1 -> /home/utest/.xsession-errors
lr-x------ 1 utest tract 64 Feb  6 11:17 10 -> pipe:[11687858]
l-wx------ 1 utest tract 64 Feb  6 11:17 11 -> pipe:[11687858]
lrwx------ 1 utest tract 64 Feb  6 11:17 12 -> socket:[11667305]
lr-x------ 1 utest tract 64 Feb  6 11:17 13 -> pipe:[11688700]
l-wx------ 1 utest tract 64 Feb  6 11:17 14 -> pipe:[11688700]
lrwx------ 1 utest tract 64 Feb  6 11:17 15 -> anon_inode:[eventfd]
lr-x------ 1 utest tract 64 Feb  6 11:17 16 -> /var/tmp/tdecache-utest/tdesycoca
l-wx------ 1 utest tract 64 Feb  6 11:17 2 -> /home/utest/.xsession-errors
lrwx------ 1 utest tract 64 Feb  6 11:17 3 -> socket:[11746354]
lrwx------ 1 utest tract 64 Feb  6 11:17 4 -> socket:[11687855]
lrwx------ 1 utest tract 64 Feb  6 11:17 5 -> socket:[11687856]
lr-x------ 1 utest tract 64 Feb  6 11:17 6 -> pipe:[11687857]
l-wx------ 1 utest tract 64 Feb  6 11:17 7 -> pipe:[11687857]
lrwx------ 1 utest tract 64 Feb  6 11:17 8 -> anon_inode:[eventfd]
lrwx------ 1 utest tract 64 Feb  6 11:17 9 -> anon_inode:[eventfd]

The pipe on fd 13 for the thread seems to be connected to it's parent:

# (find /proc -type l | xargs ls -l | grep 11688700) 2>/dev/null
lr-x------ 1 utest tract 64 Feb  6 11:57 /proc/10285/fd/12 -> pipe:[11688700]
l-wx------ 1 utest tract 64 Feb  6 11:57 /proc/10285/fd/13 -> pipe:[11688700]
lr-x------ 1 utest tract 64 Feb  6 11:58 /proc/10285/task/10285/fd/12 -> pipe:[11688700]
l-wx------ 1 utest tract 64 Feb  6 11:58 /proc/10285/task/10285/fd/13 -> pipe:[11688700]
lr-x------ 1 utest tract 64 Feb  6 11:58 /proc/10285/task/10287/fd/12 -> pipe:[11688700]
l-wx------ 1 utest tract 64 Feb  6 11:58 /proc/10285/task/10287/fd/13 -> pipe:[11688700]
#

-------
VNC setup.


I'm using Debian Stretch.  The easiest way to setup a VNC test case is
to add this to /etc/services on the TDE server system (192.168.1.1 in my
case).

# VNC Servers
vnc0            5900/tcp
vnc1            5901/tcp
vnc2            5902/tcp
#

Then create /etc/xinetd.d/vnc0 like so:

service vnc0
{
        user = nobody
        disable = no
        protocol = tcp
        socket_type = stream
        flags = KEEPALIVE
        wait = no
        group = tract
        instances = 1
        log_type = FILE /var/log/vnc
        log_on_failure += HOST USERID ATTEMPT

        server = /usr/bin/Xvnc
        server_args = -inetd -NeverShared -query localhost -once -depth 16 -geometry 1024x768 -SecurityTypes none
}

Copy /etc/xinetd.d/vnc0 to /etc/xinetd.d/vnc1 and change "service vnc0"
to "service vnc1".  Repeat for vnc2.  Restart xinetd.

From another system on the network fire up a vnc viewer for the first session:

vncviewer 192.168.1.1:5900

Login as test user.

and for the second:

vncviewer 192.168.1.1:5901

Login as same user in on that one.

You'll now see the desktops in each vncviewer window refreshing themselves
and a kdesktop process for each one using significant CPU:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                
26694 utest     20   0  275004  33772  27016 R  49.7  0.0   1:35.19 kdesktop                                                                               
27099 utest     20   0  275004  33544  26784 S  47.0  0.0   1:38.19 kdesktop      

Fire up a third:

vncviewer 192.168.1.1:5902

Login as same user on that.

You'll now get a kdesktop consuming 100% CPU:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                
10805 utest     20   0  278088  37156  27332 S 100.3  0.0   4:15.84 kdesktop                                                                               
26694 utest     20   0  275004  33772  27016 S  43.3  0.0   3:37.91 kdesktop                                                                               
27099 utest     20   0  275004  33544  26784 S  42.0  0.0   3:36.62 kdesktop                                                                               

On this third session, kdesktop becomes unresponsive (not surprising as
it's stuck in a loop!).

NB.  I've also tried this using lightdm instead of tdm and exactly the
same thing happens.

-- 
 Regards,
     Russell
 --------------------------------------------------------------------
| Russell Brown          | MAIL: russell@... PHONE: 01780 471800 |
| Lady Lodge Systems     | WWW Work: http://www.lls.com              |
| Peterborough, England  | WWW Play: http://www.ruffle.me.uk         |
 --------------------------------------------------------------------