Taskomatic troubleshooting, systems always show available updates – Spacewalk 2.3 / Satellite 5.x

"Consistency is the last refuge of the unimaginative."

Taskomatic troubleshooting, systems always show available updates – Spacewalk 2.3 / Satellite 5.x

Lately one of my projects at work is to get a patch management solution in place for our RHEL/CentOS Linux systems.  Since Red Hat’s Satellite product doesn’t manage CentOS systems, and because there is no easy method of configuring the pieces that Satellite 6 is based on, we decided to go with Spacewalk.  For longer than I am willing to admit, one bug in the programwas bugging bothering me: I would apply a patch to a system or upgrade a package, but it would still show the package as having an update available, even showing the same package name and version number on the “installed” and “available” columns (e.g. glibc-2.12-1.166.el6_7.1.x86_64 available, glibc-2.12-1.166.el6_7.1.x86_64 installed).

I’d been trolling the mailing lists, checking forum posts, reading through Red Hat’s “solutions” (e.g. https://access.redhat.com/solutions/1237493, which tells you to gather logs and create a ticket with Red Hat, no thanks), and generally banging my head on the wall trying to figure it out.  I’d enabled DEBUG logging in /usr/share/rhn/classes/log4j.properties, tried increasing the number of workers in /etc/rhn/rhn.conf, increased JVM memory parameters in /usr/share/rhn/config-defaults/rhn_taskomatic_daemon.conf, etc etc.

In one of the sites I visited, there was a mention of the rhntaskqueue tables in the Spacewalk database.  For us, that’s an external Postgres database, but it’s accessible easily enough with “sudo spacewalk-sql -i”.  Looking through there, there was an old job that was causing all of the other jobs to fail.

Initially, I would have thought a simply “DELETE FROM rhntaskqueue WHERE task_data=’156′” would finish quickly and I’d be on my way, but with ~1,800 table locks since Spacewalk was running, I had to first stop Spacewalk services, then delete the table (through a “TRUNCATE rhntaskqueue” would have done it also).

[me@work ~]$ date
Mon Aug 17 21:37:53 CDT 2015
[me@work ~]$ sudo spacewalk-sql -i
[sudo] password for <thisguy> :
psql (8.4.20)
SSL connection (cipher: DHE-RSA-AES256-SHA, bits: 256)
Type "help" for help.
rhnschema=# select * from rhntaskqueue;
 org_id |           task_name            | task_data  | priority |           earliest           
--------+--------------------------------+------------+----------+-------------------------------
      1 | update_server_errata_cache     | 1000012176 |        0 | 2015-08-17 19:27:52.318568-05
... some others
      1 | update_errata_cache_by_channel |        156 |        0 | 2015-08-11 12:51:16.753-05
(14 rows)

So, with that, I’m pretty happy to have figured this out and hopefully it saves someone else working with Spacewalk days/weeks/months of restarting services to get it working.  Now hopefully I can avoid restarting Taskomatic and having hundreds of notification e-mails sent out to the poor souls I’ve registered on the system.

 

No Comments

Add your comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.