Hamster Republic -> James's Stuff -> James's GNU/Linux Gremlins Home


[Upgrading Debian] [Contura Aero] [Fujitsu Lifebook] [Mac Powerbook] [Curse of the Netherwog (now with happy ending!)] [Software Suspend]


This page is just a log of various problems I have had with linux, along with their solucions. So often I have run into things that didn't work quite the way they were expected to, as if there was always one or two terribly critical steps left out of the instructions. This page is partly meant to fill in the gaps for anybody who is having the same problems that I did, and partly just a log of my own stupid blunders


June 20 2008

Why this page is so quiet lately

  1. Linux is getting better over time, and I am running into fewer problems overall.
  2. I'm sufficiently experienced with Linux now that I tend to solve problems more quickly, and I find answers with less searching, so I am less often driven to the old frustration that used to cause me to write about problems here.
  3. If I did write about any more Linux problems, I would write about them Here since blogging via Wordpress's gui is orders of magnitude easier than editing these silly html files here :)

April 24 2006

How SDL_mixer and timidity work together

I have been aware for quite some time that SDL_Mixer uses Timidity for MIDI music output, but I have not been aware of how the details of the implementation actually work.

Recently, a user of the OHRRPGCE reported having problems with music on GNU/Linux (Mandriva). He said that he already had timidity installed. This prompted me to research how the SDL_mixer/Timidity connection works.

SDL_Mixer does NOT dynamicly link to libtimidity. It does not call the timidity command-line binary. Instead, it uses an embedded fork of the timidity code (based on the older 0.2 version, not the newer-yet-still-obsolete timidity++ branch) I figured this out by downloading the latest SDL_Mixer tarball, and then reading the text files inside the timidity subfolder thereof. I read that SDL_mixer does require instrument patches (sound samples) to be installed. Presumable this is why installing the timidity package on my Debian systems caused it to "just work". The timidity package pulls in the freepats package, which contains the instrument patches.

I figured out which file my SDL_mixer libary is using ldd on a binary that I know links to it:

james@gilgamesh:~/src/ohrrpgce/wip$ ldd ohrrpgce-game | grep "mixer"
        libSDL_mixer-1.2.so.0 => /usr/lib/libSDL_mixer-1.2.so.0 (0xa7b59000)

I then searched that library binary for strings related to midi

james@gilgamesh:~/src/ohrrpgce/wip$ strings /usr/lib/libSDL_mixer-1.2.so.0 | grep "midi"
/usr/local/lib/timidity
Warning: "%s": Short midi file.
%s: read_midi_event: %s
/etc/timidity/timidity.cfg
EXT midi %s , %d 
can't get info about midi ports
/dos/ultrasnd/midi/%s.pat

That information led me to make the following guesses: /usr/local/lib/timidity is the default location for the patches. SLD_Mixer can look for timidity config in /etc/timidity/timidity.cfg and apparently it can also read Ultrasnd patches from a dos partition, if you happen to have one mounted.

My next test was to uninstall timidity on my Debian box, but to leave freepats installed. After doing this, my midi music stopped working. Next I downloaded the timidity.tar.gz file that the SDL_mixer page reccomends. I installed it to the default location of /usr/local/lib/timidity/ ... still no music. Then I checked out /etc/timidity/timidity.cfg ... or at least, I tried to check it out. Fact was, it was gone, since I had just uninstalled timidity. I reinstalled and looked at it, and the interesting thing I noted was that it contains a bit at the end that says:

# By default, try to use the instrument patches from freepats:

source /etc/timidity/freepats.cfg

/etc/timidity/freepats.cfg is an instrument patch config file. It is part of the freepats package, and will exist even if timidity is not installed. And what is this? Lo and behold, the tarball I had manually installed to /usr/local/lib/timidity contained a timidity.cfg file that had instrument definition commands in it in exactly the same format as the /etc/timidity/freepats.cfg file.


January 20 2006

Apache + mod_python module import extreme weirdness

If this page seems awfully quet, it is not because I have lost interest in GNU/Linux, rather I have accumulated enough experience that solutions to problems that would ahve stumped me years ago are now second-nature. Plus, quality of the software that runs on Linux is steadily improving, and vast troves of documentation is readily available all over the web. (Honestly I don't know how I survived before the invention of the ubiquitous "documentation wiki")

Anyway. This week I have been fighting a new problem which seems to me to be utterly undocumented.

I love python. Ever since I discovered it, I have been tinkering with it for various game projects using PyGame, as well as numerous work-related projects, but I have never tried web-development with python.

I recently started to teach myself mod_python. Installation was trivially easy (what Debian packaged installation isn't trivially easy?), and although configuration required a little bit of hacking the infamous and ominous apache .htaccess file, it was nothing that wasn't documented well enough to figure out.

But when I actually started writing code, things started to get strange. I have two modules. One, main.py is used for all my PythonHandler directives, and it works just peachy. The second, wca_identity.py is a separate module intended to look up user authentication information from our company's mysql database. I would make changes to it, and those changes seemed to have no effect. I added debugging statements, and they didn't work... then they worked... then they didn't work again. I had PythonDebug enabled, so whenever my code would crash, I would see a python stack trace, like the following:

Mod_python error: "PythonAuthenHandler main"

Traceback (most recent call last):

  File "/usr/lib/python2.3/site-packages/mod_python/apache.py", line 193, in Dispatch
    result = object(req)

  File "/var/www/quote/main.py", line 20, in authenhandler
    id = wca_identity.Identity(log_function = log)

  File "/var/www/quote/wca_identity.py", line 10, in __init__
    self.log(self.db)

  File "/var/www/quote/wca_identity.py", line 6, in 
    if log_function: self.log = lambda s:log_function(s)

  File "/var/www/quote/main.py", line 5, in log
    apache.log_error(s)

TypeError: argument 1 must be string or None, not Connection

... which is good, right, because I like to see stack traces. They help me find my errors... except what is going on here? In the stack traces I would see references to code I had deleted already. The above stacktrace shows a lambda function 10 minutes after I had completely removed it from my code. Sometimes simply reloading several times would produce several different error messages, all related to code that didn't even exist anymore.

Apparently, mod_python was keeping around old copies of my code in memory and re-running them without reloading them from disk. I can understand why this might be beneficial for performance if I was only ever importing standard modules, but how can I possibly develop and import my own modules if I never know whether or not I am running the current version of my code, or some randomly selected obsolete copy? The only workaround I could find was to re-start apache after every single edit, which is of course unacceptable. This apache server is doing other things, and needs to stay up.

My only clue as to what to do about this problem came from the mod_python documentation page which speaks of Multiple Interpreters. Adding "PythonInterpreter quote_interpreter" to my .htaccess seemed to solve the problem, apparently by forcing all python code in that directory to be executed in a single interpreter named "quote_interpreter" (I chose "quote" because it is the name of the project)

January 24 2006

Dang it! That workaround solves nothing! It is still wrong! Can mod_python really be this broken? How can I be the only person who has noticed this?

...Is what I said to myself. So I started browsing the mod_python mailing list, but found nothing, so i just subscribed and asked directly. I soon received an answer in which Graham Dumpleton explained the problem to me, provided a workaround, and pointed me to his page which documents the flaw extensively: http://www.dscpl.com.au/articles/modpython-003.html

This is the workaround:

  from mod_python import apache
  import os

  __here__ = os.path.dirname(__file__)
  
  def handler(req):
       req.write("Hello World!")
       testmodule = apache.import_module("testmodule,path=[__here__])
       testmodule.foo(req)
       return apache.OK

And I figure I can wrap that up into something a little prettier and use it. Thanks, Graham!


June 16 2005

Mixed stable/unstable debian installation

I can't believe I waited so long to learn how to do this. Often I want to run the stable version of Debian, but I want access to one or two packages from the unstable version. How do i do it? Previously my only choices were to (A) upgrade the whole system (B) manually install the particular package, and manually install its dependencies (C) Hope that a backport existed for it.

But none of those three options is what I really want to do. I want to have stable installed, but still have the option of asking for an unstable package, and having it just work. There are two things I had to do. First; I had to put both stable and testing in my /etc/apt/sources.list file.

deb http://ftp.us.debian.org/debian/ stable main
deb-src http://ftp.us.debian.org/debian/ stable main
deb http://security.debian.org/ stable/updates main

deb http://ftp.us.debian.org/debian/ unstable main
deb-src http://ftp.us.debian.org/debian/ unstable main

This alone would cause my entire system to be upgraded to unstable, since the higher version numbers in unstable would always oveeride those from stable. What I needed to do next was to set stable to a higher priority. I did this by creating a /etc/apt/preferences file

Package: *
Pin: release a=stable
Pin-Priority: 500

Package: *
Pin: release a=unstable
Pin-Priority: 200

Because I have given a higher priority to stable packages, so given the choice, it will use he stable ones, but when I ask for a package that only exists in unstable, it will do the right thing

If my desire was to install an unstable package that also existed in stable, I would have to add an additional entry to my /etc/apt/preferences file, giving it a higher priority by name, but that iss not what I want to do. if you want to do that, I suggest reading man apt_preferences

Also, it is worth noting that running a mixed stable/unstable system is not always reliable. You may install packages that have mutually conflicting dependencies, so whenver a backport is available for the package you care about, you should always use that first.


Feb 3 2005

Kernel 2.6 Hardware Troubles

So I finally decided it was time to bite the bullet and solve the hardware incompatability problems I have enocuntered previous times I have tried to upgrade to kernel 2.6 I learned from http://wiki.debian.net/?Kernel2dot6 that it would be good to switch to install discover and udev, so I did so.

The first thing I tested was my all-important sound card. Attempting to play an ogg file in xmms failed, and with a little poking around I could see that alsa was definitely working, trouble was, it had detected my bt878 TV capture card had been detected first, and therefore assigned as the default card 0. My Soundblaster emu10k1 was working, it had just been assigned card ID 1, so none of my sound playing apps like xmms could find it.

[james:~]ls -l /proc/asound
total 2
lrwxrwxrwx  1 root root 5 Feb  3 20:00 Bt878 -> card0
lrwxrwxrwx  1 root root 5 Feb  3 20:00 Live -> card1
dr-xr-xr-x  8 root root 0 Feb  3 20:00 card0
dr-xr-xr-x  4 root root 0 Feb  3 20:00 card1
-r--r--r--  1 root root 0 Feb  3 20:00 cards
-r--r--r--  1 root root 0 Feb  3 20:00 devices
-r--r--r--  1 root root 0 Feb  3 20:00 hwdep
-r--r--r--  1 root root 0 Feb  3 20:00 modules
dr-xr-xr-x  2 root root 0 Feb  3 20:00 oss
-r--r--r--  1 root root 0 Feb  3 20:00 pcm
dr-xr-xr-x  2 root root 0 Feb  3 20:00 seq
-r--r--r--  1 root root 0 Feb  3 20:00 timers
-r--r--r--  1 root root 0 Feb  3 20:00 version

http://alsa.opensrc.org/index.php?page=MultipleCards had some suggestions that did not help. In the processof trying things I discovered that /etc/modutils has been obsoleted in favor of /etc/modprobe.d, but putting the correct "options" line in either of those files made no difference. Finally I ended up adding snd-emu10k1 to my /etc/modules file to make it load right away on boot. Kinda inelegant IMHO, but who cares? It works!

Next of importance is my hp niash scanner. I depend heavily on this puppy. This one was tricky. Running xsane just locked up on detection. Running cat /proc/bus/usb/devices hung too. I learned several important things. First, that http://sane-project.org/ requires one to use libusb with kernel 2.6, because old style device-based access to USB scanners has been removed in Kernel 2.6. They reccomended using hotplug to detect the scanner and get the permissions set up correctly. I already had both libusb and hotplug installed, but until now, I didn't know why ;) . One reboot later, I could not reproduce the lockup when running cat /proc/bus/usb/devices and I have no idea why that was happening. It makes me sad that I don't know why, but since I can't reproduce the problem, I can't ffind out, so I have to be content to be happy that it isn't locking up anymore.

Running sane-find-scanner still found no scanner. I could now cat /proc/bus/usb/devices and I could see that my scanner was listed. it was there as /proc/bus/usb/001/004 ... why wouldn't sane detect it? Then I remembered the mention of hotplug setting permissions on usb devices. I ran sane-find-scanner as root, and sure enough, it found my scanner. I checked the permissions on /proc/bus/usb/001/004 and its ownership was correct. it belonged to the scanner group ... so what was wrong? DUH! I was not a member of the scanner group! With the old-way-of-doing-things, I had not needed to be a member (presumable because my scanner was insecurely available to all users, or somesuch) So I simply added myself to the scanner group, logged out, logged back in, and it worked.

I was worried about my DVD/CD-Writer, but without cause. When I had first booted up kernel 2.6, I had tried to play Memento, and it had not worked. My DVD program, Ogle had just sat there silently without opening a movie window, but after installing udev it worked just fine. So did the burner.

I had previously been having a ton of trouble with my USB gamepads. They would work for about 30 seconds, and then stop responding unless I unplugged all my other USB devices (Scanner, Printer, Midi-Keyboard) and if I watched my ALT+1 console, I would see an error message about "usb interrupt" spontaneously pop up every so often (I should have written down that exact error! Grr!). Anyway, the upgrade to Kernel 2.6 (or maybe the upgrade to udev) fixed this problem, and my gamepads work now, which is cool, because I was previously thinking I had overloaded my USB hub and needed to buy a powered one.

And finally, 3D accelleration. I found myself retracing the steps I had been following last May when I set up 3D accelleration in the first place. I followed the debugging steps at the wonderful DRI Wiki until I discovered in my /var/log/XFree86.0.log an error message saying:

[drm:radeon_cp_init] *ERROR* radeon_cp_init called without lock held

And once I knew that, it was easy to find (on the same wiki page) that kernal 2.6 requires agp-chipset-specific modules in addition to agpgart . So I edited my /etc/modules and added intel-agp right after agpgart (but before radeon) and then rebooted, and it worked. Hooray for the DRI Wiki!

So all of that took me about four hours, which I think is not bad, especially considering that most of that time doesn't count since I was playing Katamari Damacy in-between steps and while waiting for downloads ;)


May 30 2004

3D Acceleration

I have survived for too long withouth 3D hardware acceleration. I usually just play games that don't require it, but how else to get my playstation emulator working optimally? I can play using Pete's Soft-X GPU, but display quality and speed is not a s good as a real playstation. I know that ePSXe is capable if being better than a real playstation.

My current video card, An S3 Savage 4 is not yet supported by the X11 DRI Project, although there is good work going on to add support for the savage in the future, I am not that patient.

[james:~]lspci
pcilib: Cannot open /sys/bus/pci/devices
0000:00:00.0 Host bridge: Intel Corp. 82815 815 Chipset Host Bridge and Memory Controller Hub (rev 02)
0000:00:01.0 PCI bridge: Intel Corp. 82815 815 Chipset AGP Bridge (rev 02)
0000:00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB/EB/ER Hub interface to PCI Bridge (rev 01)
0000:00:1f.0 ISA bridge: Intel Corp. 82801BA ISA Bridge (LPC) (rev 01)
0000:00:1f.1 IDE interface: Intel Corp. 82801BA IDE U100 (rev 01)
0000:00:1f.2 USB Controller: Intel Corp. 82801BA/BAM USB (Hub #1) (rev 01)
0000:00:1f.3 SMBus: Intel Corp. 82801BA/BAM SMBus (rev 01)
0000:00:1f.4 USB Controller: Intel Corp. 82801BA/BAM USB (Hub #2) (rev 01)
0000:01:0a.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 07)
0000:01:0a.1 Input device controller: Creative Labs SB Live! MIDI/Game Port (rev 07)
0000:01:0b.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 02)
0000:01:0d.0 Multimedia video controller: Brooktree Corporation Bt878 Video Capture (rev 02)
0000:01:0d.1 Multimedia controller: Brooktree Corporation Bt878 Audio Capture (rev 02)
0000:02:00.0 VGA compatible controller: S3 Inc. Savage 4 (rev 03)
[james:~]glxinfo | grep 'direct'
Xlib:  extension "XFree86-DRI" missing on display ":0.0".
direct rendering: No
OpenGL renderer string: Mesa GLX Indirect

So I checked the DRI wiki and concluded that my best bet was an ATI radeon 9200 card, that being the newest radeon chipset supported by the open-source ATI DRI drivers. Luckily my local Staples carried this exact just-slightly-obsolete model.

My decision to choose radeon over the other available supported 3D-accelerated cards was reading several success-stories on the swsusp2 mailing lists where people described getting software suspend to work with radeon cards with full-acceleration enabled. Suspending or "Hibernating" is a very important feature to me, and I do not wish to break it.

So I shut down, swapped cards, and booted back up. The very first thing I did was edit my /etc/X11/XF86Config-4 file to change my video card device from savage->radeon

Section "Device"
        Identifier      "Generic Video Card"
-        Driver          "savage"
+        Driver          "radeon"
EndSection

And then I started X. X configuration is still a weak point in the Linux desktop IMHO. And now I test, and find to my delight that acceleration appears to be enabled:

[james:~]lspci
pcilib: Cannot open /sys/bus/pci/devices
0000:00:00.0 Host bridge: Intel Corp. 82815 815 Chipset Host Bridge and Memory Controller Hub (rev 02)
0000:00:01.0 PCI bridge: Intel Corp. 82815 815 Chipset AGP Bridge (rev 02)
0000:00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB/EB/ER Hub interface to PCI Bridge (rev 01)
0000:00:1f.0 ISA bridge: Intel Corp. 82801BA ISA Bridge (LPC) (rev 01)
0000:00:1f.1 IDE interface: Intel Corp. 82801BA IDE U100 (rev 01)
0000:00:1f.2 USB Controller: Intel Corp. 82801BA/BAM USB (Hub #1) (rev 01)
0000:00:1f.3 SMBus: Intel Corp. 82801BA/BAM SMBus (rev 01)
0000:00:1f.4 USB Controller: Intel Corp. 82801BA/BAM USB (Hub #2) (rev 01)
0000:01:0a.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 07)
0000:01:0a.1 Input device controller: Creative Labs SB Live! MIDI/Game Port (rev 07)
0000:01:0b.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 02)
0000:01:0d.0 Multimedia video controller: Brooktree Corporation Bt878 Video Capture (rev 02)
0000:01:0d.1 Multimedia controller: Brooktree Corporation Bt878 Audio Capture (rev 02)
0000:02:00.0 VGA compatible controller: ATI Technologies Inc RV280 [Radeon 9200] (rev 01)
0000:02:00.1 Display controller: ATI Technologies Inc RV280 [Radeon 9200] (Secondary) (rev 01)
[james:~]glxinfo | grep 'direct'
direct rendering: Yes

I am curious about the fact that my radeon appears to be dual, in spite of the fact that there is only one CRT port. Perhaps the S-Video output is a separate display?

Visually, glxgears seems to be running much slower than it did on my old savage card, but it reports an fps more than 10 times what I had before.

[james:~]glxgears
6351 frames in 5.0 seconds = 1270.200 FPS
7488 frames in 5.0 seconds = 1497.600 FPS
7475 frames in 5.0 seconds = 1495.000 FPS

But I know it is working, because I can now, for the very first time in my GNU/Linux life, play TuxRacer at full speed! Whee! (and that suddenly makes me want to go snowboarding. Rotten summertime!)

But sadly, software suspend no longer works. I can suspend, but when I resume, I see a garbled screen. I will track this problem further on my Software Suspend page.

2004-06-01 Note: In the process of fighting with my software-suspend problem, I have switched to the latest DRI Snapshots for Debian and gotten a noticeable improvement in FPS, but the suspend problem still eludes me...

[james:~]glxgears
9609 frames in 5.0 seconds = 1921.800 FPS
10349 frames in 5.0 seconds = 2069.800 FPS
10354 frames in 5.0 seconds = 2070.800 FPS
10330 frames in 5.0 seconds = 2066.000 FPS

April 7 2004

/usr/lib/dpkg/methods/apt/install: line 57: 15737 Segmentation fault

I was running an update of my debian/sarge box today, and encountered the following error message at the [I]nstall step in dselect.

Reading Package Lists... Done
/usr/lib/dpkg/methods/apt/install: line 57: 15737 Segmentation fault      $APTGET $OPTS "$APT_OPT0" "$APT_OPT1" dselect-upgrade

I searched for the error message on google groups, and found a few complaints about it, but most had no reply at all, and the few that did have replies were totally unhelpful. I did find Debian bug 196070 which was marked fixed in July of last year with no useful information posted in the bug report, other than the clarification that the crash was being caused by apt, not by dselect. So I cut dselect out of the loop, and tried apt-get from the command-line.

[root:~]apt-get upgrade
Reading Package Lists... Done
Segmentation faulty Tree... 50%

The "Segmentation Fault" error was being drawn over the top of the last line of text. "apt-get clean" did nothing, and "apt-get check" segfaulted the same way as "apt-get upgrade". "apt-get update" seemed to download the package lists just fine, but then "apt-get upgrade" would still crash.

But the error message "Segmentation faulty" was cute, and more clear and concise than the error I was getting by way of dselect, so I searched google groups for "Segmentation faulty" and that got some more useful posts. One suggested deleting /var/cache/apt/*.bin

[root:~]ls -l /var/cache/apt/*.bin
-rw-r--r--    1 root     root      4973681 Apr  7 15:55 /var/cache/apt/pkgcache.bin
-rw-r--r--    1 root     root      4882303 Apr  7 15:08 /var/cache/apt/srcpkgcache.bin
[root:~]mkdir bak
[root:~]mkdir bak/var              
[root:~]mkdir bak/var/cache        
[root:~]mkdir bak/var/cache/apt    
[root:~]mkdir bak/var/cache/apt/bin
[root:~]mv /var/cache/apt/*.bin ~/bak/var/cache/apt/bin

And after that, everything was happy again. I was able to run "apt-cache update" and then "apt-cache upgrade" and everything worked.

Afterwards, knowing the problem, a search of the debian bug archives turned up Bug 81829 and many duplicates. I read all the duplicates, and they were all the same story. Somebody reports segfaults in apt-get. A developer is unable to reproduce the error, and suggests to the reporter, "hey, I'll bet your apt cache got corrupted somehow" and advises the reporter to run "rm /var/cache/apt/*.bin" ... Only one reporter bothered to save the corrupt bin files, and then he deleted them before anybody asked for them. But I still have mine, and can consistenly reproduce the segmentation fault on demand:

[root:~]cp -p ~/bak/var/cache/apt/*.bin /var/cache/apt/    
[root:~]ls -l /var/cache/apt/
total 9660
drwxr-xr-x    3 root     root         8192 Apr  7 16:12 archives
-rw-r--r--    1 root     root      4973681 Apr  7 15:55 pkgcache.bin
-rw-r--r--    1 root     root      4882303 Apr  7 15:08 srcpkgcache.bin
[root:~]apt-get upgrade
Reading Package Lists... Done
Segmentation faulty Tree... 65%

So. Anybody want my corrupt /var/cache/apt/.bin files? I am eager to do whatever I can to assist in resolving this lil' bug once and for all.

2004-05-30 Well, it never happened again, and although the first time I tried copying back the bin files and reproducing ghe crash it crashed, when i tried the same a gain a few days later, i could no longer reproduce the crash. I guess it was a fluke. I hate flukes.

2004-07-07 It happened again on a different computer. A server at work.

/etc/cron.daily/debian-updates:
Hit http://non-us.debian.org sarge/non-US/main Packages
Hit http://non-us.debian.org sarge/non-US/main Release
Get:1 http://ftp.us.debian.org sarge/main Packages [2982kB]
Hit http://security.debian.org sarge/updates/main Packages
Hit http://security.debian.org sarge/updates/main Release
Get:2 http://ftp.us.debian.org sarge/main Release [81B]

gzip: stdin: invalid compressed data--crc error

gzip: stdin: invalid compressed data--length error
Err http://ftp.us.debian.org sarge/main Packages
  Sub-process gzip returned an error code (1)
Fetched 2982kB in 44s (67.1kB/s)
Reading Package Lists...Failed to fetch http://ftp.us.debian.org/debian/dists/sarge/main/binary-i386/Packages.gz  Sub-process gzip returned an error code (1)

W: Couldn't stat source package list http://ftp.us.debian.org sarge/main Packages (/var/lib/apt/lists/ftp.us.debian.org_debian_dists_sarge_main_binary-i386_Packages) - stat (2 No such file or directory)
W: Couldn't stat source package list http://ftp.us.debian.org sarge/main Packages (/var/lib/apt/lists/ftp.us.debian.org_debian_dists_sarge_main_binary-i386_Packages) - stat (2 No such file or directory)
W: You may want to run apt-get update to correct these problems
E: Some index files failed to download, they have been ignored, or old ones used instead.
Reading Package Lists...
Building Dependency Tree...
W: Couldn't stat source package list http://ftp.us.debian.org sarge/main Packages (/var/lib/apt/lists/ftp.us.debian.org_debian_dists_sarge_main_binary-i386_Packages) - stat (2 No such file or directory)
W: You may want to run apt-get update to correct these problems
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

After that, I started getting the same error;

/usr/lib/dpkg/methods/apt/install: line 57: 13081 Segmentation fault      $APTGET $OPTS "$APT_OPT0" "$APT_OPT1" dselect-upgrade

So it appears that and interrupted download was the original cause of corruption. I still firmly believe that apt should be smart enough not to segfault in such a situation, but it is presently beyond my hacking abilities to write a patch, so I will just let it go.


April 2 2004

PS/2 mouse with kernel 2.6.x

I finally set up a machine with Linux kernel 2.6.4, but when I tried to start X-Windows, X choked saying it could not find a mouse "No core pointer". By default, X tries both /dev/input/mice (USB mice) and /dev/psaux (PS/2 mice) and I knew my mouse was PS/2. I typed cat /dev/psaux to see if I could get gibberish when moving the mouse, but it told me that device did not exist. After much searching of message board and newsgroup posts, I found that PS/2 mouse support is not enabled by default in kernel 2.6.x (at least not in Debian it isn't) So I solved that by loading the mousedev module with modconf (or I could have also used modprobe -v mousedev)


October 3 2003

Numlock and key-modofiers in fvwm2

Wow. It has been ages since I ran into a really tough linux configuration puzzle. Here is one worth mentioning here. My X11 Window manager of choice is good old fvwm2 . Why? I dunno. I guess I just like the minimalist do-it-yourself approach.

Now, I am not a 10-key typer. In fact, I usually never touch the keypad unless I am playing a first-person shooter >:) So I never bother with the NumLock key, and therefore never noticed the wierdess it can create in Fvwm2. However, A few months back, I snuck a Debian Linux computer into the Sales department at work (not literally "snuck". I am half the I.T. department, I can do that sort of thing if I want). It is primaraly an e-mail workstation, with Mozilla and ssh access to our inventory program. It is a comparatively low-end machine, so for performance reasons I did not want to install Gnome or KDE on it. At first I was running qvwm as the Window manager, which certainly looks nice enough for users accustomed to Windows 9X, but it was having some problems with Window focus, so I swithed over to my old Standby fvwm2. A little hacking at the .fvwm2rc file gave me a simple interface that is a no-brainer for the Windows-trained Sales staff to use-- but then I encountered a weird problem. On several occasions I would be in the Sales department, and have to do some quick work, and naturally I gravitate towards the only Linux box in the room. But what is this? Several of my oft-used keybindings are broken! Most notably, ALT+TAB which is bound to the Window List. it took me a while to figure out that this was being caused by the NumLock key. Any time a 10-key typist used the workstation, they would turn on the NumLock key, and leave it that way. But why was that breaking fvwm's keybindings?

I turned to the universal repository of all knowledge ever and found that i was not the only person with the problem. However, most of the responses to the problem were not satisfactory. Some people were of the "well, then turn off your NumLock key, dummy!" mindset. Many others were of the "It's a feature not a bug mindset" -- I found that this later groups was technically correct, if not practically correct. Other X11 window managers have no problem with NumLock, why should fvwm be different? It turns out that this weirdness actually belongs to X11's keyboard mapping. Typing xmodmap -pm gives the following output on this system:

sales@bardling:~$ xmodmap -pm
xmodmap:  up to 2 keys per modifier, (keycodes in parentheses):

shift       Shift_L (0x32),  Shift_R (0x3e)
lock        Caps_Lock (0x42)
control     Control_L (0x25),  Control_R (0x6d)
mod1        Alt_L (0x40),  Alt_R (0x71)
mod2        Num_Lock (0x4d)
mod3      
mod4        Meta_L (0x73)
mod5        Scroll_Lock (0x4e)

From an X point of view, NumLock is just another modifier key. Fvwm2 simply takes this more literally than most other window managers do. Fvwm2's powerful configuration language would let me make special keybindings for Numlock state... but I don't want to do that. I don't want to have set up separate numlock-on versions of all my keybindings. That would be a royal pain. I already put way too much time into manually tweaking my .fvwm2rc file already!

So what to do? Some people suggested fixing the problem by running xmodmap -e "clear mod2" which worked only in the sense that it completely disabled NumLock -- that would be fine for ME, but it would make some members of the sales staff rather unhappy with me. (I had to run xmodmap -e "add mod2 = Num_Lock" to re-enable the modifier)

Finally I found this post from the wise Reid Rivenburgh, who suggested adding the cryptic IgnoreModifiers L25 to my .fvwm2rc file. I did so, and boom! it worked! NumLock functioned as advertised, and fvwm2's keybindings cleanly ignored it. I dug through fvwm2's FAQ and found an explanation of this command. That kind of configuration power is why I LOVE fvwm2... and also why I would never reccomend fvwm2 to a Linux newbie :)


September 12 2003

No news is good news

I haven't had much to say on this page lately-- partly because I haven't run into any problems that weren't already really well documented elsewhere, and partly because I have just gotten enough experience with Linux now that the sort of things that would have puzzled me for days in the past are not no-brainers, so I don't bother to write them down.

August 01 2002

Debian 3.0 PCMCIA

I just started updating my laptop to Debian 3.0. Back when I installed 2.2, the PCMCIA setup and the network card support works so easyly that it confused me (I kept looking around for more options, not realizing that it was already working perfectly) But this time it isn't going so smoothly. I can enable PCMCIA support, and it beeps at me, and if I switch to console #3 and look at the messages I can see it correctly regognizing my 3Com 572/574 Fast Ethernet card, and trying to load the module for it, 3c574_cs, but after it loads the module it gives me this error:

daemon.info cardmgr[1561]: + cat: /var/lib/misc/pcmcia-scheme: No such file or directory

I have checked, and indeed, this file does not exist, neither there nor in any other location. What is its purpose?

I found a newsgroup post that suggested that it was not important, and that the error could be supressed by making a blank file. I touched the filename, and that supressed the error, but the network config still failed. The error message may be a red herring.

I found another interesting newsgroup post which I will quote in part here (the full post is at google groups)

De:Tim Freeman (tim@fungible.com)
...
Flipping to the extra console, /var/log/messages ends with the lines

   Starting 'cardmgr '
   starting, version is 3.1.31
   watching 1 sockets
   socket 0: Linksys Ethernet
   executing: 'modprobe 8390'
   executing: 'modprobe pcnet_cs'
   executing: './network start eth0'
   + cat: /var/lib/misc/pcmcia-scheme: No such file or directory

which all look pretty normal when I compare with what happens with the
potato install.  ifconfig reports that eth0 is not up.  At this point
I had to visit #debian-boot and ask for help, which was very useful.
If I say

   ifconfig eth0 64.161.114.5 up

the network does come up, but the routing table is wrong.  It says I
have all of 64.0.0.0, which I don't, and there's no default route.
Giving these commands cleans it up (the commands are from memory, so
they may be slightly off).

   route del -net 64.0.0.0 netmask 255.0.0.0 dev eth0
   route add -net 64.161.114.0 netmask 255.255.255.248 dev eth0
   route add -net 0.0.0.0 netmask 0.0.0.0 gateway 64.161.114.1 dev eth0

After this I could continue with the network install which appeared to
complete successfully.

So I gave it a try. I got up to the stage in the install where my network was configured byt not activated. I switched to the console, and did the ifconfig command. I was then able to ping another IP on my network, but not able to ping out to the internet. I looked at my routing table. it was a little less screwed up than Tim's. All I had to do was the last route add command to set my correct gateway, and then after that everything worked from there on. Thanks, Tim!


July 20 2002

Trouble Compiling Zsnes

I went to compile zsnes, and I was getting and error message like so:

/usr/bin/nasm  -w-orphan-labels -D__LINUX__ -f elf -o init.o init.asm
init.asm:4617: error: symbol `.nextsearch' undefined
(null):1074980520: error: phase error detected at end of assembly.
make: *** [init.o] Error 1

I looked at the init.asm source file at line 4617 and I could see nothing wrong with it. The label it refered to, just a few lines earlier was definately there. Then I noticed a comment right before it:

    mov edi,fname+1
    mov byte[.cchar],'\'
    ; get position of . or \
.nextsearch
    cmp byte[edi],0
    je .nomore
    cmp byte[edi],'.'

That trailing backslash on the comment made me suspicious, so I removed it. After that, the file compiled with no troubles. Apparently the version of NASM I was using (0.98.28cvs) had a bug in it's comment parsing.

2004-10-24 Footnote:
I got some mail from a "Clint" thanking me for this info, and tellimg me that nasm 0.98.38 also suffers from the same problem when a | pipe character is present in a comment.

  From: Mr H3x 
  To: Bob-LinuxGremlins@HamsterRepublic.com
  Subject: Good info on the site

  I was having some similar problems with nasm complaining of the "phase
  detected at end of assembly". It had me buggered for awhile until I
  found the bit about forward slashes that you had used. Anyhow, the
  information helped me so I figured I would give some back. My problem
  stemmed from having "|"s in my comments in nasm 0.98.38, I removed
  those and everything worked well.

  Anyhow, use it or leave it, thanks for the information on the site.

  Regards,
  Clint

There! I hope that turns out to be helpful for future nasm-trouble-searchers :)


May 06 2002

More than 640k in Dosemu

I have been working on a big ongoing dosemu project at work, trying to get an old DOS app to run in dosemu on a linux box so it can be accessed remotely over ssh. At one point, the app calls another app to do a special printing function, and the two of them together required more than 640 of memory. On the old dos workstations were this app used to run, this was accomplished with a little utility called "stretch" from the "Headroom" program. Stretch would unmap the reserved VGA area of memory. As the DOS app only required text-mode, this was no problem. I spent a great deal of time fidling around with the settings to dosemu, trying to get the same effect. I knew I needed to prevent dosemu from marking the memory range A0000-AFFFF as reserved... but how? Nothing seemed to help. I was running this in console-text mode over a remote ssh login, so why was dosemu insisting on allocating video memory?

The answer, as it often does, turned out to be absurdly simple. I finally got to the point were I had downloaded the source code for dosemu, and I was reading through it, trying to make sense of things, and I realized; Hey! There is code in here that automatically frees the video memory when you are running in console-text mode! Then why don't I have that extra memory already? I looked in my dosemu.conf file again, and saw this line:

$_dosmem = (640)        # in Kbyte, < 640

I had not bothered to fiddle with this setting before, because I had read the comment, and figured, oh. this wont help me, all it is good for is reducing the available memory (and I wondered why anybody would need to do that, but hey, somebody might want to). So I decided to ignore the comment, and change it anyway.

$_dosmem = (680)        # in Kbyte, < 640

Not being greedy, I increased it only as much as I knew I needed, and then I ran dosemu again, not really expecting much... but when I typed MEM at the DOS prompt:

Memory Type        Total  =   Used  +   Free
----------------  -------   -------   -------
Conventional         680K       18K      662K
Upper                  0K        0K        0K
Reserved             344K      344K        0K
Extended (XMS)     1,024K        0K    1,024K
----------------  -------   -------   -------
Total memory       2,048K      362K    1,686K

Total under 1 MB     680K       18K      662K

Largest executable program size       661K (677,312 bytes)
Largest free upper memory block         0K       (0 bytes)
MS-DOS is resident in the high memory area.

Yowza! Largest executable program size 661K! I was in business. It had been that simple all along. The good programmers of dosemu had been wize enough to simply add automatic support for the sort of memory management I wanted, and all I had to do was ask, "Please sir, can I have some more?"

But it was that comment that threw me. So the moral of the story kids, is "Read the comments... but never trust them!"


Feb 13 2002

Mystery error from Send_Job

My printer at work was working perfectly. It is an HP Laserjet 4 connected to the network with a Jetdirect card. My workstation was printing to it using LPRng and apsfilter. As I said, it was working perfectly. I had just printed something only an hour before. I decided to manually add a second printcap entry for an identical printer in the engineering department. I copied the original printer's printcap entry and changed only the printer queue name and the ip hostname. I carefulyl constructed a new spool directory in /var/spool/lpd and made sure it's permissions were right. I restarted lpd and tried to print to the new printer. Nothing happened. lpq gave me the following output:

Printer: engineering@shoat 'ljet4;r=600x600;q=medium;c=full;p=letter;m=auto' (dest engineering@engineering-printer)
 Queue: no printable jobs in queue
 Server: no server active
 Rank   Owner/ID                  Class Job Files                 Size Time
error  james                        A   259 ERROR: Mystery error from Send_job
JetDirect lpd: no jobs queued on this port

Well, I have to give them points for using such a nice descriptive error string like "Mystery error". I double checked everything. It *ought* to be working. My original printer still worked, but this one threw hissy-fits. So I gave up and changed my printcap file back how it was before and erased the new failed spool directory, and restarted lpd again. Now my regular printer will not print. It does nothing, and lpq shows me the same "Mystery Error from Send_job" message.

I fixed the error, but failed to write down or remember exactly what I did. Many months later, the same error occured again when I was setting up printing on a different computer. It seems that this is the error message you get when apsfilter fails. I was able to read the error messages in /var/spool/lpd/printername/status.pr to figure out what I was missing. In this second case, i had failed to install a2ps, but I think something else was the problem before. These errors are not hard to solve once you find the log. I was very frustrated that the logs were not being kept in /var/log where I expected them.


Jan 2002

LPRng keeps giving me error messages about 'missingprinter'

Printer: missingprinter@localhost - ERROR: spool queue for  'missingprinter'
does not exist on server localhost
   non-existent printer or you need to run 'checkpc -f'

I installed LPRng, Ghostscript, and apsfilter on my linux box, and I used ran the apsfilter setup program and configured my /etc/printcap file to print to my HP Laserjet 4 using Jetdirect. But then anytime I would try to use lpr or lpq , I would get this error about 'missingprinter'. I read the LPRng docs, and they said that was the error I should expect if my printcap file was empty... I checked, and obviously it wasnt. I finally realized that LPRng had been installed to look for /usr/local/etc/printcap rather than /etc/printcap . After symlinking the printcap file to the new location, the error message was gone.


Jan 2002

When printing; Error: /invalidfont in findfont

I tested out my newly set-up printing by opening up Abiword and typing "Lorem ipsem dolor" in a wierd font with large print size, and then printing it. It worked. Then I loaded a short story I had typed weeks ago, switched it to Arial font (cause Times is ugly) and printed. It also worked. Wonderfull! Shortly afterward, I attempted to print an e-mail from Mozilla MailNews. I got one sheet out of the printer, and all it said was this:

Error: /invalidfont in findfont
                               Operand stack:
                                                 F0    Times-Roman  Font    Times-

I was baffled by this. I tried printing a simple text file from the command line, and it gave me the same error, except the value of the operand stack was "iso1dict fCourier --nostring"

I searched around and found a few other people complaining about similar errors, but no replys any more helpful than. "I think you are missing some fonts, dude" and "That happened to me once, so I re-installed my font package in [insert name of distribution-specific package manager]". Finally I went back to http://www.ghostscript.org and read the instructions again, and realized that because I was installing from sources, I need to download and install the separate ghostscript-fonts package into /usr/local/share/ghostscript . You would think that by now I would have learned to read directions, but no. :)


Dec 2001

That bloddy common_x86.o file

Each time I install X11 from the sources I seem to run into this same problem, and I always forget how I fixed it last time. So anyway. make World goes fine. Then I start my make install. Partway through it dies with an error about common_x86.o or some other similar filename being missing from the mesa GL directory of the source tree. After much pulling my hair out, I finally figured out why. I was doing this on a Debian potato distribution. That ships with kernel 2.2.x. I had compiled and installed kernel 2.4.14 from the sources. What I had forgotten to do was to update my /usr/includes directorys, so the X build system saw that I was using kernel 2.4 but then it couldnt find the headers it needed to properly compile this common_x86 thingy. So I renamed /usr/include/linux and /usr/inlude/asm and then symbolically linked those directories to the similarly named ones in the kernel source tree under ./include/linux and ./include/asm . I guess that if your distribution comes with kernel 2.4, or if you install the kernel from an rpm or something like that then this will be done for you, but when you do it from the sources you have to remember this step for yourself


Dec 2001

X spews errors on first start after installing from the sources

xauth: error in loading shared libraries: libXmuu.so.1: cannot open shared
object file: No such file or directory
xauth: error in loading shared libraries: libXmuu.so.1: cannot open shared
object file: No such file or directory
xauth: error in loading shared libraries: libXmuu.so.1: cannot open shared
object file: No such file or directory
xauth: error in loading shared libraries: libXmuu.so.1: cannot open shared
object file: No such file or directory
xinit: error in loading shared libraries: libXmuu.so.1: cannot open shared
object file: No such file or directory
xauth: error in loading shared libraries: libXmuu.so.1: cannot open shared
object file: No such file or directory

This is an easy one actually, I just always forget. To fix it, I need to add /usr/X11R6/lib to my /etc/ld.so.conf file, and then run ldconfig as root. That will make libXmuu (and many other important X shared libraries) available


Late 2001

Dont try to upgrade your glibc until you know what you are doing!

On more than one occasion, I have hosed my whole system because I said to myself, hey! Why dont I upgrade my glibc? Not that there is anything at all wrong with my current version, but I just dont have the newest and greatest... for most software, that isn't a terribly bad philosopy, but for vital shared libraries, you better make sure you do it right. I downloaded the latest sources, did ./configure , make , make install , and everything seemed to go wonderfully... until I ran ldconfig ... then everything crashed and burned. I still had my bash prompt, but *every* command would give me a library error. I couldnt even do an ls. I was horrified to think that I was going to have to completely rebuild that system from scratch, and my very faith in the spiffyness of linux was shaken to its foundations. However, once I cleared my head, I went to my work computer and I found myself some rescue disks. Debian has a nice rescue disk that I had never bothered to burn to a floppy. Armed with that, I got back home and booted with it. Then I looked through all of my library directorys until I found the new files in /usr/local/lib and I moved everything with yesterday as the file creation date into another directory. After I did that, I was able to boot again, and everything was pristine and happy! I danced with joy.

A couple weeks later, I ran into residual probelms when I tried to compile a c++ program. It spewed messages like: `struct streampos' used where a `long long int' was expected /usr/local/lib/gcc-lib/i586-pc-linux-gnu/2.95.2/../../../../include/g++-3/streambuf.h:364 . It turns out that although I had removed the new library files from my dismally failed glibc-2.2 upgrade, the upgrade had changed some of my header files. I tried removing and reinstalling all my g++ and libstdc++ libraries with Debian's dselect, but it made no difference. After much newsgroups seraching I found that there were incompatabilities in the header files when using g++ 2.95.2 and glibc-2.2. http://www.gnu.org/software/libc/FAQ.html#s-2.34 described this problem, and had a link to a patch, which was unfortunately a dead link. But armed with the filename, gccinclude-glibc-2.2-compat.diff I was able to find it mirrored elsewhere.... Eh, well, that was a rambling drawn-out boring story, wasnt it?


What huh? This is the bottom of the file. New stuff goes at the top! If you have anything of Linuxey interest which you wish to share with me, you can find me at Bob-LinuxGremlins@HamsterRepublic.com