Gas station without pumps

2013 August 1

Mac OS 10.6.8 kernel bug

Filed under: Uncategorized — gasstationwithoutpumps @ 20:40
Tags: , , , ,

Today I managed to tickle a rather nasty kernel bug in Mac OS 10.6.8 on my laptop.  The bug resulted in unkillable zombie processes—including one stuck in a busy-wait loop that took up 100% of the CPU (in kernel mode). You can’t kill the processes (not even “kill -9” works), and you can’t shut down or restart the computer—only power cycling works.

Here is how I tickled the bug:

  1. I had a KL25Z board plugged into a USB slot, providing data at 115200 baud.
  2. I started a python program that read the stream from the USB and echoed it to the terminal window:
    #!/usr/bin/env python
    
    from __future__ import print_function,division
    from glob import glob
    from serial import Serial
    
    USB_serial_ports = glob("/dev/tty.usb*")
    print ("# Possible ports = {}".format(USB_serial_ports))
    usb_serial = Serial(USB_serial_ports[0],baudrate=115200,timeout=5);
    print ("# Opened {}".format(usb_serial))
    
    try:
        for line in usb_serial:
           print (line.strip())
    except KeyboardInterrupt:
        usb_serial.close()
    
  3. I used ^C to kill the python program and close the port.  So far, everything was working great.  The program echoed the stream from the KL25Z board nicely to the terminal window, and the try-except block caught the ^C interrupt and (presumably) closed the port.
  4. I started the python program again.  This time, it was unable to open the USB serial port (never getting to the “Opened” print statement), and could not be killed. The Activity Monitor showed that the kernel was using 100% of a CPU core (I have a dual-core MacBook Pro, so this was only half the available CPU). I tried ^C, Force Quit, and “kill -9” with the process number of the Python process—none had any effect. ^Z claimed that the process was suspended, but the kernel was still in a busy-wait loop, and attempting to kill the suspended process had no effect on either the kernel busy-wait or the Python process.
  5. Unplugging the USB cable stopped the kernel busy-wait loop (at least the system CPU usage dropped from 100% down to 2%), but the Python process was still unkillable.

I could do this cycle repeatedly, since each time I plugged in the cable to the KL25Z board the Mac assigned a new device number, and I could thereby build up a lot of unkillable Python processes.  Eventually, I got tired of the large number of unkillable processes, and tried restarting the computer.  It got to the blue screen, then spun the waiting icon forever (well, longer than I was willing to wait), so I had to use option-power to turn off the computer.  It rebooted fine.

I have so such trouble with the Arduino Duemilanove board (which resets whenever the USB serial port is opened).  The same Python program also works fine with the Arduino Leonardo board, which does not reset.  I can run my little echoing program, ^C it, and run it again to continue getting the stream from the board.

There must be some difference between how the Leonardo sets up the USB serial connection and how the KL25Z board does it—a difference that causes a kernel busy-wait when reopening the stream from the KL25Z, but not when re-opening the stream from the Leonardo.  Since the default serial setup is the same for both, I’m a little mystified what that could be.

Of course, I’m now left with a bit of a dilemma—how do I record data from the KL25Z board without having to unplug and replug the USB cable for every event I want to record?

I looked around the Web to see if anyone else had similar problems.  It does seem to have been a problem with OS 10.6 in other USB contexts:

http://support.plugable.com/plugable/topics/baud_rate_switches_cause_driver_hangs_in_the_kernel

http://stackoverflow.com/questions/4064832/open-function-hangs-never-returns-when-trying-to-open-serial-port-in-mac-os

And the stackoverflow answers suggest that the problem is buggy device driver, but I’ve no idea which device driver either the Leonardo or the KL25Z board (with MBED firmware) causes to be used on the Mac.  Are they using different drivers?  How do I find out?

 

According to the USB Prober application, the KL25Z board opens with

           4: MBED CMSIS-DAP@4100000  <class IOUSBDevice>
               AppleUSBCDC  <class AppleUSBCDC>
               USB_MSC@0  <class IOUSBInterface>
                   IOUSBMassStorageClass  <class IOUSBMassStorageClass>
               IOUSBInterface@1  <class IOUSBInterface>
                   AppleUSBCDCACMControl  <class AppleUSBCDCACMControl>
               IOUSBInterface@2  <class IOUSBInterface>
                   AppleUSBCDCACMData  <class AppleUSBCDCACMData>
                       IOModemSerialStreamSync  <class IOModemSerialStreamSync>
               MBED CMSIS-DAP@3  <class IOUSBInterface>
                   IOUSBHIDDriver  <class IOUSBHIDDriver>
                       IOHIDInterface  <class IOHIDInterface>

while the Leonardo opens with

           3: Arduino Leonardo@6200000  <class IOUSBDevice>
               AppleUSBCDC  <class AppleUSBCDC>
               IOUSBInterface@0  <class IOUSBInterface>
                   AppleUSBCDCACMControl  <class AppleUSBCDCACMControl>
               IOUSBInterface@1  <class IOUSBInterface>
                   AppleUSBCDCACMData  <class AppleUSBCDCACMData>
                       IOModemSerialStreamSync  <class IOModemSerialStreamSync>
               IOUSBInterface@2  <class IOUSBInterface>
                   IOUSBHIDDriver  <class IOUSBHIDDriver>
                       IOHIDInterface  <class IOHIDInterface>

The biggest difference seems to be that the KL25Z board (with MBED firmware) offers one more interface—the mass-storage interface that is used for downloading programs to the board. I don’t understand a lot of what I can see poking around with USB Prober, but I’m not seeing anything else that looks like a significant difference between the Leonardo board and the KL25Z (with MBED firmware). Certainly the serial interface specs look identical, so far as I can tell.

2011 July 13

Simple timer

Filed under: Data acquisition,Software — gasstationwithoutpumps @ 01:51
Tags: , , ,

One of the simplest and most useful tasks that a data acquisition system can do is to measure the time between two events.

On the Arduino, this is easily done with the digitalRead() to detect the transition and micros() to time the events in microseconds. First you have to figure out what pins the inputs will use, and in the setup() routine, specify that they will be used for input, not output. In my example, I have used pin 4 for the first event and pin 5 for the second event.

In the loop() routine, have the Arduino do a busy-wait for the first event. Busy-waiting is a crude technique for determining when an event happens—you just keep checking the input until it is the value you want for starting the event. Modern computers rarely do busy waiting, because it ties up the processor doing nothing useful. The more common approach is to use an interrupt, which is possible with the Arduino, but a bit trickier to code. Since we weren’t doing anything else with the Arduino while waiting for the initial event, it made sense to me to do a busy wait.

Immediately after the starting event, record the current time. I also turned on the on-board LED (pin 13) to indicate that the board was now waiting for the ending event.

Then busy-wait for the ending event, and record the time right after it.

My example reports the difference in times on the USB line, but you could store the results in an array or do further processing of the times.

// Timer test
// Kevin Karplus
// 12 July 2011

// One of the simplest data acquisition tasks is
// to time the interval between two events.

// This program waits for pin 4 to go high,
//   starts a timer, lights the on-board LED,
//   then waits for pin 5 to go low,
//   when it turns of the LED and resets the timer.
// It reports the time between pin 4 going high and pin 5 going low
//   in microseconds.

// It is easy to change the code to use
// either transitions to high or transitions to low.
// Using a pull-up resistor can convert any make or break contact to
// a low-going or high-going edge.
// Opposite polarities were used in this example, so that the
// same square wave could be fed to both pins and the width of the
// positive pulse timed.


// The minimum time it can report (if pin 5 is already low when
// pin 4 goes high is about 12 microseconds +- 4 microseconds).
// The reported time seems to be always a multiple of 4 microseconds,
// which is most likely the resolution of the micros() call.

// Timing seems to be fairly reliable down to about 60 microsecond.

void setup()
{
    pinMode(4, INPUT);
    pinMode(5, INPUT);

    pinMode(13, OUTPUT);    // LED output
    Serial.begin(115200);
}

void loop()
{
    unsigned long start_time;
    unsigned long stop_time;

    digitalWrite(13,0);
    while(digitalRead(4)==0) {}
    start_time=micros();
    digitalWrite(13,1);
    while (digitalRead(5)) {}
    stop_time=micros();
    digitalWrite(13,0);
    Serial.print("time hi-&gt;lo=");
    Serial.print(stop_time-start_time);
    Serial.println(" microsec");
    delay(500);
}

[Update: 8 October 2011. I redid the code using the sourcecode tag, so that WordPress no longer discards the spacing.]

I put this little test program on gist as git://gist.github.com/1079561.git

%d bloggers like this: