| Title: | DIGITAL UNIX (FORMERLY KNOWN AS DEC OSF/1) |
| Notice: | Welcome to the Digital UNIX Conference |
| Moderator: | SMURF::DENHAM |
| Created: | Thu Mar 16 1995 |
| Last Modified: | Fri Jun 06 1997 |
| Last Successful Update: | Fri Jun 06 1997 |
| Number of topics: | 10068 |
| Total number of notes: | 35879 |
Hi,
Customer is porting to V4.0B an application that used to work fine under V3.2G
The processing may be summarized as follow:
( ---> represent shared memory segments used to communicate between the
different processes )
data-input ---> filter ---> treatment ---> data-output
SCHED FIFO RR RR RR
PRIO FIFO_MAX RR_MAX RR_MAX RR_MAX
For historic reasons, the 'data-input' process is build with -laoi and -threads
options (despite the fact that asynchronous IO is not used at all in this
process);
When the read operation is not re-started within few ms, the board generate
an 'overflow' interrupt (from where we crashed the system)
The crash analisys revealed the following:
The 3 processes running with sched RR have the correct policy/priority;
The 'data-input' process is multithreaded (this is the effect of -threads
or -pthread under V4.0), and all the kernel threads are scheduled with
SCHED_OTHER, except one which is correctly scheduled with SCHED_FIFO;
As you guess, the thread running the critical code is SCHED_OTHER
Removing the unuseful -laoi -pthread options make the problem disappear
I suppose this phenomen is related to two-level scheduling/contention scope;
In all the cases, from the customer point of view, sched_setsheduler()
does not seems to do what it should.
Any detailed explanation are welcome;
What solution do we have to workaround this (in case we really need Async I/O) ?
Denis.
| T.R | Title | User | Personal Name | Date | Lines |
|---|---|---|---|---|---|
| 9139.1 | Known problem. | WTFN::SCALES | Despair is appropriate and inevitable. | Wed Mar 12 1997 14:03 | 11 |
.0> I suppose this phenomen is related to two-level scheduling/contention scope Correct. The kernel folks are familiar with this problem and are devising a way to address it. .0> What solution do we have to workaround this There is none, as far as I'm aware, prior to PtMin. Webb | |||||
| 9139.2 | One workaround (makes it hard to maintain, though) | WIBBIN::NOYCE | Pulling weeds, pickin' stones | Wed Mar 12 1997 14:41 | 2 |
Webb, if it worked in 3.2*, can they link it (non_shared?) there, and get the same behavior on 4.0*? | |||||
| 9139.3 | Yep, that should work... | WTFN::SCALES | Despair is appropriate and inevitable. | Wed Mar 12 1997 16:24 | 4 |
.2> if it worked in 3.2*, can they link it (non_shared?) there, and .2> get the same behavior on 4.0*? Yes, that's true...that would be a workaround. | |||||