|
|
ru.linux- RU.LINUX --------------------------------------------------------------------- From : Sergey Lentsov 2:4615/71.10 02 Aug 2001 16:37:48 To : All Subject : URL: http://www.lwn.net/2001/0802/kernel.php3 --------------------------------------------------------------------------------
[1][LWN Logo]
[2]Click Here
[LWN.net]
Sections:
[3]Main page
[4]Security
Kernel
[5]Distributions
[6]On the Desktop
[7]Development
[8]Commerce
[9]Linux in the news
[10]Announcements
[11]Linux History
[12]Letters
[13]All in one big page
See also: [14]last week's Kernel page.
Kernel development
The current kernel release is still 2.4.7. The 2.4.8 prepatch is
currently at [15]2.4.8pre3; it includes the usual collection of fixes,
along with the single-use patch from Daniel Phillips which was covered
[16]last week. There have been [17]complaints that the 2.4.8pre series
is much slower on systems with large amounts of memory; the VM hackers
are currently hot on the trail of those problems.
Users of Adaptec adaptors (i.e. your editor, grumble grumble...) on
SMP systems were unpleasantly surprised with 2.4.8pre2, which crashed
on boot. The check that caused the crash has been removed, but there
appears to be a strange problem that still lurks in there somewhere.
Alan Cox's latest patch is [18]2.4.7ac3. It contains a great many
architecture-specific changes; slowly the kernel trees for the various
ports are finding their way back toward the mainline. There's also
some enhancements for User-Mode Linux and many miscellaneous fixes.
A new kernel API for completion events. It is common in kernel code to
set some sort of process in motion, then to go to sleep and wait until
that process completes. There are several ways of implementing the
"wait for completion" part; which is the proper one to use depends on
the specific situation. Until 2.4.7 came out, one technique used
involved semaphores. The initiating process would declare a semaphore
as a local variable (i.e. on the stack), starting out in the locked
state; the process would do what was needed to arrange for some work
to be done, then wait on the semaphore. The code actually doing the
work would simply unlock the semaphore when the task was complete.
On the surface, this technique is appealing because it avoids some
obvious race conditions. If, for example, the work gets done before
the kernel gets around to waiting on the semaphore, it notices that
fact and simply doesn't wait. The sleep_on() and wake_up() calls can
be much trickier to use correctly in this situation. But, as it turns
out, there is a race condition here too, which is a result of how the
semaphores themselves work.
When a semaphore is to be unlocked, the code (1) sets the semaphore
itself to the unlocked state, then (2) calls wake_up() to notify any
processes that might have been waiting on the semaphore. If the waiter
tests the semaphore between those two steps, it will never actually
wait, and may well execute the rest of its code before the wake_up()
call happens. That is not normally a problem, but, if the semaphore is
sitting on a kernel stack somewhere, it could cease to exist before
the wake_up() call, which requires data from the semaphore, runs. In
other words, it could be working with a pointer into random memory;
the technical term for this is "oops." This particular race is highly
unlikely to ever actually happen, but it's still a race.
The performance of this approach is also suboptimal, due to the fact
that semaphores are optimized for the unlocked case. In this
particular situation, the semaphore will almost always be locked.
Linus chose not to change the semaphore implementation (it's "painful
as hell"); instead, he [19]created a new interface for the handling of
completion events. All a process need do to use this facility is to
create and initialize a completion structure:
struct completion event;
init_completion(&event);
Then it can set things in motion, and call:
wait_for_completion(&event);
to sleep until things are done. The task actually doing the work can
perform a simple call to
complete(&event);
and the waiting process wakes up.
It's a relatively straightforward solution, even if changing APIs in
the middle of a stable kernel series may look a little strange. If
nothing else, the whole affair makes it clear, once again, just how
hard it is to avoid race conditions in kernel code.
The first initramfs patch was [20]posted by Alexander Viro this week.
This patch is the implementation of the new 2.5 boot process that was
first discussed in the [21]July 12 kernel page. In this scheme, the
kernel executable image carries with it a cpio archive containing the
contents of the initial root filesystem. That archive is loaded into a
ramdisk at boot time, at which time it can be used to continue the
system initialization process.
The hope is to move much kernel initialization code out of kernel
space and into this ramdisk. The result is a smaller kernel and more
flexibility in how the bootstrap process is set up. For the moment,
the tasks that have been moved to user space include:
* Finding and mounting the real (permanent) root filesystem. NFS
root filesystems are handled here as well.
* Setting up any initial ramdisk (usually for the purpose of loading
kernel modules needed for the boot process).
* Running the linuxrc boot script.
* Finding the real init process and running it.
There is more that can be moved into this filesystem, but that's a
good start. The claim is that kernels running with this patch will
function identically; no boot setups should be broken or require
changes. Mr. Viro would, of course, like to hear from anybody with
evidence to the contrary.
Heading toward ext3 1.0. [22]ext3 2.4-0.9.5 was released by Andrew
Morton. This version continues the work toward a truly stable ext3
journaling filesystem release, fixing a number of bugs. Much work has
also gone into performance improvements on a number of fronts. Among
other things, synchronous operations happen more quickly; this should
make people running large mail systems happy, since many mail transfer
agents make heavy use of synchronous directory operations.
Another change in 0.9.5 is the ability to use an external journal.
External journals live on a separate device (perhaps a non-volatile
RAM device), and, in theory, can speed up the operation of the
filesystem. Writes to an external journal should be very quick, and
journal operations will not contend with writes to the rest of the
disk. The initial performance results with external journals appear to
be mixed, however.
Those interested in ext3 may also want to see [23]an older patch
announcement from Andrew which contains a detailed explanation of the
three journaling modes supported.
Much slower routing performance in 2.4 has been reported by some
users. The common factor in these reports is that the people involved
are still using the 2.2 ipchains interface to set up their
firewalling. The ipchains module in 2.4 carries full connection
tracking along with it; most people setting up ipchains rules probably
do not need that feature. The solution is to switch to iptables.
Other patches and updates released this week include:
* Daniel Phillips has posted [24]a new version of his patch for the
handling of pages that are used only once.
* Anton Altaparmakov has [25]released version 1.2.0 of the
Linux-NTFS support tools.
* Also from Anton is [26]this patch which adds support for Windows
2000/XP dynamic disks.
* David Schleef has posted [27]Comedi-0.7.60, a collection of data
acquisition device drivers.
* Alan Cox has [28]modified the kernel Makefile to add a "make rpm"
target. The result, of course, is an RPM file containing the
compiled kernel. A "make deb" option will likely be added in the
near future.
* Milan Pikula has [29]started a new mailing list for those who are
interested in filesystem repair and crash recovery topics.
* An [30]Mwave modem driver for 2.4.7 has been released by Paul
Schroeder.
* [31]devfsd v1.3.12 was released by Richard Gooch.
* Richard also released [32]a patch that, when used with devfs,
enables a 2.4 kernel to support up to ("approximately") 2144 SCSI
disks. He warns that it is untested and could result in filesystem
corruption. There have been few problem reports, but it turns out
that, for now, limitations in other parts of the system will still
limit the maximum number of disks to far less than 2144.
* The Linux Test Project has released [33]ltp-20010801, the latest
version of its kernel test suite.
* Andreas Gruenbacher has posted [34]an access control list patch
for 2.4.7.
* Constantin Loizides has been working with a number of journaling
filesystems to determine the degree to which they experience
fragmentation under long-term, sustained use. He has [35]posted
his findings; the results vary significantly between the various
filesystems.
* Keith Owens has released [36]a 2.5 kbuild release candidate.
* Adam Goode has [37]started a project to write a driver for the
Logitech iFeel mouse. This device is fun in that it can be used to
provide tactile feedback to the user - little bumps as the pointer
moves over buttons and such.
* [38]mdctl 0.4 was released by Neil Brown; it is a utility for
controlling RAID devices, meant to replace mkraid, raidstart, etc.
* Version 2.2.0 of the [39]Functionally Overloaded Linux Kernel
patch is now available; it has almost anything one could imagine,
including several kitchen sinks. FOLK creator Jonathan Day informs
us that the size of the patch is now 1/3 that of the standard
kernel.
Section Editor: [40]Jonathan Corbet
August 2, 2001
For other kernel news, see:
* [41]Kernel traffic
* [42]Kernel Newsflash
* [43]Kernel Trap
Other resources:
* [44]Kernel Source Reference
* [45]L-K mailing list FAQ
* [46]Linux-MM
* [47]Linux Scalability Project
* [48]Kernel Newbies
[49]Next: Distributions
[50]Eklektix, Inc. Linux powered! Copyright Л 2001 [51]Eklektix, Inc.,
all rights reserved
Linux (R) is a registered trademark of Linus Torvalds
References
1. http://lwn.net/
2. http://ads.tucows.com/click.ng/pageid=001-012-132-000-000-003-000-000-012
3. http://lwn.net/2001/0802/
4. http://lwn.net/2001/0802/security.php3
5. http://lwn.net/2001/0802/dists.php3
6. http://lwn.net/2001/0802/desktop.php3
7. http://lwn.net/2001/0802/devel.php3
8. http://lwn.net/2001/0802/commerce.php3
9. http://lwn.net/2001/0802/press.php3
10. http://lwn.net/2001/0802/announce.php3
11. http://lwn.net/2001/0802/history.php3
12. http://lwn.net/2001/0802/letters.php3
13. http://lwn.net/2001/0802/bigpage.php3
14. http://lwn.net/2001/0726/kernel.php3
15. http://lwn.net/2001/0802/a/2.4.8pre3.php3
16. http://lwn.net/2001/0726/kernel.php3
17. http://lwn.net/2001/0802/a/2.4.8-vm.php3
18. http://lwn.net/2001/0802/a/2.4.7ac3.php3
19. http://lwn.net/2001/0802/a/lt-completions.php3
20. http://lwn.net/2001/0802/a/initramfs.php3
21. http://lwn.net/2001/0712/kernel.php3
22. http://lwn.net/2001/0802/a/ext3.php3
23. http://lwn.net/2001/0802/a/ext3-modes.php3
24. http://lwn.net/2001/0802/a/use-once.php3
25. http://lwn.net/2001/0802/a/ntfs.php3
26. http://lwn.net/2001/0802/a/dynamic-disks.php3
27. http://lwn.net/2001/0802/a/comedi.php3
28. http://lwn.net/2001/0802/a/make-rpm.php3
29. http://lwn.net/2001/0802/a/fs-salvage.php3
30. http://lwn.net/2001/0802/a/mwave.php3
31. http://lwn.net/2001/0802/a/devfsd.php3
32. http://lwn.net/2001/0802/a/lotsa-scsi.php3
33. http://lwn.net/2001/0802/a/ltp.php3
34. http://lwn.net/2001/0802/a/acl.php3
35. http://lwn.net/2001/0802/a/fragmentation.php3
36. http://lwn.net/2001/0802/a/kbuild.php3
37. http://lwn.net/2001/0802/a/tactile.php3
38. http://lwn.net/2001/0802/a/mdctl.php3
39. http://folk.sourceforge.net/
40. mailto:lwn@lwn.net
41. http://kt.zork.net/
42. http://www.atnf.csiro.au/~rgooch/linux/docs/kernel-newsflash.html
43. http://www.kerneltrap.com/
44. http://lksr.org/
45. http://www.tux.org/lkml/
46. http://www.linux.eu.org/Linux-MM/
47. http://www.citi.umich.edu/projects/linux-scalability/
48. http://www.kernelnewbies.org/
49. http://lwn.net/2001/0802/dists.php3
50. http://www.eklektix.com/
51. http://www.eklektix.com/
--- ifmail v.2.14.os7-aks1
* Origin: Unknown (2:4615/71.10@fidonet)
Вернуться к списку тем, сортированных по: возрастание даты уменьшение даты тема автор
Архивное /ru.linux/198617db31ee0.html, оценка из 5, голосов 10
|