|
|
ru.linux- RU.LINUX --------------------------------------------------------------------- From : Sergey Lentsov 2:4615/71.10 10 May 2001 17:11:13 To : All Subject : URL: http://lwn.net/2001/0510/kernel.php3 --------------------------------------------------------------------------------
[1][LWN Logo]
[2]Click Here
[LWN.net]
Sections:
[3]Main page
[4]Security
Kernel
[5]Distributions
[6]On the Desktop
[7]Development
[8]Commerce
[9]Linux in the news
[10]Announcements
[11]Linux History
[12]Letters
[13]All in one big page
See also: [14]last week's Kernel page.
Kernel development
The current kernel release is 2.4.4. There have been no kernel
releases (not even prepatches) from Linus since [15]2.4.5pre1came out
on May 2.
Alan Cox remains busy; his latest is [16]2.4.4ac6, which contains
another long list of fixes but nothing radical.
To top it off, Alan has also started the 2.2.20 prepatch series with
[17]2.2.20pre1. At this point, only serious fixes are going in at this
point: "Expect me to be very picky on changes to the core code now."
Moving block devices to the page cache. In [18]last week's kernel page
we looked at a subtle metadata corruption bug brought about by the
fact that I/O to block devices uses the buffer cache, while the
filesystem code uses the page cache. Conversation on this topic has
continued in this (otherwise slow) week, so it's worth another look.
Some background first...
Linux systems use two distinct caches to improve performance. Both are
used to keep copies of disk-resident data in main memory, and thus to
avoid excessive disk I/O operations. These caches are:
* The buffer cache holds individual disk blocks; entries in the
cache are indexed by the device and block numbers. Unix-like
systems have had a buffer cache for a very long time, and the
block I/O system is built around the "buffer head" structure used
to implement the buffer cache.
* The page cache, instead, holds full pages. The pages come from
files in the file system, and, in fact, page cache entries are
indexed (more or less) by the file's inode number and the offset
within the file. A page is almost invariably larger than a single
disk block, and the blocks that make up a single page cache entry
may not be contiguous on the disk.
The page cache tends to be easier to deal with, since it more directly
represents the concepts used in higher levels of the kernel code.
Thus, over time, parts of the kernel have shifted over from using the
buffer cache to using the page cache.
The individual blocks of a page cache entry, of course, are still
managed through the buffer cache. But, as we saw last week, accessing
the buffer cache directly can create confusion between the two levels
of caching.
Reading and writing a block device directly, as is done by utilities
like dump and fsck, works only with the buffer cache. It turns out
that Linus [19]wants to change this behavior, even though he is not
tremendously concerned about the corruption problem discussed last
week. Having block devices use the page cache will clean up a lot of
design issues, improve performance, and gets away from the idea of
using the buffer cache as a cache. The buffer cache, for Linus, really
should just be a low-level block I/O mechanism that leaves the actual
caching tasks to higher levels.
Not much time passed before Andrea Arcangeli [20]released a patch
moving block I/O into the page cache. Essentially, he has eliminated
the special-purpose block_read and block_write functions, and made a
block device look like a large file. So now the general-purpose file
I/O functions may be used instead.
As an added bonus, Andrea has obsoleted the raw I/O interface,
implementing instead an O_DIRECT flag which may be used to perform I/O
directly between the device and user space. This change makes raw I/O
a much more straightforward affair, since it's no longer necessary to
set up and bind the separate /dev/raw devices.
A change of this magnitude, of course, would not normally be expected
to go into the 2.4 kernel - though some other surprising things have
made it in. Expect to see something like Andrea's patch be
incorporated early in the 2.5 cycle, however.
ReiserFS - ready for prime time. Hans Reiser has [21]posted a note
saying, essentially, that all of the real bugs in the ReiserFS
filesystem have been fixed as of 2.4.4. Since the filesystem was
included in 2.4.1, its user base has grown greatly and that has, not
surprisingly, led to an increase in bug reports. The ReiserFS hackers
have been tracking down these problems quickly, and many fixes have
come out. As a result, the "beta period" appears to have come to a
close.
There are a few outstanding issues, though. ReiserFS still only works
on small-endian machines, for example (a patch exists which fixes this
problem, but it hasn't seen wide testing yet). You still need to apply
an additional patch to use ReiserFS and the NFS server together. And
the filesystem checker tool still needs some work. But the biggest
problems appear to have been overcome; the "experimental" label may be
removed from ReiserFS in a kernel release soon.
The problem of broken configurations in CML2. Now that a lot of the
CML2 issues have been resolved, people are starting to think more
about how they will actually use the new kernel configuration system.
And a bit of a problem has come up.
Anybody who builds a lot of kernels becomes quickly enamored of the
"make oldconfig" operation, which makes a configuration from an old
kernel work with a new one. It will stop and ask about any new
configuration options, and it makes some attempts to resolve things
when an old configuration violates the rules in the new kernel.
Some hackers noticed that CML2 did not handle things well when a new
kernel adds rules that make an old configuration invalid. Eric
Raymond's initial response was [22]to say that recovering from broken
configurations was too hard. He had the numbers to back the point up:
But wait! There's more! If some of the variables participate in
multiple constraints, the numbers get *really* large. Worst-case
you wind up having to filter 3^1976 or
6188698510434431426254983130149722318444222676000563236614236745406
2\
5379806900724582960751180301446198020519526564876580753335969242240
5\
2666334347865194819764071755917133458724636019082059746246661869961
6\
8376946603848044058853644313976187334398183473123289886812105662428
8\
2517569819726609785514431765450784953649956427216633647489198909743
8\
3518739953334734760427525969328556532863890443646741855238627453368
5\
9132753395341927328484591567822967536386248290246775878810509889267
2\
8904042696847865264863309061309081990992289899672996407366542323608
4\
8781993931968592086302728626997566607316604006242679261297575618546
2\
8153415497745891533273696697541559673207543391243812079802387578768
7\
1213986944296390679575540607709402423593798454604114603287039946767
6\
5075011477576612054998536698161079610024995262148259558044033592366
3\
8953664850794466351818869469154658365025449632705186506438004419956
1\
11898186436375597975714968012719658007155903874756222061921
distinct configurations. The heat-death of the Universe happens
while you're still crunching.
People might have been more impressed with this display of
mathematical analysis skills if it weren't for the fact that
make oldconfig works with the old configuration system. The problem,
perhaps, is that the technique used (configure out anything that
breaks the rules in the new kernel) [23]lacks the sort of elegance
that Eric would like to see in his code:
I guess you didn't know that I trained as a mathematical logician.
On the one hand, that predisposes me to try to find "elegant"
solutions where you might regard brutality and heuristics as more
appropriate.
Elegance appears to have lost, though - witness the announcement of
[24]CML2 1.4.0, the "brutality and heuristics" release...
Other patches and updates released this week include:
* Manfred Spraul has [25]released a new version of his kiobuf-based
single-copy pipe implementation.
* A new [26]hot-swap CPU patch has been released by Anton Blanchard.
This version includes support for CPU swapping on S/390 systems.
* [27]Modutils 2.4.6 was released by Keith Owens.
* [28]packet-0.0.2k, a module which allows transparent, packet-mode
writing to CD-RW drives, was released by Jens Axboe.
* IBM has released [29]version 0.3.1 of its JFS journaling
filesystem.
* Chris Wright has released [30]a new security module patch against
the 2.4.4 kernel. Other security module related postings include
[31]a sample hook function implementation for SELinux, and two
sets of benchmark results posted by [32]J. Melvin Jones and
[33]Greg Kroah-Hartman.
Section Editor: [34]Jonathan Corbet
May 10, 2001
For other kernel news, see:
* [35]Kernelnotes
* [36]Kernel traffic
* [37]Kernel Newsflash
* [38]Kernel Trap
Other resources:
* [39]Kernel Source Reference
* [40]L-K mailing list FAQ
* [41]Linux-MM
* [42]Linux Scalability Project
* [43]Kernel Newbies
[44]Next: Distributions
[45]Eklektix, Inc. Linux powered! Copyright Л 2001 [46]Eklektix, Inc.,
all rights reserved
Linux (R) is a registered trademark of Linus Torvalds
References
1. http://lwn.net/
2. http://ads.tucows.com/click.ng/pageid=001-012-132-000-000-003-000-000-012
3. http://lwn.net/2001/0510/
4. http://lwn.net/2001/0510/security.php3
5. http://lwn.net/2001/0510/dists.php3
6. http://lwn.net/2001/0510/desktop.php3
7. http://lwn.net/2001/0510/devel.php3
8. http://lwn.net/2001/0510/commerce.php3
9. http://lwn.net/2001/0510/press.php3
10. http://lwn.net/2001/0510/announce.php3
11. http://lwn.net/2001/0510/history.php3
12. http://lwn.net/2001/0510/letters.php3
13. http://lwn.net/2001/0510/bigpage.php3
14. http://lwn.net/2001/0503/kernel.php3
15. http://lwn.net/2001/0510/a/2.4.5pre1.php3
16. http://lwn.net/2001/0510/a/2.4.4ac6.php3
17. http://lwn.net/2001/0510/a/2.2.20pre1.php3
18. http://lwn.net/2001/0503/kernel.php3
19. http://lwn.net/2001/0510/a/lt-blkdev-pc.php3
20. http://lwn.net/2001/0510/a/aa-blkdev-pc.php3
21. http://lwn.net/2001/0510/a/reiserfs-stable.php3
22. http://lwn.net/2001/0510/a/cml2-broken-configs.php3
23. http://lwn.net/2001/0510/a/brutality.php3
24. http://lwn.net/2001/0510/a/cml2-1.4.0.php3
25. http://lwn.net/2001/0510/a/scp.php3
26. http://lwn.net/2001/0510/a/hot-swap-cpu.php3
27. http://lwn.net/2001/0510/a/modutils.php3
28. http://lwn.net/2001/0510/a/packet.php3
29. http://lwn.net/2001/0510/a/jfs.php3
30. http://lwn.net/2001/0510/a/sm.php3
31. http://lwn.net/2001/0510/a/selinux-hook.php3
32. http://lwn.net/2001/0510/a/jj-benchmarks.php3
33. http://lwn.net/2001/0510/a/gkh-benchmarks.php3
34. mailto:lwn@lwn.net
35. http://www.kernelnotes.org/
36. http://kt.zork.net/
37. http://www.atnf.csiro.au/~rgooch/linux/docs/kernel-newsflash.html
38. http://www.kerneltrap.com/
39. http://lksr.org/
40. http://www.tux.org/lkml/
41. http://www.linux.eu.org/Linux-MM/
42. http://www.citi.umich.edu/projects/linux-scalability/
43. http://www.kernelnewbies.org/
44. http://lwn.net/2001/0510/dists.php3
45. http://www.eklektix.com/
46. http://www.eklektix.com/
--- ifmail v.2.14.os7-aks1
* Origin: Unknown (2:4615/71.10@fidonet)
Вернуться к списку тем, сортированных по: возрастание даты уменьшение даты тема автор
Архивное /ru.linux/20308f673b181.html, оценка из 5, голосов 10
|