|
|
ru.linux- RU.LINUX --------------------------------------------------------------------- From : Sergey Lentsov 2:4615/71.10 07 Jun 2001 17:16:12 To : All Subject : URL: http://lwn.net/2001/0607/kernel.php3 --------------------------------------------------------------------------------
[1][LWN Logo]
[2]Click Here
[LWN.net]
Sections:
[3]Main page
[4]Security
Kernel
[5]Distributions
[6]On the Desktop
[7]Development
[8]Commerce
[9]Linux in the news
[10]Announcements
[11]Linux History
[12]Letters
[13]All in one big page
See also: [14]last week's Kernel page.
Kernel development
The current kernel release is still 2.4.5. Linus is back from his trip
to Japan, and has released [15]the first 2.4.6 prepatch. It contains
the usual scattering of fixes, including some aimed at the ongoing
virtual memory problems with the 2.4 kernel series.
The prepatch also contains one problem that can cause problems with
unresolved symbols in some modular kernels. Ingo Molnar produced [16]a
simple fix which gets around the problem; after several iterations he
also released [17]a much more involved fix dealing with a number of
other difficulties introduced in 2.4.6pre1.
Alan Cox, meanwhile, is up to [18]2.4.5ac9. Along with the usual fixes
he has included a new driver for the Sony Vaio I/O controller, the new
improved Configure.help file (see below), and a number of fixes for
problems found by the Stanford checker.
Another approach to bounce buffers. The discussion [19]last week on
virtual memory and bounce buffers passed over one interesting approach
to fixing the problem. We'll try to make it up this week, but doing so
requires a little bit of background in how Linux memory management
works. The following discussion is somewhat specific to the x86
architecture, but the concepts carry over to any 32-bit system.
On a processor with 32-bit addresses, a total of 4GB of memory may be
addressed. Linux systems have traditionally not been able to handle
that much memory, however, due to the way memory is laid out. For some
time, the virtual address space has been broken up as shown in this
diagram:
[Virtual memory layout]
(Please excuse your editor's crude use of the "dia" tool...).
Thus, any individual user-space process may have up to 3GB of address
space, with the uppermost 1GB being reserved for the kernel. 2.2
kernels always laid out memory in this way, and 2.4 still does by
default. Before 2.2, the kernel mapped the entire range of physical
memory into its portion of the address space, since that mapping
provided easy, direct access to all of the memory on the system. It
made life easy for kernel hackers, but it also limited the total
amount of memory on the system to the amount that could be mapped in
the kernel segment - 1GB, with subtractions for things like the PCI
I/O memory space. That is why 2.2 kernels could only make use of about
960MB of memory.
The 2.4 release lifted that restriction by enabling the kernel to work
with memory that is not directly mapped. The result was (1) the
ability to handle up to 64GB of memory on x86 systems, and (2) the
creation of a new class of memory, "high memory," which is a little
trickier to work with. So physical memory is now divided into three
zones, as shown by another ugly diagram:
[Physical memory layout]
The "DMA" zone is memory which is addressable by old ISA peripherals
that can only do 24-bit DMA; "normal" is memory above 16M which is
directly mapped into the kernel, and "high memory" is memory which is
not directly mapped. On systems with tremendous amounts of memory,
most of that memory is "high memory."
Now, finally, we can get to the bounce buffer problem. With current
2.4 kernels, any memory which is in the DMA or normal zones may be
used in DMA operations with reasonable devices on reasonable buses.
When I/O must be performed to or from high memory, however, a bounce
buffer is allocated in one of the lower zones. The data is copied
through the bounce buffer in its travels between the device and its
high memory home. On I/O bound systems with a lot of high memory,
bounce buffers can create a lot of pressure in the normal and DMA
zones, leading to memory shortage problems. All that copying isn't
entirely desirable either.
Jens Axboe looked at this problem and made an observation that, in
retrospect, should have been fairly obvious. PCI devices can (usually)
address 32 bits (4GB) of memory. When the kernel uses a bounce buffer
for high memory below 4GB, it is really wasting time and memory. The
kernel may not be able to address that memory directly, but the
peripheral can. So why not just do the DMA operation directly and skip
the bounce buffer?
So Jens [20]announced a patch which does exactly that - at least, for
block devices. (He neglected the little detail of where to find the
patch; he [21]filled that in a little later). This patch adds a fourth
memory zone, called "DMA32," that sits between the top of the normal
zone and the 4GB barrier. Whenever block I/O is being performed on
memory in the DMA32 zone, it is done directly without the use of a
bounce buffer. Bounce buffers are still required above 4GB; it's a
rare peripheral that can reach memory that high. But, even in that
case, the bounce buffer can live in the DMA32 zone.
The benefits of this patch are clear. Given that, in all likelihood,
most systems with high memory have no more than 4GB, bounce buffers
can be eliminated entirely in many cases. And for the rest, the
available memory for the allocation of these buffers has increased.
The patch was not included in 2.4.6pre1, but chances are good that a
version of it will appear in a future release.
About that swapping problem. Problems with the use of swap space in
2.4.x were also mentioned last week. The amount of complaining has
gone up recently, as more people try out the 2.4.5 kernel, which
appears to be worse.
The response from the kernel hackers so far has been "make sure your
swap area is at least twice as large as the amount of RAM in the
system." That allows the kernel, essentially, to waste half of the
swap space as a copy of what is currently in RAM, and actually swap to
the other half. That technique helps, but a number of people are, not
surprisingly, unimpressed with that requirement. 2.2 systems seemed to
work better, after all. In fact, 2.2 had the same problem with
swapping, but the more aggressive approach to caching in 2.4 has made
the problem bite a lot more people.
Help is on the way, however. Marcelo Tosatti has posted [22]a patch
which cleans the junk out of swap space. Some testers have reported
that it improves things for them. There is currently some debate,
however, as to whether the locking used by the patch is safe. So it's
probably not for everybody, yet. [23]A different swap patch was posted
by Mike Galbraith; it is new as of this writing and has not seen much
testing yet. With luck, however, some variant of one of these patches
will make it into a 2.4 kernel soon.
How should the kernel handle temperatures? David Welton [24]pointed
out that parts of the kernel that handle temperatures (generally
watchdog drivers) are not consistent - some code uses Fahrenheit, and
other parts use Celsius. He proposed a global configuration option to
decide what should be used kernel-wide.
The response that came back will be familiar to linux-kernel watchers;
the kernel should use one standard temperature format, and user-space
tools can convert to other standards if necessary. Fahrenheit has very
few defenders for that standard, not surprisingly. But the proponents
of Celsius look like they will lose as well. If one is going to use
standard units, one should do it right and [25]use kelvins. That way
nobody is happy.
Then again, one reader proposed that BogoDegrees be used instead...
Configure.help is complete. Eric Raymond has [26]announced that, after
great effort, the kernel Configure.help file now contains help entries
for every one of the 2699 known configuration symbols.
Of course, Eric knows how ephemeral such a victory can be. So he is
also proposing a policy that no patches will be accepted unless they
contain help entries for any new configuration symbols they introduce.
Other patches and updates released this week include:
* Stephen Tweedie has released [27]ext3-0.0.7a. This release fixes a
"major bug;" ext3 users should probably upgrade.
* Dawson Engler and his group are still finding problems with their
checker system. The latest include [28]two new floating point
bugs, [29]several security holes from careless use of
user-supplied information, and [30]three use-after-free memory
bugs.
* [31]A new directory index patch has been posted by Daniel
Phillips.
* Stelian Pop has [32]released a driver for the Auvertech TurboPAM
ISDN card.
* Erik Mouw has [33]posted a new version of his guide to programming
with procfs.
Section Editor: [34]Jonathan Corbet
June 7, 2001
For other kernel news, see:
* [35]Kernelnotes
* [36]Kernel traffic
* [37]Kernel Newsflash
* [38]Kernel Trap
Other resources:
* [39]Kernel Source Reference
* [40]L-K mailing list FAQ
* [41]Linux-MM
* [42]Linux Scalability Project
* [43]Kernel Newbies
[44]Next: Distributions
[45]Eklektix, Inc. Linux powered! Copyright Л 2001 [46]Eklektix, Inc.,
all rights reserved
Linux (R) is a registered trademark of Linus Torvalds
References
1. http://lwn.net/
2. http://ads.tucows.com/click.ng/pageid=001-012-132-000-000-003-000-000-012
3. http://lwn.net/2001/0607/
4. http://lwn.net/2001/0607/security.php3
5. http://lwn.net/2001/0607/dists.php3
6. http://lwn.net/2001/0607/desktop.php3
7. http://lwn.net/2001/0607/devel.php3
8. http://lwn.net/2001/0607/commerce.php3
9. http://lwn.net/2001/0607/press.php3
10. http://lwn.net/2001/0607/announce.php3
11. http://lwn.net/2001/0607/history.php3
12. http://lwn.net/2001/0607/letters.php3
13. http://lwn.net/2001/0607/bigpage.php3
14. http://lwn.net/2001/0531/kernel.php3
15. http://lwn.net/2001/0607/a/2.4.6pre1.php3
16. http://lwn.net/2001/0607/a/im-fix1.php3
17. http://lwn.net/2001/0607/a/im-fix2.php3
18. http://lwn.net/2001/0607/a/2.4.5ac9.php3
19. http://lwn.net/2001/0531/kernel.php3
20. http://lwn.net/2001/0607/a/dma32.php3
21. http://lwn.net/2001/0607/a/dma32-where.php3
22. http://lwn.net/2001/0607/a/swap-patch.php3
23. http://lwn.net/2001/0607/a/mg-swap-patch.php3
24. http://lwn.net/2001/0607/a/temperature.php3
25. http://lwn.net/2001/0607/a/kelvin.php3
26. http://lwn.net/2001/0607/a/configure.help.php3
27. http://lwn.net/2001/0607/a/ext3.php3
28. http://lwn.net/2001/0607/a/sc-fp.php3
29. http://lwn.net/2001/0607/a/sc-security.php3
30. http://lwn.net/2001/0607/a/sc-use-after-free.php3
31. http://lwn.net/2001/0607/a/directory-index.php3
32. http://lwn.net/2001/0607/a/turbopam.php3
33. http://lwn.net/2001/0607/a/procfs-guide.php3
34. mailto:lwn@lwn.net
35. http://www.kernelnotes.org/
36. http://kt.zork.net/
37. http://www.atnf.csiro.au/~rgooch/linux/docs/kernel-newsflash.html
38. http://www.kerneltrap.com/
39. http://lksr.org/
40. http://www.tux.org/lkml/
41. http://www.linux.eu.org/Linux-MM/
42. http://www.citi.umich.edu/projects/linux-scalability/
43. http://www.kernelnewbies.org/
44. http://lwn.net/2001/0607/dists.php3
45. http://www.eklektix.com/
46. http://www.eklektix.com/
--- ifmail v.2.14.os7-aks1
* Origin: Unknown (2:4615/71.10@fidonet)
Вернуться к списку тем, сортированных по: возрастание даты уменьшение даты тема автор
Архивное /ru.linux/2030857b86f7f.html, оценка из 5, голосов 10
|