Главная страница


ru.linux

 
 - RU.LINUX ---------------------------------------------------------------------
 From : Sergey Lentsov                       2:4615/71.10   08 Feb 2001  18:31:34
 To : All
 Subject : URL: http://lwn.net/2001/0208/kernel.php3
 -------------------------------------------------------------------------------- 
 
    [1][LWN Logo] 
    
                                [2]Click Here 
    [LWN.net]
    
    Sections:
     [3]Main page
     [4]Security
     Kernel
     [5]Distributions
     [6]Development
     [7]Commerce
     [8]Linux in the news
     [9]Announcements
     [10]Linux History
     [11]Letters
    [12]All in one big page
    
    See also: [13]last week's Kernel page.
    
 Kernel development
 
    The current development kernel release is still 2.4.1. The two usual
    prepatch tracks are in full swing. On the Linus side, there is
    [14]2.4.2pre1, released just after LinuxWorld. It contains a small set
    of fixes, and doesn't yet deal with the known 2.4.2 problems (see
    below). Alan Cox, instead, has released [15]2.4.1ac5, which contains a
    much larger set of fixes.
    
    On the 2.2 kernel front Alan has released [16]2.2.19pre8. There are
    still, apparently, a few things yet to go into this patch, so the real
    2.2.19 release is not yet imminent.
    
    Some difficulties with 2.4.1. While many (most) users are running
    2.4.1 without trouble, there are a couple of issues that have come up
    which are worth knowing about. They are:
      * There is a bug in the handling of Unix datagram sockets which
        locks up the kernel - or at least one processor on SMP systems.
        Chris Evans has posted [17]a simple test program which
        demonstrates the bug - don't run it on your big server. [18]A
        patch for this bug exists, and will certainly be merged into the
        next kernel prepatches.
      * Hans Reiser has posted [19]a message on stability problems with
        ReiserFS. There are currently [20]five outstanding bugs with this
        filesystem, not all of which yet have fixes available (one of them
        looks like hardware problems, rather than a real ReiserFS bug).
        
    Neither of these issues is all that surprising. Every major stable
    kernel release seems to have one denial of service bug lurking
    somewhere; it takes a larger testing community to flush it out.
    Similarly, ReiserFS is now seeing testing on a far larger scale than
    it ever has in the past, and a few surprises are certain to show up.
    This is the late stage of the free software development process in
    action; fixes are being made quickly, and the end result will be a
    more stable kernel.
    
    ReiserFS can also cause system crashes, but this is not a ReiserFS
    bug. It seems that some people are building the 2.4.x kernel with Red
    Hat's "gcc-2.96" compiler that was shipped with Red Hat 7. That
    compiler has some, um, issues, and it miscompiles some of the ReiserFS
    code. If you're running a late Red Hat system, be sure to build your
    kernels with "kgcc," or at least get the latest, patched gcc from Red
    Hat (which is said to work much better).
    
    The great kiobuf debate. Recently, a farily fierce debate has been
    filling up mailboxes on the linux-kernel and kiobuf-io-devel mailing
    lists. It all has to do with the kiobuf data structure, which was,
    until recently, seen as a generally good addition to the kernel in the
    2.3 series.
    
    The kiobuf structure was added, initially, to support raw disk I/O;
    kiobufs and their supporting routines make it easy for kernel code to
    move data directly between user space and a device, without an
    intervening copy into kernel space, and without having to worry about
    the ugly details of memory management. Their use has slowly grown; in
    the 2.4.1 kernel kiobufs can be found in the generic SCSI (sg) driver
    and in the logical volume manager code. There is also [21]a patch
    floating around that uses kiobufs to implement direct, user-to-user
    pipes. And SGI's [22]XFS patch not only uses kiobufs, but modifies the
    block I/O subsystem to make them integral to disk I/O.
    
    One would think that kiobufs were taking over, except for the little
    fact that the zero-copy networking patches do not use them. Instead, a
    new and completely different mechanism for direct userspace access was
    created. In the discussion that followed, it turned out that quite a
    few people, including Linus, are not pleased with the kiobuf design.
    
    In a (very) simplified way, that design is as follows: a kiobuf, in
    the end, consists of an array of struct page structures, along with an
    initial offset and a total length value. By using page structures
    directly, the kiobuf allows the code using it to avoid dealing with
    the virtual memory entirely - a struct page refers directly to a
    physical page. The initial offset tells where, in the first page, the
    data starts; all the remaining pages are filled with data starting at
    the beginning. A kiobuf thus describes a single, contiguous area;
    working with multiple areas requires using a "kiovec" - an array of
    kiobufs - instead.
    
    The objections to this design include:
      * It is said to be a very heavyweight structure. Kiobufs are a bit
        large, mostly due to the incorporation of an array for the page
        structures. Ingo Molnar has [23]characterized kiobufs as "big fat
        monster-trucks of IO workload."
      * Kiobufs do not handle scatter/gather operations (those which work
        from multiple, noncontiguous memory areas) very gracefully; such
        an operation requires setting up a kiovec and using several
        kiobufs which, as previously noted, are already criticized as
        being too large. Networking, in particular, makes heavy use of
        scatter/gather I/O, and needs to be able to set up and tear down
        structures very quickly.
      * One of the reasons that kiobufs are difficult for scatter/gather
        operations is that they assume that all data is aligned on page
        boundaries, with the exception of the first page. That tends to be
        true for disk I/O, but is rarely the case for networking. Linus,
        in particular, [24]doesn't want any page alignment assumptions in
        this sort of code.
        
    In the end, the fight seems to boil down to this: should a kiobuf
    include an array of offset/length pairs for each page within the
    buffer? With such an array, scatter/gather operations could be
    described with a single kiobuf, and the kiovec idea could go away.
    
    Linus, certainly, [25]takes the position that the offset and length
    values should be pushed down deep in the structure in this way. Kiobuf
    designer Stephen Tweedie, however [26]disagrees. Putting the length
    and offset at that level would make it hard to get the completion
    status of any individual segment and would tend to split apart large
    requests which should really stay together.
    
    The discussion then wandered into whether the venerable buffer head
    structure could be made to do what kiobufs do. A number of people seem
    to think that they could, especially if the block I/O API were
    modified to make it easy to submit large chains of them as a single
    operation. But no code for this use of buffer heads has, as yet, been
    forthcoming.
    
    This issue, clearly, goes pretty deeply into how fundamental
    operations are performed in the kernel. For this reason, the design
    issues involved seem to touch a number of nerves. It will probably be
    some time before a real resolution is reached; those who are
    programming with kiobufs, however, should be prepared to see the
    interface change...
    
    The first public Linux-NTFS release is out, see [27]the announcement
    for details. This release makes it possible to mount NT filesystems in
    a writable mode under Linux. It's not yet perfect, however; when it
    writes to an NTFS partition it leaves a bit of damage behind. For the
    short term, it was evidently easier to provide a separate utility
    ("ntfsfix") which fixes things up afterwards.
    
    Other patches and updates released this week include:
    
      * David Miller continues to put out [28]frequent zero-copy
        networking patches; this patch also, currently, contains the fix
        to the Unix datagram bug.
      * Jeff Merkey has released [29]version v1.1-7 of his driver for
        Dolphin Scalable Coherent Interface adapters.
      * [30]A new kernel development mailing list has been created by Ingo
        Oeser; it is intended to host discussion of a wide range of
        operating system techniques, not just those in use in the Linux
        kernel.
      * [31]devfs-v99.19 was posted by Richard Gooch; it is a backport of
        the latest devfs code to the 2.2.18 kernel. He has also posted
        [32]devfsd-v1.3.11, the devfs daemon that is needed to use a
        devfs-enabled kernel.
      * Rusty Russell has released [33]code to generate a graph of the
        2.4.0 kernel. It requires several hours to run, and, on some
        systems, has proven a little difficult to generate.
      * Juergen Schneider has posted [34]a patch which adds an animated
        boot logo to the framebuffer driver.
      * Robert H. de Vries has posted [35]a new version of his POSIX
        timers patch. This time around, Linus [36]responded that he'll not
        be applying the patch anytime soon, since he does not like the
        implementation.
      * The USAGI Project (USAGI = "UniverSAl playGround for Ipv6") has
        [37]announced the second stable release of its system, which
        features support for both the 2.2.18 and 2.4.0 kernels.
        
    Section Editor: [38]Jonathan Corbet
    February 8, 2001
    
    For other kernel news, see:
      * [39]Kernelnotes
      * [40]Kernel traffic
      * [41]Kernel Newsflash
      * [42]Kernel Trap
    
    Other resources:
      * [43]Kernel Source Reference
      * [44]L-K mailing list FAQ
      * [45]Linux-MM
      * [46]Linux Scalability Project
    
    
    
                                                   [47]Next: Distributions
    
    [48]Eklektix, Inc. Linux powered! Copyright Л 2001 [49]Eklektix, Inc.,
    all rights reserved
    Linux (R) is a registered trademark of Linus Torvalds
 
 References
 
    1. http://lwn.net/
    2. http://ads.tucows.com/click.ng/pageid=001-012-132-000-000-003-000-000-012
    3. http://lwn.net/2001/0208/
    4. http://lwn.net/2001/0208/security.php3
    5. http://lwn.net/2001/0208/dists.php3
    6. http://lwn.net/2001/0208/devel.php3
    7. http://lwn.net/2001/0208/commerce.php3
    8. http://lwn.net/2001/0208/press.php3
    9. http://lwn.net/2001/0208/announce.php3
   10. http://lwn.net/2001/0208/history.php3
   11. http://lwn.net/2001/0208/letters.php3
   12. http://lwn.net/2001/0208/bigpage.php3
   13. http://lwn.net/2001/0201/kernel.php3
   14. http://lwn.net/2001/0208/a/2.4.2pre1.php3
   15. http://lwn.net/2001/0208/a/2.4.1ac5.php3
   16. http://lwn.net/2001/0208/a/2.2.19pre8.php3
   17. http://lwn.net/2001/0208/a/unix-datagram-bug.php3
   18. http://lwn.net/2001/0208/a/unix-datagram-fix.php3
   19. http://lwn.net/2001/0208/a/reiserfs-stability.php3
   20. http://lwn.net/2001/0208/a/reiserfs-bugs.php3
   21. http://lwn.net/2001/0208/a/kiobuf-pipe.php3
   22. http://oss.sgi.com/projects/xfs/
   23. http://lwn.net/2001/0208/a/im-kiobuf.php3
   24. http://lwn.net/2001/0208/a/lt-alignment.php3
   25. http://lwn.net/2001/0208/a/lt-layering.php3
   26. http://lwn.net/2001/0208/a/st-layering.php3
   27. http://lwn.net/2001/0208/a/linux-ntfs.php3
   28. http://lwn.net/2001/0208/a/zerocopy.php3
   29. http://lwn.net/2001/0208/a/pci-sci.php3
   30. http://lwn.net/2001/0208/a/os-devel.php3
   31. http://lwn.net/2001/0208/a/devfs-v99.19.php3
   32. http://lwn.net/2001/0208/a/devfsd-v1.3.11.php3
   33. http://lwn.net/2001/0208/a/kernel-graph.php3
   34. http://lwn.net/2001/0208/a/fb-logo.php3
   35. http://lwn.net/2001/0208/a/posix-timers.php3
   36. http://lwn.net/2001/0208/a/lt-timers.php3
   37. http://lwn.net/2001/0208/a/usagi.php3
   38. mailto:lwn@lwn.net
   39. http://www.kernelnotes.org/
   40. http://kt.linuxcare.com/
   41. http://www.atnf.csiro.au/~rgooch/linux/docs/kernel-newsflash.html
   42. http://www.kerneltrap.com/
   43. http://lksr.org/
   44. http://www.tux.org/lkml/
   45. http://www.linux.eu.org/Linux-MM/
   46. http://www.citi.umich.edu/projects/linux-scalability/
   47. http://lwn.net/2001/0208/dists.php3
   48. http://www.eklektix.com/
   49. http://www.eklektix.com/
 
 --- ifmail v.2.14.os7-aks1
  * Origin: Unknown (2:4615/71.10@fidonet)
 
 

Вернуться к списку тем, сортированных по: возрастание даты  уменьшение даты  тема  автор 

 Тема:    Автор:    Дата:  
 URL: http://lwn.net/2001/0208/kernel.php3   Sergey Lentsov   08 Feb 2001 18:31:34 
Архивное /ru.linux/20308204fcd92.html, оценка 2 из 5, голосов 10
Яндекс.Метрика
Valid HTML 4.01 Transitional