Главная страница


ru.linux

 
 - RU.LINUX ---------------------------------------------------------------------
 From : Sergey Lentsov                       2:4615/71.10   06 Dec 2001  17:11:21
 To : All
 Subject : URL: http://www.lwn.net/2001/1206/kernel.php3
 -------------------------------------------------------------------------------- 
 
    [1][LWN Logo] 
    
                                [2]Click Here 
    [LWN.net]
    
    Sections:
     [3]Main page
     [4]Security
     Kernel
     [5]Distributions
     [6]Development
     [7]Commerce
     [8]Linux in the news
     [9]Announcements
     [10]Linux History
     [11]Letters
    [12]All in one big page
    
    See also: [13]last week's Kernel page.
    
 Kernel development
 
    The current development kernel release is still 2.5.0. Linus's current
    prepatch is [14]2.5.1-pre5. With recent prepatches, life has gotten
    interesting; we have a true development kernel once again. Things that
    have gone into 2.5.1 so far include:
      * The new driver model implemented by Patrick Mochel. This code
        implements a system-wide tree of all devices which will be helpful
        for system configuration and power management tasks; it was
        covered in the [15]October 25 LWN kernel page.
      * The beginnings of the block layer thrash-up (see below).
      * Richard Gooch's new devfs core code. The end result of this work
        should be a more stable devfs, but it's giving some people
        difficulties at the moment; approach with care.
        
    In general, it pays to be careful with the 2.5.1 prepatches. Some of
    the changes are truly disruptive, and a bit of instability is to be
    expected for a while yet.
    
    The current stable kernel release is 2.4.16. Marcelo ("[16]the wonder
    penguin") has released [17]2.4.17-pre4, which contains a relatively
    lengthy list of fixes and updates. Here, too, the new devfs code is
    causing difficulties for some users.
    
    On the 'design' of Linux. For those who haven't yet seen it elsewhere,
    here's Linus's [18]'Linux wasn't designed' message that was widely
    circulated. In another message, Linus [19]talked further on how he
    thinks software gets built:
    
      It's "directed mutation" on a microscopic level, but there is very
      little macroscopic direction. There are lots of individuals with
      some generic feeling about where they want to take the system (and
      I'm obviously one of them), but in the end we're all a bunch of
      people with not very good vision.
      
      And that is GOOD.
      
    It does seem that quite a bit of progress can be made, even with poor
    vision.
    
    Ripping up the block layer. It has been long understood that the 2.5
    development series would include major changes to the block (disk) I/O
    layer. The block code has no end of performance problems, especially
    on high-end systems; it's also quite ugly in a number of places. So,
    the integration of Jens Axboe's new block I/O code, while highly
    disruptive, is a good thing.
    
    Since 2.2, much of the block I/O subsystem has worked with a single
    spinlock, called io_request_lock. If the system was trying to figure
    out how to merge a request into a very long queue, or if a block
    driver was slow in figuring out what it wanted to do, all other block
    operations would have to stop and wait. This lock was serializing
    operations which had nothing to do with each other, and was an obvious
    scalability bottleneck.
    
    With 2.5.1, that lock is no more; instead, each request queue (which,
    in well-written drivers, corresponds to each device) has its own lock.
    This kind of change can be scary, since some drivers will have
    depended on the global serialization enforced by io_request_lock; its
    removal has the potential to create subtle and nasty bugs. It may be a
    little while before all the block drivers are known to be safe.
    
    Another problem with the old block code was its use of the "buffer
    head" ("bh") structure as the building block of the request queue.
    Higher-level code would go to some lengths to create large, contiguous
    block I/O requests, which would then be fragmented into a large number
    of single-block requests, each with its own buffer head. The elevator
    code then had the task of trying to merge the request back together
    again.
    
    Buffer heads are now a thing of the past, at least as a visible part
    of the block I/O interface. Block I/O requests are now described by a
    new bio structure which, in turn, contains a list of bio_vec
    structures describing the data to be transferred. The bh structure
    included a virtual pointer to the data to be transferred; the new
    structures, instead, contain struct page pointers directly into the
    system memory map.
    
    Much of the kernel has moved toward working with page structures,
    often as a result of the challenges of dealing with high memory, which
    has no virtual mapping into kernel space. Block drivers will now have
    to deal with high memory directly, but support code has been provided
    to make that easier. The advantages of working with page structures
    are worth the trouble; in particular, handling large, clustered
    requests from the raw I/O layer (or the pending asynchronous I/O patch
    by Ben LaHaise) will be much easier.
    
    Also included are the block-highmem patches, which enable DMA
    operations directly to and from high memory. With the 2.4 kernel, such
    operations require copying data via "bounce buffers" in low memory.
    Bounce buffers can create severe performance problems on large-memory
    systems, and they are (usually) entirely unnecessary.
    
    Finally, a whole set of support code has been added which hides much
    of the structure of the request queue from block drivers. Included is
    a nice routine for setting up DMA requests easily. The result is that
    all block drivers must be updated, but the resulting code should be
    simpler.
    
    The block work is far from done, however; quite a bit of work is still
    pending. Jens has already [20]stated his plan to break all of the
    block drivers again shortly. Upcoming changes include moving the
    building of SCSI-like commands into the generic block layer, and
    running ioctl() operations through the request queue so that they are
    automatically serialized with the I/O operations.
    
    For more information, see [21]Jens's writeup of the block I/O changes
    so far, and [22]Suparna Bhattacharya's notes on the LSE web site.
    
    Merging the new kbuild. Back at the [23]Kernel Summit, it was agreed
    that one of the first things to happen in 2.5 would be the integration
    of the new kbuild code. Block I/O has jumped in first, but kbuild
    remains on the agenda. To push things forward, Keith Owens has
    [24]proposed a schedule for the merging of kbuild. It calls for the
    new build code to be added in 2.5.2-pre1, and the old system to be
    ripped out in -pre2. The original plan called for deferring the
    integration of CML2 until 2.5.3, but Eric Raymond was less than
    thrilled with the idea. So a [25]revised version of the timeline has
    CML2 going in simultaneously with kbuild. There's just a couple of
    obstacles to overcome, like the fact that the two do not currently
    work together. One assumes these little details can be dealt with.
    
    There has been little comment on the plan to integrate the new kbuild;
    it does not appear to be a controversial change (though there is a
    little grumbling about the new kbuild being slower).
    
    Most speakers, when giving a talk, try to be well tuned to signals
    from the audience. So, when your editor was addressing folks at Linux
    Kongress about 2.5 changes, the sound of vomiting from the seats got
    his attention. The subject at hand was, of course, CML2. This
    development remains controversial, and the talk of integrating it with
    kbuild started up the same old flame wars.
    
    Said wars have been covered in this space in the past, and there is
    very little to add. In theory, Linus has said he will merge CML2 and
    the topic should be moot. Eric Raymond did not help things, however,
    with [26]his statement that he plans to try to get Marcelo to
    integrate CML2 into the 2.4 tree as well. This idea, at least, is not
    controversial - almost nobody seems to think it's a good idea. The 2.4
    kernel just does not need that sort of change.
    
    With regard to 2.5, the main stumbling point still appears to be the
    use of Python 2 as the implementation language. One would think people
    could just install Python and be done with it, but it's apparently not
    so simple. Most of the dissenters are just grumbling, but there are a
    couple of other efforts out there. Greg Banks has a [27]CML2 in C
    project going, though progress has pretty well stopped in recent
    months. Jan Harkes, instead, has put together [28]a patch which ports
    the CML2 code to Python 1.5. Since the older Python is available on
    more older systems, one would hope this patch might help reduce the
    complaining somewhat.
    
    But, then, as devfs shows, some developments never seem to reach a
    point of being accepted by everybody. (Current versions of these
    patches are [29]kbuild 1.1.0 and [30]CML2 1.9.4).
    
    Eliminating sleep_on. For years, the standard way to put a process to
    sleep within the kernel is with the sleep_on() function or its
    variants. sleep_on() simply blocks the calling process until somebody
    explicitly wakes it (or, in some cases, a signal or timeout happens).
    On SMP systems, however, sleep_on() has a serious problem. Consider a
    typical usage:
     if (something not ready)
         sleep_on(&my_wait_queue);
 
    If the "something" becomes ready between the two lines of code, the
    wakeup event will be missed and the process may sleep for much longer
    than intended.
    
    Workarounds for this problem have existed for a long time. The
    wait_event() macros handle this case without races; often semaphores
    or the newish "completion event" interface can be used. If all else
    fails, a relatively complicated "manual sleep" can be coded. All of
    these techniques are used in the kernel, but code that calls
    sleep_on() still exists.
    
    The plan for some time has been to remove sleep_on() in the 2.5
    series, on the theory that there is no safe way to call it. Now that
    patches are going in, people have begun to ask when this removal might
    take place. The answer, for now, is [31]a patch from David Woodhouse.
    It does not yet go so far as completely removing the function; instead
    it adds some checks which detect (and complain about) unsafe calls. It
    is a gradual approach, but the intent remains the same: eventually
    sleep_on() and friends will go away, and any code that still calls
    them will have to be updated.
    
    Incremental prepatches. H. Peter Anvin has [32]announced a
    much-requested feature for the kernel.org archives: incremental
    prepatches. Posted prepatches are relative to the last official kernel
    release; users wishing to go from one prepatch to another have to
    restart with a clean kernel, or explicitly back out the previous
    prepatch. With the new scheme, it is necessary only to download the
    (usually smaller) incremental patch and apply that. The incremental
    patches will also make it easier to see exactly what has changed
    between prepatches.
    
    Integrating ALSA. The [33]Advanced Linux Sound Architecture project
    has been working since [34]early 1998 to build a better sound
    subsystem for the Linux kernel. Some people were surprised that ALSA
    was not integrated into 2.4, but the fact is that the project never
    proposed its code for that release. The ALSA hackers have been taking
    their time and trying to get it right.
    
    Now, however, it appears that the time has come. ALSA maintainer
    Jaroslav Kysela has [35]indicated that he and the code are ready, and
    Alan Cox has [36]encouraged him to submit it. The last call belongs to
    Linus, of course, but chances are good that ALSA will find its way
    into a 2.5 kernel before too long. It will probably live alongside the
    OSS drivers for a while, but, in the long term, it seems certain that
    OSS will eventually be removed.
    
    Other patches and updates released this week include:
    
      * Peter Braam has [37]released version 1.0.6-test1 of the InterMezzo
        filesystem. There is also an InterMezzo roadmap available for
        those interested in where this distributed filesystem is going.
      * Larry McVoy has posted [38]a partial description of his
        long-standing "ccCluster" idea. Worth a read for a different
        approach to multiprocessor systems.
      * Christoph Rohland has posted [39]a document for the tmpfs
        filesystem, intended for the kernel documentation directory.
      * IBM has released [40]version 1.0.10 of the JFS journaling
        filesystem.
      * Richard Gooch has released a pile of devfs updates, including
        [41]devfsd-v1.3.20, [42]devfs-v99.21 (for 2.2 kernels),
        [43]devfs-v199.3 (for 2.4) and [44]devfs-v203 (for 2.5).
      * Davide Libenzi has posted [45]a patch which implements "task
        struct coloring." This coloring is the spreading of task structure
        alignment so that they do not all sit on the same cache line
        (which is currently the case). The result should be improved
        kernel performance, especially on SMP systems. A [46]later version
        of the patch also adds kernel stack coloring.
      * Bert Hubert has posted [47]a set of documents describing the
        kernel's network traffic control capabilities. Traffic control has
        been present since 2.2, and it provides some very nice features,
        but lack of good documentation has limited its usage. This work is
        a welcome step in the right direction.
      * [48]Version v1.13 of the Dolphin PCI-SCI driver has been released
        by Jeff Merkey.
      * Keith Owens has released [49]kdb v1.9 for the 2.4.16 kernel.
      * [50]ext3 0.9.16 for 2.4 kernels was released by Andrew Morton.
      * The international kernel patch is back: a [51]beta version for
        2.4.16 was announced by Herbert Valerio Riedel.
      * Nathan Scott has [52]posted a new version of the extended
        attributes interface.
      * A patch improving the performance of kernel statistics counters
        was [53]posted by Ravikiran G Thirumalai.
      * Ian Stewart has [54]announced a new release of the AC'97
        "linmodem" driver.
        
    Section Editor: [55]Jonathan Corbet
    December 6, 2001
    
    For other kernel news, see:
      * [56]Kernel traffic
      * [57]Kernel Newsflash
      * [58]Kernel Trap
    
    Other resources:
      * [59]Kernel Source Reference
      * [60]L-K mailing list FAQ
      * [61]Linux-MM
      * [62]Linux Scalability Effort
      * [63]Kernel Newbies
      * [64]Linux Device Drivers
    
    
    
                                                   [65]Next: Distributions
    
    [66]Eklektix, Inc. Linux powered! Copyright Л 2001 [67]Eklektix, Inc.,
    all rights reserved
    Linux (R) is a registered trademark of Linus Torvalds
 
 References
 
    1. http://lwn.net/
    2. http://ads.tucows.com/click.ng/pageid=001-012-132-000-000-003-000-000-012
    3. http://lwn.net/2001/1206/
    4. http://lwn.net/2001/1206/security.php3
    5. http://lwn.net/2001/1206/dists.php3
    6. http://lwn.net/2001/1206/devel.php3
    7. http://lwn.net/2001/1206/commerce.php3
    8. http://lwn.net/2001/1206/press.php3
    9. http://lwn.net/2001/1206/announce.php3
   10. http://lwn.net/2001/1206/history.php3
   11. http://lwn.net/2001/1206/letters.php3
   12. http://lwn.net/2001/1206/bigpage.php3
   13. http://lwn.net/2001/1129/kernel.php3
   14. http://lwn.net/2001/1206/a/2.5.1-pre5.php3
   15. http://lwn.net/2001/1025/kernel.php3
   16. http://marcelothewonderpenguin.com/
   17. http://lwn.net/2001/1206/a/2.4.17-pre4.php3
   18. http://lwn.net/2001/1206/a/no-design.php3
   19. http://lwn.net/2001/1206/a/mutation.php3
   20. http://lwn.net/2001/1206/a/ja-not-kidding.php3
   21. http://lwn.net/2001/1206/a/bio-writeup.php3
   22. http://lse.sourceforge.net/io/bionotes.txt
   23. http://lwn.net/2001/features/KernelSummit/
   24. http://lwn.net/2001/1206/a/kbuild-plan.php3
   25. http://lwn.net/2001/1206/a/kbuild-plan2.php3
   26. http://lwn.net/2001/1206/a/cml2-2.4.php3
   27. http://lwn.net/2001/1206/a/cml2-in-c.php3
   28. http://lwn.net/2001/1206/a/cml2-in-python1.php3
   29. http://lwn.net/2001/1206/a/kbuild.php3
   30. http://lwn.net/2001/1206/a/cml.php3
   31. http://lwn.net/2001/1206/a/sleep_on.php3
   32. http://lwn.net/2001/1206/a/incremental.php3
   33. http://www.alsa-project.org/
   34. http://lwn.net/1998/0226/a/elsa.html
   35. http://lwn.net/2001/1206/a/alsa.php3
   36. http://lwn.net/2001/1206/a/ac-alsa.php3
   37. http://lwn.net/2001/1206/a/intermezzo.php3
   38. http://lwn.net/2001/1206/a/ccCluster.php3
   39. http://lwn.net/2001/1206/a/tmpfs.php3
   40. http://lwn.net/2001/1206/a/jfs.php3
   41. http://lwn.net/2001/1206/a/devfsd-v1.3.20.php3
   42. http://lwn.net/2001/1206/a/devfs-v99.21.php3
   43. http://lwn.net/2001/1206/a/devfs-v199.3.php3
   44. http://lwn.net/2001/1206/a/devfs-v203.php3
   45. http://lwn.net/2001/1206/a/task-coloring.php3
   46. http://lwn.net/2001/1206/a/kernel-stack.php3
   47. http://lwn.net/2001/1206/a/tc-doc.php3
   48. http://lwn.net/2001/1206/a/pci-sci.php3
   49. http://lwn.net/2001/1206/a/kdb.php3
   50. http://lwn.net/2001/1206/a/ext3.php3
   51. http://lwn.net/2001/1206/a/ikp.php3
   52. http://lwn.net/2001/1206/a/ext-attrs.php3
   53. http://lwn.net/2001/1206/a/counters.php3
   54. http://lwn.net/2001/1206/a/ac97.php3
   55. mailto:lwn@lwn.net
   56. http://kt.zork.net/
   57. http://www.atnf.csiro.au/~rgooch/linux/docs/kernel-newsflash.html
   58. http://www.kerneltrap.com/
   59. http://lksr.org/
   60. http://www.tux.org/lkml/
   61. http://www.linux.eu.org/Linux-MM/
   62. http://lse.sourceforge.net/
   63. http://www.kernelnewbies.org/
   64. http://www.xml.com/ldd/chapter/book/index.html
   65. http://lwn.net/2001/1206/dists.php3
   66. http://www.eklektix.com/
   67. http://www.eklektix.com/
 
 --- ifmail v.2.14.os7-aks1
  * Origin: Unknown (2:4615/71.10@fidonet)
 
 

Вернуться к списку тем, сортированных по: возрастание даты  уменьшение даты  тема  автор 

 Тема:    Автор:    Дата:  
 URL: http://www.lwn.net/2001/1206/kernel.php3   Sergey Lentsov   06 Dec 2001 17:11:21 
Архивное /ru.linux/19861e8a8cbfb.html, оценка 2 из 5, голосов 10
Яндекс.Метрика
Valid HTML 4.01 Transitional