Главная страница


ru.linux

 
 - RU.LINUX ---------------------------------------------------------------------
 From : Sergey Lentsov                       2:4615/71.10   13 Dec 2001  17:11:08
 To : All
 Subject : URL: http://www.lwn.net/2001/1213/kernel.php3
 -------------------------------------------------------------------------------- 
 
    [1][LWN Logo] 
    
                                [2]Click Here 
    [LWN.net]
    
    Sections:
     [3]Main page
     [4]Security
     Kernel
     [5]Distributions
     [6]Development
     [7]Commerce
     [8]Linux in the news
     [9]Announcements
     [10]Linux History
     [11]Letters
    [12]All in one big page
    
    See also: [13]last week's Kernel page.
    
 Kernel development
 
    The current development kernel release is still 2.5.0. The current
    2.5.1 prepatch is [14]2.5.1-pre10. On the surface, little has changed
    over the last week; most of the changelog entries seem to be some
    variant of "Jens Axboe: bio work." The thrashing of the block layer is
    taking some time to stabilize - as to be expected from a change of
    this magnitude. The last of the disruptive block I/O changes have not
    yet hit the kernel, so this situation could persist for a while yet.
    
    Also included in this prepatch is a Super-H architecture update, some
    network driver work, an NTFS update, USB fixes, memory pools (see
    below), and the inevitable superblock cleanup patches from Al Viro.
    
    The current stable kernel release is 2.4.16. Marcelo's prepatches are
    up to [15]2.4.17-pre8; he has stated that the next prepatch will be
    the first 2.4.17 release candidate. Marcelo's stated plan is to have
    the final release be the same as the last release candidate; the hope
    is to be done with surprises caused by last-minute patches.
    
    Memory pools are a new addition to the kernel as of 2.5.1-pre10. The
    idea behind "mempools," which were implemented by Ingo Molnar, is to
    provide a memory allocation function that is guaranteed to work, even
    when memory is tight. Some places in the kernel can not afford to have
    memory allocations fail. For example, memory pressure can force the
    system to swap pages out, but that swap operation will require memory
    to be executed. If the memory to set up the swap is not available, the
    system comes to a halt.
    
    Memory pools work by simply preallocating a bunch of memory and
    keeping it aside until it's needed. The actual allocation and freeing
    of memory is handled by somebody else (the idea seems to be for
    mempools to be layered over the slab allocator); all mempools do is
    stock up ahead of time. Their use will thus increase the kernel's
    memory consumption (by the amount of memory that is set aside). For
    certain critical paths, though, they should help to improve the
    stability of the system under heavy load.
    
    Coming soon: bigger device numbers. One of the long-stated plans for
    2.5 is to increase the size of dev_t, the type which is used to
    represent device numbers. This type, as it stands now, has roots all
    the way back to the original Unix systems - it is a 16-bit quantity,
    with eight bits for the major number, and eight for the minor. It is
    inadequate for modern systems, which can have, literally, thousands of
    devices on them. So dev_t has to grow.
    
    Linus laid out the plan some time ago (see [16]the March 29, 2001 LWN
    Kernel Page): dev_t would grow to 32 bits. Of those, twelve would
    designate the major number, and 20 the minor number. A number of
    people would rather see 64-bit device numbers, but Linus is opposed to
    that.
    
    Changing device numbers raises a number of interesting compatibility
    problems. Consider, for example, a tar or dump archive containing a
    /dev directory. The archive contains the device numbers for every
    entry in that directory; if those numbers stop working after the dev_t
    change, everybody's backups have just been rendered invalid. System
    administrators, when faced with that prospect, tend to break out in a
    cold sweat, overindulge in beer, and switch to BSD.
    
    Fortunately, that particular problem [17]has a solution. In the new
    scheme, the major number zero is set aside as a marker for "legacy"
    device numbers. Any 32-bit device number with a major number of zero
    is interpreted as an old-style number and "just works." A change to
    the C library will be required before applications can exchange larger
    device numbers with the kernel, but the change should be relatively
    smooth beyond that.
    
    On the kernel side, however, life could be more interesting. Kernel
    developers really do try to avoid breaking applications, but they are
    more willing to tear things up inside the kernel. Especially in a
    development series.
    
    The kernel version of the device number type is kdev_t. It has long
    been meant to be an opaque type, but it's really just dev_t in kernel
    drag. People had assumed that kdev_t would grow along with dev_t, but
    [18]that's not what Linus has in mind. Linus wants kdev_t to go away
    entirely. All of the interfaces in the kernel which currently use that
    type will be changed to take a pointer to an appropriate structure.
    Block drivers, thus, will see a pointer to a struct block_device
    rather than a device number. Some sort of struct char_device will also
    probably be created to handle a similar role.
    
    In other words, the kernel will no longer use device numbers at all,
    except as a means of communication with user space. Internally, device
    numbers will not exist. A lot of kernel code is going to have to
    change to make this happen; one does not have to look very hard to see
    more unstable development kernel releases in the future - see, for
    example, [19]Al Viro's description of some of the issues involved.
    But, then, that's what development kernels are for.
    
    Where do important changes get tested? One would think that, now that
    we finally have a development kernel again, non-trivial changes would
    show up there before being merged into the stable 2.4 series. Thus,
    there was [20]some surprise when support for "hyperthreading" on
    Pentium IV processors went into 2.4.17-pre5. That support still does
    not exist in 2.5, and has thus not seen the wider testing that it
    could experience there.
    
    The reasoning behind putting this change into 2.4, as [21]explained by
    Alan Cox, is interesting. The claim that normal users will not be
    affected by the change is standard. But Alan also points out that, due
    to the ongoing block I/O work, the 2.5 series "isn't usable for that
    kind of thing in the near future." So, if a feature like
    hyperthreading is to be tried out, it must be added to the stable
    kernel series.
    
    Things will get better as the block layer stabilizes - at least, until
    the next set of disruptive changes go in. Until then, it's a bit
    ironic that the only place to test certain kinds of changes is the
    stable kernel series.
    
    (Hyperthreading, for those who are interested, is the hardware trick
    of making a single processor appear to be multiple virtual processors
    as a way of keeping busy while waiting for memory accesses. See
    [22]Intel's Hyperthreading page for details).
    
    Work on the scheduler is also coming to a boil. It is a widely (though
    not universally) held belief that the Linux scheduler is overdue for a
    rewrite in 2.5. [23]Quoting Alan Cox again:
    
      Its a great scheduler for a single or dual processor 486/pentium
      type box running a home environment. It gets a bit flaky by the
      time its running oracle on a 4 way, it gets very flaky by the time
      its running lotus back ends on an 8 way. It doesn't take lunacy
      like java, broken JVM implementations and volcanomark to make it go
      astray.
      
    The scheduler's performance on larger systems and under load has been
    shown to be inadequate numerous times. But there is little agreement
    on what should replace it.
    
    Mike Kravetz and company at IBM have posted [24]a new multi-queue
    scheduler patch for the 2.5.0 kernel. This scheduler cuts down on
    scheduling time by maintaining a separate run queue for each processor
    on the system. It tries to improve performance while maintaining the
    same behavior as the existing scheduler.
    
    Alan Cox has [25]a new scheduler of his own which works by maintaining
    a set of eight (currently) run queues for each processor. Picking a
    process to run is just a matter of taking the first one off the
    highest priority queue.
    
    Finally, Davide Libenzi has [26]a scheduler patch which implements a
    per-CPU run queue and some load balancing code.
    
    All of these projects share the same goals: cut down on scheduling
    overhead, work harder to keep processes from moving between
    processors, and retain good performance in low-load situations. The
    low-load performance is considered critical: it is, after all, the
    normal situation for most systems, and the current scheduler handles
    it well. No patch which impairs low-load performance is likely to get
    too far.
    
    The hyperthreading issue mentioned above is likely to throw a new set
    of complications into the mix. A processor which does hyperthreading
    looks like two independent CPUs, but it should not be scheduled as
    such - it is better to divide process across real (hardware)
    processors first. Expect scheduling to be a hot topic for some time.
    
    Linux Advanced Routing & Traffic Control Documentation Project. Bert
    Hubert has been working for some time on the documentation of the
    advanced Linux routing features. The Linux traffic control mechanism
    has been available since the 2.1 days, but is greatly underutilized.
    The quality of the available documentation has not helped here. The
    code is great, but it's hard to figure out how to use it. So an effort
    to shine some light in that direction is more than welcome.
    
    Bert's work has how grown into the [27]Linux Advanced Routing &
    Traffic Control documentation project, and a great deal of information
    is available there. The latest addition is the [28]tc-cbq man page:
    "Nearly 2500 words, 8 printed pages, of nearly unintelligible
    gobledygook, explaining mostly how CBQ works." Good stuff.
    
    Other patches and updates released this week include:
    
      * Andrea Arcangeli has [29]made available (as a tarball containing a
        magicpoint file) the slides from his PLUTO talk on the new VM
        implementation. This is the first documentation that has been made
        available on the new code. We have also made the slides available
        [30]in HTML format.
      * Daniel Phillips has [31]posted his ALS paper on ext2 directory
        indexes, along with a wealth of benchmark results. Worth a look if
        this work interest you at all.
      * [32]Kernel Traffic #145 (December 10) is available.
      * Rusty Russell has [33]posted a patch making it easy for kernel
        code to set up per-CPU data areas.
      * The [34]ltp-20021206 release is available from the Linux Test
        Project.
      * The latest User-mode Linux release from Jeff Dike is
        [35]0.53-2.4.16.
      * A new [36]preemptible kernel patch is available from Robert Love.
      * Karim Yaghmour has [37]released version 0.9.5pre4 of the Linux
        Trace Toolkit.
      * Jason Baietto has [38]released a set of "multiprocessor control
        interface" programs. These allow users to bind tasks to processors
        and other, similar tasks.
      * Ben LaHaise has posted [39]a patch which adds his kvec type
        (essentially a lightweight replacement for kiobufs) to the kernel.
        kvecs are needed for his asynchronous I/O work, among other
        things. Also available is [40]this patch, which works the kvec
        structure into the new block I/O code.
      * Eric Raymond has released [41]CML2 1.9.7.
      * The [42]2001_12_10 release of the security module code is
        available. Also available is [43]a new security module adding
        labeled IPv4 networking to SELinux.
      * Lennert Buytenhek has [44]released version 0.0.4pre1 of his
        bridging netfilter code.
      * Jozsef Kadlecsik has been [45]added to the netfilter core team.
        
    Section Editor: [46]Jonathan Corbet
    December 13, 2001
    
    For other kernel news, see:
      * [47]Kernel traffic
      * [48]Kernel Newsflash
      * [49]Kernel Trap
    
    Other resources:
      * [50]Kernel Source Reference
      * [51]L-K mailing list FAQ
      * [52]Linux-MM
      * [53]Linux Scalability Effort
      * [54]Kernel Newbies
      * [55]Linux Device Drivers
    
    
    
                                                   [56]Next: Distributions
    
    [57]Eklektix, Inc. Linux powered! Copyright Л 2001 [58]Eklektix, Inc.,
    all rights reserved
    Linux (R) is a registered trademark of Linus Torvalds
 
 References
 
    1. http://lwn.net/
    2. http://ads.tucows.com/click.ng/pageid=001-012-132-000-000-003-000-000-012
    3. http://lwn.net/2001/1213/
    4. http://lwn.net/2001/1213/security.php3
    5. http://lwn.net/2001/1213/dists.php3
    6. http://lwn.net/2001/1213/devel.php3
    7. http://lwn.net/2001/1213/commerce.php3
    8. http://lwn.net/2001/1213/press.php3
    9. http://lwn.net/2001/1213/announce.php3
   10. http://lwn.net/2001/1213/history.php3
   11. http://lwn.net/2001/1213/letters.php3
   12. http://lwn.net/2001/1213/bigpage.php3
   13. http://lwn.net/2001/1206/kernel.php3
   14. http://lwn.net/2001/1213/a/2.5.1-pre10.php3
   15. http://lwn.net/2001/1213/a/2.4.17-pre8.php3
   16. http://lwn.net/2001/0329/kernel.php3
   17. http://lwn.net/2001/1213/a/hpa-dev_t.php3
   18. http://lwn.net/2001/1213/a/lt-kdev_t.php3
   19. http://lwn.net/2001/1213/a/av-kdev_t.php3
   20. http://lwn.net/2001/1213/a/ht.php3
   21. http://lwn.net/2001/1213/a/ac-2.5.php3
   22. http://developer.intel.com/technology/hyperthread/
   23. http://lwn.net/2001/1213/a/ac-scheduler.php3
   24. http://lwn.net/2001/1213/a/mq.php3
   25. http://lwn.net/2001/1213/a/8queue.php3
   26. http://lwn.net/2001/1213/a/dl-scheduler.php3
   27. http://ds9a.nl/lartc/
   28. http://lwn.net/2001/1213/a/cbq.php3
   29. http://lwn.net/2001/1213/a/aa-vm.php3
   30. http://lwn.net/2001/1213/aa-vm-talk/
   31. http://lwn.net/2001/1213/a/directory-index.php3
   32. http://kt.zork.net/kernel-traffic/kt20011210_145.html
   33. http://lwn.net/2001/1213/a/per-cpu.php3
   34. http://lwn.net/2001/1213/a/ltp.php3
   35. http://lwn.net/2001/1213/a/uml.php3
   36. http://lwn.net/2001/1213/a/pk.php3
   37. http://lwn.net/2001/1213/a/ltt.php3
   38. http://lwn.net/2001/1213/a/mci.php3
   39. http://lwn.net/2001/1213/a/kvec.php3
   40. http://lwn.net/2001/1213/a/kvec-bio.php3
   41. http://lwn.net/2001/1213/a/cml.php3
   42. http://lwn.net/2001/1213/a/sm.php3
   43. http://lwn.net/2001/1213/a/selopt.php3
   44. http://lwn.net/2001/1213/a/bridge-netfilter.php3
   45. http://lwn.net/2001/1213/a/nct.php3
   46. mailto:lwn@lwn.net
   47. http://kt.zork.net/
   48. http://www.atnf.csiro.au/~rgooch/linux/docs/kernel-newsflash.html
   49. http://www.kerneltrap.com/
   50. http://lksr.org/
   51. http://www.tux.org/lkml/
   52. http://www.linux.eu.org/Linux-MM/
   53. http://lse.sourceforge.net/
   54. http://www.kernelnewbies.org/
   55. http://www.xml.com/ldd/chapter/book/index.html
   56. http://lwn.net/2001/1213/dists.php3
   57. http://www.eklektix.com/
   58. http://www.eklektix.com/
 
 --- ifmail v.2.14.os7-aks1
  * Origin: Unknown (2:4615/71.10@fidonet)
 
 

Вернуться к списку тем, сортированных по: возрастание даты  уменьшение даты  тема  автор 

 Тема:    Автор:    Дата:  
 URL: http://www.lwn.net/2001/1213/kernel.php3   Sergey Lentsov   13 Dec 2001 17:11:08 
Архивное /ru.linux/198610186ec74.html, оценка 2 из 5, голосов 10
Яндекс.Метрика
Valid HTML 4.01 Transitional