|
|
ru.algorithms- RU.ALGORITHMS ---------------------------------------------------------------- From : Max Alekseyev 2:5015/60 06 Feb 2002 14:00:26 To : All Subject : Google programming contest w/ $10K cash prize --------------------------------------------------------------------------------
ЫЫЫЫ OS/2 Hi, All !
http://www.google.com/programming-contest/
===cut===
First Annual Google Programming Contest
In celebration of more than three years of delivering the best search experience
on the Internet, Google is sponsoring the first annual Google Programming
Contest.
Grand Prize
* $10,000 in cash
* VIP visit to Google Inc. in Mountain View, California
* Potentially run your prize-winning code on Google's multi-billion document
repository (circumstances permitting)
The Challenge
Google is providing a selection of about 900,000 web pages in pre-parsed and raw
format, together with a "ripper" program that provides a framework for
processing the pre-parsed data. Your mission is to write a program (most likely
by adding code to the ripper) that does something interesting with the data, in
such a way that it would scale to a web-sized collection of documents. Part of
your job is to convince us of why your program is interesting and why it will
scale; other than that, you're free to implement whatever strikes your fancy.
We suggest you fit your entry in one of two different tracks: Systems or
Applications.
1. Systems
Entries in the Systems track generally pertain to infrastructure for handling
the data, where typical goals are systems-related (i.e., speed/space
properties). Some examples of possible projects include:
* Achieving better compression for the repository (starting from either the
pre-parsed or raw formats). You might make a case for why your compression
scheme saves the most space, or saves space while still allowing quick access to
the data.
* Designing and implementing an efficient index structure to quickly find
all documents that contain a given word or phrase.
* Constructing a link graph for the data and providing fast access to it.
2. Applications
Entries in the Applications track generally deal with the semantics of the data.
Some examples include:
* Detecting common templates in pages, and separating out the common
structure from the individual content.
* Classifying links on a page.
* Detecting pages that are near-duplicates of one another.
* Clustering pages by topic or type.
The supplied repository is several orders of magnitude smaller than the ultimate
target repository for the code, because of the limitations of the distribution
media and the likely resource constraints of many entrants. Keep this in mind
when designing your implementation. You should assume that your code will
ultimately run on a collection of networked machines with a reasonable amount of
memory (~2-4 gigabytes each), where the data is divided among them. You will
probably need to combine partial results from each machine to form a single
final result.
The limited size of the repository being distributed and the selection of
documents may preclude certain interesting kinds of document processing. This
repository includes a selection of HTML Web pages from 100 different sites in
the "edu" domain.
How to Enter
Read the Contest Rules located at the bottom of this page. By participating in
this contest, you agree to be bound by these Contest Rules.
The code and data may be downloaded from our web site:
* http://research.google.com/contest/prog-contest-sample.tar - 57M
This .tar file contains both the source code you'll need and a sample data file
that can be used to develop and test your program. It also contains a README
file with details about the code and data, as well as links to the site where
you can download the full set of 900,000 web pages on which to run your program.
If you prefer, we will mail you the code and data on a set of five CDs. E-mail
your request for CDs, including a postal address, to
programming-contest@google.com.
We provide source code in C++. You may alternatively choose to write your code
in Java, in which case you are responsible for implementing any necessary
interface code. Your submission must include a Makefile and README, and must
compile on Linux 2.2 or 2.4 using g++ (for C++ code) or standard Sun tools (for
Java code). If your code depends on third-party packages, you must include a
complete list of all packages, including exact version information and download
URLs. Sorry, we cannot accept entries that require commercial software or other
software that is not provided as open source or under GPL.
* Entries must include an English-language explanation of the design,
including an argument that it will scale to 2 billion pages with reasonable
runtime, as well as source code for the implementation. We strongly encourage
you to include all data needed to support your claims, such as sample output
from your program. Also, clear instructions and an easy to use demo program that
allows experimentation with your system will help.
* Your entries must also include the names, e-mail addresses, and brief
resumes (including postal addresses and telephone numbers) of everyone who
contributed to the project.
* Entries must be submitted in machine-readable format (gzipped tar file)
via e-mail to programming-contest@google.com.
* Entries will be accepted through midnight (PST) April 30 2002.
There is no limit on the number of entries you may submit. Keep copies for your
records. Google assumes no responsibility for lost, misdirected, illegible or
late entries or for failed computer transmissions or technical failures.
Discussion Group
If you want to discuss ideas and problems related to the programming contest
with other participants, visit the Google Groups programming contest newsgroup:
google.public.programming-contest.
Judging
Winners will be selected by a panel of Google staff scientists. The judges will
grade entries using the following criteria:
* General utility and importance of output
* Scalability and elegance of design (including selection of appropriate
algorithms and data structures)
* Clarity, efficiency and portability of implementation
The judges shall have the sole authority and discretion to select the award
recipient(s).
Contest Rules
To participate in the Google Programming Contest (the "Contest"), you must be at
least 18 years old. The Contest is open to individuals or teams of up to 3
people, but not to corporate entries. Employees and contractors of Google, Inc.
("Google") and members of their immediate families are not eligible to enter.
Void where prohibited.
With regard to the software and repository that you obtain for the Contest, you
agree to the license terms as stated in files you download or receive. With
regard to an entry you submit as part of the Contest, you grant Google a
worldwide, perpetual, fully paid-up, non-exclusive license to make, sell, or use
the technology related thereto, including but not limited to the software,
algorithms, techniques, concepts, etc., associated with the entry.
If you are selected as a contest winner, you agree that Google may publicize
your name, likeness, and the description of work you did to win the contest.
Apart from the prizes associated with being selected as a winner, Google shall
not be obligated to compensate you in any way for such publicity.
One $10,000 cash prize will be awarded to the winning entry. If the winning
entry is submitted by more than one individual, the $10,000 cash prize will be
divided equally among the participants who submit the winning entry. In
addition, Google shall provide each member of the winning team a round trip
ticket for a commercial carrier flight to the San Francisco Bay Area, and will
reimburse each member of the winning team for up to 3 nights stay at a hotel to
be designated by Google, Inc.
Each entrant shall indemnify, defend, and hold Google harmless from any third
party claims arising from or related to that entrant's participation in the
Contest. In no event shall Google be liable to an entrant for acts or omissions
arising out of or related to the Contest or that entrant's participation in the
Contest.
Odds of winning depend on the number and quality of entries received. All taxes,
including income taxes, are the sole responsibility of winners. No prize
substitution is permitted. Winner(s) may be required to verify their entry.
Notification
The winning entry will be announced on the Google.com site by Google Inc. on May
31, 2002. Following the announcement, individual winners will be notified by
e-mail. Winners have 14 days from notification to claim the prize. Prize may be
claimed by return e-mail. Unclaimed prizes will not be awarded.
Questions?
Contact Google Inc. at programming-contest@google.com.
===cut===
Regards, ш.ш
Max ~
--- FleetStreet 1.27.3.7
* Origin: (2:5015/60)
Вернуться к списку тем, сортированных по: возрастание даты уменьшение даты тема автор
Архивное /ru.algorithms/18133c612931.html, оценка из 5, голосов 10
|