NYCPHP Meetup

NYPHP.org

[nycphp-announce] FW: [nylug-announce] NY Linux Users Grp. 29 Mar. Meeting: CraigNevill-Manning (Google) on Finding Needles in a 20 Terabyte Haystack

New York PHP noreply at nyphp.org
Wed Mar 16 23:44:45 EST 2005


Interesting meeting by NYLUG at our usual meeting space at IBM... remember to RSVP according to NYLUG's requirements (this is not a NYPHP meeting).


---
New York PHP
http://www.nyphp.org

AMP Technology
Supporting Apache, MySQL and PHP



> March 29th, 2005
> Tuesday
> 6:30PM-8:00PM
> IBM Headquarters Building
> 590 Madison Avenue at 57th Street
> 12th Floor, home to the IBM Linux Center of Competency
> 
> ** RSVP Instructions **
>     NEW POLICY: You must R.S.V.P. for *EVERY* meeting.
>     Register at http://rsvp.nylug.org/
>     Check in with photo ID at the lobby for badge and room number.
> 
> 
>                         Craig Nevill-Manning (Google)
>                                      -on-
>                   Finding Needles in a 20 Terabyte Haystack
> 
> 
>    Due to scheduling, venue problems this month's meeting will be on
>    Tuesday, 29 March. Please mark your calendars. If you can help with
>    a modern (projector, connectivity), large, regular space we would
>    like to hear from you.
> 
>    What to think when a company's name becomes a verb? When through word of
>    mouth and no paid advertising it is commonplace? We are witnessing
>    something especial no doubt, a rare a bird. We are speaking of Google,
>    Inc. of course. The preeminent, global entity in Net search.
> 
>    Tuesday, March 29 Craig Nevill-Manning of Google will make a
>    presentation for the New York Linux Users Group entitled "Finding
>    Needles in a 20 Terabyte Haystack: 200 million times per day."
> 
>    In Craig's own words. ``Google faces two large technical challenges:
>    Ensuring that our search results are as relevant as possible, and
>    serving hundreds of millions of queries in a fraction of a second each
>    at a reasonable cost. To solve the first problem we perform an offline
>    matrix computation to produce PageRank, a query independent measure of
>    page reputation, and combine it with more traditional query-specific
>    scoring. To solve the distributed computing problem, we use tens of
>    thousands of commodity PCs and highly fault-tolerant software. I will
>    discuss some details of these solutions, and also share some interesting
>    statistical tidbits about search and the web.''
> 
>    Google has taken an unorthodox approach to its mission, and it has paid
>    off handsomely. To exerpt a passage from a developerpipeline.com
>    article:
> 
>      To search the [Google] index quickly, Google breaks it "into pieces
>      called shards," scattered across servers so they may be searched in
>      parallel, each server coming up with part of the answer to a question
>      and feeding it back for aggregated results.
> 
>      Google's file system, indexing technology, and grid of commodity
>      servers allow it to achieve search times of a quarter of a second on
>      a typical query. The replication and constant heartbeat messaging
>      built into the file system gives it high reliability and
>      availability, he noted.
> 
>      In addition, as Google servers parse queries, they break them down
>      into smaller tasks and make one trip to the database for a result
>      that may satisfy many users. The process is called "map reduction."
>      Hoelzle said Google once "lost 1,800 of 2,000 map-reduction machines
>      in a large-scale maintenance incident." Because of the load balancing
>      built into the system, Google still completed all queries by steering
>      uncompleted tasks to the machines that showed they had processing
>      power.
> 
>    This will be a highly attended meeting, space is limited.
> 
> For More Information Visit:
> 
>      * developerpipeline.com article
>         http://developerpipeline.com/showArticle.jhtml?articleId=60404907
>      * Interesting projects coming out of Google Labs
>         http://labs.google.com/
>      * A paper on the Google File System
>         http://www.cs.rochester.edu/sosp2003/papers/p125-ghemawat.pdf
>      * A paper on the Google MapReduce system
>         http://labs.google.com/papers/mapreduce.html
> 
> About Craig Nevill-Manning:
> 
>    Dr. Craig Nevill-Manning is a Senior Staff Research Scientist and New
>    York Engineering Director at Google. While at Google, he has led the
>    development team for Froogle, a product search engine. Prior to his four
>    years at Google, Dr. Nevill-Manning was an assistant professor in the
>    Computer Science Department at Rutgers University and a postdoctoral
>    fellow at Stanford University.
> 
> Swag (Give Away) - During the meeting... unusally terrific swag of
>    non-predetermined origin will be given out to all attendees at the
>    regular meeting for free as usual.
> 
> Stammtisch
>     After the meeting ... Join us around 8:30pm or so at TGI Friday's,
>     located at 677 Lexington Avenue and 56th Street, second floor.
>     Northeast corner.
> 
> Please see our home page at http://www.nylug.org for the HTMLized
> version of this announcement, our archives, and a lot of other good
> stuff.
> 
> Monthly Reminder!
>     Please read the NYLUG-Talk Posting Guidelines at:
>     http://www.nylug.org/mlistguide/
> 
> ________________________________________________________________________
> March 2005 - The New York Linux Users Group, NYLUG.org
> ______________________________________________________________________
> Hire expert Linux talent by posting jobs here :: http://jobs.nylug.org
> nylug-announce mailing list
> nylug-announce at nylug.org
> http://nylug.org/mailman/listinfo/nylug-announce




More information about the announce mailing list