|   Website Templates   |   Find a local emergency plumber   |   Wireless Service   |   trade show displays   |   forex trading

Hadoop at Twitter (part 1): Splittable LZO Compression

3 month ago

Jeremy Zawodny : Hadoop at Twitter (part 1): Splittable LZO Compression - Hadoop at Twitter (part 1): Splittable LZO Compression: LZO sounds pretty interesting

nelson : LZO + Hadoop - Great article about using compression to make distributed computation faster

Tags : twitter lzo hadoop mapreduce systems software architecture

  copy

CloudCrowd

6 month ago

bmilleare : CloudCrowd - If Carlsberg made worker/job queue servers...

Simon Willison : cloud-crowd - cloud-crowd. New parallel processing worker/job queue system with a strikingly elegant architecture. The central server is an HTTP server that manages job requests, which are farmed out to a number of node HTTP servers which fork off worker processes to

Tags : scaling cloud-computing mapreduce

  copy

Hadoop petabyte sort

10 month ago

nelson : Hadoop petabyte sort - Hadoop sorts a petabyte in 16 hours. Compare Google's 6 hour petabyte sort, but then that's not open source

Tags : hadoop benchmark mapreduce yahoo google

  copy

Tom White: "Disks have become tapes"

24 month ago

deusx : Tom White: "Disks have become tapes" - "In essence MapReduce works by repeatedly sorting and merging data that is streamed to and from disk at the transfer rate of the disk. Contrast this to accessing data from a relational database that operates at the seek rate of the disk"

Matthew M. Boedicker : thinking of disks as a sequential device rather than a random access device - (via reddit) [via]

Tags : disks mapreduce scaling tapes

  copy

MapReduce whitepaper

26 month ago

nelson : MapReduce whitepaper - Nice summary, link in comments to full paper. Google's apparently crunching the equivalent of 130,000 computers full time.

Tags : google mapreduce systems technology

  copy

Yahoo's Doug Cutting on MapReduce and the Future of Hadoop

30 month ago

Jeremy Zawodny : Yahoo's Doug Cutting on MapReduce and the Future of Hadoop - Yahoo's Doug Cutting on MapReduce and the Future of Hadoop: "In this special InfoQ interview Cutting discusses how Hadoop is used at Yahoo, the challenges of its development, and the future direction of the project."

nelson : Hadoop interview - Doug Cutting is one of the smartest programmers I know

Tags : links cluster code cutting distributed grid hadoop lucene mapreduce opensource scalability via:zawodny yahoo

  copy

Yahoo Pig and Google Sawzall

35 month ago

Jeremy Zawodny : Yahoo Pig and Google Sawzall - Yahoo Pig and Google Sawzall: "I have to say, it is good to see Yahoo building these kinds of tools for large scale data manipulation."

nelson : Yahoo Pig and Google Sawzall - massively parallel data crunching platforms

Tags : links api google grid mapreduce pig programming sawzall yahoo

  copy

Google: "one trillion words from public Web pages."

44 month ago

kellan : Google: "one trillion words from public Web pages." - note to self, revisit Hadoop #

Paul Hammond : Official Google Research Blog: All Our N-gram are Belong to You - We processed 1,011,582,453,213 words of running text and are publishing the counts for all 1,146,580,664 five-word sequences that appear at least 40 times

joshua : Official Google Research Blog: All Our N-gram are Belong to You - i wish this wasn't $150

Tags : public web google research ngram mapreduce big.numbers data ir

  copy

Joel on Software - Can Your Programming Language Do This?

44 month ago

Paul Hammond : Joel on Software - Can Your Programming Language Do This? - By abstracting away the very concept of looping, you can implement looping any way you want

deusx : Can Your Programming Language Do This? - Joel on Software - "I hope you're convinced, by now, that programming languages with first-class functions let you find more opportunities for abstraction, which means your code is smaller, tighter, more reusable, and more scalable."

Tags : programming abstraction functions languages mapreduce google microsoft algorithms

  copy
xml
Upian.