28 month ago
kellan : Google: "one trillion words from public Web pages." - note to self, revisit Hadoop #
Paul Hammond : Official Google Research Blog: All Our N-gram are Belong to You - We processed 1,011,582,453,213 words of running text and are publishing the counts for all 1,146,580,664 five-word sequences that appear at least 40 times
joshua : Official Google Research Blog: All Our N-gram are Belong to You - i wish this wasn't $150
# copy