Although recent performance has declined, there have been some technical problems, but Digg as a guiding social news site, behind the technology is still worth exploring, Digg engineer Dave Beckett's recent article entitled Built
one, Digg provides services
a social news site
for the individual it is a personal community-based news delivery platform
an advertising platform
one open API platform
blog and document system
two, Digg's kernel functions
article submission feature to submit your detect expensive information
list of feature articles submitted information the user to do a list of different latitudes (personal information, such as the recently unlocked)
the operation of the article can act various operations on the article,
Lacoste Swerve Lace Trainers, including reading, press, digg,
Lacoste Trainers UK, comment, rate, and so for the comments
Top articles
Digg will sometimes feature some fashionable articles Top to Digg home page, from the sheet so that extra folk tin discern
3, Digg features behind the implementation
First we see by a flowchart that describes the average consumer in the use of specific modules Digg Digg during the operation:
In fact, this operation includes the navel two parts: simultaneous and asynchronous
immediate response to user synchronization: synchronization mainly to express user requests (including the API apply) real-time fast response, including some in the page by way of an asynchronous AJAX requests. These operations usually require a second or two the longest time to complete.
offline batch for asynchronous computing: In appending to real-time response to requests, occasionally absence to do some batch computing tasks that may be indirectly stimulated at the user, but users will no await for these tasks. These are commonly asynchronous computation may take a few seconds, minutes or even hours.
two portions of the applying on Digg course this chart can be narrated with the following:
upon is a common overview, the emulating part we will go into the various features of Digg-depth learn.
1. online network system
provide Web page services and API services components as with: PHP language for architecture front-end CMS system, Python API to build server, they run on the Tornado. Thrift agreement with them via the cardinal storage layer interaction, a lot of data will be such as Memcached and Redis memory store system store.
2. messaging system
Digg RabbitMQ accustomed because queuing system will manipulate without synchronization reaction into the queue asynchronously.
3. asynchronous batch processing system
system is the above message queue, and this refers to the specific queue is removed from the part of the task execution. This system will remove the job from the queue, then the calculation of a definite operation on the primary storage for primary storage, operating in real-time systems and asynchronous systems are the same batch.
4. data storage layer
Digg
data storage wafer using multiple products to realize various missions, the characteristic catalogue is It’s about time:
Cassandra: such as articles,
Buy Lacoste Trainers, user, Digg operating records We use the Cassandra0.6 edition, as version 0.6 did not kidnap the secondary concordance, so we deal the data through the application layer and then use it for storage. For example,
Lacoste Trainers 2011, our data layer provides the user with user label and Email address to ask user information interface.
HDFS: mainly used in calculation of the log information storage and thinking using the Hive operating Hadoop, MapReduce be calculated.
MogileFS: a distributed file storage system to cache bin files, such as user avatars, screenshots, etc. Of way, there is a unified file storage on altitude of the CDN.
MySQL: At present, our story on Top function some of the data using a MySQL storage, because this feature requires a lot of JOIN operation. At the same time HBase seems too a good attention.
Redis: As Redis high-performance and flexible data framework, we use it to provide storage for Digg Streaming API,
Womens Lacoste Trainers, we also use Redis to build real-time outlook and kick counter.
SOLR: used to build the full-text indexing system. To invest for the contents of the article,
Lacoste Protect Laser Trainers, topics such as the full-text search.
Scribe: log collection system, more powerful than the syslog-ng easier. Use it to collect the logs will be examined and enumerated into the HDFS.
5. operating system and configuration
Digg escapes scampers on Debian settled based GNU / Linux servers which we configure with Clusto, Puppet and using a configuration system over Zookeeper
translation link: http://blog.nosqlfan.com/html/1575.html
text links: http://about.digg.com/blog/how-digg-is-built