...
Component | Description |
---|---|
Content fetchers/crawlers | A set of background applications/daemons that fetch drops from the various sources e.g. RSS, Twitter, Email etc. |
Metadata extractors | These perform semantic (named entity extraction & subsequent geocoding of any place names that are encountered) and media extraction (links and images) once the drops have been fetched and structured by the content fetchers |
Data mergerDrop queue processor | Keeps track of each drop as it goes through the drops as they come in from the content fetchers and forwards them to the various pre-processing stages - semantic extraction, media extraction, rules processing. Once a drop has undergone gone through all pre-processing stages, it is reassembled and posted to the API for final storage in the DB NOTE: While drops are undergoing pre-processing, they're maintained in a persistent RabbitMQ queue |
API | Posts and retrieves data to/from the database, handles user authentication and authorization, updates the search index |
Search Server | Handles all full text and geo search functions and is periodically updated (by default, every 30s) with any new data |
UI (web) client | A web application for interacting with the API; fetches data and presents it to the user |
...