Architecture#
This page of the documentation explains technical details of okami architecture.
Process#
Initialising#
- Spider object is created and settings are loaded
- Pipelines and middlewares are loaded and initialised
- Startup pipeline is executed and finished
- Start page from spider is queued
Scraping process starts.
Processing#
- Request is created with a Task object
- Request is passed through http middleware before cycle
- Downloader processes Request and creates Response
- Response is passed through http middleware after cycle
- Errors from request/response cycle are handled as well as throttling
- Task and Response are passed through spider middleware before cycle
- Spider processes Task and Response and creates a list of new Task and Item objects
- Task, Response, list of Task and Item objects are passed through spider middleware after cycle
- List of Item objects is passed through items pipeline cycle
- List of Task objects is passed through tasks pipeline cycle
Processing part is repeated for every task or page until exhausted.
Finalising#
- Session object is closed
- Pipelines and middlewares are finalised
Okami terminates.