Our service was designed to accept email via SMTP and reject the SPAM directly, any remaining eamil mail would be delivered to its intended destination.
In short there were only two options available for each incoming email with no middle-ground:
The mail would be recognised as SPAM, which would cause it to be rejected at SMTP-time and also filed away in an online quarantine area.
The mail would be delivered to the intended recipient(s).
Each of our users would have significant input on how their email was handled, although there was a common-core of tests which were applied unilaterally.
Except for the modification of the global tests users would be able to disable certain tests, and also setup whitelists, blacklists, and make changes according to their tastes and needs.
Graphically our service would look something like this:
Notionally our service was split into three logical parts:
The control panel was the canonical location of the list of hosted domains and the settings which were applied to each of them. The control panel was the main part of our service which users interacted with.
In general updates to the control panel were carried out via a web-browser, but updates could also be achieved via email as described in Appendix E.
(We received considerable interest in the idea of an externally visible API to allow changes to be made but this was not implemented prior to our service termination.)
The quarantine contained a copy of each message which had been rejected for our hosted domain, for a period of 31 days.
The functionality of the quarantine was pretty minimal as it only needed to support a few operations:
1. The addition of new messages rejected at the satellite MX machines.
2. The expiry of old messages.
3. The presentation & searching of archived messages.
The quarantine was actually located upon the same physical host as the control-panel, a decision made in the early days of our service when there was just a single machine and never revisited.
The satellite machines were where the filtering of mail actually occurred. Each machine would receive updates from the control panel when domains were added or removed from the service, and when settings for any of the existed domains changed.
The satellite machines would receive SMTP traffic from the Internet and for each incoming mail either archive it to local storage if it were SPAM, or forward it to the final destination if it were good.
The quarantine host would be responsible for pulling rejected messages from the local spool every few hours, at which point the local copy would serve as a backup.
In terms of physical organisation we had two classes of machines:
The master machine operated in the capacity of both the quarantine storage node, and the control panel host.
The fact that the machine served two roles did make it a potential single point of failure, but having access to the quarantine locally available to the master control panel and webhost was a significant advantage, so the risk was deemed worthwhile.
As described later we had multiple machines that existed solely to test incoming email. In theory there was no limit to the number of hosts that could have this job. In practise managing a large number of machines wasn't seen as a useful thing to do, so the decision was made to add to their number only when necessary.
Early on the decision was made that it was preferable to have several "large" machines instead of numerous "small" machines. This was primarily a decision made to ease the administrative burden, but also a practical one. Because each satellite MX machine stored rejected messages locally, prior to them being imported to the quarantine, we found that many hosting companies offering low specification machines wouldn't provide the disk space we required.