Feedshot and musing about Spreading out web UI

I was struck by an urge to resume working on my RSS fetcher this weekend. Or maybe by a bus. I’m minding my own business and suddenly it seems very important to finish building the fetcher. It’s still not complete but it’s a bit further along. I implemented a bunch of infrastructure to allow me to do XML-RPC calls over the Spread bus, implemented some queuing and scheduling code (for timeouts), and integrate Spread messaging with my existing fetch event-loop. Most of the time, though, was designing and sketching out the API for how the UI and IMAP sink will interact with the new fetcher. The fetcher service connects to my Spread segment and exposes an XML-RPC API. It also broadcasts new/updated article notifications. The XML-RPC calls will allow other parts of the system to ask the fetcher to send new/updated article notifcations to a Spread group. There’s calls for subscribe/unsubscribe, retrieving previously fetched articles (to allow a new service to catch up to what’s been seen), and listing what feeds are being subscribed to. The eventual plan is to have a web UI to manage subscriptions talking to both the fetcher and the IMAP sink. The IMAP sink (which exists now in a crude form) will be expanded to also expose a management API (so the UI can update it, instead of me editing the script by hand) to direct articles into different IMAP folders based on matching filtering rules (e.g. a feed category). This phase will just use the existing IMAP sink and flesh out the fetcher service and a crude web UI. I’m tired of editing a config file to subscribe to new feeds, and I want to have the fetcher adapt to the update frequency of a feed so it doesn’t keep hitting a feed every hour when it gets updated every 3 weeks. The fetcher will spread fetching out over the day so my connection isn’t slammed — the current version hits every feed (with etag/last-modified so most of them are light) when it runs out of cron.

For the web UIs, I’ve been thinking about writing an adapter that will speak FastCGI on one side (to Apache) and FastCGIish-over-Spread on the other side. The idea would be to let me easily map a URL space on the apache side to “a bunch of web services that run in the Spread segment”. The FastCGI code would broadcast a message when it got enough of the request to begin dispatch (headers, path, hostname) and whichever service answered first would handle the request (which would include receiving the rest of the FastCGI traffic for the request). I suppose one could use this for load balancing, but I just want it to be easy to prototype and hack on things without having to kick Apache or the mod_fastcgi’s wrapper. The adapter script can be very robust and relatively simple and delegate all the ~~flakiness~~ bleeding-edge code to other processes. If the handler flakes out, the adapter can close out the request with an appropriate message without freaking out the FastCGI connector code in Apache.

One of the problems I’ve run into with Singleshot that I haven’t been able to figure out how to deal with is how to easily do the edit/compile/run cycle with Singleshot as a FastCGI process. When there’s a bad enough bug, Singleshot doesn’t respond and mod_fastcgi gets angry and stops talking to it. Then I have to kick Apache to reset things. For much of the time I was developing with Singleshot in CGI mode, but to work on some things I have to run Singleshot in the mode I actually use it in — running as a persisted process is not the same thing as running as a CGI script.

Mostly, though, I just want to see if it turns out to be a good idea or makes anything easier. The idea of easily being able to map a URL space onto a bunch of different cooperating processes seems like it could make hacking up little web toys easier. Eventually one could imagine moving the adapter into an Apache module (or making it the entire web server). The only down-side to this is none of it would be deployable to a typical web host in this form, but obviously I’d make the meat of each web app use WSGI or something similar rather than knowing about all the adapter magic — the same way Singleshot can run as either CGI or FastCGI.

This project is only accidentally really useful to me — I no longer use any other RSS readers. The actual goal has been to play with some technologies in an app small enough to actually build but big enough to learn something. I can only go so far with the reading papers about a topic.