Munin
Reddit Bot Overview
This blog post will explore the use of HTTP lanes in munin, a streaming data application built on SwimOS. munin is a real-time application that assigns answers to every submission on r/WhatsThisBird based on comments. It analyzes most comments internally but also makes external API calls (to eBird.org) for some. The application pushes every derived answer to a PostgreSQL database and interacts with Reddit by posting comments on submissions to display the ongoing analysis.
Key Functionalities of munin
- Real-time Categorization of Submissions:
- munin categorizes r/WhatsThisBird submissions into answered, unanswered, reviewed, and unreviewed, based on the analysis of comments.
- The categorization is accessible through HTTP lanes and Swim-cli commands, providing detailed insights into each submission.
- munin Health Awareness:
- munin interacts with multiple endpoints, including Reddit’s comment and submission fetch endpoints, eBird API, and a PostgreSQL database.
- The system logs errors and issues encountered during these interactions, offering granular information about the distributed system’s health.
Project Structure
The swim.munin.swim
package contains Swim-relevant logic for general-purpose Reddit applications. The swim.munin.filethesebirds
package and its subpackages contain application logic specific to FileTheseBirdsBot. Notice that within this package, there are swim.munin.filethesebirds.connect
and swim.munin.filethesebirds.swim
subpackages here that similarly breakdown into general purpose and application-specific implementations.
A design goal of munin is to provide a clean separation of concerns and make it easy to support a different subreddit. You’ll see that the concrete Web Agents in swim.munin.filethesebirds.swim
simply extend classes in swim.munin.swim
, and that while the former depends on the latter, the latter is completely unaware of the former. To make a custom app without FileTheseBirds logic, you only need the latter package.
munin Web Agents at a Glance
SubmissionsAgent
(Manager of Submissions):- Acts as an overarching manager for all active subreddit submissions.
- Categorizes submissions, tracks their status, and manages their lifecycle.
- Provides HTTP endpoints for external access to submission data, facilitating broader application integration.
SubmissionAgent
(Individual Submission Handler):- Each instance manages a specific submission.
- Tracks detailed information about the submission, including status changes and associated comments.
- Can be dynamically instantiated as needed, allowing efficient handling of multiple submissions concurrently.
PublishingAgent
(Comment Interaction Specialist):- Manages the creation, updating, and deletion of comments on submissions.
- Handles comment-related interactions in response to changes in the submission’s state.
- Efficiently manages Reddit API constraints through a queuing system.
CommentsFetchAgent
(Comment Fetcher and Router):- Fetches new comments and routes them to the relevant
SubmissionAgent
. - Filters comments based on submission status (e.g., shelving expired submissions).
- Operates periodically, ensuring the system is up-to-date with the latest comments.
- Fetches new comments and routes them to the relevant
SubmissionsFetchAgent
(Initial Submission Acquirer):- Specialized in fetching new submissions from the subreddit
r/WhatsThisBird
. - Inserts updated information about active submissions and shelves submissions that are deleted or removed.
- Interacts with the vault for data storage, upserting information about active submissions and deleting shelved ones.
- Uses a command lane to preemptively manage submissions and a timer to schedule regular fetching operations, ensuring timely updates.
- Specialized in fetching new submissions from the subreddit
Serving SwimOS Streams as Plain REST Data
Serving real-time SwimOS streams as plain REST data is simple and straightforward. In SwimOS, an HttpLane is a type of lane that can respond to HTTP requests. This is analogous to a REST endpoint in a traditional web service framework. HTTP lanes are simple and integrate seamlessly with the real-time data handling capabilities of SwimOS. In munin, four such lanes are utilized:
- https://munin.swim.services/submissions?lane=api/answered
- https://munin.swim.services/submissions?lane=api/unanswered
- https://munin.swim.services/submissions?lane=api/reviewed
- https://munin.swim.services/submissions?lane=api/unreviewed
Let’s look at the flow for api/unanswered
:
- HTTP Request Handling:
- An HTTP GET request is made to https://munin.swim.services/submissions?lane=api/unanswered.
- The SubmissionsAgent routes this request to the unansweredApiDoRespond method.
- Data Retrieval and Response:
unansweredApiDoRespond
calls a method in SubmissionsAgentLogic to retrieve the current state of unanswered submissions.- The logic determines which submissions are unanswered and formats them as a response.
- Returning the Response:
- The formatted data is returned as an HttpResponse object, which is then sent back to the client.
The corresponding code in SubmissionsAgent
looks like this:
@SwimLane("unanswered")
protected MapLane<Long, Value> unanswered = mapLane();
@SwimLane("unreviewed")
protected MapLane<Long, Value> unreviewed = mapLane();
@SwimLane("answered")
protected MapLane<Long, Value> answered = mapLane();
@SwimLane("reviewed")
protected MapLane<Long, Value> reviewed = mapLane();
@SwimLane("api/unanswered")
protected HttpLane<Value> unansweredApi = this.<Value>httpLane()
.doRespond(this::unansweredApiDoRespond);
@SwimLane("api/unreviewed")
protected HttpLane<Value> unreviewedApi = this.<Value>httpLane()
.doRespond(this::unreviewedApiDoRespond);
@SwimLane("api/answered")
protected HttpLane<Value> answeredApi = this.<Value>httpLane()
.doRespond(this::answeredApiDoRespond);
@SwimLane("api/reviewed")
protected HttpLane<Value> reviewedApi = this.<Value>httpLane()
.doRespond(this::reviewedApiDoRespond);
The objective here is to provide an HTTP API endpoint in the SubmissionsAgent
that, when requested, returns an HTML page listing all unanswered submissions. This is achieved by:
- Delegating the processing of the HTTP request to a specialized logic class (
SubmissionsAgentLogic
), following a clean separation of concerns. - Utilizing the
unanswered
MapLane inSubmissionsAgent
to access the relevant data (unanswered submission IDs). - Dynamically creating an HTML response that formats each unanswered submission as a link.
In the SubmissionsAgent, the call to unansweredApiDoRespond
is delegated to SubmissionsAgentLogic
:
- The method
unansweredApiDoRespond
is defined to handle HTTP requests specifically aimed at fetching data about unanswered submissions. - When an HTTP request is received, this method delegates the processing of the request to
SubmissionsAgentLogic.unansweredApiDoRespond
.
HttpResponse<?> unansweredApiDoRespond(HttpRequest<Value> request) {
return SubmissionsAgentLogic.unansweredApiDoRespond(this, request);
}
As for the SubmissionsAgentLogic implementation, it performs the following tasks:
- HTTP Response Creation (
unansweredApiDoRespond
):- This static method takes the
SubmissionsAgent
instance and the HTTP request as parameters. - It creates an
HttpResponse
object with a status ofHttpStatus.OK
, indicating a successful request. - The body of the response is generated by calling the
body
method, which formats the content as HTML.
- This static method takes the
- Data Retrieval and Formatting (
body
Method):- The
body
method is responsible for formatting the response body. - It retrieves the set of IDs (submission IDs) from the
unanswered
lane of theSubmissionsAgent
usingkeySet()
. This set contains the keys (IDs) of all entries in theunanswered
MapLane. - The method iterates over these IDs, creating HTML links for each unanswered submission. The
oneLink
function (not shown in the snippet) presumably formats a single submission ID into an HTML link. - The
StringBuilder
accumulates these links, and the resulting string of links is inserted into a format template (UNANSWERED_PAGE_FMT
), producing the final HTML content.
- The
Here’s the code:
static HttpResponse<?> unansweredApiDoRespond(SubmissionsAgent runtime, HttpRequest<Value> request) {
return HttpResponse.create(HttpStatus.OK)
.body(body(UNANSWERED_PAGE_FMT, runtime.unanswered.keySet()), MediaType.textHtml());
}
private static String body(String fmt, Set<Long> ids) {
final Iterator<Long> itr = ids.iterator();
String oneLink = oneLink(itr, 1);
StringBuilder links = new StringBuilder();
for (int i = 1; oneLink != null; i++, oneLink = oneLink(itr, i)) {
links.append(oneLink);
}
return String.format(fmt, links.toString());
}
SubmissionsAgent
is passed in so that its unanswered
lane can be accessed using keySet()
which is exposed by all MapLanes.
Next Steps
There is a lot more to munin, including some insightful technical patterns and practices:
- Consolidation of different read streams into entities (SubmissionAgents). Here.
- Implementation of asynchronous jobs for external API interactions. [1, 2]
- Efficient data pushing to external systems. Here.
- Use of MapLane/ListLane combined with a timer for a throttled work queue. Here.
- Policy-level prevention of bad actor instantiation. Here.
- Dynamic Web Agent state management for cleanup and maintenance. Here.
You can find the repo here: https://github.com/swimos/munin.
Parting Thoughts
munin serves as a useful example of implementing a real-time, event-driven application using SwimOS. Its architecture demonstrates how business logic can be seamlessly integrated into a Swim server, providing real-time streaming benefits with minimal effort. The application’s interaction with external systems and its ability to perform complex operations efficiently make it a valuable reference for developers looking to take full advantage of SwimOS.