Munin

Reddit Bot Overview

This blog post will explore the use of HTTP lanes in munin, a streaming data application built on SwimOS. munin is a real-time application that assigns answers to every submission on r/WhatsThisBird based on comments. It analyzes most comments internally but also makes external API calls (to eBird.org) for some. The application pushes every derived answer to a PostgreSQL database and interacts with Reddit by posting comments on submissions to display the ongoing analysis.

Key Functionalities of munin

  1. Real-time Categorization of Submissions:
    • munin categorizes r/WhatsThisBird submissions into answered, unanswered, reviewed, and unreviewed, based on the analysis of comments.
    • The categorization is accessible through HTTP lanes and Swim-cli commands, providing detailed insights into each submission.
  2. munin Health Awareness:
    • munin interacts with multiple endpoints, including Reddit’s comment and submission fetch endpoints, eBird API, and a PostgreSQL database.
    • The system logs errors and issues encountered during these interactions, offering granular information about the distributed system’s health.

Project Structure

The swim.munin.swim package contains Swim-relevant logic for general-purpose Reddit applications. The swim.munin.filethesebirds package and its subpackages contain application logic specific to FileTheseBirdsBot. Notice that within this package, there are swim.munin.filethesebirds.connect and swim.munin.filethesebirds.swim subpackages here that similarly breakdown into general purpose and application-specific implementations.

A design goal of munin is to provide a clean separation of concerns and make it easy to support a different subreddit. You’ll see that the concrete Web Agents in swim.munin.filethesebirds.swim simply extend classes in swim.munin.swim, and that while the former depends on the latter, the latter is completely unaware of the former. To make a custom app without FileTheseBirds logic, you only need the latter package.

munin Web Agents at a Glance

  1. SubmissionsAgent (Manager of Submissions):
    • Acts as an overarching manager for all active subreddit submissions.
    • Categorizes submissions, tracks their status, and manages their lifecycle.
    • Provides HTTP endpoints for external access to submission data, facilitating broader application integration.
  2. SubmissionAgent (Individual Submission Handler):
    • Each instance manages a specific submission.
    • Tracks detailed information about the submission, including status changes and associated comments.
    • Can be dynamically instantiated as needed, allowing efficient handling of multiple submissions concurrently.
  3. PublishingAgent (Comment Interaction Specialist):
    • Manages the creation, updating, and deletion of comments on submissions.
    • Handles comment-related interactions in response to changes in the submission’s state.
    • Efficiently manages Reddit API constraints through a queuing system.
  4. CommentsFetchAgent (Comment Fetcher and Router):
    • Fetches new comments and routes them to the relevant SubmissionAgent.
    • Filters comments based on submission status (e.g., shelving expired submissions).
    • Operates periodically, ensuring the system is up-to-date with the latest comments.
  5. SubmissionsFetchAgent (Initial Submission Acquirer):
    • Specialized in fetching new submissions from the subreddit r/WhatsThisBird.
    • Inserts updated information about active submissions and shelves submissions that are deleted or removed.
    • Interacts with the vault for data storage, upserting information about active submissions and deleting shelved ones.
    • Uses a command lane to preemptively manage submissions and a timer to schedule regular fetching operations, ensuring timely updates.

Serving SwimOS Streams as Plain REST Data

Serving real-time SwimOS streams as plain REST data is simple and straightforward. In SwimOS, an HttpLane is a type of lane that can respond to HTTP requests. This is analogous to a REST endpoint in a traditional web service framework. HTTP lanes are simple and integrate seamlessly with the real-time data handling capabilities of SwimOS. In munin, four such lanes are utilized:

Let’s look at the flow for api/unanswered:

  1. HTTP Request Handling:
    • An HTTP GET request is made to https://munin.swim.services/submissions?lane=api/unanswered.
    • The SubmissionsAgent routes this request to the unansweredApiDoRespond method.
  2. Data Retrieval and Response:
    • unansweredApiDoRespond calls a method in SubmissionsAgentLogic to retrieve the current state of unanswered submissions.
    • The logic determines which submissions are unanswered and formats them as a response.
  3. Returning the Response:
    • The formatted data is returned as an HttpResponse object, which is then sent back to the client.

The corresponding code in SubmissionsAgent looks like this:

@SwimLane("unanswered")
protected MapLane<Long, Value> unanswered = mapLane();

@SwimLane("unreviewed")
protected MapLane<Long, Value> unreviewed = mapLane();

@SwimLane("answered")
protected MapLane<Long, Value> answered = mapLane();

@SwimLane("reviewed")
protected MapLane<Long, Value> reviewed = mapLane();

@SwimLane("api/unanswered")
protected HttpLane<Value> unansweredApi = this.<Value>httpLane()
    .doRespond(this::unansweredApiDoRespond);

@SwimLane("api/unreviewed")
protected HttpLane<Value> unreviewedApi = this.<Value>httpLane()
    .doRespond(this::unreviewedApiDoRespond);

@SwimLane("api/answered")
protected HttpLane<Value> answeredApi = this.<Value>httpLane()
    .doRespond(this::answeredApiDoRespond);

@SwimLane("api/reviewed")
protected HttpLane<Value> reviewedApi = this.<Value>httpLane()
    .doRespond(this::reviewedApiDoRespond);

The objective here is to provide an HTTP API endpoint in the SubmissionsAgent that, when requested, returns an HTML page listing all unanswered submissions. This is achieved by:

In the SubmissionsAgent, the call to unansweredApiDoRespond is delegated to SubmissionsAgentLogic:

  HttpResponse<?> unansweredApiDoRespond(HttpRequest<Value> request) {
    return SubmissionsAgentLogic.unansweredApiDoRespond(this, request);
  }

As for the SubmissionsAgentLogic implementation, it performs the following tasks:

  1. HTTP Response Creation (unansweredApiDoRespond):
    • This static method takes the SubmissionsAgent instance and the HTTP request as parameters.
    • It creates an HttpResponse object with a status of HttpStatus.OK, indicating a successful request.
    • The body of the response is generated by calling the body method, which formats the content as HTML.
  2. Data Retrieval and Formatting (body Method):
    • The body method is responsible for formatting the response body.
    • It retrieves the set of IDs (submission IDs) from the unanswered lane of the SubmissionsAgent using keySet(). This set contains the keys (IDs) of all entries in the unanswered MapLane.
    • The method iterates over these IDs, creating HTML links for each unanswered submission. The oneLink function (not shown in the snippet) presumably formats a single submission ID into an HTML link.
    • The StringBuilder accumulates these links, and the resulting string of links is inserted into a format template (UNANSWERED_PAGE_FMT), producing the final HTML content.

Here’s the code:

  static HttpResponse<?> unansweredApiDoRespond(SubmissionsAgent runtime, HttpRequest<Value> request) {
    return HttpResponse.create(HttpStatus.OK)
        .body(body(UNANSWERED_PAGE_FMT, runtime.unanswered.keySet()), MediaType.textHtml());
  }
  
  private static String body(String fmt, Set<Long> ids) {
    final Iterator<Long> itr = ids.iterator();
    String oneLink = oneLink(itr, 1);
    StringBuilder links = new StringBuilder();
    for (int i = 1; oneLink != null; i++, oneLink = oneLink(itr, i)) {
      links.append(oneLink);
    }
    return String.format(fmt, links.toString());
  }  

SubmissionsAgent is passed in so that its unanswered lane can be accessed using keySet() which is exposed by all MapLanes.

Next Steps

There is a lot more to munin, including some insightful technical patterns and practices:

You can find the repo here: https://github.com/swimos/munin.

Parting Thoughts

munin serves as a useful example of implementing a real-time, event-driven application using SwimOS. Its architecture demonstrates how business logic can be seamlessly integrated into a Swim server, providing real-time streaming benefits with minimal effort. The application’s interaction with external systems and its ability to perform complex operations efficiently make it a valuable reference for developers looking to take full advantage of SwimOS.