A Reddit Search Which Offers Paths, Not a List

A Reddit Search Which Offers Paths, Not a List

I’ve written articles about Reddit over the years, but I never did a lot of socializing on it. It just didn’t end up on my social network radar. As I’ve worked more with Wikipedia and its page view data, however, I’ve grown to appreciate how many people do use it, and how just a mention on a large subreddit can bring a lot of people to a Web page.

More recently, I’ve been following a lot of Internet culture / influencer news stories, and discovered via some Web searching that Reddit has a lot of useful information and backstory that I’ve missed. I knew that Reddit offered its pages as JSON, so I decided to create my own Reddit search tool, Reddit Paths. Since I had access to full search results *and* I didn’t have to pay for them, I further decided to do a little experimenting and stunting. A single stack of results wouldn’t encompass all the ways a community could discuss a topic, so why not try some choose-your-own-adventure action?

Let me show you how Reddit Paths works using the example of the browser extension Honey and its current controversy. (Here’s some lore/backstory if you need it.)

Starting Your Search

A screenshot of the opening screen of Reddit Paths. A top form asks for a search goal, while three more forms ask for primary terms, secondary terms, and ancillary terms.

The search form is weird because this is MY search engine and I get to be weird. It starts by asking what the overarching search goal is. This should be expressed in a sentence, ideally a question (“Is Honey a scam?”) The second form asks for primary terms, the heart of your search (In this case, “Honey” and “PayPal,” because PayPal owns Honey.) The third form asks for secondary terms, concepts that are part of the topic/story (“affiliate marketing” and “ecommerce”.) Finally, the fourth form is for ancillary terms, terms that are relevant to the topic/story but which are minor enough that they might not appear in a search. (“YouTube” and “influencers.”) Dividing my search keywords into specific types allows me to perform a patterned set of searches using some of Reddit’s special syntax; it’s not just one search and you’re done. In fact, with two keywords of each type it’s about three dozen searches.

Even getting only 10 results per search, you end up with hundreds of results, many of them similar because you’re running similar searches. They’ve got to be cleaned up a bit before Reddit Paths can do anything with them. So the program retrieves them as JSON, filters out low-score results, filters out duplicates using Jaccard similarity, and ends up with a nice solid JSON. That JSON is going to an OpenAI call. I won’t bore you with the entire prompt, it’s a bit involved, but it starts: “Analyze the following Reddit posts and identify 2-4 distinct thematic paths, keeping the search goal and the primary terms in mind. The paths should all be relevant to the primary terms. ”

When I ran the Honey query as shown in my initial screenshot, I got four thematic “paths” to follow: “Honey’s Alleged Scam Practices,” “Impact on Influencers and Content Creators,” “Public Perception and Consumer Trust,” and “YouTube Investigations and Media Influence.” Each path includes a summary, keywords, and up to ten highlighted Reddit posts from that path. The summary gave me ideas for follow-up keywords, and the key posts sometimes pointed me to things I hadn’t heard about yet (like Honey and Capital One.)

To keep going, pick one and click the “Explore this Path” button. It kicks you back to the original form with the new keywords filled in, where you can tweak them or even change the search goal if you want to go in a different direction. Then click the search button and away you go. Sometimes the paths get a little circular but I find swapping out a couple of keywords works wonders. A couple of layers down in the Honey search I was getting paths like “Influencer Backlash and Scams” and “The Role of Influencers in Affiliate Marketing” — a little more focused, but still very relevant to the original topic and still giving me additional things to think about.

I’m very pleased with how this works. The next time I need to research something related to Internet culture or popular culture current events, this is going to be one of the first tools I reach for.

Back To Top