Wikipedia and YouTube: A Beautiful Search Mashup

Wikipedia and YouTube: A Beautiful Search Mashup

I generally haven’t done too much with Google’s API offerings because the ones I’m interested in are expensive, so I was surprised when I checked YouTube’s API and it was free. After I thought about it, though, it made sense: when you search YouTube’s site, you stay on YouTube to view the content — you’re not going off to some external site where Google can’t get your eyeballs for ads. So why not make that functionality free?

Earlier this week I tried putting together Wikipedia and YouTube to see if I could trace suddenly-popular Wikipedia topics to AI-generated slop on YouTube. (I wrote about it here.) That was interesting enough that I decided to see if I could make a general YouTube search that build queries off Wikipedia pages.

It worked really well! I was pleased how I could easily create different levels of query complexity using different parts of the Wikipedia API offering. And between Wikipedia article footnotes and Gossip Machine, I got two types of date-based search that brought useful results.

Basic Mashing

Searching YouTube with Wikipedia. On the left, the Wikipedia search interface: a basic keyword. A search for Boeing has been done and five results are shown, including Boeing, Boeing 777, Boeing 737, and Boeing 747. Each listing includes the name of the page and a brief description. 

On the right is the YouTube search interface. There's a keyword form and some filters but nothing is happening here yet.

Everything starts with a basic Wikipedia keyword search. That search gives you a list of results with a brief description. When you click on one is when you start unfolding more options.

We continue our Wikipedia search. The Boeing result has been selected and is showing a description of the page (in Boeing's case pretty lengthy, but usually it's brief) along with a small cloud of relevant keywords from the page. 

Still nothing happening on the right where the YouTube search is, we haven't gotten to that yet.

When you select a result from the initial search, it expands to show a brief description (or in Boeing’s case not so brief) along with a cloud of relevant words mechanically (without AI) extracted from the page text. (It’s not a great cloud at the moment because I need to put in a better list of stopwords to filter out common terms.) Clicking on any of the terms adds them to the YouTube search. When you got the query just how you like it you can click the red search button and hey presto, YouTube videos.

The keywords "boeing" and "airplanes" have been added to the YouTube query on the right and it's showing four of the first 50 search results. Above the search results there's a set of filters to narrow the results further, though they're currently not in use.

I like this for basic topic searches, things I’m absolutely unfamiliar with. A Wikipedia keyword search gives me some background and a set of general but relevant keywords to build a query. Once I’ve run a search, I can tweak the results I’m looking at with the result filters. What you’re seeing changes a lot when you specify that you want videos of at least 10 minutes with at least 10,000 views, for example.

But you may not feel that’s enough for your search needs. You may want to use keywords that are more specific than what you’d get in even a properly-stopworded cloud, or you may want to do more reading on the topic. You can by going one level deeper into the Wikipedia information.

It’s a little hard to see but there are a couple of small squares to the right of the “Boeing” header on the Wikipedia side. When you click on the one nearest the header the entire Wikipedia page loads.

More Specific Search

On the left side of the page, the entire Wikipedia article for "Boeing" has loaded and been scrolled down to the "Labor Strike" heading. The phrase "Boeing machinists strike" has been highlighted. 

On the YouTube side, a search for "Boeing machinists strike" has been performed with four results of 50 showing.

The tag cloud remains over the loaded Wikipedia page, so you can still use those general terms, but you can also highlight any part of the entire Wikipedia page and click the “Add Query Term” button to use it in a YouTube search.

Pulling query terms out of full Wikipedia page text is useful (and can drop you into some turbo-powered rabbit holes if you’re not careful.) But I wanted to do more with the full Wikipedia page than just add some query terms. I wanted to try some date-based YouTube searching with those sweet Wikipedia footnotes!

Date-Based Searching Using Wikipedia Footnotes

Clicking on a Wikipedia footnote with this search does not take you to the end of the Wikipedia page and the citation. Instead, the date in the citation is used to create a date-bounded search in YouTube (citation date plus seven days). The citation itself is turned into a tag cloud for building the query.

On the Wikipedia side, the full article has been scrolled down to a section starting "On January 7, 2021, Boeing settled to pay over $2.5 billion after being charged with fraud over the company's hiding of information from the safety regulators..." 

On the right side, a search for "Boeing conspiracy" has been performed. Underneath the search box is the citation for the event I quoted earlier. Beneath that

I like this feature a lot if I do say so myself. A lot of news outlets make their content available on YouTube, but it’s hard to find sometimes among all the other YouTube content that’s more calibrated for the platform (louder thumbnails, tweaked and tested headlines, etc.) By building a search around a date and a particular event, you can get right to it.

Of course, a Wikipedia page is going to have only so many footnotes, and only so many of them are going to be relevant to YouTube’s lifespan. So I wanted to add another way to do date-based YouTube searching with Gossip Machine.

Gossip Machine for More Date-Bounded Searching

Gossip Machine is something I made a couple of years ago and sometimes add to my projects for date-based searching. It analyzes Wikipedia pages for dates with especially-high page views and, in this case, turns them into YouTube searches. Just use the date form to specify the time span you want to search (Wikipedia’s page logs go back to 2017) and you’ll get a dropdown menu of high-view dates along with some viewing data:

A close-up of the Gossip Machine part of the Wikipedia-YouTube search. 2024 is the date span searched; a drop-down menu shows January 6- January 9 2024 as an important time span.

Choose a date from the dropdown menu and it’s turned into a date-based YouTube search which renders often-useful results which you can filter to your heart’s content.

On the left, the Wikipedia page has searched 2024 for days when Boeing's page has unusually-high views. January 6-9 has been chosen from the dropdown menu. 

On the right, a search for "Boeing news" has been run on YouTube. Search results show stories from Yahoo Finance, NBC Bay Area, and CBC News.

Should I put this on GitHub? I’ve never done that with a Node project before.

Back To Top