A collection of TILs, snippets and thoughts
<< Back to notes

Script to look for links to twitter

Last updated

Recently, I wanted to try out the BeautifulSoup python library.

At work we have an organisation dataset that includes the twitter handle for the organisation. The dataset adds this field by looking at the wikidata entry for a given organisation. If wikidata doesn't have the twitter handle we don't add it to our dataset. There were a lot missing.

So I wrote a quick script to loop over the list of organisations, visit their webpage and search (using BeautifulSoup) for any outbound links to twitter, assuming if they exist it'll point to their twitter account.

If it finds any potential hits I can check them and add them to wikidata (Can you programatically add things to wikidata?).

Here is the script (loaded from a github gist):

webdev / code / gist