Tweeper: a Twitter to RSS web scraper

Tweeper is a web scraper which extracts the most recent public tweets of a given user from their home page on Twitter.com and formats them in RSS, so the information can be conveniently accessed and collected by a feed reader.

Since Jun 11th 2013 Twitter.com retired their API v1.0, so it's not possible to access a user timeline via RSS anymore, and it's also become mandatory to authenticate via OAuth to access this public information in JSON format.

Some pointers:

Some services came up to overcome this “problem”:

However these solutions are still shady and let no control to the user about who collects the information about the visited user timelines.

This is why Tweeper (also Twitter feed scraper) was born, as an Open Source way to keep following your friends with a certain degree of anonymity, without having to tell Twitter.com whom you are friend to.

Get it from the Tweeper source code repository.

Tweeper can be used via web or as a command line program, for example as a filter in your feed reader, by passing the URL of the user's public timeline as the first argument.

Example of use on the command line:

  $ php tweeper.php http://twitter.com/NSACareers

Example of use as a Liferea filter:

  $ liferea-add-feed  "|php .../path_to_tweeper/tweeper.php http://twitter.com/NSACareers"

Example of use with identi.ca:

  $ liferea-add-feed  "|php .../path_to_tweeper/tweeper.php http://identi.ca/evan"

Update, 2013-07-27

Tweeper now requires the URL of the public timeline as the argument, the Twitter screen name is not enough anymore; this is in order to uniform usage with different sites.

Sorry for changing the behavior of the interface but it makes the multi-site code a lot simpler, I didn't want to carry around compatibility code so early in the development.


CommentsSyndicate content

http://rssitfor.me is another

Anonymous's picture

http://rssitfor.me is another service that converts twitter user timelines to RSS.
It works pefectly for me - tweets are every 1 hour, which is fine for RSS.

Ah, cool, but a closed web

ao2's picture

Ah, cool, but a closed web service can always be shut down, open source code is “for ever” :P

Ciao, Antonio

Cool - this is helpful. Is

Libby's picture

Cool - this is helpful. Is there a way to include tweets that are even older? Not sure how it (or Twitter) decides which/how many/how far back tweets to include.

Hi Libby, I don't think

ao2's picture

Hi Libby,

I don't think there is any way to choose how many tweets to import from the web page, it's a server-side thing to decide how many of them are to be rendered in HTML.

Tweeper just scrapes what's rendered in HTML.

However feed readers accumulate RSS items, so from a certain point on you will have all the “old” tweets (the ones not rendered anymore) in the feed reader.

Ciao, Antonio

Antonio, many thanks for your

mozzarella's picture

Antonio, many thanks for your great work!

In the current version something goes wrong with the encoding though. For example the twitter account "kedye13" has a mixture of characters (Thai, Roman, Korean). Which will be corrupted in the feed.

I've tried this and that to force UTF-8, but that didn't work. So I can't point you to the solution.

Thanks, I'll take a look at

ao2's picture

Thanks, I'll take a look at the encoding issue.

Ciao, Antonio

Hi - thanks for putting the

Anonymous's picture

Hi - thanks for putting the effort into this.

Can I call the script via a URL? ie

http://www.myurl.com/tweeper.php?src_url=http://twitter.com/NSAcareers

When I do that I get a 403 error?

Any help appreciated

To make the search and @

Anonymous's picture

To make the search and @ twitter links work change the following line in the rss_converter_twitter.com.xsl file

rss version="2.0"

to

rss version="2.0" xml:base="http://twitter.com"

Thanks, if you want you can

ao2's picture

Thanks, if you want you can send a patch so you get credit for that, otherwise I'll add the change myself.

Ciao, Antonio

BTW my feed reader

ao2's picture

BTW my feed reader successfully expands relative URLs to absolute ones by picking up the correct base URL itself, even without an explicit xml:base. I think it takes the base URL from the parent <link> element.

However I can reproduce your issue by visiting a feed generated by tweeper via web with firefox, in this case specifying xml:base can be useful indeed.

Was the latter also your scenario?

Ciao, Antonio

helpful thanks

Anonymous's picture

helpful thanks

Hi. I'm trying to use Tweeper

Steve's picture

Hi. I'm trying to use Tweeper in the terminal but it says "Could not open input file: tweeper.php". Any suggestions? Thanks.

Hi Steve, how are you

ao2's picture

Hi Steve, how are you executing tweeper?

If you are using the code downloaded from the git repository, you can run it like this:

$ ./tweeper 
usage: ./tweeper [-e|-h|--help] 

Or, if you are on a Debian system install the package and just run tweeper without the path prefix.

Ciao,
Antonio

Hi Antonio. Thanks very much

Steve's picture

Hi Antonio. Thanks very much for your response. I installed tweeper through the mint software manager and simply using "tweeper" by itself has worked. Cheers!!!

I am not able to get this

Aaron's picture

I am not able to get this working on Ubuntu 16.04 with the liferea and tweeper packages using the instructions above. (Substituting plain 'tweeper' for the path above). Liferea segfaults when I try this. I haven't encountered a pipe at the beginning of a quoted command, what does that do?

I was able to get it working by saving the tweeper output to a file in a cron job and then loading that file in liferea's New Subscription > advanced. This is probably closer to what I want anyway. Thanks for the work you've put in on the software, do you have a bug tracker someplace, just so I can check if my problem above is a known issue?

Thanks again

Hi Aaron, if you have

ao2's picture

Hi Aaron,

if you have tweeper installed via packages you can use something like:

$ liferea-add-feed "|tweeper https://twitter.com/ao2it"

The leading pipe tells liferea that this is not a direct URL but rather a filter command to execute.

The same thing can be accomplished from the UI:

  1. go to New Subscription > Advanced;
  2. choose Command as the Source Type;
  3. put tweeper https://twitter.com/ao2it in the Source field (without the pipe this time).

Let me know if this fixes your problem.

As far as bug reports go, you can send me an e-mail anytime.

Ciao ciao, Antonio

Thanks Antonio, Now that I

Aaron's picture

Thanks Antonio,

Now that I understand what the leading pipe does I see that this is a problem with liferea. Adding any feed (not just a command) from the command line causes a segfault:

$liferea-add-feed https://blog.twitter.com/api/blog.rss?name=company
Segmentation fault (core dumped)

I'll have to check out their latest release and see if it's been fixed when I get a chance.

Best,
Aaron

I've tried to reply a few

Aaron's picture

I've tried to reply a few times, but my messages are called spam. It is a bug in liferea. Thanks for the help!

https://bugs.launchpad.net/ubuntu/+source/liferea/+bug/1669117

I tried to reply, but it

Aaron's picture

I tried to reply, but it thought I was spam.

Thanks for your help, it's a problem with liferea. Liferea segfaults even with the url of an rss feed.

it is this bug:

Aaron's picture

Thanks for the info, it looks

ao2's picture

Thanks for the info, it looks like it has been fixed in the dev version.

And sorry for not publishing your comments sooner.

Ciao, Antonio

Post new comment

The content of this field is kept private and will not be shown publicly. If you have a Gravatar account associated with the e-mail address you provide, it will be used to display your avatar.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
3
w
1
p
T
s
Enter the code without spaces.