Tweeper: a Twitter to RSS web scraper

Tweeper is a web scraper which extracts the most recent public tweets of a given user from their home page on Twitter.com and formats them in RSS, so the information can be conveniently accessed and collected by a feed reader.

Since Jun 11th 2013 Twitter.com retired their API v1.0, so it's not possible to access a user timeline via RSS anymore, and it's also become mandatory to authenticate via OAuth to access this public information in JSON format.

Some pointers:

Some services came up to overcome this “problem”:

However these solutions are still shady and let no control to the user about who collects the information about the visited user timelines.

This is why Tweeper (also Twitter feed scraper) was born, as an Open Source way to keep following your friends with a certain degree of anonymity, without having to tell Twitter.com whom you are friend to.

Get it from the Tweeper source code repository.

Tweeper can be used via web or as a command line program, for example as a filter in your feed reader, by passing the URL of the user's public timeline as the first argument.

Example of use on the command line:

  $ php tweeper.php http://twitter.com/NSACareers

Example of use as a Liferea filter:

  $ liferea-add-feed  "|php .../path_to_tweeper/tweeper.php http://twitter.com/NSACareers"

Example of use with identi.ca:

  $ liferea-add-feed  "|php .../path_to_tweeper/tweeper.php http://identi.ca/evan"

Update, 2013-07-27

Tweeper now requires the URL of the public timeline as the argument, the Twitter screen name is not enough anymore; this is in order to uniform usage with different sites.

Sorry for changing the behavior of the interface but it makes the multi-site code a lot simpler, I didn't want to carry around compatibility code so early in the development.


CommentiCondividi contenuti

http://rssitfor.me is another

Ritratto di Anonymous

http://rssitfor.me is another service that converts twitter user timelines to RSS.
It works pefectly for me - tweets are every 1 hour, which is fine for RSS.

Ah, cool, but a closed web

Ritratto di ao2

Ah, cool, but a closed web service can always be shut down, open source code is “for ever” :P

Ciao, Antonio

Cool - this is helpful. Is

Ritratto di Libby

Cool - this is helpful. Is there a way to include tweets that are even older? Not sure how it (or Twitter) decides which/how many/how far back tweets to include.

Hi Libby, I don't think

Ritratto di ao2

Hi Libby,

I don't think there is any way to choose how many tweets to import from the web page, it's a server-side thing to decide how many of them are to be rendered in HTML.

Tweeper just scrapes what's rendered in HTML.

However feed readers accumulate RSS items, so from a certain point on you will have all the “old” tweets (the ones not rendered anymore) in the feed reader.

Ciao, Antonio

Antonio, many thanks for your

Ritratto di mozzarella

Antonio, many thanks for your great work!

In the current version something goes wrong with the encoding though. For example the twitter account "kedye13" has a mixture of characters (Thai, Roman, Korean). Which will be corrupted in the feed.

I've tried this and that to force UTF-8, but that didn't work. So I can't point you to the solution.

Thanks, I'll take a look at

Ritratto di ao2

Thanks, I'll take a look at the encoding issue.

Ciao, Antonio

Hi - thanks for putting the

Ritratto di Anonymous

Hi - thanks for putting the effort into this.

Can I call the script via a URL? ie

http://www.myurl.com/tweeper.php?src_url=http://twitter.com/NSAcareers

When I do that I get a 403 error?

Any help appreciated

To make the search and @

Ritratto di Anonymous

To make the search and @ twitter links work change the following line in the rss_converter_twitter.com.xsl file

rss version="2.0"

to

rss version="2.0" xml:base="http://twitter.com"

Thanks, if you want you can

Ritratto di ao2

Thanks, if you want you can send a patch so you get credit for that, otherwise I'll add the change myself.

Ciao, Antonio

BTW my feed reader

Ritratto di ao2

BTW my feed reader successfully expands relative URLs to absolute ones by picking up the correct base URL itself, even without an explicit xml:base. I think it takes the base URL from the parent <link> element.

However I can reproduce your issue by visiting a feed generated by tweeper via web with firefox, in this case specifying xml:base can be useful indeed.

Was the latter also your scenario?

Ciao, Antonio

helpful thanks

Ritratto di Anonymous

helpful thanks

Hi. I'm trying to use Tweeper

Ritratto di Steve

Hi. I'm trying to use Tweeper in the terminal but it says "Could not open input file: tweeper.php". Any suggestions? Thanks.

Hi Steve, how are you

Ritratto di ao2

Hi Steve, how are you executing tweeper?

If you are using the code downloaded from the git repository, you can run it like this:

$ ./tweeper 
usage: ./tweeper [-e|-h|--help] 

Or, if you are on a Debian system install the package and just run tweeper without the path prefix.

Ciao,
Antonio

Hi Antonio. Thanks very much

Ritratto di Steve

Hi Antonio. Thanks very much for your response. I installed tweeper through the mint software manager and simply using "tweeper" by itself has worked. Cheers!!!

I am not able to get this

Ritratto di Aaron

I am not able to get this working on Ubuntu 16.04 with the liferea and tweeper packages using the instructions above. (Substituting plain 'tweeper' for the path above). Liferea segfaults when I try this. I haven't encountered a pipe at the beginning of a quoted command, what does that do?

I was able to get it working by saving the tweeper output to a file in a cron job and then loading that file in liferea's New Subscription > advanced. This is probably closer to what I want anyway. Thanks for the work you've put in on the software, do you have a bug tracker someplace, just so I can check if my problem above is a known issue?

Thanks again

Hi Aaron, if you have

Ritratto di ao2

Hi Aaron,

if you have tweeper installed via packages you can use something like:

$ liferea-add-feed "|tweeper https://twitter.com/ao2it"

The leading pipe tells liferea that this is not a direct URL but rather a filter command to execute.

The same thing can be accomplished from the UI:

  1. go to New Subscription > Advanced;
  2. choose Command as the Source Type;
  3. put tweeper https://twitter.com/ao2it in the Source field (without the pipe this time).

Let me know if this fixes your problem.

As far as bug reports go, you can send me an e-mail anytime.

Ciao ciao, Antonio

Thanks Antonio, Now that I

Ritratto di Aaron

Thanks Antonio,

Now that I understand what the leading pipe does I see that this is a problem with liferea. Adding any feed (not just a command) from the command line causes a segfault:

$liferea-add-feed https://blog.twitter.com/api/blog.rss?name=company
Segmentation fault (core dumped)

I'll have to check out their latest release and see if it's been fixed when I get a chance.

Best,
Aaron

I've tried to reply a few

Ritratto di Aaron

I've tried to reply a few times, but my messages are called spam. It is a bug in liferea. Thanks for the help!

https://bugs.launchpad.net/ubuntu/+source/liferea/+bug/1669117

I tried to reply, but it

Ritratto di Aaron

I tried to reply, but it thought I was spam.

Thanks for your help, it's a problem with liferea. Liferea segfaults even with the url of an rss feed.

it is this bug:

Ritratto di Aaron

Thanks for the info, it looks

Ritratto di ao2

Thanks for the info, it looks like it has been fixed in the dev version.

And sorry for not publishing your comments sooner.

Ciao, Antonio

Invia nuovo commento

Il contenuto di questo campo è privato e non verrà mostrato pubblicamente. If you have a Gravatar account associated with the e-mail address you provide, it will be used to display your avatar.
  • Indirizzi web o e-mail vengono trasformati in link automaticamente
  • Elementi HTML permessi: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Linee e paragrafi vanno a capo automaticamente.

Ulteriori informazioni sulle opzioni di formattazione

CAPTCHA
Questa domanda serve a verificare che il form non venga inviato da procedure automatizzate
Y
U
y
K
h
1
Enter the code without spaces.