Tweeper: a Twitter to RSS web scraper
Tweeper is a web scraper which extracts the most recent public tweets of a given user from their home page on Twitter.com and formats them in RSS, so the information can be conveniently accessed and collected by a feed reader.
Since Jun 11th 2013 Twitter.com retired their API v1.0, so it's not possible to access a user timeline via RSS anymore, and it's also become mandatory to authenticate via OAuth to access this public information in JSON format.
Some pointers:
Some services came up to overcome this “problem”:
- http://twss.55uk.net/
- http://twitter-rss.com/ (now redirecting to google.com)
However these solutions are still shady and let no control to the user about who collects the information about the visited user timelines.
This is why Tweeper (also Twitter feed scraper) was born, as an Open Source way to keep following your friends with a certain degree of anonymity, without having to tell Twitter.com whom you are friend to.
Get it from the Tweeper source code repository.
Tweeper can be used via web or as a command line program, for example as a filter in your feed reader, by passing the URL of the user's public timeline as the first argument.
Example of use on the command line:
$ php tweeper.php http://twitter.com/NSACareers
Example of use as a Liferea filter:
$ liferea-add-feed "|php .../path_to_tweeper/tweeper.php http://twitter.com/NSACareers"
Example of use with identi.ca:
$ liferea-add-feed "|php .../path_to_tweeper/tweeper.php http://identi.ca/evan"
Update, 2013-07-27
Tweeper now requires the URL of the public timeline as the argument, the Twitter screen name is not enough anymore; this is in order to uniform usage with different sites.
Sorry for changing the behavior of the interface but it makes the multi-site code a lot simpler, I didn't want to carry around compatibility code so early in the development.
Comments
http://rssitfor.me is another
http://rssitfor.me is another service that converts twitter user timelines to RSS.
It works pefectly for me - tweets are every 1 hour, which is fine for RSS.
Ah, cool, but a closed web
Ah, cool, but a closed web service can always be shut down, open source code is “for ever” :P
Ciao, Antonio
Cool - this is helpful. Is
Cool - this is helpful. Is there a way to include tweets that are even older? Not sure how it (or Twitter) decides which/how many/how far back tweets to include.
Hi Libby, I don't think
Hi Libby,
I don't think there is any way to choose how many tweets to import from the web page, it's a server-side thing to decide how many of them are to be rendered in HTML.
Tweeper just scrapes what's rendered in HTML.
However feed readers accumulate RSS items, so from a certain point on you will have all the “old” tweets (the ones not rendered anymore) in the feed reader.
Ciao, Antonio
Antonio, many thanks for your
Antonio, many thanks for your great work!
In the current version something goes wrong with the encoding though. For example the twitter account "kedye13" has a mixture of characters (Thai, Roman, Korean). Which will be corrupted in the feed.
I've tried this and that to force UTF-8, but that didn't work. So I can't point you to the solution.
Thanks, I'll take a look at
Thanks, I'll take a look at the encoding issue.
Ciao, Antonio
Hi - thanks for putting the
Hi - thanks for putting the effort into this.
Can I call the script via a URL? ie
http://www.myurl.com/tweeper.php?src_url=http://twitter.com/NSAcareers
When I do that I get a 403 error?
Any help appreciated
To make the search and @
To make the search and @ twitter links work change the following line in the
rss_converter_twitter.com.xsl
fileto
Thanks, if you want you can
Thanks, if you want you can send a patch so you get credit for that, otherwise I'll add the change myself.
Ciao, Antonio
BTW my feed reader
BTW my feed reader successfully expands relative URLs to absolute ones by picking up the correct base URL itself, even without an explicit
xml:base
. I think it takes the base URL from the parent<link>
element.However I can reproduce your issue by visiting a feed generated by tweeper via web with firefox, in this case specifying
xml:base
can be useful indeed.Was the latter also your scenario?
Ciao, Antonio
helpful thanks
helpful thanks
Hi. I'm trying to use Tweeper
Hi. I'm trying to use Tweeper in the terminal but it says "Could not open input file: tweeper.php". Any suggestions? Thanks.
Hi Steve, how are you
Hi Steve, how are you executing tweeper?
If you are using the code downloaded from the git repository, you can run it like this:
Or, if you are on a Debian system install the package and just run
tweeper
without the path prefix.Ciao,
Antonio
Hi Antonio. Thanks very much
Hi Antonio. Thanks very much for your response. I installed tweeper through the mint software manager and simply using "tweeper" by itself has worked. Cheers!!!
I am not able to get this
I am not able to get this working on Ubuntu 16.04 with the liferea and tweeper packages using the instructions above. (Substituting plain 'tweeper' for the path above). Liferea segfaults when I try this. I haven't encountered a pipe at the beginning of a quoted command, what does that do?
I was able to get it working by saving the tweeper output to a file in a cron job and then loading that file in liferea's New Subscription > advanced. This is probably closer to what I want anyway. Thanks for the work you've put in on the software, do you have a bug tracker someplace, just so I can check if my problem above is a known issue?
Thanks again
Hi Aaron, if you have
Hi Aaron,
if you have tweeper installed via packages you can use something like:
The leading pipe tells liferea that this is not a direct URL but rather a filter command to execute.
The same thing can be accomplished from the UI:
New Subscription > Advanced
;Command
as the Source Type;tweeper https://twitter.com/ao2it
in the Source field (without the pipe this time).Let me know if this fixes your problem.
As far as bug reports go, you can send me an e-mail anytime.
Ciao ciao, Antonio
Thanks Antonio, Now that I
Thanks Antonio,
Now that I understand what the leading pipe does I see that this is a problem with liferea. Adding any feed (not just a command) from the command line causes a segfault:
$liferea-add-feed https://blog.twitter.com/api/blog.rss?name=company
Segmentation fault (core dumped)
I'll have to check out their latest release and see if it's been fixed when I get a chance.
Best,
Aaron
I've tried to reply a few
I've tried to reply a few times, but my messages are called spam. It is a bug in liferea. Thanks for the help!
https://bugs.launchpad.net/ubuntu/+source/liferea/+bug/1669117
I tried to reply, but it
I tried to reply, but it thought I was spam.
Thanks for your help, it's a problem with liferea. Liferea segfaults even with the url of an rss feed.
it is this bug:
it is this bug: https://github.com/lwindolf/liferea/issues/479
Thanks!
Thanks for the info, it looks
Thanks for the info, it looks like it has been fixed in the dev version.
And sorry for not publishing your comments sooner.
Ciao, Antonio
Post new comment