How to download twitter data?

Venkata Prasad Palakiti @Venkata_Palakiti

03 March 2015 64 5K Report

Hello all, I am working on project. I want to download twitter data. By using twitter API, I am able to download only 3 tweets. Is there a way to download at least 1000 tweets?

Dominik Batorski Popular answer

You can use

Twitter APIs:

REST API https://dev.twitter.com/rest/public and Streaming API https://dev.twitter.com/streaming/overview

R packages:

e.g. twitteR, RTwitterAPI

See also blog posts on Twitter data on R-bloggers: http://www.r-bloggers.com/search/twitter

http://www.r-bloggers.com/search/twitter

Chee Siang Ang

have you tried NodeXL? http://nodexl.codeplex.com/

Francesc Grau

You can download your own file going to your Settings -> "Request your archive" button. And you'll get to your first tweet :-)

Venkata Prasad Palakiti

@Chee Siang Ang: I didn't try. I am using R packages.

Venkata Prasad Palakiti

@ Francesc grau: Thank you for your reply. I want to download some other tweets not my own tweets.

Francesc Grau

You can't access to file of others by this way :-(

Venkata Prasad Palakiti

@ Francesc Grau: Can you tell me, is there a way to do it?

Francesc Grau

You can try https://www.exporttweet.com or http://www.tweetdownload.net

I used them long time ago and both were well. Nowadays, I don't know. Sorry

Venkata Prasad Palakiti

@ Francesc Grau: Thanks a lot.

Dominik Batorski

You can use

Twitter APIs:

REST API https://dev.twitter.com/rest/public and Streaming API https://dev.twitter.com/streaming/overview

R packages:

e.g. twitteR, RTwitterAPI

See also blog posts on Twitter data on R-bloggers: http://www.r-bloggers.com/search/twitter

http://www.r-bloggers.com/search/twitter

Iñaki San Vicente

You probably want to connect to the public streams. There are libraries for many languages: https://dev.twitter.com/overview/api/twitter-libraries

So far I've used the Perl Net::Twitter successfully.

Depending on your setup (i.e. account/keywords you pass to the API to look for) you may not get many tweets at the beginning, and so you will need to be connected to the stream for quite some time,

Marius Rohde Johannessen

Depending on what you are after, NodeXL can be a good package. Does not require programming skills, it's simply a plugin for Excel. And the guys behind it are helpful if you get stuck.

Udeh Tochukwu

you can use putty and hive to download tweets if you are using twitter Api...That was the apps I used to extract the tweets

Mariano Pierantozzi

You can have a look at this link.

Mathematica allows you to download and manage a lot of Social Media data, including twitter.

http://reference.wolfram.com/language/ref/SocialMediaData.html

Hope it is useful.

Mariano.

Venkata Prasad Palakiti

Thank you all for your kind reply.

@ Marius: Is NODE XL useful for doing sentiment analysis?. I tried NODE XL, it is useful for doing network analysis. I want to use tweets for doing sentimental analysis.

@ Udeh: Is puuty and hive are mobile apps?

@ Mariano: I will try this.

Udeh Tochukwu

you can install an oracle VM as the server and then use putty to connect to it. You need some queries to extract the tweets in hive and then add some json file for transforming the data before using it for sentiment analysis.

Venkata Prasad Palakiti

@ Udeh: Thanks a lot. If you have any material related to putty, can you please share with me.

Günther Fliedl

Try TWICTION, which is connected to a Twitter Account and realized (what I know) via a Twitter/streaming API.

For further information have a look at: www.econob.com

Venkata Prasad Palakiti

@ Gunther Fliedl: Thank you for your reply. Could you please explain how it helps to my study?

Marius Rohde Johannessen

@venkata I'm not sure if NodeXL is the best tool for sentiment analysis. I've only used it for SNA.

Venkata Prasad Palakiti

@ Marius: Thank you.

Timothy Cribbin

Some good suggestions above, but you may also want to take a look at Chorus Tweetcatcher (TCD) for a simple GUI data collection tool. You can run queries and download user-specific timelines. It can also be used in conjunction with a prototype visual analytics tool called Tweetvis which provides summary statistics and various visualizations to help you explore your datasets.

Chorus is free to download and use: http://chorusanalytics.co.uk

Jolly Soparia

where can i download the movie reviews where users have given opinions using rating. I tried using rotten tomatoes. But its not working. I want it for Sentiment Analysis. Please any body can help me.

Also suggest me if there are other product sites that allow users to download opinions of rating on movies or any other product.

Andry Alamsyah

i suggest you use R package TwitteR (for backward search) or StreamR (for forward search). I found this package is easy to use and we can crawl up to 1500 on one go. i attach the appearance of this TwitteR package

http://cran.r-project.org/web/packages/twitteR/index.html

Venkata Prasad Palakiti

@ Andry: Thank you for your reply. I have used twitteR package in R software. But i am able to download only 3 tweets. I want minimum 1000 tweets.

Andry Alamsyah

@Vankata if you dont specify anything by default twitter search will get 25 last tweet from your search. This is the default :

searchTwitter(searchString, n=25, lang=NULL, since=NULL, until=NULL, locale=NULL, geocode=NULL, sinceID=NULL, retryOnRateLimit=120, ...)

the more variable you put above makes the search is more specific and better accuracy. i am able to get more accurate result by putting for example n=1000, since(date) or sinceID (twitter ID) or geocode ..

Venkata Prasad Palakiti

@Andy: Thank you for your reply. I am specifying n=1000 still it is giving only 3 tweets.

Andry Alamsyah

@Venkata i forgot one thing the API can not retrieve tweet older than 7 days. other than that i never had any problem to crawl twitter data. by the way you can give me one of your keyword and i will check from my end if i experience the same problem.

Venkata Prasad Palakiti

@Andry: Ya i agree. API will not provide more than 7days tweets. My keyword is stayzilla. Please tell me how many you are able to download.

Udeh Tochukwu

Do you know programming in R? I will soon publish a project on data visualization in R. So you can use the analysis to download tweets

Venkata Prasad Palakiti

@ Udeh: ya i know R programming.

Udeh Tochukwu

ok so you can also download tweets using R, then save it and use it for your project

Venkata Prasad Palakiti

@udeh: I am using twitteR package to downlaod tweets. I am able to download only 5 tweets , i want at least 1000 tweets. Do you know how to download?

Udeh Tochukwu

> library(twitteR)

> library(wordcloud)

> library(RColorBrewer)

> library(plyr)

> library(ggplot2)

> library(sentiment)

# Find OAuth settings for twitter:

> library(httr)

> oauth_endpoints(“twitter”)

## request: https://api.twitter.com/oauth/request_token

## authorize: https://api.twitter.com/oauth/authenticate

## access: https://api.twitter.com/oauth/access_token

# Register an application (API) at https://apps.twitter.com/

# Once done registering, look at the values of api key, secret and token

# Insert these values below:

> api_key api_secret access_token access_token_secret setup_twitter_oauth(api_key,api_secret,access_token,access_token_secret)

## [1] “Using direct authentication”

# Now let us collect some tweets (2000 in our example) containing the term '#UDEH_BIGDATA' from twitter (language = English,

# if you wish you can set other languages to fetch tweets in those languages in your analytics)

> tweets = searchTwitter('#UDEH_BIGDATA', n=2000, lang=”en”)

Marco Bianchi

You should specify the programming language you are using. Anyway, a list of libraries to interact with the Twitter API is available at https://dev.twitter.com/overview/api/twitter-libraries.

Depending on your needs, you could use:

the REST API (https://dev.twitter.com/rest/public), if you want to execute searches to retrieve tweets matching some specified criteria, or
the STREAMING API (https://dev.twitter.com/streaming/overview), if you need to monitor some keywords or account (https://dev.twitter.com/streaming/reference/post/statuses/filter) or if you want to sample the whole twitter stream (https://dev.twitter.com/streaming/reference/get/statuses/sample).

If you are using Java as programming language, consider that Twitter4J (http://twitter4j.org) is, probably, the best library to interact with Twitter. It is also well-documented.

Andry Alamsyah

@Venkata sorry for late reply over the weekend ..

your keyword appearance just to few in the twitter, i am only able to gather 184 data when specify n = 1000, i will try on another computer today ..

Venkata Prasad Palakiti

@ Udeh: Sorry for the late reply. I am using the similar code in R. Still i can download only 6 tweets.

@ Marco: Thank you very much for your reply. I am using R Software. I will try.

@ Andry: Thank you very much. It has 2176 tweets. Let me know how many you are able to download.

Noha Nagi

Do you have to code for that?

I mean: there are ready made software for this. You have to check Deen Freelon list : https://docs.google.com/document/d/1UaERzROI986HqcwrBDLaqGG8X_lYwctj6ek6ryqDOiQ/edit

This list is curated by Deen but collected in contribution of a lot of internet researchers' chat through the AOIR list ([email protected]). "Proudly" I have contributed too in this :)

I also suggest you begin following this list for there are many helpful guys there !

Good luck!

Venkata Prasad Palakiti

@Noha: Thank you for your reply. Definitely i will look into this.

Noha Nagi

You have to!

Good luck :)

Jolly Soparia

thank you all for your support. Not able to thank all specifically, sorry for that.

really it helped me.

Lone Sorensen

Hi, I'm in a similar situation. I need historical twitter data for social network analysis. What would be the advantages/disadvantages of scraping tweets from Twitter's advanced search and downloading data from the Internet Archive respectively? Both in terms of coverage/completeness of data set and in terms of the tools and/or programming skills I would need. Thanks.

Dominik Batorski

@Lone Sorensen

It seems to me that to obtain full historical data from Twitter is not possible. Please see this paper http://www.sciencedirect.com/science/article/pii/S0378873314000057 for a discussion of biases of different methods of data collection.

Udeh Tochukwu

You can learn the procedures by following the steps in this project https://www.researchgate.net/publication/275947996_BIG_DATA_STATISTICS_WITH_R?ev=prf_pub

Conference Paper BIG DATA STATISTICS WITH R

Lone Sorensen

@Dominik - thanks for the article! I gather that I can't use the API as it won't provide me with historical data (I need to go further back than one week). Hence I'm wondering what the bias is of using Twitter's advanced search, compared to the firehose and/or API. I've seen studies comparing the API and the firehose like the one you posted but nothing about the web search itself. Have you heard of any such studies? And am I right in guessing that the Internet Archive is based on the streaming API?

@Udeh - thanks, this will be really useful as a next step!

Yeimy Arevalo

@ Udeh: thank you very much for you R code. That function perfectly!. mayble if someone doesn´t know how to get the api and tocken codes could get it in https://apps.twitter.com .

Fabian Huettig

The Software MAXQDA 12 provides an Interface for selective downloads.

Emma Lee

Hi,

You can get twitter data from Podargos Data. All you have to do is to tell them your needs.

Podargos can provide twitter historical and realtime data. In additional, it's able to help you retrieve other mobile App data of social networks and e-commerce, and supports for nearly 100 languages. I hope this information can help you.

Podargos' website is http://www.podargos.com

Here is the data sample:

http://goo.gl/dRxyow

Noha Nagi

I would also suggest that you check: twitteR

Kamaalini Subramani

Hi venkat....I am also facing the same problem. But i have a solution to download data from facebook using netvizz app. But Now a days i can't download data from FB.

Any one can help me how to download fb comments using netvizz or some other easiest way and also i need .gdf file to import in Gephi.

Please guide me.

Shaheen Khatoon

Hi, You can scrap data using "Search Twitter" operator in RapidMiner. With this operator you can specify query and get Twitter statuses containing this query.

Ghulam Asrofi Buntoro

with R Programming

Alistair Walsh

Hi,

I'm a researcher from University of Melbourne in Australia. The group I work in is scraping Twitter data to analyse aspects of emotion control. After many months we are now collecting tweets and user timelines in a mongoDB database using docker, pymongo and tweepy. Currently have a dataset of 20 million tweets from 8000 individual users and it's growing daily.

Happy to share information as I think helping other researchers will help us.

[email protected]

Tiago Santos

If you are looking for information older than a week, a current limitation of the Twitter API, you can crawl the advanced search with the script in the link below.

http://tomkdickinson.co.uk/2016/12/extracting-a-larger-twitter-dataset/

Emma Lee

Hi, you can get twitter data from Podargos Data. It can provide twitter historical and realtime data. In additional, it's able to help you retrieve other mobile App data of social networks and e-commerce, and supports for nearly 100 languages. I hope this information can help you.

Pordargos' website is http://www.podargos.com.

Here is the data sample:

http://goo.gl/dRxyow

Stu Shulman

Realtime data collection using the Twitter Search API and the Gnip PowerTrack 2.0 is available using a point and click (no programming) interface at https://discovertext.com. Here are the top 10 reasons to use DiscoverText:

https://vimeo.com/170157685

For three free historical Twitter data estimates per day, create a free account on Sifter (https://sifter.texifter.com).

For more information about "Gnip" PowerTrack data, please refer to:

https://discovertext.com/gnip-enabled-access-for-discovertext-users/

Zuheb Siddique Mohammed

I used this tool a month ago to download twitter data, am not sure if the free trial would let you export 30,000 rows at a time,. Feel free to check it out though https://www.exporttweet.com/

Mohamed Ali Saip

I used KNIME (https://www.knime.com/) to monitor and download Twitter data daily.

Abdullah Önden

If you try Twitter API, you'll face 7 day limitation. Now Twitter offers a Premium API for bulk requests. Still I don't understand why they don't offer any option for educational researches. But still you can still try another options like trimming data from Twitter search services. There's a good option here, even if you don't know Python, this tool will help you to get bulk tweets and also you can get different time spans: https://github.com/Jefferson-Henrique/GetOldTweets-python

And also follow our project SentiTweet: https://www.sentitweet.com

Hiba J. Aleqabie

streaming twitter API using python language the package of twepy

Jakub Rogalski

Have you considered using a social media monitoring tool?

There are many you could test, and I'd advise using a paid tool. Free ones usually lack features that you might actually find the most valuable, whether it's analytics or data visualisation.

As far as my experience goes, the tool I've spent quite a big chunk of my time with is Brand24 (https://brand24.com). It's a media monitoring tool that allows for monitoring both social and traditional web and then exporting the data to .xls files, infographics or .pdf reports.

Coming back to crawling Twitter data, it's very effective due to the access to Twitter's API. However, media monitoring tools are able to collect only those mentions that are publicly available and don't have access restrictions assigned to them. On top of that, historical data is limited, so the tool is the most accurate from the moment of creating the monitoring project onwards.

As far as other tools are concerned, you might also want to have a look at Brandwatch(https://brandwatch.com) or Sysomos (https://sysomos.com)

Hope you'll find this answer helpful!

Alberto Burchi

Thank you for your contribution. I read with great interest. I am now able to download the tweets as described in some links. Now, I would like to build a database of tweets to be continuously enriched so as to overcome the constraint of the 18,000 tweets and dey 7 days free of the free API. Do you have any advice for me?

Hatice tıraşoğlu

You can use this:

https://github.com/Jefferson-Henrique/GetOldTweets-python

Shadi AlZu'bi

Join us in Granada at SAMSN 2019

http://emergingtechnet.org/SAMSN2019/

Badges
Science topic

More Venkata Prasad Palakiti's questions See All

Can any one tell me the A* journals in Management science and Operations Research?

I know that Management science and Operations research journals are A* journals. Can you please tell me the complete source of classification of management science and operations research journals?

06 July 2014 7,323 3 View

Why should we include a dummy job in the parallel machine while formulating a model?

31 December 2013 982 2 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

Is this a facetotecta nauplius?

This larva was captured using a plankton net in the Persian Gulf during the summer. I believe it may be a Facetotecta nauplius.

08 August 2024 3,746 4 View

May members post flyers about opportunities to present at a conference? If so, where to post?

May members post flyers about opportunities to present at a conferehttps://veraeducation.com/nce? If so, where to post for the Virginia Educational Research Association? https://veraeducation.com/

08 August 2024 4,585 1 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View