1. Crawler4j: is an open source Java crawler which provides a simple interface for crawling the Web.
https://code.google.com/p/crawler4j/
2. Apache Nutch: is a highly extensible and scalable open source web crawler software project.
https://nutch.apache.org/
3. Ex-Crawler: is divided into three subprojects. Ex-Crawler server daemon is a highly configurable, flexible (Web-) Crawler, including distributed grid / volunteer computing features written in Java. Crawled informations are stored in MySQL, MSSQL or PostgreSQL database.
4. Whalebot: is open-source web crawler. It is intended to be simple, fast and memory efficient. It was created as a targeted spider, but you may use it as common.
https://code.google.com/p/whalebot/
You can take the help by looking at the source code available on the respective websites and design your crawler accordingly.
In case of any more query then feel free to contact.
Please, check twitter API website. they recommend some open source code in C++ , C#, java and python for crawling twitter with the new authentication. the following link for these source codes