the startups.com platform about startups.comCheck out the new Startups.com - A Comprehensive Startup University
Education
Planning
Mentors
Funding
Customers
Assistants
Clarity
Categories
Business
Sales & Marketing
Funding
Product & Design
Technology
Skills & Management
Industries
Other
Business
Career Advice
Branding
Financial Consulting
Customer Engagement
Strategy
Sectors
Getting Started
Human Resources
Business Development
Legal
Other
Sales & Marketing
Social Media Marketing
Search Engine Optimization
Public Relations
Branding
Publishing
Inbound Marketing
Email Marketing
Copywriting
Growth Strategy
Search Engine Marketing
Sales & Lead Generation
Advertising
Other
Funding
Crowdfunding
Kickstarter
Venture Capital
Finance
Bootstrapping
Nonprofit
Other
Product & Design
Identity
User Experience
Lean Startup
Product Management
Metrics & Analytics
Other
Technology
WordPress
Software Development
Mobile
Ruby
CRM
Innovation
Cloud
Other
Skills & Management
Productivity
Entrepreneurship
Public Speaking
Leadership
Coaching
Other
Industries
SaaS
E-commerce
Education
Real Estate
Restaurant & Retail
Marketplaces
Nonprofit
Other
Dashboard
Browse Search
Answers
Calls
Inbox
Sign Up Log In

Loading...

Share Answer

Menu
Data Science: How can I aggregate data from online sources about a specific topic?
OT
OT
Olga T, Data Mining and Analytics Expert answered:

There are so many ways to do it... Do you need this data for yourself, or you are planning to make a product around it?

From what I see you can use Twitter API and Facebook Graph API (Are you comfortable programming?) Most of the students are active on social media so you will find lots of data. Facebook graph API will give you a number of likes and comments to all the posts of you competitors. You can analyze all the posts of your competitors. Using Twitter API you can get all the twits that use certain hashtags or mentions. If you are not into coding, but still want to get social media information, you can take a look at tools like IBM Watson ANalytics ($30 for personal use), it natively connects to Twitter API, and you don't have to be a programmer at all. It is intuitive and easy to learn. Analytics Canvas connects to Facebook Graph API (it's free for 30 days of trial).

Unfortunately, you would not be able to collect any personal information from social media at large scale (age, income, gender, etc.), because it violates all the laws about privacy on the Internet. You can use census data instead.

Google Sheets are a very handy tool if you are planning to use this information for personal research. You can set up a spreadsheet and add some Java script to make it collect all information from competitor's blogs, and also sites like Reddit.

Finally, you can try web scraping (it's not the best, but can speed up the process). A tool like OutWitHub will collect information from websites (such as website reviews) based on the structure you provide (select html tags). You can collect thousands of reviews in one day if you automate it (paid version). Very easy to use.

Note: not all the websites are open to this method, review their policies to make sure you are not violating their terms of service. Reviews belong to the website where they were published.

If you REALLY need personal data (like how much they earn and how much they spend, etc.), just print out 100 questionnaires and go to Student Union Building of Dalhousie University. Most of the students will share any personal data in exchange for a Tim Horton's gift card that gets them a free coffee. It is probably the least technical and fastest way to get all the data you need.

Hope this helps.

Talk to Olga Upvote • Share
•••
Share Report

Answer URL

Share Question

  • Share on Twitter
  • Share on LinkedIn
  • Share on Facebook
  • Share on Google+
  • Share by email
About
  • How it Works
  • Success Stories
Experts
  • Become an Expert
  • Find an Expert
Answers
  • Ask a Question
  • Recent Answers
Support
  • Help
  • Terms of Service
Follow

the startups.com platform

Startups Education
Startup Planning
Access Mentors
Secure Funding
Reach Customers
Virtual Assistants

Copyright © 2025 Startups.com. All rights reserved.