There are various types of text data:
The different kinds of data being:
- User Data
Now, let’s move on to our topic “Where to get the data from?”
Well, below are few sources of data:
- Linguistic Data Consortium (https://www.ldc.upenn.edu/)
- Web Crawling/Scraping
- API’s: Twitter, Wordnik etc.
- University sites & academic communities.
Below are few Classic NLP Problems:
- Linguistically-motivated: Segmentation, Tagging, Parsing etc.
- Analytical: Classification, Sentiment Analysis
- Transformation: Translation, Correction, Generation
- Conversation: Question Answering, Dialog etc.