Difference between revisions of "NLP"
Jump to navigation
Jump to search
Line 16: | Line 16: | ||
* <code>tokens = nltk.word_tokenize()</code> is a more robust <code>.split()</code> | * <code>tokens = nltk.word_tokenize()</code> is a more robust <code>.split()</code> | ||
* <code>nltk.pos_tag( tokens )</code> | * <code>nltk.pos_tag( tokens )</code> | ||
− | === | + | ===NLTK POS tagset=== |
* CC coordinating conjunction | * CC coordinating conjunction | ||
* CD cardinal digit | * CD cardinal digit |
Revision as of 16:41, 26 October 2019
Vocabulary
- Token
- Bag of words
- Stemming and lemmatization
Resources
- Stanford CS224n: Natural Language Processing with Deep Learning (Winter 2019)
- NLTK book
- Jurafsky book
Parts of speech tagging
- POS, word classes, syntactic categories: nouns, verbs, adjectives, conjunctions
- Categorizing and Tagging Words (NLTK book chapter 5)
- Rachiele G. 2018 Medium article
nltk syntax
tokens = nltk.word_tokenize()
is a more robust.split()
nltk.pos_tag( tokens )
NLTK POS tagset
- CC coordinating conjunction
- CD cardinal digit
- DT determiner
- EX existential there (like: “there is” … think of it like “there exists”)
- FW foreign word
- IN preposition/subordinating conjunction
- JJ adjective ‘big’
- JJR adjective, comparative ‘bigger’
- JJS adjective, superlative ‘biggest’
- LS list marker 1)
- MD modal could, will
- NN noun, singular ‘desk’
- NNS noun plural ‘desks’
- NNP proper noun, singular ‘Harrison’
- NNPS proper noun, plural ‘Americans’
- PDT predeterminer ‘all the kids’
- POS possessive ending parent’s
- PRP personal pronoun I, he, she
- PRP$ possessive pronoun my, his, hers
- RB adverb very, silently,
- RBR adverb, comparative better
- RBS adverb, superlative best
- RP particle give up
- TO, to go ‘to’ the store.
- UH interjection, errrrrrrrm
- VB verb, base form take
- VBD verb, past tense took
- VBG verb, gerund/present participle taking
- VBN verb, past participle taken
- VBP verb, sing. present, non-3d take
- VBZ verb, 3rd person sing. present takes
- WDT wh-determiner which
- WP wh-pronoun who, what
- WP$ possessive wh-pronoun whose
- WRB wh-abverb where, when