Difference between revisions of "NLP"
Jump to navigation
Jump to search
Line 8: | Line 8: | ||
* [https://www.nltk.org/book/ NLTK book] | * [https://www.nltk.org/book/ NLTK book] | ||
* [https://web.stanford.edu/~jurafsky/slp3/ed3book.pdf Jurafsky book] | * [https://web.stanford.edu/~jurafsky/slp3/ed3book.pdf Jurafsky book] | ||
+ | |||
+ | ==Parts of speech tagging== | ||
+ | * POS, word classes, syntactic categories: nouns, verbs, adjectives, conjunctions | ||
+ | * [https://www.nltk.org/book/ch05.html Categorizing and Tagging Words (NLTK book chapter 5)] | ||
+ | * [https://medium.com/@gianpaul.r/tokenization-and-parts-of-speech-pos-tagging-in-pythons-nltk-library-2d30f70af13b Rachiele G. 2018 Medium article] | ||
+ | ===nltk syntax=== | ||
+ | * <code>tokens = nltk.word_tokenize()</code> is a more robust <code>.split()</code> | ||
+ | * <code>nltk.pos_tag( tokens )</code> | ||
+ | ===List of POS=== | ||
+ | * CC coordinating conjunction | ||
+ | * CD cardinal digit | ||
+ | * DT determiner | ||
+ | * EX existential there (like: “there is” … think of it like “there exists”) | ||
+ | * FW foreign word | ||
+ | * IN preposition/subordinating conjunction | ||
+ | * JJ adjective ‘big’ | ||
+ | * JJR adjective, comparative ‘bigger’ | ||
+ | * JJS adjective, superlative ‘biggest’ | ||
+ | * LS list marker 1) | ||
+ | * MD modal could, will | ||
+ | * NN noun, singular ‘desk’ | ||
+ | * NNS noun plural ‘desks’ | ||
+ | * NNP proper noun, singular ‘Harrison’ | ||
+ | * NNPS proper noun, plural ‘Americans’ | ||
+ | * PDT predeterminer ‘all the kids’ | ||
+ | * POS possessive ending parent’s | ||
+ | * PRP personal pronoun I, he, she | ||
+ | * PRP$ possessive pronoun my, his, hers | ||
+ | * RB adverb very, silently, | ||
+ | * RBR adverb, comparative better | ||
+ | * RBS adverb, superlative best | ||
+ | * RP particle give up | ||
+ | * TO, to go ‘to’ the store. | ||
+ | * UH interjection, errrrrrrrm | ||
+ | * VB verb, base form take | ||
+ | * VBD verb, past tense took | ||
+ | * VBG verb, gerund/present participle taking | ||
+ | * VBN verb, past participle taken | ||
+ | * VBP verb, sing. present, non-3d take | ||
+ | * VBZ verb, 3rd person sing. present takes | ||
+ | * WDT wh-determiner which | ||
+ | * WP wh-pronoun who, what | ||
+ | * WP$ possessive wh-pronoun whose | ||
+ | * WRB wh-abverb where, when |
Revision as of 16:41, 26 October 2019
Vocabulary
- Token
- Bag of words
- Stemming and lemmatization
Resources
- Stanford CS224n: Natural Language Processing with Deep Learning (Winter 2019)
- NLTK book
- Jurafsky book
Parts of speech tagging
- POS, word classes, syntactic categories: nouns, verbs, adjectives, conjunctions
- Categorizing and Tagging Words (NLTK book chapter 5)
- Rachiele G. 2018 Medium article
nltk syntax
tokens = nltk.word_tokenize()
is a more robust.split()
nltk.pos_tag( tokens )
List of POS
- CC coordinating conjunction
- CD cardinal digit
- DT determiner
- EX existential there (like: “there is” … think of it like “there exists”)
- FW foreign word
- IN preposition/subordinating conjunction
- JJ adjective ‘big’
- JJR adjective, comparative ‘bigger’
- JJS adjective, superlative ‘biggest’
- LS list marker 1)
- MD modal could, will
- NN noun, singular ‘desk’
- NNS noun plural ‘desks’
- NNP proper noun, singular ‘Harrison’
- NNPS proper noun, plural ‘Americans’
- PDT predeterminer ‘all the kids’
- POS possessive ending parent’s
- PRP personal pronoun I, he, she
- PRP$ possessive pronoun my, his, hers
- RB adverb very, silently,
- RBR adverb, comparative better
- RBS adverb, superlative best
- RP particle give up
- TO, to go ‘to’ the store.
- UH interjection, errrrrrrrm
- VB verb, base form take
- VBD verb, past tense took
- VBG verb, gerund/present participle taking
- VBN verb, past participle taken
- VBP verb, sing. present, non-3d take
- VBZ verb, 3rd person sing. present takes
- WDT wh-determiner which
- WP wh-pronoun who, what
- WP$ possessive wh-pronoun whose
- WRB wh-abverb where, when