pattern.it

The pattern.it module contains a fast part-of-speech tagger for Italian (identifies nouns, adjectives, verbs, etc. in a sentence) and tools for Italian verb conjugation and noun singularization & pluralization.

It can be used by itself or with other pattern modules: web | db | en | search | vector | graph.


Documentation

The functions in this module take the same parameters and return the same values as their counterparts in pattern.en. Refer to the documentation there for more details.  

Gender

Italian nouns and adjectives inflect according to gender. The gender() function predicts the gender (MALE, FEMALEPLURAL) of a given noun with about 92% accuracy: 

>>> from pattern.it import gender, MALE, FEMALE, PLURAL
>>> print gender('gatti')

(MALE, PLURAL)

Article

The article() function returns the article (INDEFINITE or DEFINITE) inflected by gender (e.g., il gatto → i gatti).

>>> from pattern.it import article, DEFINITE, MALE, PLURAL
>>> print article('gatti', DEFINITE, gender=(MALE, PLURAL))

i

Noun singularization & pluralization

For Italian nouns there is singularize() and pluralize(). The implementation is slightly less robust than the English version (accuracy 84% for singularization and 93% for pluralization).

>>> from pattern.it import singularize, pluralize
>>>  
>>> print singularize('gatti')
>>> print pluralize('gatto')

gatto
gatti 

Verb conjugation

For Italian verbs there is conjugate(), lemma(), lexeme() and tenses(). The lexicon for verb conjugation contains about 1,250 common Italian verbs, mined from Wiktionary. For unknown verbs it will fall back to a rule-based approach with an accuracy of about 86%. 

Italian verbs have more tenses than English verbs. In particular, the plural differs for each person, and there are additional forms for the FUTURE tense, the IMPERATIVE, CONDITIONAL and SUBJUNCTIVE mood and the PERFECTIVE aspect:

>>> from pattern.it import conjugate
>>> from pattern.it import INFINITIVE, PRESENT, PAST, SG, SUBJUNCTIVE, PERFECTIVE
>>>  
>>> print conjugate('sono', INFINITIVE)
>>> print conjugate('sono', PRESENT, 1, SG, mood=SUBJUNCTIVE)
>>> print conjugate('sono', PAST, 3, SG) 
>>> print conjugate('sono', PAST, 3, SG, aspect=PERFECTIVE) 

essere
sia
era 
fu   

For PAST tense + PERFECTIVE aspect we can also use PRETERITE (passato remoto) For PAST tense + IMPERFECTIVE aspect we can also use IMPERFECT (imperfetto).

>>> from pattern.it import conjugate
>>> from pattern.it import IMPERFECT, PRETERITE
>>>  
>>> print conjugate('sono', IMPERFECT, 3, SG)
>>> print conjugate('sono', PRETERITE, 3, SG)

era
fu   

 The conjugate() function takes the following optional parameters:

Tense Person Number Mood Aspect Alias Example
INFINITVE None None None None "inf" essere
PRESENT 1 SG INDICATIVE IMPERFECTIVE "1sg" io sono
PRESENT 2 SG INDICATIVE IMPERFECTIVE "2sg" tu sei
PRESENT 3 SG INDICATIVE IMPERFECTIVE "3sg" lui è
PRESENT 1 PL INDICATIVE IMPERFECTIVE "1pl" noi siamo
PRESENT 2 PL INDICATIVE IMPERFECTIVE "2pl" voi siete
PRESENT 3 PL INDICATIVE IMPERFECTIVE "3pl" loro sono
PRESENT None None INDICATIVE PROGRESSIVE "part" essendo
 
PRESENT 2 SG IMPERATIVE IMPERFECTIVE "2sg!" sii
PRESENT 3 SG IMPERATIVE IMPERFECTIVE "3sg!" sia
PRESENT 1 PL IMPERATIVE IMPERFECTIVE "1pl!" siamo
PRESENT 2 PL IMPERATIVE IMPERFECTIVE "2pl!" siate
PRESENT 3 PL IMPERATIVE IMPERFECTIVE "3pl!" siano
 
PRESENT 1 SG SUBJUNCTIVE IMPERFECTIVE "1sg?" io sia
PRESENT 2 SG SUBJUNCTIVE IMPERFECTIVE "2sg?" tu sia
PRESENT 3 SG SUBJUNCTIVE IMPERFECTIVE "3sg?" lui sia
PRESENT 1 PL SUBJUNCTIVE IMPERFECTIVE "1pl?" noi siamo
PRESENT 2 PL SUBJUNCTIVE IMPERFECTIVE "2pl?" voi siate
PRESENT 3 PL SUBJUNCTIVE IMPERFECTIVE "3pl?" loro siano
 
PAST 1 SG INDICATIVE IMPERFECTIVE "1sgp" io ero
PAST 2 SG INDICATIVE IMPERFECTIVE "2sgp" tu eri
PAST 3 SG INDICATIVE IMPERFECTIVE "3sgp" lui era
PAST 1 PL INDICATIVE IMPERFECTIVE "1ppl" noi eravamo
PAST 2 PL INDICATIVE IMPERFECTIVE "2ppl" voi eravate
PAST 3 PL INDICATIVE IMPERFECTIVE "3ppl" loro erano
PAST None None INDICATIVE PROGRESSIVE "ppart" stato
 
PAST 1 SG INDICATIVE PERFECTIVE "1sgp+" io fui
PAST 2 SG INDICATIVE PERFECTIVE "2sgp+" tu fosti
PAST 3 SG INDICATIVE PERFECTIVE "3sgp+" lui fu
PAST 1 PL INDICATIVE PERFECTIVE "1ppl+" noi fummo
PAST 2 PL INDICATIVE PERFECTIVE "2ppl+" voi foste
PAST 3 PL INDICATIVE PERFECTIVE "3ppl+" loro furono
 
PAST 1 SG SUBJUNCTIVE IMPERFECTIVE "1sgp?" io fossi
PAST 2 SG SUBJUNCTIVE IMPERFECTIVE "2sgp?" tu fossi
PAST 3 SG SUBJUNCTIVE IMPERFECTIVE "3sgp?" lui fosse
PAST 1 PL SUBJUNCTIVE IMPERFECTIVE "1ppl?" noi fossimo
PAST 2 PL SUBJUNCTIVE IMPERFECTIVE "2ppl?" voi foste
PAST 3 PL SUBJUNCTIVE IMPERFECTIVE "3ppl?" loro fossero
 
FUTURE 1 SG INDICATIVE IMPERFECTIVE "1sgf" io sarò
FUTURE 2 SG INDICATIVE IMPERFECTIVE "2sgf" tu sarai
FUTURE 3 SG INDICATIVE IMPERFECTIVE "3sgf" lui sarà
FUTURE 1 PL INDICATIVE IMPERFECTIVE "1plf" noi saremo
FUTURE 2 PL INDICATIVE IMPERFECTIVE "2plf" voi sarete
FUTURE 3 PL INDICATIVE IMPERFECTIVE "3plf" loro saranno
 
CONDITIONAL 1 SG INDICATIVE IMPERFECTIVE "1sg->" io sarei
CONDITIONAL 2 SG INDICATIVE IMPERFECTIVE "2sg->" tu saresti
CONDITIONAL 3 SG INDICATIVE IMPERFECTIVE "3sg->" lui sarebbe
CONDITIONAL 1 PL INDICATIVE IMPERFECTIVE "1pl->" noi saremmo
CONDITIONAL 2 PL INDICATIVE IMPERFECTIVE "2pl->" voi sareste
CONDITIONAL 3 PL INDICATIVE IMPERFECTIVE "3pl->" loro sarebbero

Instead of optional parameters, a single short alias, or PARTICIPLE or PAST+PARTICIPLE can also be given. With no parameters, the infinitive form of the verb is returned.

Attributive & predicative adjectives 

Italian adjectives inflect with suffixes -o → -i (masculine) and -a → -e (feminine), with some exceptions  (e.g., grande → i grandi felini). You can get the base form with the predicative() function. A statistical approach is used with an accuracy of 88%.

>>> from pattern.it import attributive
>>> print predicative('grandi') 

grande  

Parser

For parsing there is parse(), parsetree() and split(). The parse() function annotates words in the given string with their part-of-speech tags (e.g., NN for nouns and VB for verbs). The parsetree() function takes a string and returns a tree of nested objects (Text → Sentence → Chunk → Word). The split() function takes the output of parse() and returns a Text. See the pattern.en documentation (here) how to manipulate Text objects. 

>>> from pattern.it import parse, split
>>>  
>>> s = parse('Il gatto nero faceva le fusa.')
>>> for sentence in split(s):
>>>     print sentence

Sentence('Il/DT/B-NP/O gatto/NN/I-NP/O nero/JJ/I-NP/O'
         'faceva/VB/B-VP/O'
         'le/DT/B-NP/O fusa/NN/I-NP/O ././O/O')

The parser is mined from Wiktionary. The accuracy is around 92%.

Sentiment analysis

There's no sentiment() function for Italian yet.