Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
N
nltk
Overview
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
edx
nltk
Commits
3449a126
Commit
3449a126
authored
Jun 13, 2007
by
Ewan Klein
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
preliminary changes towards using fs.py
svn/trunk@4666
parent
8a3e49a7
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
44 additions
and
38 deletions
+44
-38
nltk/test/featgram.doctest
+44
-38
No files found.
nltk/test/featgram.doctest
View file @
3449a126
...
@@ -4,10 +4,10 @@ tested and revised. Assuming we have saved feat0cfg_ as a file named
...
@@ -4,10 +4,10 @@ tested and revised. Assuming we have saved feat0cfg_ as a file named
``'feat0.cfg'``, the function ``GrammarFile.read_file()`` allows us to
``'feat0.cfg'``, the function ``GrammarFile.read_file()`` allows us to
read the grammar into NLTK, ready for use in parsing.
read the grammar into NLTK, ready for use in parsing.
>>> from nltk
.parse import *
>>> from nltk
import parse, tokenize
>>> from nltk
.parse.featurechart import *
>>> from nltk
import fs as featstruct
>>>
import nltk.tokenize
>>>
from nltk.fs import FS
>>> cp = load_earley('feat0.cfg', trace=2)
>>> cp =
parse.
load_earley('feat0.cfg', trace=2)
>>> sent = 'Kim likes children'
>>> sent = 'Kim likes children'
>>> tokens = list(tokenize.whitespace(sent))
>>> tokens = list(tokenize.whitespace(sent))
>>> tokens
>>> tokens
...
@@ -49,25 +49,31 @@ read the grammar into NLTK, ready for use in parsing.
...
@@ -49,25 +49,31 @@ read the grammar into NLTK, ready for use in parsing.
Feature structures in NLTK are ... Atomic feature values can be strings or
Feature structures in NLTK are ... Atomic feature values can be strings or
integers.
integers.
>>> fs1 =
dict(TENSE='past', NUM='sg'
)
>>> fs1 =
FS(dict(TENSE='past', NUM='sg')
)
>>> print fs1
>>> print fs1
{'NUM': 'sg', 'TENSE': 'past'}
{'NUM': 'sg', 'TENSE': 'past'}
We can think of a feature structure as being like a Python dictionary,
We can think of a feature structure as being like a Python dictionary,
and access its values by indexing in the usual way.
and access its values by indexing in the usual way.
>>> fs1 =
dict(PER=3, NUM='pl', GND='fem'
)
>>> fs1 =
FS(dict(PER=3, NUM='pl', GND='fem')
)
>>> print fs1['GND']
>>> print fs1['GND']
fem
fem
We can also define feature structures which have complex values, as
We can also define feature structures which have complex values, as
discussed earlier.
discussed earlier.
>>> fs2 =
dict(POS='N', AGR=fs1
)
>>> fs2 =
FS(dict(POS='N', AGR=fs1)
)
>>> print fs2
>>> print fs2
{'AGR': {'NUM': 'pl', 'GND': 'fem', 'PER': 3}, 'POS': 'N'}
AGR:
GND: fem
NUM: pl
PER: 3
POS: N
>>> print fs2['AGR']
>>> print fs2['AGR']
{'NUM': 'pl', 'GND': 'fem', 'PER': 3}
GND: fem
NUM: pl
PER: 3
>>> print fs2['AGR']['PER']
>>> print fs2['AGR']['PER']
3
3
...
@@ -80,23 +86,23 @@ Representing dictionaries in YAML form is useful for making feature
...
@@ -80,23 +86,23 @@ Representing dictionaries in YAML form is useful for making feature
structures readable:
structures readable:
>>> from nltk.parse.featurelite import *
>>> from nltk.parse.featurelite import *
>>> f1 =
yaml.load
("NUMBER: SINGULAR")
>>> f1 =
featstruct.parse
("NUMBER: SINGULAR")
>>> f2 =
yaml.load
("PERSON: 3")
>>> f2 =
featstruct.parse
("PERSON: 3")
>>> print
show(unify(f1, f2)
)
>>> print
unify(f1, f2
)
NUMBER: SINGULAR
NUMBER: SINGULAR
PERSON: 3
PERSON: 3
>>> f1 =
yaml.load
('''
>>> f1 =
featstruct.parse
('''
... A:
... A:
... B: b
... B: b
... D: d
... D: d
... ''')
... ''')
>>> f2 =
yaml.load
('''
>>> f2 =
featstruct.parse
('''
... A:
... A:
... C: c
... C: c
... D: d
... D: d
... ''')
... ''')
>>> print
show(unify(f1, f2)
)
>>> print
unify(f1, f2
)
A:
A:
B: b
B: b
C: c
C: c
...
@@ -110,12 +116,12 @@ Feature structures are not inherently tied to linguistic objects; they are
...
@@ -110,12 +116,12 @@ Feature structures are not inherently tied to linguistic objects; they are
general purpose structures for representing knowledge. For example, we
general purpose structures for representing knowledge. For example, we
could encode information about a person in a feature structure:
could encode information about a person in a feature structure:
>>> person01 =
yaml.load
('''
>>> person01 =
featstruct.parse
('''
... NAME: 'Lee'
... NAME: 'Lee'
... TELNO: '01 27 86 42 96'
... TELNO: '01 27 86 42 96'
... AGE: 33
... AGE: 33
... ''')
... ''')
>>> print
show(person01)
>>> print
person01
AGE: 33
AGE: 33
NAME: Lee
NAME: Lee
TELNO: 01 27 86 42 96
TELNO: 01 27 86 42 96
...
@@ -127,7 +133,7 @@ is prefixed with an integer in parentheses, such as ``(1)``, and any
...
@@ -127,7 +133,7 @@ is prefixed with an integer in parentheses, such as ``(1)``, and any
subsequent reference to that structure uses the notation
subsequent reference to that structure uses the notation
``->(1)``, as shown below.
``->(1)``, as shown below.
>>> fs=
yaml.load
("""
>>> fs=
featstruct.parse
("""
... NAME: 'Lee'
... NAME: 'Lee'
... ADDRESS: &1
... ADDRESS: &1
... NUMBER: 74
... NUMBER: 74
...
@@ -136,7 +142,7 @@ subsequent reference to that structure uses the notation
...
@@ -136,7 +142,7 @@ subsequent reference to that structure uses the notation
... NAME: 'Kim'
... NAME: 'Kim'
... ADDRESS: *1
... ADDRESS: *1
... """)
... """)
>>> print
show(fs)
>>> print
fs
ADDRESS: &id001
ADDRESS: &id001
NUMBER: 74
NUMBER: 74
STREET: rue Pascal
STREET: rue Pascal
...
@@ -147,14 +153,14 @@ subsequent reference to that structure uses the notation
...
@@ -147,14 +153,14 @@ subsequent reference to that structure uses the notation
There can be any number of tags within a single feature structure.
There can be any number of tags within a single feature structure.
>>> fs3 =
yaml.load
("""
>>> fs3 =
featstruct.parse
("""
... A: 'a'
... A: 'a'
... B: &1
... B: &1
... C: 'c'
... C: 'c'
... D: *1
... D: *1
... E: *1
... E: *1
... """)
... """)
>>> print
show(fs3)
>>> print
fs3
A: a
A: a
B: &id001
B: &id001
C: c
C: c
...
@@ -162,12 +168,12 @@ There can be any number of tags within a single feature structure.
...
@@ -162,12 +168,12 @@ There can be any number of tags within a single feature structure.
E: *id001
E: *id001
>>> fs1 =
yaml.load
("""
>>> fs1 =
featstruct.parse
("""
... NUMBER: 74
... NUMBER: 74
... STREET: 'rue Pascal'
... STREET: 'rue Pascal'
... """)
... """)
>>> fs2 =
yaml.load
("CITY: Paris")
>>> fs2 =
featstruct.parse
("CITY: Paris")
>>> print
show(unify(fs1, fs2)
)
>>> print
unify(fs1, fs2
)
CITY: Paris
CITY: Paris
NUMBER: 74
NUMBER: 74
STREET: rue Pascal
STREET: rue Pascal
...
@@ -179,7 +185,7 @@ Unification is symmetric:
...
@@ -179,7 +185,7 @@ Unification is symmetric:
Unification is commutative:
Unification is commutative:
>>> fs3 =
yaml.load
("TELNO: 01 27 86 42 96")
>>> fs3 =
featstruct.parse
("TELNO: 01 27 86 42 96")
>>> unify(unify(fs1, fs2), fs3) == unify(fs1, unify(fs2, fs3))
>>> unify(unify(fs1, fs2), fs3) == unify(fs1, unify(fs2, fs3))
True
True
...
@@ -200,7 +206,7 @@ this is implemented by setting the result of unification to be
...
@@ -200,7 +206,7 @@ this is implemented by setting the result of unification to be
Now, if we look at how unification interacts with structure-sharing,
Now, if we look at how unification interacts with structure-sharing,
things become really interesting.
things become really interesting.
>>> fs0 =
yaml.load
("""
>>> fs0 =
featstruct.parse
("""
... NAME: Lee
... NAME: Lee
... ADDRESS:
... ADDRESS:
... NUMBER: 74
... NUMBER: 74
...
@@ -211,7 +217,7 @@ things become really interesting.
...
@@ -211,7 +217,7 @@ things become really interesting.
... NUMBER: 74
... NUMBER: 74
... STREET: 'rue Pascal'
... STREET: 'rue Pascal'
... """)
... """)
>>> print
show(fs0)
>>> print
fs0
ADDRESS:
ADDRESS:
NUMBER: 74
NUMBER: 74
STREET: rue Pascal
STREET: rue Pascal
...
@@ -222,12 +228,12 @@ things become really interesting.
...
@@ -222,12 +228,12 @@ things become really interesting.
STREET: rue Pascal
STREET: rue Pascal
NAME: Kim
NAME: Kim
>>> fs1 =
yaml.load
("""
>>> fs1 =
featstruct.parse
("""
... SPOUSE:
... SPOUSE:
... ADDRESS:
... ADDRESS:
... CITY: Paris
... CITY: Paris
... """)
... """)
>>> print
show(
unify(fs0, fs1))
>>> print unify(fs0, fs1))
ADDRESS:
ADDRESS:
NUMBER: 74
NUMBER: 74
STREET: rue Pascal
STREET: rue Pascal
...
@@ -239,7 +245,7 @@ things become really interesting.
...
@@ -239,7 +245,7 @@ things become really interesting.
STREET: rue Pascal
STREET: rue Pascal
NAME: Kim
NAME: Kim
>>> fs0 =
yaml.load
("""
>>> fs0 =
featstruct.parse
("""
... NAME: Lee
... NAME: Lee
... ADDRESS: &1
... ADDRESS: &1
... NUMBER: 74
... NUMBER: 74
...
@@ -248,7 +254,7 @@ things become really interesting.
...
@@ -248,7 +254,7 @@ things become really interesting.
... NAME: Kim
... NAME: Kim
... ADDRESS: *1
... ADDRESS: *1
... """)
... """)
>>> print
show(fs0)
>>> print
fs0
ADDRESS: &id001
ADDRESS: &id001
NUMBER: 74
NUMBER: 74
STREET: rue Pascal
STREET: rue Pascal
...
@@ -257,7 +263,7 @@ things become really interesting.
...
@@ -257,7 +263,7 @@ things become really interesting.
ADDRESS: *id001
ADDRESS: *id001
NAME: Kim
NAME: Kim
>>> print
show(unify(fs0, fs1)
)
>>> print
unify(fs0, fs1
)
ADDRESS: &id001
ADDRESS: &id001
CITY: Paris
CITY: Paris
NUMBER: 74
NUMBER: 74
...
@@ -267,18 +273,18 @@ things become really interesting.
...
@@ -267,18 +273,18 @@ things become really interesting.
ADDRESS: *id001
ADDRESS: *id001
NAME: Kim
NAME: Kim
>>> fs1 =
yaml.load
("""
>>> fs1 =
featstruct.parse
("""
... ADDRESS1:
... ADDRESS1:
... NUMBER: 74
... NUMBER: 74
... STREET: 'rue Pascal'
... STREET: 'rue Pascal'
... """)
... """)
>>> fs2 =
yaml.load
("""
>>> fs2 =
featstruct.parse
("""
... ADDRESS1: ?x
... ADDRESS1: ?x
... ADDRESS2: ?x
... ADDRESS2: ?x
... """)
... """)
>>> print
show(unify(fs1, fs2)
)
>>> print
unify(fs1, fs2
)
ADDRESS1: &id001
ADDRESS1: &id001
NUMBER: 74
NUMBER: 74
STREET: rue Pascal
STREET: rue Pascal
...
@@ -287,7 +293,7 @@ things become really interesting.
...
@@ -287,7 +293,7 @@ things become really interesting.
>>> sent = 'who do you claim that you like'
>>> sent = 'who do you claim that you like'
>>> tokens = list(tokenize.whitespace(sent))
>>> tokens = list(tokenize.whitespace(sent))
>>> cp = load_earley('feat1.cfg', trace=1)
>>> cp =
parse.
load_earley('feat1.cfg', trace=1)
>>> trees = cp.parse(tokens)
>>> trees = cp.parse(tokens)
|.w.d.y.c.t.y.l.|
|.w.d.y.c.t.y.l.|
Scanner |[-] . . . . . .| NP[+WH] -> who * {}
Scanner |[-] . . . . . .| NP[+WH] -> who * {}
...
@@ -332,7 +338,7 @@ things become really interesting.
...
@@ -332,7 +338,7 @@ things become really interesting.
Let's load a German grammar:
Let's load a German grammar:
>>> cp = load_earley('german0.cfg', trace=0)
>>> cp =
parse.
load_earley('german0.cfg', trace=0)
>>> sent = 'die katze sieht den hund'
>>> sent = 'die katze sieht den hund'
>>> tokens = list(tokenize.whitespace(sent))
>>> tokens = list(tokenize.whitespace(sent))
>>> trees = cp.parse(tokens)
>>> trees = cp.parse(tokens)
...
@@ -349,7 +355,7 @@ Let's load a German grammar:
...
@@ -349,7 +355,7 @@ Let's load a German grammar:
First attempt at doing semantics with features:
First attempt at doing semantics with features:
>>> cp = load_earley('sem3.cfg', trace=2)
>>> cp =
parse.
load_earley('sem3.cfg', trace=2)
>>> sent = 'Kim barks'
>>> sent = 'Kim barks'
>>> tokens = list(tokenize.whitespace(sent))
>>> tokens = list(tokenize.whitespace(sent))
>>> trees = cp.parse(tokens)
>>> trees = cp.parse(tokens)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment