*.universal.mention.txt for the original source file, matched against Freebase
relations
*.universal.txt for matrix factorization
dev is part of test
dev is a half of test
*.ds.txt for distant supervision
*.gz for mihai's miml code

data used in NAACL2013:
train file: nyt-freebase.train.triples.universal.txt
test file: nyt-freebase.dev.universal.txt


POSITIVE:  this entity pair can be found in Freebase and the Freebase
relations are added besides the surface patterns.
Example:
POSITIVE        Edward M. Liddy Allstate        lc#'' said      rc#, said
REL$/business/person/company    trigger#executive
path#appos|->appos->executive->prep->of->pobj->|pobj
    REL$/business/company_shareholder/major_shareholder_of   

NEGATIVE: both entities of the entity pair occur in Freebase, however,
Freebase does not say there is a relation between them. So no relations from
Freebase should be added.
We use 'REL$NA' in our data. 

UNLABELED: at least one of the entities in the entity pair does not occur in
Freebase. This means we can not find an extract string match from the entity
to a Freebase entity.

In our naacl paper, for training, we use all data, POSITIVE+NEGATIVE+UNLABELED
for test, we only use POSITIVE+NEGATIVE data. We hide freebase labels for the
POSITIVE rows. For final evaluation, we pick a subset of these entity pairs.