*.universal.mention.txt for the original source file, matched against Freebase relations *.universal.txt for matrix factorization dev is part of test dev is a half of test *.ds.txt for distant supervision *.gz for mihai's miml code data used in NAACL2013: train file: nyt-freebase.train.triples.universal.txt test file: nyt-freebase.dev.universal.txt POSITIVE: this entity pair can be found in Freebase and the Freebase relations are added besides the surface patterns. Example: POSITIVE Edward M. Liddy Allstate lc#'' said rc#, said REL$/business/person/company trigger#executive path#appos|->appos->executive->prep->of->pobj->|pobj REL$/business/company_shareholder/major_shareholder_of NEGATIVE: both entities of the entity pair occur in Freebase, however, Freebase does not say there is a relation between them. So no relations from Freebase should be added. We use 'REL$NA' in our data. UNLABELED: at least one of the entities in the entity pair does not occur in Freebase. This means we can not find an extract string match from the entity to a Freebase entity. In our naacl paper, for training, we use all data, POSITIVE+NEGATIVE+UNLABELED for test, we only use POSITIVE+NEGATIVE data. We hide freebase labels for the POSITIVE rows. For final evaluation, we pick a subset of these entity pairs.